Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A Time Series Representation Framework Based on Learned Patterns Mustafa Gokce Baydogan● George Runger* Didem Yamak† Boğaziçi University * Arizona State University † DeVry University ● 10/5/2013 8th INFORMS Workshop on Data Mining and Health Informatics (DM-HI 2013) Outline Time series data mining Motivation Representing time series Measuring similarity Learning a pattern-based representation Pattern (relationship) discovery Learned pattern similarity (LPS) Computational experiments and results Conclusions and future work Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 2 Time Series Data Mining Motivations People measure things, and things (with rare exceptions) change over time Time series are everywhere ECG Heartbeat Stock Consider a patient’s medical record test values observations actions and related responses Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 3 Time Series Data Mining Motivations Other types of data can be converted to time series. Everything is about the representation. Example: Recognizing words An example word “Alexandria” from the dataset of word profiles for George Washington's manuscripts. A word can be represented by two time series created by moving over and under the word Images from E. Keogh. A quick tour of the datasets for VLDB 2008. In VLDB, 2008. Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 4 Challenges Local patterns are important Translations and dilations (warping) Observed four peaks are related to certain event in the manufacturing process Time of the peaks may change (two peaks are observed earlier for blue series) Indication of a problem Problem occurred over a shorter time interval Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 5 Challenges Time series are usually noisy Multivariate time series (MTS) Relation of patterns within the series and interactions between series may be important High-dimensionality Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 6 Motivations Time series representation To reduce high-dimensionality noise To capture trends, shapes and patterns As they provide more information compared to exact values of each time series data point Time series similarity Accurate Handle warping Fast Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 7 Time series representation * Allows lower bounding for similarity computations Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 8 Time series similarity Popular (No parameter) Intuitive Fast computation Performs bad • • • • • • • • • • • Very popular (No parameter) Handles warping (Accurate) Hard to beat May perform bad (long series with noise) Handles warping (Accurate) Too many parameters to tune Computationally not efficient Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 9 Learning a pattern-based representation A regression tree-based approach is used to learn a representation Earlier (Geurts, 2001), Your data matrix Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis t observed value 1 0.440 2 3 4 . . . 127 128 0.363 0.081 0.083 . . . 0.962 0.553 10 A new learning approach Predicting (forecasting) a segment Your data matrix Forecast ∆ (gap) time units forward Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 11 Representation Learned patterns Time series is 128 units long Predictor segment 1-60 Response segment 51-111 Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 12 Multiple segments Concatenate for all time series to create 1. Randomly, select a response segment (column) of length L 2. Build a regression tree Multiple random ∆ levels At each split decision, select a random predictor column (one segment at each time)* Build J trees with depth D *Known to work well for regression P. Geurts, D. Ernst, and L. Wehenkel. Extremely randomized trees. Machine Learning, 63(1):3-42, 2006. Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 13 Multiple segments (cont.) Tree #1 Tree #2 ……… Tree #3 Tree #J ………………... 10 9 8 7 6 5 4 3 10 10 9 9 10 8 8 9 7 7 6 6 5 5 4 4 3 3 2 2 1 1 8 ……………… 7 6 5 4 3 2 2 1 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 0 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1. Aggregate the information over all trees for prediction (i.e. denoising) Each terminal node defines a basis 4 3 3 3 3 2. pattern-based representation 3 3 1 2 2 7 8 5 6 ……………… 3 1 2 2 7 (a vector of size RxJ) Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 14 Similarity measure Learned Pattern Similarity (LPS) Time series is represented by Suppose be kth entry of then* Penalizes the number of mismatches Robust to noise Series with mismatching observations in the patterns are different Implicitly works on the discrete values Robust to warping Representation learning handles the problem of warping *Assuming each tree has R terminal nodes Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 15 Similarity measure (cont.) The computations are similar to Euclidean distance Fast Allows for bounding schemes Early abandon Similarity search: Find the reference time series that is most similar to query series Keep record of the best distance found so far Stop computing distance for a reference series if current distance is larger than best-so-far Known to improve the testing time (query time) significantly Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 16 S-MTS Experiments 45 univariate time series datasets from UCR database* Compared to popular NN classifiers with different distance measures Addition of difference series Taking trend information into consideration A multivariate time series extension Euclidean DTW (Constrained and unconstrained version) SpADe Sparse Spatial Sample Kernels (SSSK) If time permits Parameters Cross-validation to set parameters for each dataset Segment length (L) (0.25, 0.5, 0.75) factor of time series length Depth of trees (4,6,8) Not important if set large Number of trees=150 enough *http://www.cs.ucr.edu/~eamonn/time_series_data/ Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 17 Univariate datasets Health Energy Robotics Astronomy Chemistry Gesture recognition Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 18 Parameters Illustration over 6 datasets (L=0.5xT) Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 19 Average error rates over 10 replications Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis Multivariate time series While training, randomly select one univariate time series and a target segment Complexity does not change Find splits over randomly selected predictor segments of randomly selected univariate time series More trees with larger depth may be required uWaveGestureLibrary the accelerometer readings in three dimensions (i.e. x, y and z) Same parameters result in error rate of 0.022 Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 21 LPS Conclusions and future work A new approach for time series representation Captures relations between and within the series Features learned within the algorithm (not pre-specified) Handles nominal and missing values Handles warping by representation learning Scalable (also allows for parallel implementation) Training complexity: O(JNTD) Linear to time series length and number of training series Training took at most 6 minutes for 45 datasets (single thread, J=150, D=8, N=1800, T=750) SpADe did not return a result for a week of run Similarity search takes less than a millisecond Fast and accurate results with few parameters Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 22 LPS Conclusions and future work This approach can be extended to many data mining tasks (for both univariate and multivariate time series and images) such as Denoising (in progress) Forecasting (in progress) Anomaly detection (in progress) Clustering (in progress) Indexing … LPS package is provided on http://www.mustafabaydogan.com/learned-pattern-similarity-lps.html Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 23 Thanks! Questions and Comments? LPS package is provided on http://www.mustafabaydogan.com/learned-pattern-similarity-lps.html Mustafa Gokce Baydogan, George Runger and Didem Yamak DM-HI 2013, Minneapolis 24