Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
NTU/Intel M2M Project: Wireless Sensor Networks Content Analysis and Management Special Interest Group Data Analysis Team Discriminant Classification Sub-Team Graphical Learning Sub-Team Anomaly Detection Sub-Team Pattern Mining Sub-Team Monthly Report: September, 2011 1. Team Organization Principal Investigator: Shou-De Lin Co-Principal Investigator: Yung-Jen (Jane) Hsu Team Leader: Todd McKenzie Team Members: Peng-Hua Gong, Hsun-Ping Hsieh, Fu-Chun Hsu, Chung-Yi Li, Ting-Wei Lin, WeiLun Su, En-Hsu Yen, Tu-Chun Yin 2. Discussion with Champions Number of meeting with champion in current month: 1 (8/29) 3. Progress between last month and this month Time Series Mining • • • A thorough survey on the following top conferences / workshops papers – SIGKDD-2011, SIGKDD-2010, SIGKDD-2009 – ICML-2011, ICML-2010, ICML-2009, ICML-2008 – IJCAI-2011, IJCAI-2009 – ICDM-2010, ICDM-2009 – NIPS-2010, NIPS-2009, NIPS-2008 – SensorKDD-2011, SensorKDD-2010 (workshop) We mainly focused on two topics of MTS – Feature Extraction Techniques – Semi-supervised Learning We identify four sensor network scenarios, show at the right Literature Review: Feature Extraction Techniques • NIPS 2010, Y. Liu et al, Decoding Ipsilateral Finger Movements from ECoG Signals in Humans, NIPS 2010 – • • • • • Uses ECoG (Electrocorticography) signals, adopting binary classification, multiclass classification, and multitask classification to determine whether a particular finger moves or not, which finger moves or not, and whether any finger makes movement, respectively. (three classification tasks) – Feature vector: the one-shot readouts of all ECoG channels SIGKDD 2011, Z. Xing et al, Extracting Interpretable Features for Early Classification on Time Series, SIGKDD 2011 – Introduces the “shapelet”, a subsequence that maximizes information gain, which can be used as a discriminative and interpretable feature. NIPS 2010, J. Wiens et al, Active Learning Applied to Patient-Adaptive Heartbeat Classification, NIPS 2010 – Performs heartbeat classification according to ECG recordings. – Feature Vector: SensorKDD 2010, J. R. Kwapisz et al, Activity Recognition using Cell Phone Accelerometers, SensorKDD 2010 – Performs activity recognition using a smart phone’s accelerometer with x, y, z coordinates readings. – Feature vector: Average Acceleration (3), Standard Deviation of Acceleration (3), Average Absolute Difference (3), Average Resultant Acceleration (1), Time Between Peaks (3), Binned Distribution (30) SensorKDD 2010, R. Srinivasan et al, Activity Recognition using Actigraph Sensor, SensorKDD 2010 – Using a wearable actigraph watch to recognize ADL (Activities of Daily Living). – Feature Vector: Min ZC Value, Max ZC Value, Sleep Status, Time Length, Begin Hour, Number of Events, Bin, Total ZC Value, Previous-Activity, Next-Activity SensorKDD 2010, C. Chen et al, Energy Prediction Based on Resident's Activity, SensorKDD 2010 – Predicts energy usage by tracking the residents’ motion & activities in a smart home environment in that different types of sensors such as motion sensors, temperature sensors, etc., are deployed. – Feature vector: activity length, previous activity, next activity, number of kinds of motion sensors involved, total number of times of motion sensor events triggered, and motion sensor M1…M51 (On/Off) • • Recent literature reveal the problem that time series stream data often has a small number of labeled samples and a large number of unlabeled samples. Solution: use positively labeled data for semi-supervised learning, also build clusters for unlabeled sets to compare with initial classifier Literature Review: Semi-supervised learning • SIGKDD 2006, Wei and Keogh. Semi-Supervised Time Series Classification, SIGKDD 2006 – First paper on using semi-supervised learning on time series classification – Given set P and U of positively and unlabeled samples, train 1NN classifier, and select most confident positive samples and add to P, then retrain • IJCAI 2011, Nguyen et al. Positive Unlabeled Learning for Time Series Classification. IJCAI 2011 — Partition U into local clusters. Compute distances of clusters to P, if the distance is greater than a threshold, then identify it as reliable negative (RN), else ambiguous (AMBI) set — Decision boundary based on assigning AMBI to RN/P respectively • ICDM 2010, Zhang et al. Classifier and Cluster Ensembles for Mining Concept Drifting Data Streams, ICDM 2010 – Segment time series, and build classifiers and clusters according to whether or not the samples are labeled – Propagate label information from classifiers to clusters, and then iteratively refine the results by propagating similarities among all clusters Anomaly Detection • Existing work on spatial anomaly detection: – Identify the neighbors for every sensor and use their readings to predict/ interpolate the sensor value • • • • • • Neighbor identification : – (1)Physical connection (2)spatial information (3)Feature similarity – Then, compare the distance of value. Model the original distance and their relationships together – Idea: The physical distance (or rank) of the prediction inferred from our model far from original distance(rank) may be the potential outlier Location prediction – Randomly divide sensors into two sets for training and testing – Use training set to predict physical distance of testing set (We then can rank the result) – Features: multi-dimensional similarity, correlation,etc. – Training instance: pair • For example: – Train: 1,2,3,4, Test:5,6,7,8 – Training instance:1-2,1-3,14,2-3,2-4,3-4 – Test instance:5-1,5-2,5-3,54,5-6,5-7,5-8,6-1….. – Regression: SVR, linear regression, Pace regression, SMOreg…. • SMOreg performs best : – Mean absolute error:1.1 – Correlation coefficient :0.5133 – Rank for Kendal tau=66.4% Current Deployments – Server & sensors • 3F: R302 • 5F: R513 – Test situations • Switch and move the sensor • Same room • Different rooms – Data is currently being collected By creating different scenarios, the way in which harsh environmental conditions can affect the wireless sensor network data results can be discovered Four sensors are used in these experiments. Two of them are tarokos with no batteries box and it is powered by • • • • • computer. The other two motes are octopus, and it is powered by batteries. Sensor Roles – The sensor No.3 and No.4 plays the role of experimental group in these scenarios experiment – Sensor No.1 and No.2 is the “normal” sensor (e.g. control) with full-power supply and no extreme condition testing – All of the data are collected through a single hop receiver Scenario 1 – Battery in a depleted power state – Expect: errors rate and data fault rate increase – Result: Humidity and voltage is seriously affected when battery power becomes depleted Scenario 2 – Extreme humidity environment – Place sensor in a water surrounded and covered bowl – Expect: The fault produced by temperature has risen or emerge • Result: Drift discovered, but the precise data has to be examined and compared with the control group. There are also some data missing but the rate is small Scenario 3 – Similar to scenario 2 but this time with boiled water – Expect: The output of temperature and humidity imply drifts and noises – Processing…… Further Scenarios – Body sensor network with one sensor runs out of battery – Other extreme condition such as in the top of refrigerator, moisture bathroom, expose directly to sunlight – Experiment on the “out-of-range” value, that means over or under the sensor’s sensing range Classification • Investigated concepts introduced in activity recognition papers: – MMCRF – Latent-SVM • – Structural SVM – SVMHMM – CRF – Segmentation – Denoising with SVD – PCA – K-SVD – Compressive sensing – MF + sensor network Structural SVM – Motivation • We prefer discriminant model for classification • Usually better performance • Unlike supervised learning, now label y has some structure – How to Solve • View as multi-class classification • But exponential #, so speed-up is needed • Joint feature map: one weight for one feature(x,y) • Adding mostly violated constraints into working sets Graphical Models Graphical Model is a general approach for Event Tracking, Pattern Detection and Classification. There are many papers which devise such kinds of models for Sensor Network (Like StructuralSVM, HHMM). However, inference on those models is often intractable for large systems due to: 1. Large number of tracked events in database (ex. Object Recognition with Large Database. ) 2. Structural Uncertainty (ex. Which observation belongs to which event/pattern? ) 3. Large potential pattern space Our Goal : Develop Indexing Technique for models with large search space such that the inference time would not grow with system scale. Model Proposed by: Nikos Komodakis, Georgios Tziritas. 2006 Search Space: about 50000 labels (from surrounding patches) For output of the same quality: – Apply Standard Optimization algorithm: Quadratic Complexity, intractable (cost > 1 hr.) – Label Pruning Technique (Nikos etc.) : about Linear Complexity, cost 50 sec. – Hashing Technique (ours): sub-linear Complexity cost 4 sec. 5. Research Byproducts n/a