Download self-training

Toward a Semi-Supervised Non-Intrusive Load Monitoring System for Event-based Energy Disaggregation GlobalSIP 2015, 14th Dec. Karim Said Barsim and Bin Yang Institute of Signal Processing and System Theory University of Stuttgart, Stuttgart, Germany {karim.barsim,bin.yang}@iss.uni-Stuttgart.de Introduction Labeled and unlabeled data Introduction  Semi-Supervised Learning (SSL): leverage both labeled and unlabeled data. Task & tools  Motivations Self-training 2  Labeled data are: scarce … costly… time-consuming Transductive learning 1  Unlabeled data are: plentiful … cheap … rapidly growing Transductive learning 2 Inductive learning 1 Inductive learning 2 ––––––––––– ––––––––––– ––––––––––– ––––––––––– K. S. Barsim Stuttgart University Supervised machine learning  Advantages  Improved performance with reduced labeling efforts.  NILM systems learning over time.  Requirements / Limitations:  Cluster assumption: decision boundaries through sparse regions only !  Manifold assumption: same labels are close in geometry !  Lazy training (transductive SSL models). • Eager learning, • Labeled data Semi-Supervised Learning (SSL) Self-training 1 Inductive learning • Eager learning, • Labeled & unlabeled data Transductive learning • Lazy learning, • Labeled & unlabeled data Constrained clustering • Lazy learning, • Unlabeled data & labeling constraints Unsupervised machine learning • Lazy learning, • Unlabeled data (2/15) 14th Dec. 2015 Introduction Labeled & Unlabeled data Labeled and unlabeled data  Labeled data Task & tools Self-training 1 Self-training 2 Transductive learning 1 Transductive learning 2  Sub-metered loads (eventless) start-up reboot reboot reboot Power signal of PC at 1Hz  Labeled events (event-based) Inductive learning 1 Inductive learning 2 ––––––––––– ––––––––––– ––––––––––– ––––––––––– K. S. Barsim Stuttgart University  Unlabeled data  Aggregate signals (eventless NILM)  Segmented signals (event-based NILM) (3/15) 14th Dec. 2015 Introduction Labeled and unlabeled data Task and tools  Is semi-supervised learning suitable for NILM systems ? Task & tools Self-training 1 Self-training 2 Transductive learning 1 Transductive learning 2 Inductive learning 1 Inductive learning 2 ––––––––––– ––––––––––– ––––––––––– ––––––––––– K. S. Barsim Stuttgart University  An SSL model: self-training  Advantages Structure of a selftraining model  Simple SSL model  A wrapper model  Does not require unsupervised components  Requirements  A learning algorithm ℎ or a seed classifier 𝑓 0  Confidence-rated predictions  Limitations  Separable data/classes  NILM test dataset: BLUED[1]  Suitable for event-based NILM [1] K. Anderson, A. Ocneanu, D. Benitez, D. Carlson, A. Rowe, and M. Berges, “BLUED: a fully labeled public dataset for Event-Based Non-Intrusive load monitoring research,” in Proceedings of the 2nd KDD Workshop on Data Mining Applications in Sustainability (SustKDD), Beijing, China, Aug. 2012 (4/15) 14th Dec. 2015 Introduction Labeled and unlabeled data Self-training 1  Labeled dataset: ℒ= 𝒙𝑛 , 𝑦 𝑛 Task & tools Self-training 1  Unlabeled dataset: 𝒰 = 𝒙𝑛 𝑁ℒ 𝑛=1 𝑁𝒰 +𝑁ℒ 𝑛=𝑁ℒ +1 Self-training 2 Transductive learning 1  Training: መ 𝑓መ (𝑡) = ℎ(ℒ ∪ ℒ)  Prediction: 𝑦ො Transductive learning 2 Inductive learning 1 𝑛 = 𝑓መ 𝑡 𝒙𝑛 , 𝒙𝑛 ∈ 𝒰 Inductive learning 2 ––––––––––– ℒመ = Sel  Selection: ––––––––––– ––––––––––– 𝒙𝑛 , 𝑦ො 𝑛 𝑁𝑎 +𝑁𝑏 𝑛= Training dataset at 𝑡1 Validation dataset Training dataset at 𝑡0 Inductive set Transductive set ––––––––––– K. S. Barsim Stuttgart University ℒ= 𝒙𝑛 , 𝑦 𝑛 𝑁ℒ 𝑛=1 𝒰 = 𝒙𝑛 𝑁𝒰 +𝑁ℒ 𝑛=𝑁ℒ +1 (5/15) 14th Dec. 2015 Introduction Labeled and unlabeled data Self-training 2: the double crescent problem  Classification problem: Task & tools  The double moon (2-class). Self-training 1  1000 samples/class. Self-training 2 Transductive learning 1  Learner/Classifier:  Support Vector Machine (SVM). 2 Transductive learning 2  Gaussian kernel 𝑒 Inductive learning 1  Minimal labelling (1 sample/class) Inductive learning 2 ––––––––––– − 𝒙𝑖 −𝒙𝑗  Selection:  Farthest from boundary. ––––––––––– ––––––––––– ––––––––––– K. S. Barsim Stuttgart University (6/15) 14th Dec. 2015 Introduction Labeled and unlabeled data Self-training 2: the double crescent problem  Classification problem: Task & tools  The double moon (2-class). Self-training 1  1000 samples/class. Self-training 2 Transductive learning 1  Learner/Classifier:  Support Vector Machine (SVM). 2 Transductive learning 2  Gaussian kernel 𝑒 Inductive learning 1  Minimal labelling (1 sample/class) Inductive learning 2 ––––––––––– ––––––––––– − 𝒙𝑖 −𝒙𝑗  Selection:  Farthest from boundary.  300 Iterations: > 70% ––––––––––– ––––––––––– K. S. Barsim Stuttgart University (7/15) 14th Dec. 2015 Introduction Labeled and unlabeled data Self-training 2: the double crescent problem  Classification problem: Task & tools  The double moon (2-class). Self-training 1  1000 samples/class. Self-training 2 Transductive learning 1  Learner/Classifier:  Support Vector Machine (SVM). 2 Transductive learning 2  Gaussian kernel 𝑒 Inductive learning 1  Minimal labelling (1 sample/class) Inductive learning 2 ––––––––––– ––––––––––– ––––––––––– − 𝒙𝑖 −𝒙𝑗  Selection:  Farthest from boundary.  300 Iterations: > 70%  500 Iterations: > 80% ––––––––––– K. S. Barsim Stuttgart University (8/15) 14th Dec. 2015 Introduction Labeled and unlabeled data Self-training 2: the double crescent problem  Classification problem: Task & tools  The double moon (2-class). Self-training 1  1000 samples/class. Self-training 2 Transductive learning 1  Learner/Classifier:  Support Vector Machine (SVM). 2 Transductive learning 2  Gaussian kernel 𝑒 Inductive learning 1  Minimal labelling (1 sample/class) Inductive learning 2 ––––––––––– ––––––––––– ––––––––––– ––––––––––– K. S. Barsim Stuttgart University − 𝒙𝑖 −𝒙𝑗  Selection:  Farthest from boundary.  300 Iterations: > 70%  500 Iterations: > 80%  700 Iterations: > 90% (9/15) 14th Dec. 2015 Introduction Labeled and unlabeled data Self-training 2: the double crescent problem  Classification problem: Task & tools  The double moon (2-class). Self-training 1  1000 samples/class. Self-training 2 Transductive learning 1  Learner/Classifier:  Support Vector Machine (SVM). 2 Transductive learning 2  Gaussian kernel 𝑒 Inductive learning 1  Minimal labelling (1 sample/class) Inductive learning 2 ––––––––––– ––––––––––– ––––––––––– ––––––––––– K. S. Barsim Stuttgart University − 𝒙𝑖 −𝒙𝑗  Selection:  Farthest from boundary.  300 Iterations: > 70%  500 Iterations: > 80%  700 Iterations: > 90%  1000 Iterations: optimal ! (10/15) 14th Dec. 2015 Introduction Labeled and unlabeled data  Reduced labeling efforts: how much labeling is needed for near-optimal performance ? Task & tools  When should SSL replace purely supervised models ? Self-training 1  Object of classification: Event-based features ( 𝑑𝑃, 𝑑𝑄 Self-training 2  Classifier: Support Vector Machine (SVM) with a linear kernel Transductive learning 1  Selection: nearest to class mean (based on the labeled samples) Transductive learning 2  Inductive learning 1  Dataset: BLUED dataset (refined) Inductive learning 2 ––––––––––– 𝑇 feature vectors) 1 sample per class per iteration, 3 iterations  Phase A: 749 samples, 23 classes  Phase B: 1284 samples, 45 classes ––––––––––– ––––––––––– ––––––––––– K. S. Barsim Stuttgart University Test dataset 5% 10% 15% 20% Labeled set Transductive set (11/15) 14th Dec. 2015 Introduction Labeled and unlabeled data Task & tools Self-training 1 Self-training 2 Transductive learning: how much labeling ?  Reduced labeling efforts: how much labeling is needed for near-optimal performance ?  When should SSL replace purely supervised models ? SSL is no longer required ! (~12% of labeling) Manifold assumption violation Transductive learning 1 Transductive learning 2 Inductive learning 1 Inductive learning 2 ––––––––––– ––––––––––– ––––––––––– ––––––––––– K. S. Barsim Stuttgart University Phase A Phase B (12/15) 14th Dec. 2015 Introduction Labeled and unlabeled data Inductive learning: learning over time ?  Effect of increasing unlabeled dataset. Task & tools  Test dataset is fixed and includes inductive and transductive inference tests. Self-training 1  Object of classification: Event-based features ( 𝑑𝑃, 𝑑𝑄 Self-training 2  Classifier: Support Vector Machine (SVM) with a linear kernel Transductive learning 1  Selection: nearest to class mean (based on the labeled samples) Transductive learning 2  Inductive learning 1  Dataset: BLUED dataset (refined) 𝑇 feature vectors) 1 sample per class per iteration, 3 iterations  Phase A: 749 samples, 23 classes Inductive learning 2  Phase B: 1284 samples, 45 classes ––––––––––– ––––––––––– Test dataset ––––––––––– 5% ––––––––––– 5% K. S. Barsim Stuttgart University 5% 5% Labeled set Transductive set Inductive set (13/15) 14th Dec. 2015 Introduction Labeled and unlabeled data Task & tools Self-training 1 Inductive learning: learning over time ?  Effect of increasing unlabeled dataset.  Test dataset is fixed and includes inductive and transductive inference tests. Irrelevant labeling (samples behind support vectors) Learning phase Effects of clusterassumption violation Irrelative labeling Self-training 2 Transductive learning 1 Transductive learning 2 Inductive learning 1 Inductive learning 2 ––––––––––– Learning phases Unlabeled samples have no effect on purely supervised models ––––––––––– ––––––––––– ––––––––––– K. S. Barsim Stuttgart University Unlabeled samples have no effect on purely supervised models Phase A Phase B (4.67% of labeling in both phases) (14/15) 14th Dec. 2015 Introduction Discussion Labeled and unlabeled data Task & tools Self-training 1 Self-training 2 Transductive learning 1 Transductive learning 2 Inductive learning 1 Inductive learning 2 Discussion Thank you for your attention ––––––––––– ––––––––––– ––––––––––– K. S. Barsim Stuttgart University (15/15) 14th Dec. 2015

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download self-training