Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Toward a Semi-Supervised
Non-Intrusive Load Monitoring
System for Event-based
Energy Disaggregation
GlobalSIP 2015, 14th Dec.
Karim Said Barsim and Bin Yang
Institute of Signal Processing and System Theory
University of Stuttgart, Stuttgart, Germany
{karim.barsim,bin.yang}@iss.uni-Stuttgart.de
Introduction
Labeled and
unlabeled data
Introduction
Semi-Supervised Learning (SSL): leverage both labeled and unlabeled data.
Task & tools
Motivations
Self-training 2
Labeled data are: scarce … costly… time-consuming
Transductive
learning 1
Unlabeled data are: plentiful … cheap … rapidly growing
Transductive
learning 2
Inductive
learning 1
Inductive
learning 2
–––––––––––
–––––––––––
–––––––––––
–––––––––––
K. S. Barsim
Stuttgart
University
Supervised machine learning
Advantages
Improved performance with reduced labeling efforts.
NILM systems learning over time.
Requirements / Limitations:
Cluster assumption: decision boundaries through sparse regions only !
Manifold assumption: same labels are close in geometry !
Lazy training (transductive SSL models).
• Eager learning,
• Labeled data
Semi-Supervised Learning (SSL)
Self-training 1
Inductive learning
• Eager learning,
• Labeled & unlabeled data
Transductive learning
• Lazy learning,
• Labeled & unlabeled data
Constrained clustering
• Lazy learning,
• Unlabeled data & labeling
constraints
Unsupervised machine learning
• Lazy learning,
• Unlabeled data
(2/15) 14th Dec. 2015
Introduction
Labeled & Unlabeled data
Labeled and
unlabeled data
Labeled data
Task & tools
Self-training 1
Self-training 2
Transductive
learning 1
Transductive
learning 2
Sub-metered loads
(eventless)
start-up
reboot
reboot
reboot
Power signal
of PC at 1Hz
Labeled events
(event-based)
Inductive
learning 1
Inductive
learning 2
–––––––––––
–––––––––––
–––––––––––
–––––––––––
K. S. Barsim
Stuttgart
University
Unlabeled data
Aggregate signals
(eventless NILM)
Segmented signals
(event-based NILM)
(3/15) 14th Dec. 2015
Introduction
Labeled and
unlabeled data
Task and tools
Is semi-supervised learning suitable for NILM systems ?
Task & tools
Self-training 1
Self-training 2
Transductive
learning 1
Transductive
learning 2
Inductive
learning 1
Inductive
learning 2
–––––––––––
–––––––––––
–––––––––––
–––––––––––
K. S. Barsim
Stuttgart
University
An SSL model: self-training
Advantages
Structure of a selftraining model
Simple SSL model
A wrapper model
Does not require unsupervised components
Requirements
A learning algorithm ℎ or a seed classifier 𝑓 0
Confidence-rated predictions
Limitations
Separable data/classes
NILM test dataset: BLUED[1]
Suitable for event-based NILM
[1] K. Anderson, A. Ocneanu, D. Benitez, D. Carlson, A. Rowe, and M. Berges, “BLUED: a fully labeled public dataset for Event-Based Non-Intrusive load
monitoring research,” in Proceedings of the 2nd KDD Workshop on Data Mining Applications in Sustainability (SustKDD), Beijing, China, Aug. 2012
(4/15) 14th Dec. 2015
Introduction
Labeled and
unlabeled data
Self-training 1
Labeled dataset:
ℒ=
𝒙𝑛 , 𝑦
𝑛
Task & tools
Self-training 1
Unlabeled dataset: 𝒰 = 𝒙𝑛
𝑁ℒ
𝑛=1
𝑁𝒰 +𝑁ℒ
𝑛=𝑁ℒ +1
Self-training 2
Transductive
learning 1
Training:
መ
𝑓መ (𝑡) = ℎ(ℒ ∪ ℒ)
Prediction:
𝑦ො
Transductive
learning 2
Inductive
learning 1
𝑛
= 𝑓መ
𝑡
𝒙𝑛 ,
𝒙𝑛 ∈ 𝒰
Inductive
learning 2
–––––––––––
ℒመ = Sel
Selection:
–––––––––––
–––––––––––
𝒙𝑛 , 𝑦ො
𝑛
𝑁𝑎 +𝑁𝑏
𝑛=
Training dataset at 𝑡1
Validation
dataset
Training dataset at 𝑡0
Inductive set
Transductive set
–––––––––––
K. S. Barsim
Stuttgart
University
ℒ=
𝒙𝑛 , 𝑦
𝑛
𝑁ℒ
𝑛=1
𝒰 = 𝒙𝑛
𝑁𝒰 +𝑁ℒ
𝑛=𝑁ℒ +1
(5/15) 14th Dec. 2015
Introduction
Labeled and
unlabeled data
Self-training 2: the double crescent problem
Classification problem:
Task & tools
The double moon (2-class).
Self-training 1
1000 samples/class.
Self-training 2
Transductive
learning 1
Learner/Classifier:
Support Vector Machine (SVM).
2
Transductive
learning 2
Gaussian kernel 𝑒
Inductive
learning 1
Minimal labelling (1 sample/class)
Inductive
learning 2
–––––––––––
− 𝒙𝑖 −𝒙𝑗
Selection:
Farthest from boundary.
–––––––––––
–––––––––––
–––––––––––
K. S. Barsim
Stuttgart
University
(6/15) 14th Dec. 2015
Introduction
Labeled and
unlabeled data
Self-training 2: the double crescent problem
Classification problem:
Task & tools
The double moon (2-class).
Self-training 1
1000 samples/class.
Self-training 2
Transductive
learning 1
Learner/Classifier:
Support Vector Machine (SVM).
2
Transductive
learning 2
Gaussian kernel 𝑒
Inductive
learning 1
Minimal labelling (1 sample/class)
Inductive
learning 2
–––––––––––
–––––––––––
− 𝒙𝑖 −𝒙𝑗
Selection:
Farthest from boundary.
300 Iterations: > 70%
–––––––––––
–––––––––––
K. S. Barsim
Stuttgart
University
(7/15) 14th Dec. 2015
Introduction
Labeled and
unlabeled data
Self-training 2: the double crescent problem
Classification problem:
Task & tools
The double moon (2-class).
Self-training 1
1000 samples/class.
Self-training 2
Transductive
learning 1
Learner/Classifier:
Support Vector Machine (SVM).
2
Transductive
learning 2
Gaussian kernel 𝑒
Inductive
learning 1
Minimal labelling (1 sample/class)
Inductive
learning 2
–––––––––––
–––––––––––
–––––––––––
− 𝒙𝑖 −𝒙𝑗
Selection:
Farthest from boundary.
300 Iterations: > 70%
500 Iterations: > 80%
–––––––––––
K. S. Barsim
Stuttgart
University
(8/15) 14th Dec. 2015
Introduction
Labeled and
unlabeled data
Self-training 2: the double crescent problem
Classification problem:
Task & tools
The double moon (2-class).
Self-training 1
1000 samples/class.
Self-training 2
Transductive
learning 1
Learner/Classifier:
Support Vector Machine (SVM).
2
Transductive
learning 2
Gaussian kernel 𝑒
Inductive
learning 1
Minimal labelling (1 sample/class)
Inductive
learning 2
–––––––––––
–––––––––––
–––––––––––
–––––––––––
K. S. Barsim
Stuttgart
University
− 𝒙𝑖 −𝒙𝑗
Selection:
Farthest from boundary.
300 Iterations: > 70%
500 Iterations: > 80%
700 Iterations: > 90%
(9/15) 14th Dec. 2015
Introduction
Labeled and
unlabeled data
Self-training 2: the double crescent problem
Classification problem:
Task & tools
The double moon (2-class).
Self-training 1
1000 samples/class.
Self-training 2
Transductive
learning 1
Learner/Classifier:
Support Vector Machine (SVM).
2
Transductive
learning 2
Gaussian kernel 𝑒
Inductive
learning 1
Minimal labelling (1 sample/class)
Inductive
learning 2
–––––––––––
–––––––––––
–––––––––––
–––––––––––
K. S. Barsim
Stuttgart
University
− 𝒙𝑖 −𝒙𝑗
Selection:
Farthest from boundary.
300 Iterations: > 70%
500 Iterations: > 80%
700 Iterations: > 90%
1000 Iterations: optimal !
(10/15) 14th Dec. 2015
Introduction
Labeled and
unlabeled data
Reduced labeling efforts: how much labeling is needed for near-optimal performance ?
Task & tools
When should SSL replace purely supervised models ?
Self-training 1
Object of classification: Event-based features ( 𝑑𝑃, 𝑑𝑄
Self-training 2
Classifier: Support Vector Machine (SVM) with a linear kernel
Transductive
learning 1
Selection: nearest to class mean (based on the labeled samples)
Transductive
learning 2
Inductive
learning 1
Dataset: BLUED dataset (refined)
Inductive
learning 2
–––––––––––
𝑇
feature vectors)
1 sample per class per iteration, 3 iterations
Phase A: 749 samples, 23 classes
Phase B: 1284 samples, 45 classes
–––––––––––
–––––––––––
–––––––––––
K. S. Barsim
Stuttgart
University
Test dataset
5%
10%
15%
20%
Labeled set
Transductive set
(11/15) 14th Dec. 2015
Introduction
Labeled and
unlabeled data
Task & tools
Self-training 1
Self-training 2
Transductive learning: how much labeling ?
Reduced labeling efforts: how much labeling is needed for near-optimal performance ?
When should SSL replace purely supervised models ?
SSL is no longer required !
(~12% of labeling)
Manifold assumption
violation
Transductive
learning 1
Transductive
learning 2
Inductive
learning 1
Inductive
learning 2
–––––––––––
–––––––––––
–––––––––––
–––––––––––
K. S. Barsim
Stuttgart
University
Phase A
Phase B
(12/15) 14th Dec. 2015
Introduction
Labeled and
unlabeled data
Inductive learning: learning over time ?
Effect of increasing unlabeled dataset.
Task & tools
Test dataset is fixed and includes inductive and transductive inference tests.
Self-training 1
Object of classification: Event-based features ( 𝑑𝑃, 𝑑𝑄
Self-training 2
Classifier: Support Vector Machine (SVM) with a linear kernel
Transductive
learning 1
Selection: nearest to class mean (based on the labeled samples)
Transductive
learning 2
Inductive
learning 1
Dataset: BLUED dataset (refined)
𝑇
feature vectors)
1 sample per class per iteration, 3 iterations
Phase A: 749 samples, 23 classes
Inductive
learning 2
Phase B: 1284 samples, 45 classes
–––––––––––
–––––––––––
Test dataset
–––––––––––
5%
–––––––––––
5%
K. S. Barsim
Stuttgart
University
5%
5%
Labeled
set
Transductive set
Inductive set
(13/15) 14th Dec. 2015
Introduction
Labeled and
unlabeled data
Task & tools
Self-training 1
Inductive learning: learning over time ?
Effect of increasing unlabeled dataset.
Test dataset is fixed and includes inductive and transductive inference tests.
Irrelevant labeling
(samples behind support vectors)
Learning
phase
Effects of clusterassumption violation
Irrelative
labeling
Self-training 2
Transductive
learning 1
Transductive
learning 2
Inductive
learning 1
Inductive
learning 2
–––––––––––
Learning
phases
Unlabeled samples have no effect
on purely supervised models
–––––––––––
–––––––––––
–––––––––––
K. S. Barsim
Stuttgart
University
Unlabeled samples have no effect
on purely supervised models
Phase A
Phase B
(4.67% of labeling in both phases)
(14/15) 14th Dec. 2015
Introduction
Discussion
Labeled and
unlabeled data
Task & tools
Self-training 1
Self-training 2
Transductive
learning 1
Transductive
learning 2
Inductive
learning 1
Inductive
learning 2
Discussion
Thank you
for your attention
–––––––––––
–––––––––––
–––––––––––
K. S. Barsim
Stuttgart
University
(15/15) 14th Dec. 2015