Download AM207 Poster

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
AM207
Energy Disaggregation from Non-Intrusive Load Monitoring
Karen Yu, Nick Vasios, Thibaut Perol
Methods For Disaggregation (1)
n=1
Microwave
Pros
Cons
Easy to Implement
Mathematically Attractive
Computationally Inefficient
Limited disaggregation accuracy
x1(1)
Figure: Typical Power Consumption in a household within a time frame that many appliances activate and deactivate
Appliance Activations
x2(1)
y1(1)
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
x2(2)
0.0
Start time
0
100
200
300
400
Scaled Time
500
Avg power
0.035
0.030
0.025
0.020
0.015
0.010
0.005
0.000
0
100
200
300
400
Scaled Time
500
600
Convolutional layer, relu activation
16 filters, filter size = 0.5 second
Power (W)
r
Ap
1
pr
A
1
01
22
r2
Ap
Ap
11
20
4
r2
1
01
Ap
62
r2
A
pr
1
01
2
28
1
r
Ap
y2
y3
TP-FP-FN-TN
entries, typically drawn from a multinomial distribution.
The acronyms stand for True Positives, False Positives, False Negatives and True Negatives
respectively. The definitions using the problem’s notation are:
X
(n)
(n)
(n)
TP
=
AND xt = on , x̂t = on
(2)
t
F P (n) =
N
• The emission probability matrix B with dimensions K ×2, typically drawn from a Gaussian distribution since observations are continuous.
F N (n) =
Expectation-Maximization is used to train the model parameters and the Viterbi algorithm to
decode the most likely state sequence.
In the context of this project we make use of the REDD data set [4], a public data set for
energy disaggregation research. The REDD dataset is not readily accessible since access can
only be granted by the authors of [4] at MIT. While REDD, is what made this project possible,
it cannot be directly fed into the disaggregation algorithms developed/borrowed in this project.
A data Pipeline is necessary to preprocess and feed the data for both training as well as testing
purposes. While the training and disaggregation steps are significantly different in all of the
methods that we use, the remaining steps are the same and are outlined below.
State Transitions are Naturally Modelled
Improved Disaggregation Accuracy
Even more computationally Inefficient
Harder Implementation
The Computational Cost is O(T K 2N ) where N is the number of appliances, K is the number
of appliances states and T is the number of time slices in the dataset. FHMM is more capable
than CO but still extremely computationally inefficient. Disaggregation across many appliances with multiple states and for a large time window is almost impossible ! Methods such
as variational inference can produce approximations that are more computationally efficient.
Disaggregation Algorithm : CO , FHMM , ConvNet
Data Interface
NILMTK
HDF5
Training
Preprocessing
Model
Disaggregation
Convolutional Neural Networks are similar to ordinary Neural Networks (multi-layer perceptrons). Each neuron receive an input, perform a dot product with its weights and follow this
with a non-linearity (here we only use ReLu). The whole network has a loss function that is
here the RMS error. The network implements the ’rectangle method’. From the input sequence
we invert for the start time, the end time and the average power of ONE appliance.
Metrics
Figure: The NILM pipeline At each stage of the pipeline, results and data can be stored to or loaded from disk
Pros
Cons
Quite Effective
Fast Disaggregation
Difficult to Implement
Training is time consuming
(n)
xt
AND
(n)
xt
AND
(n)
xt
= of f ,
= on
(3)
= of f
(4)
(n)
x̂t
X
= on ,
(n)
x̂t
(n)
=
X
= of f ,
(n)
x̂t
= of f
(5)
t
(n)
(n)
where xt ,x̂t
are the predicted and ground truth states of appliance n at t respectively.
Precision / Recall
Precision is the fraction of time for which an appliance was correctly predicted to be ON that it
was actually OFF. Recall is the fraction of time for which an appliance was correctly predicted
to be ON that it was actually ON. The definitions are:
Convolutional Neural Networks (ConvNet)
Statistics
REDD
Dataset
Cons
AND
t
TN
Pros
X
t
Figure: The energy consumption of a household (Building 2 in the REDD dataset) during a period of 1 month
The REDD Data Set
Method
ConvNet
Fridge
Microwave
0.91
CO
FHMM
Method
ConvNet
1.0
1.00 1.00
Fridge
Microwave
0.8
0.6
0.45
0.0
y1
• The transition probability matrix A with dimensions K N × K N , typically drawn from a
multinomial distribution.
1
20
30
FHMM
0.0
0.64
0.2
y2(2)
400
1
20
20
CO
0.48
0.51
0.4
y2(2)
Figure: A typical diagram of the Factorial Hidden Markov Model considered here [4]
1
0.18
0.6
0.51
0.38
0.4
0.19
CO
0.2
FHMM
Method
ConvNet
0.0
0.11
CO
0.06
FHMM
Method
ConvNet
Accuracy Metrics
600
01
0.28
0.8
y1(2)
• A prior probability π0 with K
0.4
0.33
600
It is necessary to introduce a number of metrics in order to access the accuracy of energy
disaggregation for each method in order to facilitate the comparison between them but also
to outline the flaws and limitations of each one. In the context of this project, the following
metrics were used, implemented and computed for all of the methods.
N
0.53
0.2
1.0
x3(2)
Site meter
800
2
18
0.38
0.6
0.10
y2(1)
1200
0
0.51
0.67
End time
x3(1)
y2(1)
x1(2)
200
0.52
0.2
Figure: A schematic representation of the architecture of the convolutional neural network
1000
0.6
0.4
Input sequence
batch size = 16 , sequence length = 85 seconds
The power demand of every appliance can also be modelled after hidden Markov Model (HMM)
[1] with the hidden states being the states of each appliance. To effectively disaggregate the
energy it is necessary to simultaneously decode the power draw of n appliances and thus a
Factorial Hidden Markov Model is necessary. Such a model requires 3 parameters:
Waste Disposal Unit
Dish Washer
Fridge
Sockets
Washer Dryer
Microwave
Electric Stove
Light
Sockets
Site Meter
Site Meter
0.66
0.78
0.8
0.71
(n)
where y t is the aggregate power consumption at t and ŷt is the power consumption of appliance n in the current state, which is determined by training the CO.
Factorial Hidden Markov Model (FHMM)
Time
0.8
Fridge
Microwave
1.0
F1 Score
(n)
x̂t
(1)
The Computational Cost is O(T K N ) where N is the number of appliances, K is the number
of appliances states and T is the number of time slices in the dataset. CO is only suitable for
small problems.
Fridge
Fridge
= argmin y t −
(n)
ŷt
Fridge
Microwave
1.0
Precision
Stove
(n)
x̂t
N
X
Scaled Power
Power Consumption
Space Heater
CO is a deterministic method that finds the optimal combination of appliance states which
minimizes the difference between the sum of the predicted appliance power and the observed
aggregate power subject to a set of appliance models [1]. The optimal combination of appliance
(n)
states x̂t of appliance n at time instance t is given by [5]
Convolutional neural networks have revolutionized computer vision. From an image the convolutional layer learns through its weights low level features. In the case of an image the
features detectors (filters) would be: horizontal lines, blobs etc. These filters are built using a
small receptive field and share weights across the entire input, which makes them translation
invariant. Similarly, in the case of time series, the filters extract low level feature in the time
series. By experimenting we found that only using 16 of these filters gives a good predictive
power to the ConvNet. This convolutional layer is then flatten and we use 2 hidden layers of
1024 and 512 neurons with ReLu activation function before the output layer of 3 neurons (start
time, end time and average power).
Recall
Combinatorial Optimization (CO)
Energy disaggregation is the procedure that infers the energy consumption in the basis of appliances in a household given the total energy consumption from a single meter of that household. In recent years, this field has become increasingly popular as smart meters have begun
to deploy and are installed in many households across the world. However, the field’s popularity is mostly attributed to the extremely powerful techniques which are able to take advantage of the continuously improving computational resources and perform this disaggregation
in a non-intrusive manner. The term non-intrusive implies that the appliance-based energy
consumption is not determined by installing individual meters on each appliance and thus interfering with the occupant’s privacy, but rather by estimating it using both deterministic and
stochastic techniques.
Results and Comparison
Convolutional Neural Networks (ConvNet)
Scaled Power
Abstract
Methods For Disaggregation (2)
Accuracy
Introduction
TP
Precision =
TP + FP
,
TP
Recall =
TP + FN
(6)
Discussion
Appliance based energy disaggregation is a complicated and very difficult problem if it is to
be performed on data from actual households over long periods of time and for a wide class
of appliances. The disaggregation efficiency of deterministic methods commonly employed in
these types of problems such as Combinatorial Optimization is extremely poor and the computational cost significant. Stochastic methods are certainly much more capable in these types
of problems and the metrics corresponding to the Factorial Hidden Markov Model underline
this point. The computational cost for FHMM is even more significant than that of CO which
suggests that even FHMM is not suitable for disaggregation over large data sets. In contrast,
the efficiency achieved by the Convolutional Neural Network is remarkable! It outperforms all
other methods in every metric and is ideal for these types of problems. Analysis performed by
members of our team indicates that the neural network is capable of achieving high accuracy
disaggregation even with a few layers. The downside is that it requires considerable expertise
to implement and a large data set to train. Training usually takes a very long time for Neural
Nets but the network, once trained, is able to disaggregate fairly quickly and certainly much
quicker than CO and FHMM.
References
[1] Batra, N. Kelly, J. Parson, O. Dutta, H. Knottenbelt, W. Rogers, A. Singh, A. Srivastava, M., (2014), NILMTK:
An Open Source Toolkit for Non-intrusive Load Monitoring, Int. Conference on Future Energy Systems, Cambridge, UK
[2] Kelly, J. Knottenbelt, W. (2015), Neural NILM: Deep Neural Networks Applied to Energy Diseggragation,
ACM BuildSys’15, Seoul
[3] Hart G. W. (1992), Nonintrusive Appliance Load Monitoring, Proceedings of the IEEE, (80)12
Accuracy / F1
Accuracy is defined as the fraction of time where the appliance was correctly classified as either
on or off whereas the F score is a harmonic mean of the Precision and Recall.
Accuracy =
Figure: The Accuracy, F1, Recall and Precision Scores for each method used for Disaggregation and per appliance
disaggregated
TP + TN
P +N
,
F1 =
2 × Precision × Recall
Precision + Recall
where P and N are the positives and negatives in ground truth.
(7)
[4] Kolter, J.Z. Johnson, M.J. (2011), REDD: A Public Data Set for Energy Disaggregation Research, SustKDD
2011, San Diego, CA, USA
[5] Hart G. W. (1992), Nonintrusive Appliance Load Monitoring, Proceedings of the IEEE, (80)12