Survey

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Survey

Document related concepts

no text concepts found

Transcript

AM207 Energy Disaggregation from Non-Intrusive Load Monitoring Karen Yu, Nick Vasios, Thibaut Perol Methods For Disaggregation (1) n=1 Microwave Pros Cons Easy to Implement Mathematically Attractive Computationally Inefficient Limited disaggregation accuracy x1(1) Figure: Typical Power Consumption in a household within a time frame that many appliances activate and deactivate Appliance Activations x2(1) y1(1) 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 x2(2) 0.0 Start time 0 100 200 300 400 Scaled Time 500 Avg power 0.035 0.030 0.025 0.020 0.015 0.010 0.005 0.000 0 100 200 300 400 Scaled Time 500 600 Convolutional layer, relu activation 16 filters, filter size = 0.5 second Power (W) r Ap 1 pr A 1 01 22 r2 Ap Ap 11 20 4 r2 1 01 Ap 62 r2 A pr 1 01 2 28 1 r Ap y2 y3 TP-FP-FN-TN entries, typically drawn from a multinomial distribution. The acronyms stand for True Positives, False Positives, False Negatives and True Negatives respectively. The definitions using the problem’s notation are: X (n) (n) (n) TP = AND xt = on , x̂t = on (2) t F P (n) = N • The emission probability matrix B with dimensions K ×2, typically drawn from a Gaussian distribution since observations are continuous. F N (n) = Expectation-Maximization is used to train the model parameters and the Viterbi algorithm to decode the most likely state sequence. In the context of this project we make use of the REDD data set [4], a public data set for energy disaggregation research. The REDD dataset is not readily accessible since access can only be granted by the authors of [4] at MIT. While REDD, is what made this project possible, it cannot be directly fed into the disaggregation algorithms developed/borrowed in this project. A data Pipeline is necessary to preprocess and feed the data for both training as well as testing purposes. While the training and disaggregation steps are significantly different in all of the methods that we use, the remaining steps are the same and are outlined below. State Transitions are Naturally Modelled Improved Disaggregation Accuracy Even more computationally Inefficient Harder Implementation The Computational Cost is O(T K 2N ) where N is the number of appliances, K is the number of appliances states and T is the number of time slices in the dataset. FHMM is more capable than CO but still extremely computationally inefficient. Disaggregation across many appliances with multiple states and for a large time window is almost impossible ! Methods such as variational inference can produce approximations that are more computationally efficient. Disaggregation Algorithm : CO , FHMM , ConvNet Data Interface NILMTK HDF5 Training Preprocessing Model Disaggregation Convolutional Neural Networks are similar to ordinary Neural Networks (multi-layer perceptrons). Each neuron receive an input, perform a dot product with its weights and follow this with a non-linearity (here we only use ReLu). The whole network has a loss function that is here the RMS error. The network implements the ’rectangle method’. From the input sequence we invert for the start time, the end time and the average power of ONE appliance. Metrics Figure: The NILM pipeline At each stage of the pipeline, results and data can be stored to or loaded from disk Pros Cons Quite Effective Fast Disaggregation Difficult to Implement Training is time consuming (n) xt AND (n) xt AND (n) xt = of f , = on (3) = of f (4) (n) x̂t X = on , (n) x̂t (n) = X = of f , (n) x̂t = of f (5) t (n) (n) where xt ,x̂t are the predicted and ground truth states of appliance n at t respectively. Precision / Recall Precision is the fraction of time for which an appliance was correctly predicted to be ON that it was actually OFF. Recall is the fraction of time for which an appliance was correctly predicted to be ON that it was actually ON. The definitions are: Convolutional Neural Networks (ConvNet) Statistics REDD Dataset Cons AND t TN Pros X t Figure: The energy consumption of a household (Building 2 in the REDD dataset) during a period of 1 month The REDD Data Set Method ConvNet Fridge Microwave 0.91 CO FHMM Method ConvNet 1.0 1.00 1.00 Fridge Microwave 0.8 0.6 0.45 0.0 y1 • The transition probability matrix A with dimensions K N × K N , typically drawn from a multinomial distribution. 1 20 30 FHMM 0.0 0.64 0.2 y2(2) 400 1 20 20 CO 0.48 0.51 0.4 y2(2) Figure: A typical diagram of the Factorial Hidden Markov Model considered here [4] 1 0.18 0.6 0.51 0.38 0.4 0.19 CO 0.2 FHMM Method ConvNet 0.0 0.11 CO 0.06 FHMM Method ConvNet Accuracy Metrics 600 01 0.28 0.8 y1(2) • A prior probability π0 with K 0.4 0.33 600 It is necessary to introduce a number of metrics in order to access the accuracy of energy disaggregation for each method in order to facilitate the comparison between them but also to outline the flaws and limitations of each one. In the context of this project, the following metrics were used, implemented and computed for all of the methods. N 0.53 0.2 1.0 x3(2) Site meter 800 2 18 0.38 0.6 0.10 y2(1) 1200 0 0.51 0.67 End time x3(1) y2(1) x1(2) 200 0.52 0.2 Figure: A schematic representation of the architecture of the convolutional neural network 1000 0.6 0.4 Input sequence batch size = 16 , sequence length = 85 seconds The power demand of every appliance can also be modelled after hidden Markov Model (HMM) [1] with the hidden states being the states of each appliance. To effectively disaggregate the energy it is necessary to simultaneously decode the power draw of n appliances and thus a Factorial Hidden Markov Model is necessary. Such a model requires 3 parameters: Waste Disposal Unit Dish Washer Fridge Sockets Washer Dryer Microwave Electric Stove Light Sockets Site Meter Site Meter 0.66 0.78 0.8 0.71 (n) where y t is the aggregate power consumption at t and ŷt is the power consumption of appliance n in the current state, which is determined by training the CO. Factorial Hidden Markov Model (FHMM) Time 0.8 Fridge Microwave 1.0 F1 Score (n) x̂t (1) The Computational Cost is O(T K N ) where N is the number of appliances, K is the number of appliances states and T is the number of time slices in the dataset. CO is only suitable for small problems. Fridge Fridge = argmin y t − (n) ŷt Fridge Microwave 1.0 Precision Stove (n) x̂t N X Scaled Power Power Consumption Space Heater CO is a deterministic method that finds the optimal combination of appliance states which minimizes the difference between the sum of the predicted appliance power and the observed aggregate power subject to a set of appliance models [1]. The optimal combination of appliance (n) states x̂t of appliance n at time instance t is given by [5] Convolutional neural networks have revolutionized computer vision. From an image the convolutional layer learns through its weights low level features. In the case of an image the features detectors (filters) would be: horizontal lines, blobs etc. These filters are built using a small receptive field and share weights across the entire input, which makes them translation invariant. Similarly, in the case of time series, the filters extract low level feature in the time series. By experimenting we found that only using 16 of these filters gives a good predictive power to the ConvNet. This convolutional layer is then flatten and we use 2 hidden layers of 1024 and 512 neurons with ReLu activation function before the output layer of 3 neurons (start time, end time and average power). Recall Combinatorial Optimization (CO) Energy disaggregation is the procedure that infers the energy consumption in the basis of appliances in a household given the total energy consumption from a single meter of that household. In recent years, this field has become increasingly popular as smart meters have begun to deploy and are installed in many households across the world. However, the field’s popularity is mostly attributed to the extremely powerful techniques which are able to take advantage of the continuously improving computational resources and perform this disaggregation in a non-intrusive manner. The term non-intrusive implies that the appliance-based energy consumption is not determined by installing individual meters on each appliance and thus interfering with the occupant’s privacy, but rather by estimating it using both deterministic and stochastic techniques. Results and Comparison Convolutional Neural Networks (ConvNet) Scaled Power Abstract Methods For Disaggregation (2) Accuracy Introduction TP Precision = TP + FP , TP Recall = TP + FN (6) Discussion Appliance based energy disaggregation is a complicated and very difficult problem if it is to be performed on data from actual households over long periods of time and for a wide class of appliances. The disaggregation efficiency of deterministic methods commonly employed in these types of problems such as Combinatorial Optimization is extremely poor and the computational cost significant. Stochastic methods are certainly much more capable in these types of problems and the metrics corresponding to the Factorial Hidden Markov Model underline this point. The computational cost for FHMM is even more significant than that of CO which suggests that even FHMM is not suitable for disaggregation over large data sets. In contrast, the efficiency achieved by the Convolutional Neural Network is remarkable! It outperforms all other methods in every metric and is ideal for these types of problems. Analysis performed by members of our team indicates that the neural network is capable of achieving high accuracy disaggregation even with a few layers. The downside is that it requires considerable expertise to implement and a large data set to train. Training usually takes a very long time for Neural Nets but the network, once trained, is able to disaggregate fairly quickly and certainly much quicker than CO and FHMM. References [1] Batra, N. Kelly, J. Parson, O. Dutta, H. Knottenbelt, W. Rogers, A. Singh, A. Srivastava, M., (2014), NILMTK: An Open Source Toolkit for Non-intrusive Load Monitoring, Int. Conference on Future Energy Systems, Cambridge, UK [2] Kelly, J. Knottenbelt, W. (2015), Neural NILM: Deep Neural Networks Applied to Energy Diseggragation, ACM BuildSys’15, Seoul [3] Hart G. W. (1992), Nonintrusive Appliance Load Monitoring, Proceedings of the IEEE, (80)12 Accuracy / F1 Accuracy is defined as the fraction of time where the appliance was correctly classified as either on or off whereas the F score is a harmonic mean of the Precision and Recall. Accuracy = Figure: The Accuracy, F1, Recall and Precision Scores for each method used for Disaggregation and per appliance disaggregated TP + TN P +N , F1 = 2 × Precision × Recall Precision + Recall where P and N are the positives and negatives in ground truth. (7) [4] Kolter, J.Z. Johnson, M.J. (2011), REDD: A Public Data Set for Energy Disaggregation Research, SustKDD 2011, San Diego, CA, USA [5] Hart G. W. (1992), Nonintrusive Appliance Load Monitoring, Proceedings of the IEEE, (80)12