* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download modified_final_Intelligent Outlier Detection using Online
Survey
Document related concepts
Transcript
Intelligent Outlier Detection for HVAC System Fault Detection Ying Guo1*, Davood Dehestani2, Jiaming Li1, Josh Wall3, Sam West3, Steven Su1 CSIRO ICT Centre, Sydney, Australia 2 UTS, Sydney, Australia 3 CSIRO Division of Energy Technology, Newcastle, Australia 1 * Corresponding email: [email protected] Keywords: HVAC systems, detection, energy performance, machine learning. 1 Introduction This paper proposes methods for detecting outliers in order to improve real-time fault detection of HVAC systems. Detecting faults properly can significantly improve energy efficiency, reduce maintenance costs, and improve human comfort. However, there often exist outliers in measured data which mislead the training process for fault detection of HVAC systems. We propose a novel intelligent outlier detection approach using online soft-margin support vector machine, which introduces the slack variables to measure the degree of misclassification of the training samples. Nonlinear penalty functions have then been used to reduce the effect of outliers on the classifier. In addition, we applied an on-line incremental SVM to cope with large dataset in real time. Based on this online training procedure, a semiunsupervised fault detection method is implemented which can detect new unknown faults (outliers) from the training datasets in real time. 2 Incremental-decremental Algorithm of SVM To identify the outliers within the training datasets, we need to investigate methods with real time operation capability and requiring less training data. Support Vector Machine (SVM) has been extensively studied in data mining and machine learning communities for the last two decade (Vapnic 1995). A SVM model is equivalent to a two-layer, perceptron neural network. With using a kernel function, SVM is an alternative training method for multi-layer perceptron classifiers in which the weights of the network are identified by solving a quadratic programming problem under linear constraints, rather than by solving a non-convex unconstrained minimization problem. Liang and Du (2007) studied fault detection for HVAC systems by using standard SVM (off-line). It is required to solve a quadratic programming (QP) for the training of a SVM. However, standard numerical techniques for QP are infeasible for very large datasets which is the situation for fault detection and isolation for HVAC systems (Thongkam et. al. 2008). By using online SVM, the large-scale classification problems can be implemented in real time configuration under limited hardware and software resources. In this paper, incremental SVM (on-line) has been applied for outlier detection in training datasets. The main advantages of SVM include the usage of kernel trick (no need to know the non-linear mapping function), the global optimal solution (quadratic problem), and the generalization capability obtained by optimizing the margin. However, for very large datasets, standard numeric techniques for QP become infeasible. Training an SVM incrementally on new data by discarding all previous data except their support vectors, gives only approximate results. An online alternative, that formulates the (exact) solution for M+1 training data in terms of that for M data and one new data point, is presented in online incremental method. Cauwenberghs and Poggio (2001) consider incremental learning as an exact on-line method to construct the solution recursively, one point at a time. The key is to retain the Kuhn-Tucker (KT) conditions on all previous data, while adiabatically adding a new data point to the solution. Leave-one-out is a standard procedure in predicting the generalization power of a trained classifier, both from a theoretical and empirical perspective (Vapnic 1995). 3 Implementation of Online SVM for Outlier Detection Figure 1 shows the proposed fault detection scheme by using incremental-decremental support vector machine classification. The system can detect unknown faults existed in the training datasets by monitoring key HVAC variables during system operation. In this algorithm existing faults can be detected (as unknown new faults) by comparing with the outputs of the healthy model and the real system. If detected fault was similar to known fault, it will be categorized by algorithm as existing faults. Otherwise, this data is sent to online SVM trainer for training for the new fault. Finally new fault will be isolate by this online SVM as a known fault. The incremental procedure is reversible and decremental unlearning of each training sample produces an exact leave-one-out estimate of faults with using all HVAC data during its operating. Figure 1: Schematic of semi unsupervised outlier detection with online SVM. As mentioned earlier, our proposed algorithm is able to detect unknown faults in the training datasets by semi-unsupervised learning process. To testing semi-unsupervised performances, an unknown sudden fault is imposed in the system at one test. The detection results are shown in Figure 2. It is clearly indicated that the margin changes from high level to low level when detecting incipient outlier. For unknown fault (outliers) this change is dramatic as unknown fault is abrupt type. To efficiently optimizing training process, samples in each normal/faulty condition should be applied. A group containing of maximum faulty (outliers) training samples is selected, and applied for training. From Figure 2, it is found that. The experimental results shows that the designed SVM classifier can identify the HVAC unknown fault (outlier) accurately, and the unknown faults can also be detected efficiently. Figure 2: Sudden unknown faults (outliers) in the training data detected as the margin shown. 4 Discussion The main advantage of this algorithm is usage of only a range of useful data (including healthy data, old faults, and new faults) instead of whole data sets. The computation cost can be reduced dramatically, hence it can be able to detect new unknown faults (outliers) in real time. Furthermore, this online approach can more efficiently train the fault detection modular by throwing out unnecessary data and just used a series of data with high priority regarding to classification. This approach has been tested based on real data from several commercial HVAC systems, and can successfully isolate outliers and detect HVAC system faults from un-healthy datasets. The experimental results are all very positive. 5 References Cauwenberghs G. and Poggio T. 2001. Incremental and decremental support vector machine learning, Advances in Neural Information Processing Systems. 13(13), pp. 409-415. Liang J. and Du R. 2007. Model-based fault detection and diagnosis of HVAC systems using support vector machine method, International Journal of Refrigeration. 30(6), pp. 1104-1114. Thongkam J., Xu G., Zhang Y., and Huang F. 2008. Support Vector Machine for outlier detection in breast cancer survivability prediction, APWeb 2008 Workshops, LNCS 4977, pp. 99-109. Vapnic V. 1995. The nature of statistical learning theory, Springer-Verlag.