Download Long Term Electric Load Forecasting using Neural Networks

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Choice modelling wikipedia , lookup

Data assimilation wikipedia , lookup

Least squares wikipedia , lookup

Linear regression wikipedia , lookup

Regression analysis wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
IJCST Vol. 3, Issue 1, Jan. - March 2012
ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)
Long Term Electric Load Forecasting using Neural
Networks and Support Vector Machines
Renuka Achanta
Dept. of CSE, AKRG Engineering College, Nallagerla, AP, India
Abstract
Electric load forecasting is an important aspect in electrical power
industry. It is important to determine the future demand for power
as far in advance as possible. According to the foreseen load,
the company makes investments and decisions on buying energy
from the generating companies, and planning for maintenance
and expansion. It is therefore absolutely necessary to have
some knowledge of future power consumption. Electric power
distributors require a tool that allows them to predict the load
in order to support its management and make more efficient its
planning formulation. Accurate prediction of electric load is
difficult. A large number of the classical prediction models are
inappropriate for this modeling because of their requirement of
linearity and seasonality. This paper presents an application of
Artificial Neural Networks (ANN) and Support Vector Machines
(SVM) to predict electric load. The results obtained using both the
techniques are compared and the performance of SVM is found
to be consistently better.
Keywords
Electric Load Forecasting, Neural Networks, Support Vector
Machines
I. Introduction
Electric load is one of the key variables for electric power companies
[14], since it determines its main source of income, particularly
in the case of distributors. According to the foreseen load, the
company makes investments and takes decisions regarding
planning, maintenance and expansion. It is therefore necessary to
have some knowledge of future power load. Accurate prediction of
electric load is difficult, because it is determined largely by variables
that involve “uncertainty” and whose relation with the final load
is not deduced directly [18-19]. The load is also characterized
as a nonlinear and non stationary process that can undergo rapid
changes due to weather, seasonal and macroeconomic variations. A
large number of the classical prediction models are inappropriate
for this modeling because of their requirement of linearity and
seasonality [10]. Neural network techniques have the potential
to handle complex, nonlinear problems in a better way when
compared to traditional techniques. Support vector regression
is different from conventional regression techniques because it
uses Structural Risk Minimization (SRM) but not Empirical Risk
Minimization (ERM) induction principle which is equivalent to
minimizing an upper bound on the generalization error and not the
training error. Due to this feature it is expected to perform better
than conventional techniques which may suffer from possible over
fitting. To solve this forecasting problem various neural network
and support vector machine models were analyzed and applied to
a real world dataset provided in the web by the EIA department
of America for the state of Alaska [20].
A. Long Term Load Forecasting
Long term load forecasting forecasts within the interval range of
one year or more than that. The factors that affect long term load
forecasting are previous years’ load. The forecasting procedure
266
International Journal of Computer Science And Technology depends on the manner in which historical time series data is
analyzed and on the type of information available at the time the
forecast is prepared. Various techniques have been applied to
the problem of Long term load prediction. To implement Long
term load forecasting there are traditional models. The main
tradition models are employed such as Regression model, Gray
forecast model and Combination model. During recent decades,
numerous intelligent methods have been proposed to improve the
load forecasting accuracy, such as expert system Fuzzy inference,
Particle Swarm Optimisation (PSO), Artificial Neural Networks
(ANN) and Support Vector Machine (SVM).
II. Artificial Neural Network
Artificial Neural Network (ANN) is a machine learning approach
inspired by the way in which the brain performs a particular
learning task [1, 17]. ANNs are modeled on human brain and
consists of a number of artificial neurons. Neuron in ANNs tend
to have fewer connections than biological neurons. Each neuron
in ANN receives a number of inputs. A function called activation
function is applied to these inputs which results in activation level
of neuron. Three different classes of network architectures:
Single-layer feed-forward, Multi-layer feed-forward, Recurrent.
A. Multi Layer Perceptron
Multi Layer Perceptron network is a class of neural networks
which consists of a set of sensory units that constitute the input
layer and one or more hidden layers of computation nodes and
an output layer of computation nodes. A non linear activation
function namely sigmoid function is widely used to generate
output activation in the computation nodes. In general MLPs are
trained with the back propagation algorithm to develop successful
classification and regression systems.
B. Back Propagation Algorithm
A Back propagation network consists of at least three layers of
units:
1. An input layer,
2. At least one intermediate hidden layer, and
3. An output layer.
Typically, units are connected in a feed-forward fashion with
input units fully connected to units in the hidden layer and hidden
units fully connected to units in the output layer. When a backpropagation network is cycled, an input pattern is propagated
forward to the output units through the intervening input-to-hidden
and hidden-to-output weights. The output of back-propagation is
interpreted as a classification decision. With Back –propagation
networks, learning occurs during a training phase. The steps
followed during learning are:
Each input pattern in a training set is applied to the input units
and then propagated forward.
The pattern of activation arriving at the output layer is compared
with the correct (associated) output pattern to calculate an error
signal.
The error signal for each such target output pattern is then backpropagated from the outputs to the inputs in order to appropriately
w w w. i j c s t. c o m
IJCST Vol. 3, Issue 1, Jan. - March 2012
ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)
adjust the weights in each layer of the network. After a back
propagation network has learned the correct classification for a
set of inputs, it can be tested on a second set of inputs to see how
well it classifies untrained patterns.
III. Support Vector Machines
Support Vector Machines (SVM) is a recent development that
serves as an alternative technique for dealing with complex
classification and regression problems [11-12]. Support Vector
Machine algorithm was developed by Vapnik and is based on
statistical learning theory. The basic idea of Support Vector
Machines is to map the original data X into a feature space F
with high dimensionality through a non linear mapping function
and construct an optimal hyperplane in new space [5]. SVM
can be applied to both classification and regression. In the case
of classification, an optimal hyperplane is found that separates
the data into two classes. Whereas in the case of regression a
hyperplane is to be constructed that lies close to as many points
as possible [16].
A. Support Vector Regression
Regression is the problem of estimating a function based on a
N
given dataset. Consider a dataset G = {( xi , d i )}i =1, where, xi is
the input vector, d i is the desired result and N corresponds to the
size of the dataset. The general form of Support Vector Regression
[3-4], estimating function is
(1)
where,
represent input feature, w and b are the co-efficients
that have to be estimated from data. A parameter represents the
deviation between the actual values and the regression function
which can be treated as a tube around the regression function.
Points outside the tube are considered as training errors. By
minimizing the regularized risk function represented by Equation
(2), the coefficients w and b are determined.
R( F ) =
Where,
1
1
|| w ||2 +C
2
N
| d − Fi | −e
Le (di , Fi ) =  i
0
N
∑ L (d , F ) i =1
e
i
i
(2)
| di − Fi |≥ e
others
(3)
are user determined parameters. The term
Le (di , Fi ) is the -insensitive loss function and the term 1 || w || is
2
used as measurement of the function flatness. C is a regularized
constant determining the trade-off between the training error
∗
and the model flatness. Two positive slack variables ξi and ξi
, represent the distance from actual values to the corresponding
boundary values of the - tube. The two slack variables are zero
when the data points fall within the - tube. After introducing the
slack variables the risk function can be expressed in the following
constrained form:
minimize
where, C and
2
1
|| w ||2 +C ∑ (ξi + ξ∗i )
2
i =1
N
(4)
subject to
di − wφ( xi ) − b ≤ e + ξi , i =1, 2,…. N
( w.φ( x)) + b − di ≤ e + ξ*i i =1, 2,…. N ξi , ξ*i ≥ 0
i =1, 2,…. N
This constrained Optimisation problem is solved using the
following primal Lagrangian form:
minimize
w w w. i j c s t. c o m
(5)
the term
is included to ensure that the optimality
constraints on the Lagrange multipliers
and
assume
variable forms.
Equation (5) is minimized with respect to primal variables w, b, ξi
∗
and ξi and is maximized with respect to nonnegative Lagrangian
multipliers ,
, and . Finally , Karush – Kuhn – Tucker
conditions are applied to Equation (4) and the dual Lagrangian
form represented by
maximize
N
N
1
N N
∑ d (α − α ) − e ∑ (α + α ) − 2 ∑∑ (α − α )(α − α )K ( x , x )
i =1
i
i
*
i
i =1
i
*
i
i =1 j =1
i
*
i
j
*
j
is derived, subject to the constraints
N
∑ (α
i =1
i
i
j
(6)
− α*i ) = 0
0 ≤ α i , α*i ≤ C ,
i = 1, 2,....n
The Lagrange multipliers in Equation (6) satisfy the equality
.The Lagrangian multipliers and
are calculated
and the optimal desired weight vector of the regression hyperplane
is expressed as
Therefore the regression function is
(7)
(8)
where, K ( xi , x j ) is the kernel function whose value is equal to
the inner product of two vectors, xi and x j , in the feature space
and
.
For a nonlinear regression problem, the data are mapped to a
high-dimensional feature space. The kernel function can simply
involve the use of a mapping. Any function that satisfies Mercer’s
theorem can be used as a kernel function. Some of the functions
that satisfy this condition are shown in Table 1.
Table 1: Some of the kernel functions that satisfy Mercer’s
Theorem.
Kernel
Function
Polynomial
[1 + ( X . X i )] p
Comment
Power p is
specified by the
user
2
RBF
exp(−
1
|| X − X i ||2 )
2σ 2
The width σ is
specified by the
user
SVM is fast, accurate and less prone to over fitting than other
methods. It can handle high dimensional data efficiently. SVMs
have been applied successfully in applications that deal with
numerical attributes. They are applied successfully to handwritten
character recognition, object recognition, speaker identification
etc. The success of the model depends on proper setting of the
parameters [6-7].
IV. Methodology
The real world datasets are highly susceptible to noisy and missing
data. The data can be preprocessed to improve the quality of
data and thereby improve the prediction results. In this work data
transformation has been applied to the data. Data transformation
International Journal of Computer Science And Technology 267
IJCST Vol. 3, Issue 1, Jan. - March 2012
ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)
improves the accuracy, speed and efficiency of the algorithms used.
The data is normalized using Z-score normalization [9], where,
the values of an attribute, A, are normalized based on the mean
(Ā) and standard deviation (σA) of A. The normalized value
V’ of V can be obtained as
V’ = (V - Ā)/σA
In this work electric load consumption is forecasted based on
the historical time series data. The available data is divided into
training, validation and test sets. Training set is used to build the
model, validation set is used for parameter Optimisation and test
set is used to evaluate the model. Separate models are developed
using SVM and MLP trained with back propagation algorithm.
A non linear support vector regression method is used to train
the SVM. A kernel function must be selected from the functions
that satisfy Mercer’s theorem. Polynomial kernel is adopted in
this study. The polynomial kernel function requires setting of
the parameter p in addition to the regular parameters C and
. As there are no general rules to determine the free parameters
the optimum values are set by grid search method [2, 8, 13].
The search is performed to identify the best combination of
parameters. After experimentation it has been observed that the
model with parameters C=1, =0.001, p =2 gives the least error.
Back propagation algorithm is used to develop the MLP model.
Several models [15], have been developed and tested, and finally
the best model is identified based on the mean absolute error which
is considered as performance measure in this work. The sigmoid
activation function is used in the models.
V. Results and Discussion
Electric load is predicted using both neural networks and support
vector machines. The errors in both the models is presented
below
Table 2: Comparison of NN and SVM
Model
Neural Networks
Support Vector Machines
Mean Absolute Error
0.1088
0.0916
From the table it can be observed that SVM’s give a better
performance than Neural Networks.
Fig. 1 and fig. 2, show the performance of both the models on
the test set.
Fig. 1:Actual and predicted values using SVM model
Fig. 2: Actual and predicted values using NN model
268
International Journal of Computer Science And Technology VI. Conclusion
An application of support vector regression and multi layer
perceptron neural networks for electric load forecasting is
presented in this paper. The performance of SVM was compared
with MLP for various models. The results obtained show that SVM
performs better than neural networks trained with back propagation
algorithm. It was also observed that parameter selection in the
case of SVM has a significant effect on the performance of the
model. It can be concluded that through proper selection of the
parameters, Support Vector Machines can replace some of the
neural network based models for electric load forecasting.
References
[1] Haykin, S.,“Neural Networks-A Comprehensive Foundation”,
Prentice Hall. 1999.
[2] Jae H.Min., Young-chan Lee.,“Bankruptcy prediction using
support vector machine with optimal choice of kernel function
parameters”, Expert Systems with Applications, 28, pp. 603614. 2005.
[3] Ronan Collobert, Samy Benegio,“SVM Torch: Support
Vector Machines for Large-Scale Regression Problems”,
Journal of Machine Learning Research, 1, 2001, pp. 143160.
[4] Smola A.J, Scholkopf B,“A Tutorial on support vector
regression”, Neuro COLT Technical Report NC-TR-98-030,
Royal Holloway College, University of London, UK. 1998
[5] Hsu, C.W., Chang, CC., Lin C.J.,“A Practical guide to
support vector classification”, Technical report, Department
of Computer science and information Engineering, National
Taiwan University, 2008.
[6] Y.Radhika., M.Shashi.,“Atmospheric temperature prediction
using Support Vector Machines”, International Journal of
Computer Theory and Engineering, Vol. 1, No. 1, pp. 55-58,
2009.
[7] Hand, D.J., Heikki Mannila, Padhraic Smyth,“Principles of
Data Mining”, The MIT Press, 2001.
[8] Radhika, Y., Shashi, M.,“A Recursive Algorithm for Parameter
Optimisation in Support Vector Regression”, International
Journal of Computer Applications in Engineering, Technology
and Sciences, Vol. 1, Issue 2, April 2009, pp. 451-454.
[9] Han, Kamber,“Data Mining: Concepts and Techniques”,
Morgan Kaufmann Publishers, 2002.
[10]Samson, D.C., Downs, T., Saha, T.K.,“Evaluation of
Support Vector Machine Regression Based Forecasting
Tool in Electricity Price Forecasting for Australian National
Electricity Market Participants”, J. Elect. Eloctron Eng.
Austr, Vol. 22, 2002, pp. 227-234.
[11] Scholkopf, B., Burges, C.J.C., Smola, A.J.,“Using Support
Vector Machine for Time Series Prediction”, Advances in
Kernel Methods, Eds. Cambridge, MA:MIT Press, pp. 242253, 1999.
[12]Muller, K.R., Smola, A.J., Ratsch, G., Scholkopf, B.,
Kohlmorgen, J., Vapnik, V.,“Predicting Time Series with
Support Vector Machine”, Proc. Int. Conf. Artificial Neural
Networks, ICANN 97.
[13]Lamamra, K., Belarbi, K., Mokhtari, F.,“Optimisation of the
Structure of a Neural Networks by Multi Objective Genetic
Algorithms”, ICGST-ARAS Journal, April 2007, pp. 1-4.
[14]Khan, M.R., Ondrusek C.,“Short Term Electric Demand
Prognosis Using Artificial Neural Networks”, Electr. Engg,
2000.
w w w. i j c s t. c o m
ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)
IJCST Vol. 3, Issue 1, Jan. - March 2012
[15]Snehashish Chakraverty, Pallavi Gupta,“Comparison of
Neural Network Configurations in the Long-Range Forecast of
Southwest Monsoon Rainfall Over India. Neural Computing
and Applications”, 2007.
[16]Kim, K J.,“Financial Time Series Forecasting Using Support
Vector Machines”, Neurocomputing, 2003, pp.307-319.
[17] Sivanandam, S.N., Sumati, S., Deepa, S.N.,“Introduction
to Neural Networks”, Using MATLAB 6.0, Tata McGraw
Hill, 2006.
[18]Changyin Sun, Jinya Song, Linfeng Li, Ping Ju,
“Implementation of Hybrid Short Term Load Forecasting
System with Analysis of Temperature Sensitivities”, Soft
Computing, 2008, pp. 633-638.
[19]Qiudan Li, Stephen Shaoyi Liao, Dandan Li,“A Clustering
Model for Mining Consumption Patterns from Imprecise
Electric Load Time Series Data”, Lecture Notes in Computer
Science, Fuzzy Systems and Knowledge Discovery, SpringerVerlag Berlin Heidelberg, FSKD 2006, LNAI 4223, pp. 12171220, 2006.
[20][Online] Available: http://www.eia.gov/cneaf/electricity/epa/
epa_sprdshts.html
Renuka Achanta received her B.E (Computer Science Engineering)
M.Tech (Computer Science and Technology). She is currently
an Assistant Professor in the Department of Computer Science
Engineering at A.K.R.G. Engineering College.
w w w. i j c s t. c o m
International Journal of Computer Science And Technology 269