Download F. Villa_Forecast electricity prices_v.5_Fer

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Metastability in the brain wikipedia , lookup

Artificial neural network wikipedia , lookup

Holonomic brain theory wikipedia , lookup

Time series wikipedia , lookup

Catastrophic interference wikipedia , lookup

Biological neuron model wikipedia , lookup

Mathematical model wikipedia , lookup

Convolutional neural network wikipedia , lookup

Neural modeling fields wikipedia , lookup

Nervous system network models wikipedia , lookup

Recurrent neural network wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Electricity price forecasting wikipedia , lookup

Transcript
Abstract
The forecasting of electricity prices in liberalized and deregulated markets has been considered to be a
difficult task due to the amount and complexity of the factors that influence their representation.
Traditional neural network models allow for one to represent those complexities; however, often they
are criticized for their lack of statistical rigor. The cascade correlation neural network has been used to
resolve this problem. Although the cascade correlation can be the best of all the traditional neural
networks models, they can suffer from over-fitting. In order to control this problem, in this paper some
regularization strategies are proposed: weight decay, weight elimination, and ridge regression to
forecasting the mean monthly prices of the dispatch contracts in the wholesale electricity market of
Colombia. We compare the obtained forecasts with a multilayer perceptron and an ARIMA model. The
results show that the regularized cascade correlation capture the intrinsic dynamics of the time series
better than other traditional models, and it is able to give a more accurate forecast for a horizon of
twelve months ahead.
Keywords: time series forecast, cascade correlation
[1] INTRODUCTION
In the last decade, the electricity industry has experienced significant changes towards deregulation and
competition with the aim of improving economic efficiency. In many places, these changes have
culminated in the appearance of a wholesale electricity market. In Colombia, the residential public
services and the electricity laws have brought about the reconstruction of the electricity sector,
transforming it into a new scheme of free competition. In this new scheme, the current operation of the
generating units depends on the decentralized decisions of generation firms whose goals are to
maximize their own profits. All firms compete to provide generation services at a price set by the
market through two basic mechanisms: bilateral contracts between agents and the trading done on
energy spot market.
Stock price forecast is a particularly complex problem due to the amount and complexity of the factors
that influence their representation, such as: the physical characteristics of the generation system (the
electricity cannot be stored and its transportation requires a transmission line), the influence of the
business decisions of the market individual agents, and regulation. In general, the time series of stock
prices exhibits these complexities through its features, including: pronounced seasonal daily, weekly,
monthly, and other cycles; time-varying volatility; strong variations from year to year and season to
season; long-term structure dynamics; leverage effects and asymmetric response of volatility to positive
and negative values; outliers; high-order correlations; structural changes; local trends and mean
reversion. Furthermore, dependence on the conditions of generating units in the short run, investment
in capacity and demand growth in the long term, and different determinants for short, medium, and
long term risk.
The accuracy of forecastingthe stock prices is critical for producers, consumers and retailers. In fact,
they must set up bids for the spot market in the short term and define contract policies in the medium
term, and in addition, they must define their expansion plans in the long term. For these reasons, all the
decisions that each market player must take are strongly affected by price forecasts. Consequently, it
was necessary to develop forecast models with high performance for these series. In this paper we
analyze the forecast of the mean monthly prices of the dispatch contracts in wholesale electricity
market of Colombia.
Neural networks have been often used for modeling of complex time series, specifically in the
electricity market has been reported its use in. However, the estimation of the parameters in the
traditional neural networks models, like multilayer perceptron – MLP, has been characterized as a
particularly difficult problem. The lack of statistical identifiability of the model is one of the aspects
that hinder their specification. This is related with the fact that the optimal parameters are not unique
for a specification of the model (inputs or lags, hidden neurons, etc...). The artificial neural network
type Cascade Correlation - CASCOR presents interesting conceptual advantages in relation to the
statistical identifiability problem of the MLP. CASCOR is designed in the scheme growth size of the
network or constructive learning, where there is no need to know a priori the number of required
hidden neurons, so the learning can be faster and can have better generalization ability than a MLP .
Although CASCOR has advantages with respect to the problem of statistical identifiability of the
traditional multilayer perceptron and have proven to be robust enough to model complex sets, they can
suffer overfitting. For controlling this problem, in this paper, we propose to use some regularization
strategies: weight decay, weight elimination and ridge regression.
The main purpose of this paper is to forecast the mean monthly prices of the dispatch contracts in
wholesale electricity market of Colombia using CASCOR networks, and compare the results obtained
with ARIMA and MLP models, in order to determine the best prediction model for the studied series.
The results prove the efficiency and practicality of the proposed method.
The originality and importance of the proposed paper is based on the following aspects:
• Although there is extensive experience in the forecasting of electricity price in short-term markets,
there are no references in the literature on forecasting of prices of contracts with CASCOR
networks.
• There are few experiences in the literature comparing the performance of networks CASCOR with
other models in forecast of real world series.
• It helps to promulgate the use of CASCOR for the forecast of electricity price series, in order to
increase the amount of tools available.
This article is organized as follows: Section II gives a general description of the CASCOR models
for time series forecasting. Section III presents the proposed time series-CASCOR procedure with
regulations strategies, its implementation to the forecast of the mean monthly prices of the dispatch
contracts in wholesale electricity market of Colombia and the comparison of the results obtained with
ARIMA and MLP models. Conclusions are drawn in Section IV.
[2] CASCOR MODEL FOR TIME SERIES FORECASTING
The artificial neural network known as Cascade Correlation (CASCOR) proposed in , is designed in
the scheme growth size of the network or constructive learning, ie it starts with a minimal network
without hidden layers and then constructs a multilayered structure adding one neuron at once in the
hidden layer. In the process of adding hidden neurons to the network, each new neuron receives a
synaptic connection of each input and each hidden neurons that precedes it. After adding the new
hidden neuron, input synaptic weights are frozen, while its output weights are trained repeatedly. This
process is continuous until it reaches a satisfactory performance. Figure 1 shows the scheme of a
network CASCOR, the boxes at the intersections of the lines indicate the weights (parameters wp, h) that
are frozen once they have added a unit in the hidden layer. The crosses indicate the weights that are
modified after inserting the neuron.
Thus, the network CASCOR combines two basic ideas: the first is the cascade architecture, where
each hidden neuron is added at once, and are not changed after being added; the second is the
incremental or constructive learning, which is concerned with how the new hidden neurons are created,
where by each new hidden neuron, the algorithm is to maximize the correlation between the new
hidden neuron and the residual error of the network, ie hidden neurons are added trying to reduce the
error until their performance is satisfactory.
However, there are some criticisms against the CASCOR networks, specially oriented to the
overfitting problems, as seen in the next section. Given the improvements of a network CASCOR on a
MLP; CASCOR networks could theoretically perform non-linear regression functions better than an
MLP. This (the general problem of regression) has already been addressed in the literature, but the
problem of modeling and forecasting time series is more complex than the regression problem, since it
must take into account the order of the data and new statistical properties that this ordering induces on
the information. Then CASCOR is expected to perform the time-series forecast with a greater accuracy
than that of MLP. However, this hypothesis has not been proven in the literature and will be
demonstrated experimentally in this paper.
[3] RESEARCH METHODOLOGY
A.
CasCor Regularization
Although the cascade correlation neural networks (CasCor) can be better than multilayer perceptrons,
they can suffer overfitting. For controlling this problem, we propose to use some regularization
strategies: weight decay, weight elimination and ridge regression.
Weight decay was proposed by Hinton (1989) , and weight elimination by Weigend et al. (1991), this
strategies are described in Palit y Popovic (2005). Ridge Regression was proposed by Hoerl and
Kennard (1970), the main idea is controlling the bias variance trade-off, for more details, can be
consulted. Ridge Regression can be reduced the weight variance, minimized the outliers effect and
reduced the validation error of the network.
Then for the forecast, we consider the following five regularization schemes:

WE, CasCor network regularized with weight elimination.

WD, CasCor network regularized with weight decay.

RR, CasCor network regularized with ridge regression.

WE+RR, CasCor network regularized with weight elimination and ridge regression.

WD+RR, CasCor network regularized with weight decay and ridge regression.
For weight decay, we take 0.0001 for the lambda value. While for weight elimination we take the
same lambda value and 100 for the w0 value. Additionally for estimating the CasCor model
parameters, we use the ConRprop optimization algorithm, this was proposed by Villa et al. (2009) and
it’s described there in.
B.
Data analysis
In this study we use the natural logarithm of the mean monthly prices of the dispatch contracts in
wholesale electricity market of Colombia in $/kWh, between Between January 1997 (1997:01) and
October 2009 (2009:10). This data series is available in the Neon system of the enterprise XM
Compañía de Expertos en Mercados S.A. E.S.P.
Fig. 2 shows that this series features a long-term upward trend from 1997:1 to the first half of 2003
and during the same interval of time is evidence of a cyclical component of variable amplitude
annually, explained, possibly for the winter cycle -summer. The largest amplitude of the periodic
component coincides with the “El Niño” phenomenon occurred between 1997 and 1998, this cyclical
component, although not so marked with an amplitude remains until early 2004. Since 2003, there is a
slight downward trend ending sometime in the first half of 2006. Evidenced in this moment of time, a
structural change in the series, both in its tendency and in its cyclical component. On the one hand,
recovering levels of growth that characterized the years 2000, 2001 and 2002, while the other, it’s
again a seasonal cycle of annual period, whose highest level coincides with the summer season.
The series consists of 154 data, of which the first 130 (1997:01 to 2007:10) were used to estimate the
model parameters. Table I shows the models for time series forecasting. To test the generalizability of
the models, we use different two time horizons to forecast: the first consists of 12 data (from 2007:11
to 2008:10), for a year and the second, corresponding two years, from 24 observations (between 2006:7
and 2009:10).
C.
Empirical results
For the series studied in this paper we estimate the models presented in Table I, which was conducted
with a forecast horizon of one to two years, i.e. 12 and 24 months respectively. The fit goodness of the
models was measured using the sum squared error (SSE) both in training and in prediction (validation),
the results are presented in Table I.
To evaluate the predictive power of CasCor networks with respect to other models, the comparison is
made with respect to a MLP, and illustratively presents a model integrated autoregressive moving
average (ARIMA).The MLP model was estimated for different sets of lags, and selected the best
models with less error. The MLP architecture consists of an input layer with one neuron for each of the
lags considered, a hidden layer with 5 neurons hit by the same amount of CASCOR models, and one
output layer. While the ARIMA model is obtained by using the auto.arima () implemented in R
package forecast of Hyndman and Khandakar (2008), which seeks the best ARIMA model for a
univariate time series, the ARIMA model was found (0,1,0) (2,0,2), the result of the forecast is also
presented in Table 15. Furthermore, stressed that all models achieve an error CASCOR regularized
lower than the corresponding network without regularizing.
The results show that in models with three lags, the CASCOR-WE+RR-1 is that we get the slightest
error in training and forecasting to 2 years, while the CASCOR-RR-1 yields the lowest forecast one
year, however, the forecast error of a model year CASCOR-WE+RR-1 is only 4% greater than the
lesser models of 3 lags.In addition, all three lags CASCOR models have better generalization that the
MLP-1, also, the models CASCOR-WD-1 and CASCOR-WE-RR-1 are better than the ARIMA model
in both training and forecasting, the rest only are superior in prediction. By increasing the number of
lags, six and thirteen, it appears that CASCOR models are better than the respective MLP, even so are
compared to ARIMA.
When we have 6 lags, models CASCOR-WE+RR-2 and CASCOR-RR-2 remain the best in training
and one year forecast, respectively, but now CASCOR-RR-2 also is the better forecast to 2 years. The
difference CASCOR-RR-2 on CASCOR-WE+RR-2 in training is 33%, while that of CASCORWE+RR-2 on CASCOR-RC-2 in forecast is 8.42% and 8.13% , at one and two years respectively.The
difference between the two models is broader in training, so in this case might be more appropriate
model CASCOR-WE+RR-2. Furthermore, the difference CASCOR-WE+RR-2 on CASCOR-WD+RR2 is -2.34%, 2.11% and 3.13% in training, outcome at one and two years respectively, the
modelCASCOR-WD+RR-2 also is emerging as one suitable to model the series.It emphasizes that the
regularized models in both layers reach a lower training error than others, but in the lowest forecast
error attains the model is regularized only in the output layer.
Increase to 13 lags, the model CASCOR-WD+RC-3 as the best of all, this is regulated between the
input layer and hidden with weight decay, and between the hidden and output with ridge regression, i.e.
in this model controls overfitting. While the model CASCOR-3, which does not have any adjustment
strategy, his errors are noticeably larger than those achieved by CASCOR-WD+RR-3.
In general, for this series, the models CASCOR fully regularized (between input and hidden layer
and between hidden and output layer) achieves better errors that most models regularized only with a
technique. As well, completely regularized are more appropriate for forecasting, given that largely
control the causes of over fitting.
[4]
CONCLUSIONS
Forecast of the mean monthly prices of the dispatch contracts in wholesale electricity market of
Colombia is a complex task due to the presence of changes in the cyclic annual pattern, as well as
several changes in its long-term trend. We used the first 130 data for parameters estimation of the
models, while the remainders were used to evaluate the predictive capability. The results indicate that
fully regularized CASCOR networks more accurately predict the MLP, the ARIMA model and those
they CASCOR without regulating. The procedure proposed allows finding models with better
generalization than those other proposals in the literature.
REFERENCES