Download (MCF)_Forecast_of_the_Mean_Monthly_Prices

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Holonomic brain theory wikipedia , lookup

Artificial neural network wikipedia , lookup

Metastability in the brain wikipedia , lookup

Catastrophic interference wikipedia , lookup

Time series wikipedia , lookup

Biological neuron model wikipedia , lookup

Mathematical model wikipedia , lookup

Convolutional neural network wikipedia , lookup

Neural modeling fields wikipedia , lookup

Nervous system network models wikipedia , lookup

Recurrent neural network wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Electricity price forecasting wikipedia , lookup

Forecast of the Mean Monthly Prices of the Dispatch Contracts in
Wholesale Electricity Market of Colombia Using Cascade
Correlation Neural Networks
Paola Sánchez, Fernán Villa, Juan Velásquez
Abstract—Forecasting of the electricity prices in liberalized
and deregulated markets have been considered a difficult task
due to the amount and complexity of the factors that influence
their representation. Traditional neural networks models allow
represent those complexities, however often, they are criticized
criticism by their lack of statistical. The neural networks
model type Cascade Correlation has been used for resolved
this problem. Although the Cascade Correlation can be the best
of all the traditional neural networks models, they can suffer of
over-fitting. For controlling this problem, in this paper some
regularization strategies are proposed: weight decay, weight
elimination and ridge regression for to forecast the mean
monthly prices of the dispatch contracts in the wholesale
electricity market of Colombia. We compare the obtained
forecasts with a multilayer perceptron and an ARIMA model.
The results shows that the regularized cascade correlation
capture better the intrinsic dynamics of the time series than
other traditional models, and it is able the more accuracy
forecast for a horizon of twelve months ahead.
Keywords—Time series forecast, cascade correlation.
In the last decade, the electricity industry has experienced
significant changes towards deregulation and competition
with the aim of improving economic efficiency. In many
places, these changes have culminated in the appearance of a
wholesale electricity market. In Colombia, the Residential
Public Services and the Electricity Laws, has conducted to
restructuring of the electricity sector, to a new scheme of
free competition. In this new context, the actual operation of
the generating units depends on decentralized decisions of
generation firms whose goals are to maximize their own
profits. All firms compete to provide generation services at a
price set by the market through two basic mechanisms:
bilateral contracts between agents, and trading of energy
spot market.
The stock price forecast is a particularly complex problem
due to the amount and complexity of the factors that
influence their representation [1], such as, physical
Paola Sánchez is with Systems School. National University of Colombia.
Av 80 No. 65 - 223 Bl M8, Medellin – Colombia (Phone:057-4-4255370;
[email protected])
FernánVilla is with Systems School. National University of Colombia.
Av 80 No. 65 - 223 Bl M8, Medellin – Colombia (Phone:057-4-4255370;
[email protected])
Juan Velásquez is with Systems School. National University of
Colombia. Av 80 No. 65 - 223 Bl M8, Medellin – Colombia (Phone:057-44255370; [email protected])
characteristics of the generation system (‘‘electricity’’
cannot be stored and its transportation requires a
transmission lines), the influence of decisions business of the
individual agents of the market, and the regulation. In
general, the time series of stock prices exhibit these
complexities through its features, including: pronounced
seasonal cycles of daily, weekly, monthly, and others; timevarying volatility; strong variations from year to year and
season to season; long-term structure dynamics; leverage
effects and asymmetric response of volatility to positive and
negative values; outliers; high-order correlations; structural
changes; local trends and mean reversion. Furthermore,
dependence on the conditions of generating units in the short
run, investment in capacity and demand growth in the long
term, and different determinants for the risk of short,
medium and long term.
Given the complexity of the dynamics of stock prices, the
difficulty of forecast and the risk involved, the contracts are
a mechanism of risk mitigation. First, prevent the buyer is
bounded to price volatility in the stock market, and
exceptionally high prices that occur in the presence of
extreme hydrological events; second, stabilize earnings of
the seller and protect of exceptionally low price. In the
Colombian electricity market there are two types of
contracts: pay-to-contracted and pay-to-demanded. The payto-contracted contract specifies that the buyer agrees to pay
all electricity contracted, whether it was consumed or not. In
pay-to-demanded, the buyer pays only the energy actually
The accuracy of forecastingthe stock prices is critical for
producers, consumers and retailers. In fact, they must set up
bids for the spot market in the short term and define contract
policies in the medium term, and in addition, they must
define their expansion plans in the long term [2]. For these
reasons, all the decisions that each market player must take
are strongly affected by price forecasts [3]. Consequently, it
was necessary to develop forecast models with high
performance for these series. In this paper we analyze the
forecast of the mean monthly prices of the dispatch contracts
in wholesale electricity market of Colombia.
Neural networks have been often used for modeling of
complex time series, particularly in the electricity market
has been reported its use in [1], [4] and [5]. However, the
estimation of the parameters in the traditional neural
networks models, like multilayer perceptron – MLP, has
been characterized as a particularly difficult problem. The
lack of statistical identifiability of the model is one of the
aspects that hinder their specification. This relates to the fact
that the optimal parameters are not unique for a specification
of the model (inputs or lags, hidden neurons, etc.). The
artificial neural network type Cascade Correlation CASCOR [6] presents interesting conceptual advantages in
relation to the statistical identifiability problem of the MLP.
CASCOR is designed in the scheme growth size of the
network or constructive learning, where there is no need to
know a priori the number of required hidden neurons, so the
learning can be faster and can have better generalization
ability of a MLP [7].
Although CASCOR has advantages to the problem of
statistical identifiability of the traditional multilayer
perceptron and have proven robust enough to model
complex sets, they can suffer overfitting. For controlling this
problem, in this paper, we propose use some regularization
strategies: weight decay, weight elimination and ridge
The main purposes of this paper is forecasting of the
mean monthly prices of the dispatch contracts in wholesale
electricity market of Colombia using CASCOR networks,
and compare the results obtained with ARIMA and MLP
models, in order to determine the best prediction model for
the studied series. The results prove the efficiency and
practicality of the proposed method.
The originality and importance of the proposed paper is
based on the following aspects:
• Although there is extensive experience in the
forecasting of electricity price in short-term markets
[1], there are no references in the literature on
forecasting of prices of contracts with CASCOR
• There are few experiences in the literature comparing
the performance of networks CASCOR with other
models in forecast of real world series.
• It helps to promulgate the use of CASCOR for the
forecast of electricity price series, to increasing the
amount of tools available.
This article is organized as follow: Section II gives a
general description of the CASCOR models for time series
forecasting. Section III presents the proposed time seriesCASCOR procedure with regulations strategies, its
implementation to the forecast of the mean monthly prices of
the dispatch contracts in wholesale electricity market of
Colombia and the comparison of the results obtained with
ARIMA and MLP models. Conclusions are drawn in Section
The artificial neural network known as Cascade
Correlation (CASCOR) proposed in [6], is designed in the
scheme growth size of the network or constructive learning,
ie it starts with a minimal network without hidden layers and
then constructs a multilayered structure adding one neuron at
once in the hidden layer. In the process of adding hidden
neurons to the network, each new neuron receives a synaptic
connection of each input and each hidden neurons that
precede it. After adding the new hidden neuron, input
synaptic weights are frozen, while its output weights are
trained repeatedly. This process is continuous until it reaches
a satisfactory performance. Figure 1 shows the schematic of
a network CASCOR, the boxes at the intersections of the
lines indicate the weights (parameters wp, h) that are frozen
once they have added a unit in the hidden layer. The crosses
indicate the weights that are modified after inserting the
Thus, the network CASCOR combines two basic ideas:
the first is the cascade architecture, where each hidden
neuron is added at once, and are not changed after being
added; the second is the incremental or constructive
learning, which is concerns how the new hidden neurons are
created, where by each new hidden neuron, the algorithm is
to maximize the correlation between the new hidden neuron
and the residual error of the network, ie hidden neurons are
added trying to reduce the error until their performance is
However, there are some criticisms against the CASCOR
networks, specially oriented to the overfitting problems
[8],as seen in the next section. Given the improvements of a
network CASCOR on a MLP; CASCOR networks could
theoretically perform non-linear regression functions to
better than an MLP. This (the general problem of regression)
has already been addressed in the literature, but the problem
of modeling and forecasting time series is more complex
than the regression problem, since it must take into account
the order of the data and new statistical properties that this
ordering induces on the information. Then CASCOR be
expected to perform the time-series forecast with accuracy
greater than that of MLP. However, this hypothesis has not
been proven in the literature and will be demonstrated
experimentally in this paper.
Fig. 1. Scheme of CASCOR Network [6]
A. CasCor Regularization
Although the cascade correlation neural networks (CasCor)
can be better than multilayer perceptrons, they can suffer
overfitting. For controlling this problem, we propose use
some regularization strategies: weight decay, weight
elimination and ridge regression.
Weight decay was proposed by Hinton (1989) [9], and
weight elimination by Weigend et al. (1991)[10], this
strategies are described in Palit y Popovic (2005)[11]. Ridge
Regression was proposed by Hoerl and Kennard (1970)[12],
the main idea is controlling the bias variance trade-off, for
more details, can be consulted[13]. Ridge Regression can be
reduced the weight variance, minimized the outliers effect
and reduced the validation error of the network.
Then for the forecast, we consider the following five
regularization schemes:
 WE, CasCor network regularized with weight
 WD, CasCor network regularized with weight
 RR, CasCor network regularized with ridge
 WE+RR, CasCor network regularized with weight
elimination and ridge regression.
 WD+RR, CasCor network regularized with weight
decay and ridge regression.
For weight decay, we take 0.0001 for the lambda value.
While for weight elimination we take the same lambda value
and 100 for the w0 value. Additionally for estimating the
CasCor model parameters, we use the ConRprop
optimization algorithm, this was proposed by Villa et al.
(2009) and it’s described in[14].
B. Data analysis
In this study we use the natural logarithm of the mean
monthly prices of the dispatch contracts in wholesale
electricity market of Colombia in $/kWh, between Between
January 1997 (1997:01) and October 2009 (2009:10). This
data series is available in the Neon system of the enterprise
XM Compañía de Expertos en Mercados S.A. E.S.P.
Fig. 2 shows that this series features a long-term upward
trend from 1997:1 to the first half of 2003 and during the
same interval of time is evidence of a cyclical component of
variable amplitude annually, explained, possibly for the
winter cycle -summer. The largest amplitude of the periodic
component coincides with the “El Niño” phenomenon
occurred between 1997 and 1998, this cyclical component,
although not so marked with an amplitude remains until
early 2004. Since 2003, there is a slight downward trend
ending sometime in the first half of 2006. Evidenced in this
moment of time, a structural change in the series, both in its
tendency and in its cyclical component. On the one hand,
recovering levels of growth that characterized the years
2000, 2001 and 2002, while the other, it’s again a seasonal
cycle of annual period, whose highest level coincides with
the summer season.
The series consists of 154 data, of which the first 130
(1997:01 to 2007:10) were used to estimate the model
parameters. Table I shows the models for time series
forecasting. To test the generalizability of the models, we
use different two time horizons to forecast: the first consists
of 12 data (from 2007:11 to 2008:10), for a year and the
second, corresponding two years, from 24 observations
(between 2006:7 and 2009:10).
C. Empirical results
For the series studied in this paper we estimate the models
presented in Table I, which was conducted with a forecast
Fig. 2. One-Step Ahead Forecasting of the mean monthly prices of the dispatch contracts in wholesale electricity market of Colombia using a
CasCor model.
horizon of one to two years, i.e. 12 and 24 months
respectively. The fit goodness of the models was measured
Sum Square of Errors (SSE)
Forecast 1 year
Forecast 2 year
13 ,
1 – 13
1 – 13
1 – 13
1 – 13
1 – 13
1 – 13
1 – 13
using the sum squared error (SSE) both in training and in
prediction (validation), the results are presented in Table I.
To evaluate the predictive power of CasCor networks
respect to other models, the comparison is made with respect
to a MLP, and illustratively presents a model integrated
autoregressive moving average (ARIMA).The MLP model
was estimated for different sets of lags, and selected the best
models with less error. The MLP architecture consists of an
input layer with one neuron for each of the lags considered,
a hidden layer with 5 neurons hit by the same amount
CASCOR models, and one output layer. While the ARIMA
model is obtained by using the auto.arima () implemented in
R package forecast of Hyndman and Khandakar (2008)[15],
which seeks the best ARIMA model for a univariate time
series, the ARIMA model was found (0,1,0) (2,0,2), the
result of the forecast is also presented in Table 15.
Furthermore, stressed that all models achieve an error
CASCOR regularized lower than the corresponding network
without regularizing.
The results show that in models with three lags, the
CASCOR-WE+RR-1 is that we get the slightest error in
training and forecasting to 2 years, while the CASCOR-RR1 yields the lowest forecast one year, however, the forecast
error of a model year CASCOR-WE+RR-1 is only 4%
greater than the lesser models of 3 lags.In addition, all three
lags CASCOR models have better generalization that the
MLP-1, also, the models CASCOR-WD-1 and CASCORWE-RR-1 are better than the ARIMA model in both training
and forecasting, the rest only are superior in prediction. By
increasing the number of lags, six and thirteen, it appears
that CASCOR models are better than the respective MLP,
even so are compared to ARIMA.
When we have 6 lags, models CASCOR-WE+RR-2 and
CASCOR-RR-2 remain the best in training and one year
forecast, respectively, but now CASCOR-RR-2 also is the
better forecast to 2 years. The difference CASCOR-RR-2 on
CASCOR-WE+RR-2 in training is 33%, while that of
CASCOR-WE+RR-2 on CASCOR-RC-2 in forecast is
8.42% and 8.13% , at one and two years respectively.The
difference between the two models is broader in training, so
in this case might be more appropriate model CASCORWE+RR-2. Furthermore, the difference CASCOR-WE+RR2 on CASCOR-WD+RR-2 is -2.34%, 2.11% and 3.13% in
training, outcome at one and two years respectively, the
modelCASCOR-WD+RR-2 also is emerging as one suitable
to model the series.It emphasizes that the regularized models
in both layers reach a lower training error than others, but in
the lowest forecast error attains the model is regularized only
in the output layer.
Increase to 13 lags, the model CASCOR-WD+RC-3 as
the best of all, this is regulated between the input layer and
hidden with weight decay, and between the hidden and
output with ridge regression, i.e. in this model controls
overfitting. While the model CASCOR-3, which does not
have any adjustment strategy, his errors are noticeably larger
than those achieved by CASCOR-WD+RR-3.
In general, for this series, the models CASCOR fully
regularized (between input and hidden layer and between
hidden and output layer) achieves better errors that most
models regularized only with a technique. As well,
completely regularized are more appropriate for forecasting,
given that largely control the causes of over fitting.
[5] H.S. Hippert, C.E. Pedreira, and R.C. Souza, "Neural networks for
short-term load forecasting: a review and evaluation," IEEE
Transactions on Power Systems, vol. 16, no. 1, pp. 44 - 55, Febrero
[6] Scott E. Fahlman and Christian Lebiere, "The Cascade-Correlation
Forecast of the mean monthly prices of the dispatch
contracts in wholesale electricity market of Colombia is
complex task due to the presence of changes in the cyclic
annual pattern, as well as several changes in its long-term
trend. We used the first 130 data for parameters estimation
of the models, while the remainders were used to evaluate
the predictive capability. The results indicate that fully
regularized CASCOR networks more accurately predict the
MLP, the ARIMA model and those they CASCOR without
regulating. The procedure proposed allows finding models
with better generalization those other proposals in the
Learning Architecture," Advances in Neural Information Processing
Systems , vol. 2, pp. 524-532, 1990.
[7] Fernán A. Villa, Juan D. Velásquez, and Reinaldo C. Souza, "Una
aproximación a la regularización de redes cascada-correlación para la
predicción de series de tiempo.," Investigación Operacional., no. 28,
pp. 151-161, 2008.
[8] H.S. Hippert, D.W. Bunn, and R.C. Souza, "Large neural networks
for electricity load forecasting: Are they overfitted?," International
Journal of Forecasting, vol. 21, no. 3, pp. 425 - 434, 2005.
[9] G.E. Hinton, "Connectionist learning procedures," Artificial
Intelligence, no. 40, p. 185–243, 1989.
[10] Andreas S. Weigend, David E. Rumelhart, and Barnardo A.
Huberman, "Generalization by weight-elimination with application to
forecasting," in Advances in Neural Information Processing Systems,
1558601848th ed., R. P. Lippmann, J. E. Moody, and D. S.
Touretzky, Eds. San Mateo, CA, USA: Morgan Kaufmann Publishers
Inc., 1991, vol. 3, p. 875–882, ISBN:1-55860-184-8.
[11] Ajoy K. Palit and Dobrivoje Popovic, Computational Intelligence in
Time Series Forecasting. London: Springer, 2005.
[1] J. D. Velásquez, I. Dyner, and R. C. Sousa, "¿Por qué es tan difícil
obtener buenos pronósticos de los precios de la electricidad en
mercados competitivos?," Cuadernos de Administración, no. 20, p.
259 – 282, 2007.
[2] A. Conejo, J. Contreras, R. Espínosa, and M. Plazas, "Forecasting
electricity prices for a day-ahead pool-based electric energy market,"
International Journal of Forecasting, no. 21, p. 435–462, 2005.
[3] Y. Hong and C. Lee, "Aneuro-fuzzy price forecasting approach in
deregulated electricitymarkets," Electric Power Systems Research,
no. 73, p. 151–157., 2005.
[4] R. Gareta, A. Gil, A. Monzón, and L.M. Romeo, "Las redes
neuronales como herramienta para predecir el precio de la energía
eléctrica," Energía: Ingeniería energética y medioambiental, vol. 30,
no. 180, p. 67—72, 2004, ISSN 0210-2056.
[12] A. E. Hoerl and R. W. Kennard, "Ridge Regression: Biased
Estimation for Nonorthogonal Problems," Technometrics., vol. 12,
no. 1, p. 55–67, 1970.
[13] Donald W. Marquardt and Ronald D. Snee, "Ridge regression in
practice," The American Statistician, vol. 29, no. 1, pp. 3-20, Feb.
[14] Fernán A. Villa, Juan D. Velásques, and Patricia Jaramillo,
"Conrprop: un algoritmo para la optimización de funciones no
lineales con restricciones," Revista Facultad de Ingeniería
Universidad de Antioquia, no. 50, pp. 188-194, 2009.
[15] R.J. Hyndman and Y. Khandakar, "Automatic time series forecasting:
The forecast package for R," Journal of Statistical Software, vol. 26,
no. 3, 2008.