Download Regression Applying on Forecast of Time Series and Its Solution on

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Non-monetary economy wikipedia , lookup

Transcript
Regression Applying on Forecast of Time Series and Its Solution on Web Pages
— Take the Application on Economy Prediction in Fujian as an Example
DING Yuechao
SUN Yang
Computer Engineering College
Jimei University
Xiamen 361021, China
[email protected]
Computer Engineering College
Jimei University
Xiamen 361021, China
[email protected]
Abstract—Regression analysis is the statistic method which
decides two or more variables’ dependency. In general, the
data to be analyzed with regression are not time series. This
paper summarized the prediction efficacity in Fujian economy
forecast after the shifting regression and trend regression were
put forward in 2005 and were applied to predict time series.
Meanwhile, the solution of forecast-type regression on web
pages is introduced, and the soft is put on the Internet to be
used freely by people. As long as user provides data to be
analyzed and follows the operating steps, any time series
variable’s future value would be predicted. Some economy
indexes in the following two years in Fujian are forecasted, all
of which are to be verified by the practical economy results.
Keywords- forecast; time series; regression; web page
I.
INTRODUCTION
Forecast means predicting what will happen in the future
based on accurate statistical data, by combining scientific
methods with the perspective of history, actual conditions
and objective laws. The data generated by the development
of economy business are time series data. The future values
of time series can be predicted.
The time series data change with time, and hence is a
time function. The time series analysis aims to discover the
pattern of indicator changes over time. In general, for single
indicator forecast, time series smooth method is applied. It
includes these methods: shifting average, index smooth,
difference index smooth, self-adaptive filtering, linear
model, multinomial model, exponential curve model,
revision exponential curve model, growth curve model and
seasonal variation [1] [2].
However, the generating of time series data is the
synthetical result of multiple factors. By using simple
regression, neither the linear extension nor the curved
extension of a single factor can accurately predict the new
value. By using multivariate regression [3], the value of
dependent variable can be achieved only when the
simultaneous independent variables are known. In general,
the values of independent variables and dependent variable
are produced simultaneously. Therefore, it is not necessary to
predict the dependent variable’s value while it is known.
Meanwhile, in regression analysis, the sequence of
samples’ values is not considered in the model. In other
words, there is no method of sequential regression in math.
So it’s necessary to improve ordinary multivariate regression
analysis to make it applicable to prediction or trend analysis.
In 2005, we put forward a new method called “Multifactor Forward Shifting Regression” and published the paper
“A New Method—Forward Shifting Regression and Its
Application in Prediction of Per Capita GDP in Fujian” [4].
The macro-economic indicators in Fujian were predicted. In
another paper “Some Statistic Characteristics of Economy in
Fujian Province and its Developing Trend” [5], ten economic
indicators of Fujian in the next two years were forecasted
and some statistic characteristics were analyzed. The latter
was republish by the Renmin University’s copied paper and
journal material “Statistics and actuary” which is an
authoritative periodical in China. In the following two years
after the economy data came out, the forecast was proved to
be satisfactory. In 2006 in Fujian, the Gross output Value of
Agriculture, Forestry, Animal husbandry and Fishery
(GVAFAF) is 1535 (1450, predicted value outside the
bracket, actual value inside the bracket, similarly hereinafter)
100M yuan; the Added Value of Industry (AVI) is 3471
(3312) 100M yuan; the total Retail Sales of Consumer Goods
(RSCG) is 2597 (2704) 100M yuan; the Freight TonKilometers (FTK) is 1927 (1904) 100M t-km; the PassengerKilometers (PK) is 540 (525) 100M p-km; the per Capita
Annual Income of Rural Households (CAIRH) is 4869
(4835) yuan; the Per Capita GDP (PCGDP) is 21432 (21385)
yuan. The forecast two years ahead the year also got
approving result. In 2007 in Fujian, the GVAFAF is 1713
(1692) 100M yuan; the AVI is 3986 (4018) 100M yuan; the
RSCG is 2993 (3188) 100M yuan; the FTK is 2240 (2084)
100M t-km; the PK is 621 (588) 100M p-km; the CAIRH is
5504 (5467) yuan; the PCGDP is 25346 (25908) yuan.
In 2008, the method shifting regression was improved,
while a new method “Multi-factor Trend Regression” was
put forward. In the new method, the forecast effects of last
several periods on the next period are weighted. In recent
years, both “Forward Shifting Regression” and “Multi-factor
Trend Regression” have got approving results on the
application to a dozen of province-level area in China.
II.
FORECAST METHOD APPLYING REGRESSION ANALYSIS
A. Multivariate linear regression
When there are multiple independent variables, the linear
regression is multivariate linear regression. Suppose x1, x2,
…,xp are measurable or controllable variables, and if variable
y has a linear relation with x1, x2,…, xp, then n group of data
can be collected after n times of experiments: (yi, xi1, xi2, …,
xip), i= 1,2,…,n。
Multiple regression analysis models can be represented
as [3]
y1 = b0 + b1x11 + b2x12 + … + bpx1p +ε1
y2 = b0 + b1x21 + b2x22 + … + bpx2p +ε2
…………
yn = b0 + b1xn1 + b2xn2 + … + bpxnp +εn
where b0, b1, b2, …, bp are p+1 undetermined parameters,
that is, regression coefficient. εi denotes random factor’s
influence on yi during the tests, and it is usually ignored.
Through the solution of the linear equation, the values of
parameter b0, b1, b2, …, bp can be obtained, and p-variable
regression equation is shown below:
y= b0 + b1x1 + b2x2 + … + bpxp
The goal of establishing regression equation is to utilize
it to forecast and control. In reality, the relationship between
random parameter y and x1, x2, …, xp cannot be predetermined. Linear regression model is only an assumption
before regression equation is solved. Therefore, it needs to
be tested statistically.
The regression analysis does not consider the order of
time. The value of dependent variable is obtained merely
based on the linear expression of influence factors.
Furthermore, some values of dependent variables, such as
economic indicators, can only be obtained on condition that
the values of other variables during the same period are
known. Therefore, it is not forecast in the strict sense. The
true forecast means the values are forecast before actual
events occur.
B. Shifting regression
Based on the view that the values of the indicators at a
time is the foundation of next value of the indicator to be
predicted, the dependent variable at a later time is a function
of all the independent variables at a previous time, and the
regression model can be [4]
y2 = b0 + b1x11 + b2x12 + … + bpx1p +ε1
y3 = b0 + b1x21 + b2x22 + … + bpx2p +ε2
…… …… ……
yn = b0 + b1xm1 + b2xm2 + … + bpxmp +εm
where m = n – 1. The other parameters are the same as the
above mentioned regression mode. This model is one line
less than the previous model, and it means one less sample is
calculated during data processing, which has little effect on
the solution to the equation when a large number of samples
are available. The independent variables’ values of the last
sample is not calculated in the model but can just be put into
the regression equation to predicate the coming period’s
dependent variable value. This forecast method is called
“Multi-factor Forward Shifting Regression”.
Deduced by analogy as shifting 1 period forward, we can
shift dependent variable 2 or 3 periods forward. If 2 periods
is moved forward to build the equation, the last but one
sample (produced in the n–1 period) may be put into the
equation to predicate the dependent’s value yn+1 in n+1
period. If 3 periods is shifted forward and the equation is
built, the last but two sample (produced in the n–2 period)
may also be put into the equation to predicate the value yn+1.
As the method shifting 1, 2 and 3 periods forward can reflect
the developing trend of the multiple time series, this new
method may be called “Multi-factor Trend Regression”.
Forecasting yn+1 produces three predicted values. Therefore,
the average value can serve as the final predicted value.
Weighted average method can also be applied here: the
weights of predicted value in the model of shifting dependent
1, 2 and 3 periods forward may be given as 0.5, 0.3 and 0.2
respectively. To predict the next values of dependent, the
nearest period sample has the greatest impact on the value to
be predicted.
III.
THE REALIZTION OF SHIFTING REGRESSION ON WEB
Over the past several years, some new methods put
forward by us are designed as desktop programs. Because
scholars have been requesting the software, we designed two
web-type programs [6] by JavaEE technology and put them
on web, one of which is “shifting regression” [7]. Input the
website http://210.34.136.253:8088/forecast/, following the
instructor, you may process you data and get the forecast
result. The flow chat is as Fig. 1.
Figure 1. Flow chart of shifting regression on web
During the operating of program on web, the interface of
selecting dependent variable and independent variables is
shown as Fig. 2, and the data are also displayed on the
figure. You may select any variables and samples (records).
When you select method, you have three choices: ordinary
stepwise regression, shifting forward regression and trend
regression. The ordinary regression can only get the
functional relationship among dependent and independent
variables, and time series prediction can not be done. Both
the other two methods can predict the dependent’s value in
the new period. The data in Fujian province during 19832009 [8] [9] are processed. Dependent variable y is the per
capita GDP. 11 Independent variables x1, x2, ..., x11 are
selected, they are: GVAFAF, AVI, RSCG, CPI, FTK, PK,
FHS(Freight Handled at Seaports), TGR(Total Government
Revenue), TIFA(Total Investment in Fixed Assets), CAIRH
and NGR(Natural Growth Rate). All the data from 1983 to
2009 are selected.
Figure 2. The interface of selectiong dependent,independents and samples for Fujian data
According to the number of samples and variables and
the estimated number of variables to be selected into the
model, the F-statistic for significance test of entering and
removing variable would be determined while doing
TABLE I.
regression. Here, F1 and F2 are both given as 4.5. Next, the
regression coefficients of shifting dependent 1 period
forward were got and shown in Table I.
COEFFICIENTS OF REGRESSION EQUATION
Coefficient
b0
b1 b2 b3 b4 b5
b6 b7 b8 b9 b10
b11
value
2700.61 0 2.519
0 3.23 7.52 0 5.396 0
-199.49
The regression equation is obtained and shown below:
y = 2700.61 + 2.519× x2 + 3.23× x5 + 7.52 × x6 + 5.396 ×
x8 -199.49 × x11
That is:
PCGDP of next year=2700.61 + 2.519 × AVI of previous
year + 3.23 × FTK of previous year + 7.52 × PK of previous
year + 5.396 × TGR of previous year -199.49 × NGR of
previous year
According to the equation, the soft put in the sample data
in 2009, and the PCGDP in next year (2010) turn out to be
35495 yuan. The predicted results of PCGDP from 1984 to
2009 are shown in Table II. The errors in the first decade are
comparatively high; especially the errors are as high as 50%
in the first two years. From 1994 to 2009, the errors are
small, with all the errors are below 3.41% and most of them
are about 1%. The results are quite good.
TABLE II.
COMPARISON OF ESTIMATED VALUES AND REAL VALUES
1994
1995
1996
Real Predicted
Error
value
Value
Year
%
(yuan)
(yuan)
591
342 42.19
1997
737
348 52.78
1998
809
965 19.27
1999
999
1065
6.57
2000
1349
1181 12.43
2001
1589
1629
2.53
2002
1763
1820
3.21
2003
2041
2051
0.49
2004
3072 20.13
2557
2005
3556
3956 11.24
2006
5194
0.02
5193
2007
6526
6688
2.48
2008
7646
7557
1.17
2009
IV.
THE FORECAST OF MACROECONOMIC INDICATORS IN
Year
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
Real Predicted
value Value Error %
(yuan) (yuan)
8775
8532
2.77
9603
9276
3.41
10323
10251
0.70
11194
11127
0.59
11892
11865
0.23
12938
13126
1.46
14333
14295
0.26
16469
16558
0.54
18861
1.15
18646
21385
21606
1.03
25361
2.11
25908
30123
29682
1.46
33051
33589
1.63
THE FOLLOWING TWO YEARS
There are twelve indicators in the data of Fujian
province. In fact, any of these indicators can be dependent
variable, with the others serving as independent variables to
forecast the dependent's value in 2010. If we do this for all
TABLE III.
Year
GVAFAF
100M yuan
2010
2011
2150
2228
the variables, then all the estimated values of indicators can
be obtained. This paper selects the method “Forward Shifting
Regression” to forecast the values in 2010. NGR doesn’t
vary much each year and therefore it will not be discussed
for now.
Directly put the above predicted values in 2010 in the
equation, and the values in 2011 can be predicted. If the
predicted values in 2010 are put into the data table and the
equation is rebuilt, the values in 2011 can also be forecasted.
In this paper, the values of economic indicators in 2010
and 2011 in Fujian is predicted and shown in table III. The
values in 2011 are predicted by the method “Multi-factor
Trend Regression”.
V.
“Multi-factor Forward Shifting Regression” and “Multifactor Trend Regression” are designed as web applying
software and put on the Internet. Supported by the forecast of
economic indicators in Fujian and other provinces, the
method proved to be effective in the past years. Here, the
economic indicators in next two years are predicted and to be
verified by the practical economy result. The methods
improved the previous time series forecasting methods which
only involve self-extension without taking multiple factors
(variables) into consideration. Besides, they got over the
weakness of forecasting by general regression analysis that
relies on the simultaneous independent variables. An
economy indicator is the function of interrelated economy
factors. The indicators (independent variables) in a period
may affect the indicator (dependent variable) to be predicted
in the next period, which is the basis of the new methods.
The new methods brought a new approach to economy
forecast.
THE FORECAST OF ECONOMIC INDICATORS OF FUJIAN PROVINCE IN 2010 AND 2011
AVI
RSCG
100M
100M yuan
yuan
5423
4831
5730
5279
CPI
%
97
99
FTK
100M
t-km
2743
2829
PK
100M p-km
697
681
[5]
REFERENCES
[1]
[2]
[3]
[4]
CONCLUSIONS
QIAN Zhongwei, LI Shengda. Quantitative Method of Economic
Forecast. Chongqing Press: Chongqing, 1994, pp. 135-192,
Brucel L. Bowerman, Richard T. O’Connell. Forecasting and Time
Series(3rd Ed.). Brooks/Cole and China Machine Press, 2003,
pp.291-427.
James M Lattin, J Douglas Carroll, Paul E Green. Analyzing
Multivariate Data. Brooks/Cole and China Machine Press, 2003, pp.
38-80.
SHEN Jun, DING Yuechao, “A NewMethod of Forward Regression
Ana lysis and Its Application in Prediction of Per Capita GDP in
Fujian Prov ince”, Journal of Jimei University(Nature Science), 2006,
vol. 11, No. 4, pp. 375-380.
[6]
[7]
[8]
[9]
FHS
10000
tons
28681
31810
TGR
TIFA
100M yuan 100M yuan
2073
2317
7539
8942
CAIRH
PCGDP
yuan
yuan
6963
7464
35495
39322
SHEN Jun, DING Yuechao, “Some Statistic Characteristics of
Economy Development in Fujian and its Developing Trend”, Journal
of Jimei University(Philosophy and Social Science), 2006, vol. 9, no.
4, pp. 38-43.
DING Yuechao, “Announcement on Sharing Two Softwares”,
EB/OL,http://210.34.136.253:8488/contribute.htm, 2010-04-18
DING Yuechao, “Forecast-type Regression(Shifting Regression)
software”, http://210.34.136.253:8088/forecast/, 2010-04-18.
Bureau of Statistics of Fujian, Fujian Statistics Yearbook 2009. China
Statistics Press, Beijing, 2009.
Bureau of Statistics of Fujian, “Statistical Bulletin 2009 of Fujian
National Economy and Society Development”, EB/OL,
http://www.fujian.gov.cn/zwgk/tjxx/tjgb/201003/t20100306_197570.
htm, 2009-02-25.