* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Machine learning wikipedia, lookup

Financial economics wikipedia, lookup

Inverse problem wikipedia, lookup

History of numerical weather prediction wikipedia, lookup

Computer simulation wikipedia, lookup

Expectation–maximization algorithm wikipedia, lookup

Multiple-criteria decision analysis wikipedia, lookup

Predictive analytics wikipedia, lookup

Linear least squares (mathematics) wikipedia, lookup

Pattern recognition wikipedia, lookup

K-nearest neighbors algorithm wikipedia, lookup

Data assimilation wikipedia, lookup

Vector generalized linear model wikipedia, lookup

Transcript

Contemporary Logistics 13 (2013) 1838-739X Contents lists available at SEI Contemporary Logistics journal homepage: www.seiofbluemountain.com Research of Optimal Regress Model Based on the Least-absolute Criteria and Its Application in Regional Economic Forecast Jianhua XIAO School of Economics and Management, Wuyi University, Jiangmen 529020, Guangdong, P.R.China KEYWORDS ABSTRACT Statistic learning theory, Least-absolute criteria, Multiple regression, Economic forecasting Combined with the research achievement of statistic learning theory, an optimal regress model based on least-absolute criteria is proposed, which aims at overcoming the disadvantage of weak generalization ability that existed in most of the current regression function. Compared with prior regress models, LaOR model has taken regress error and confidence interval into account synthetically. As a result, the expected risk of regress could be reduced effectively. In the end, the LaOR model is applied to the short term forecasting of regional economic development, taking Jiangmen, Guangdong as an example, and the result is acceptable. © ST. PLUM-BLOSSOM PRESS PTY LTD 1 Introduction In the existing parameter estimation of regression, whether the linear models or nonlinear models, researchers tend to stick to the two aspects: first, using the least squares criterion; second, based on empirical risk minimization. In fact, it is nearly 40 years that the studies of the least absolute criteria appeared earlier than the least squares criterion. Yet for the nondifferentiable problem of least absolute criteria, its calculation is much more difficult when compared with the least squares criterion. That’s why the development of the study was relatively slow [1]. Although the theory of least squares criterion is well developed and already has achieved satisfied results in some regression problems, there are two extremely obvious shortages. Firstly, the prediction accuracy of the unknown sample in the regression model will be very low by using the least squares criterion, when sample data are relatively small and have outliers. The reason is, compared with normal data, the deviation of the abnormal data is much bigger, thus its squared value is even bigger, in order to hold down the sum of squares, the regression model has to “accommodate” these outliers, and ultimately increases the impact of outliers on the regression model, resulting in distortion of the regression model [2]. Secondly, the premise of a better effect by using the least squares criterion is that the random error in the regression model must follow normal distribution. But this premise cannot always be satisfied, especially in some certain issues of quantitative economics which follow a distribution with a larger proportion of tail. It has already been proved, in these circumstances, statistical performance of the least absolute criteria is better than the least squares criterion [3]. Taking the application of prediction as an example, the purpose of regression modeling is to establish the mathematical model, which is as realistic as possible, through learning the return of the known data samples, and ultimately predict the future time. Thus it can be seen, the standard of the pros and cons of a regression model is the generalization ability, or the value of the expected risk. Corresponding author. E-mail: [email protected] English edition copyright © ST. PLUM-BLOSSOM PRESS PTY LTD DOI:10.5503/J.CL.2013.13.005 23 Unfortunately, the theory of statistical learning proposed by V. Vapnik and others points out [4]: the minimalempirical risk does not guarantee the minimum of the expected risk, and eventually cannot guarantee the generalization ability of the regression function. The above discussion has fully demonstrated there are deficiencies in existing parameter estimation method of regression model. In order to solve these problems, in this paper, it will combine the research findings of the theory of statistical learning and then proposes optimal regress model based on least absolute criteria (which is referred to as LaOR model). 2 Optimal Regress Model Based on Least Absolute Criteria 2.1 Least absolute criteria Taking multiple linear regression as an example, consider the given learning n samples ( x1 , y1 ), ( x2 , y2 ),, ( xn , yn ) In which xi R d ， yi R ， i (1) 1,2,, n The purpose of linear regression is to find the regression function. f ( x) w, x b (2) w, x is the inner product of w and x, and the parameters have a minimum error base on some certain criteria, which is: min Remp ( f , g ) g ( y, f ( x)) (3) Until now, most commonly used criterion is the least squares criterion n min Remp ( f ) yi f ( xi ) 2 i 1 (4) It is easy to find that formula (4) accumulates the sum of error squares, and thus enlarges the relative size of each sample point and the error of regression function. Once there are outliers in the sample data, in order to reduce the error, it will inevitably lead the regression function moving closer to the outliers. Thus to reduce the interference of abnormal sample point, a natural idea is to use the absolute value of error instead of the sum of squared errors, which is n min Remp ( f ) yi f ( xi ) (5) i 1 This is known as the least absolute criteria. 2.2 Least absolute criteria under kernel method Formula (4) and formula (5) have something in common: what they accumulate is the error of the existing sample point (or sum of squares error), which is called the empirical risk, the whole learning principle is also called empirical risk minimization. However, the goal of regression modeling is to minimize the expected risk, by using the least absolute criterion, which is min R( f ) y f ( x) dF ( x, y) (6) The joint probability density function F ( x, y ) in the formula is an unknown function. Based on the research of statistical learning theory, in order to minimize R( f ) , we must consider both the empirical risk and the confidence interval, which is structural risk minimization. Combined with the research results of Smola and the others [5], by using the least absolute criterion, in order to minimize structural risk, the parameters of regression function formula (2) should satisfy l 1 min ( w, , * ) ( w w) C ( i i* ) 2 i 1 (7) yi w, xi b i s.t. ( yi w, xi b) i* , * 0 i i * i ,i in the formula are fitting error of each learning sample by using the least absolute criteria. Parameter compromise empirical risk and confidence interval. Define Langrange function (8) C is used to l l 1 L ( w w) C ( i i* ) i ( i yi w, xi b) 2 i 1 i 1 l l i 1 i 1 i* ( i* yi w, xi b) (i i i* i* ) (9) 24 Obviously, L should be maximum under w, b, i , i* , then calculate the partial derivatives of these variables respectively, which is l b L ( i* i ) 0 (10) i 1 l w L w ( i i* ) xi 0 (11) i C i i 0 (12) * C i* i* 0 (13) i 1 i Substituted formula (10) - (13) into formula (9), we can get the optimization problem max l 1 l (i i* )( j *j ) xi , x j yi ( i i* ) 2 i , j 1 i 1 (14) l s.t. ( i i* ) 0 i 1 i , i* [0, C] It is the optimal regress model based on least absolute criteria (LaOR model). We believe this model is optimal, because during the modeling process we take the fitting error and confidence interval into account, therefore, it’s ensuring the generalization ability. * * The i , i can be solved by formula (14). In fact, according to KKT, only a portion of i , i is not equal to 0, and the corresponding training samples are called support vectors. Further, by the formula (11), w l ( i 1 i i* ) xi (15) Finally, the regression equation can be obtained by the formula (1) and formula (15) f ( x ) i i* xi x b n (16) i 1 The above parameter b in the formula can be calculated as follows [6] b 1 w xr xs 2 (17) xr , xs in the above equation belong to either of the two different types of support vector. 3 The Application of the LaOR Model in the Multiple Regression: A Case Study of the Economic Prediction in Jiangmen According to the qualitative analysis and calculation of economic development in Jiangmen, Guangdong Province [7], we can see a closely relationship between short term prediction of the t year’s economic growth and the following 8 indicators: Jiangmen’s GDP t 1 , Guangdong’s GDP t 1 , total export value of foreign trade t 1 , financial expenditure t 1 , total retail sales of social consumer goods t 1 , investment in fixed assets t 1 , fixed asset investment t 1 and the actual utilization of foreign capital t 1 . Considering the dynamic nature of the economic system, in order to reflect the importance of the recent data. first, we should use the weighted geometric method to operating sample data, and give larger weights to the samples which are close to the forecast year. To illustrate the validity of the LaOR model, the author used the data from 1985 to 1997 as a learning sample, and the data from 1998 to 2002 as a test sample, using LaOR models that showed in formula (14) to test, and make C=5, the results are shown in Figure1 and in Table 1. 25 The solid line in Figure 1 is the actual value of GDP for each year, “*” corresponds to the relevant year on LaOR model predictions. Figure 1 Comparison between predictive value based on LaOR model and the actual value Table 1 Comparison between forecast GDP and real GDP Year Prediction of GDP Actual GDP Error % 2004 4 928 100 4 818 800 2.269 2 2005 5 127 100 5 146 900 -0.385 9 2006 5 529 200 5 675 100 -2.571 8 2007 6 232 800 6 151 600 1.319 6 2008 6 584 200 6 608 200 -0.363 5 From the Figure 1 and Table 1 we can see that the error of results is less than 3%. Considering the complexity of the actual economic system, the forecasting results obtained by the LaOR model is ideal. In fact, the literature suggests that similar studies of the forecast error are often more than 5%. 4 Conclusion The theory of Statistical learning provides a new research way for the modeling of complex economic data. According to the characteristics of regional economic development, the author proposed the optimal regression model based on least absolute criteria by combining the theory of statistical learning and least absolute criteria. As the model is built on the basis of the theory of statistical learning, and thus has better generalization ability. Acknowledgement: The project received financial support from the natural sciences fund of Guangdong Province (S2011010006103). References [1]. XIE Kaigui, SONG Qiankun, ZHOU Jiaqi. The Research Based on Least-absolute Criteria of Linear Regression Model [J]. Journal of System Simulation, 2008, 14 (2): 99-102 (in Chinese) [2]. DONG Jian, XIE Kaigui. The Study Based on Least-Absolute Criteria of Nonlinear Regression Model [J]. Journal of Chongqing Teachers College (Natural Science Edition), 2009, 18 (4): 71-74 (in Chinese) [3]. CHEN Xiru. Linear Regression Based on Least-absolute Criteria [J]. Application of Statistics and Management, 1989, 8 (5): 48-55 (in Chinese) [4]. V. Vapnik. The Nature of Statistical Learning Theory [M]. New York: Springer-Verlag, 1995 [5]. Smola A J., Scholkopf B. A Turtorial on Support Vector Regression [R]. NeuroCOLT TR NC-TR-98-030. Royal Holloway College, University of London, UK, 1998 [6]. S. Gunn. Support Vector Machines for Classification and Regression. Technical Report. University of Southamptom, 2003 26 [7]. LEE Yunmeng, XIAO Jianhua. The Short-term GDP Forecasting Model Based on Support Vector Machines [J]. Inner Mongolia University, 2004, 35 (4): 438-441 (in Chinese) 27