Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Application of Projection Pursuit Classification Model (PPC) based on Particle Swarm Optimization Algorithm (PSO) in Water Quality Evaluation1 ﹡ WANG Bai ZHANG Zhongxue Water and Civil Engineering College Northeast Agricultural University, Harbin 150030, PRC [email protected] Abstract: Based on water quality national standard of groundwater, the multi-dimensional dimension-reducing technology of Projection Pursuit Classification Model (PPC) was utilized, considering Particle Swarm Optimization algorithm (PSO) possessed both simple with concept and fast convergence, using PSO to optimize their projection direction to converse the high dimension indexes into low dimension, and using the projection function value to appraise water quality. The result showed that the model can give appropriate assessment of water quality, solve the complexity of the multifactor, and be applied in other environment’s synthetically evaluation. Key words: water quality; projection pursuit; particle swarm optimization algorithm 1 Introduction Integrated evaluation of water quality is to establish a mathematical model by some water quality value and to integrated judge water quality level of the water body, to provide a foundation for the decisions in water management and water pollution prevention. For the appraisal results by single items of factual water are always not compatible, and it is not applicable to appraise the water quality level by using water quality standard table directly. So Gray Clustering Method, Blur Integrated Appraisal ,NN and so on were proposed one after another, the calculation results of these models are all some discrete water quality level and half quantificational, the differentiation ratio of the level was lower. Water quality indexes of factual water were some continual real numbers; that is to say, the difference of water quality index values are often evident according to water quality appraisal method in common use at present even though they belong to the same level, which is inconvenient to guide the water quality management. In addition, there haven’t any necessary ways to judge the water quality appraisal standard. At present, studies on water quality integrated appraisal has still been focused on how to integrate multiple indexes to a single index. It is possible to appraisal water quality synthetically only in the single dimension space. So, Projection Pursuit Classification Model (PPC) based on Particle Swarm Optimization algorithm (PSO) was brought forward as a new method of water quality evaluation. 2. Projection Pursuit Model (PP) 2.1 Brief Introduction of PP model Projection Pursuit is a method that can be used in high dimension data analysis, both exploring analysis and certainty analysis. Friedman and Tukey 1974 imitated experienced data analysis workers to make clustering and sorting analysis by a new item which combine the whole spread degree and partial agglomeration degree. Character of PP method can be induced as follows: (1) Many data in natural science fall short of normal distribution or the people run short of ( 1 ) ( ) 1. The paper is supported by Heilongjiang provincial key research program GB06B106 and program for innovative research team of north Zhang Zhongxue east agriculture university. 2. First author: Wang Bai (1980-), male, master degree student, has been researching on agricultural water saving and sustainable water resources utilization. 3. correspondence author: Zhang Zhongxue, E-mail [email protected] ﹡ 204 experiences about the data.. That needs to find the structures and characters from the data them selves. PP method can conquer the serious difficulty resulted from high dimension data. Because its data analysis is carried in low dimension sub-space, the data points are close enough for1~3 dimension projection space, and it is enough to find the structure or character of data in projection space. (2) PP method can remove the interference of variables without or with little connect to data structure and character. (3) PP method made an approach for using single dimension stat. method to solve high dimension problem. PP method can project the high dimension data into single dimension sub-space and then analysis the single dimension data to find the better projection by comparing different analysis results of single dimension projection. (4) PP method can solve some nonlinear problem as other nonparametric method. Though PP is based on data linear projection, which it searched is nonlinear structure, so it can solve nonlinear problem, such as multi-nonlinear regression. 2.2 Step of PPC modeling [2][3][4] Modeling of PPC Projection Pursuit Classification Model PPC including the following steps: Step one: normalization of sample appraisal indices collection. Suppose the sample collection of each index was x* ( i , j ) i = 1 ~ n, j = 1 ~ p x* ( i , j ) is the value of index j, n p are the number of samples (sample capacity )and indices respectively. For eliminating the dimension of the indices and normalizing the variety range of each index, unitary numerical value can be carried by the following formula: ( , ) }, { 、 For the index that more big more excellent : x( i , j ) = x * ( i , j ) − x min ( j ) x max ( j ) − x min ( j ) For the index that more small more excellent: x( i , j ) = 、 , x max ( j ) − x * ( i , j ) x max ( j ) − x min ( j ) x max ( j ) x min ( j ) are the maximum and the minimum values of the j index respectively, x( i , j ) is Normalized sequence of eigenvalue index. { Step two: Constructing projection index function Q( a ) The p dimension data x * ( i , j ) j = 1 ~ p } can be integrated to the single dimension projection value z( i ) based on a = {a( 1 ), a( 2 ), a( 3 ),⋅ ⋅ ⋅, a( p )} in PP method. p z( i ) = ∑ a( j )x( i , j ) (i =1~ n) j =1 {z( i ) ① Then the classification can be made according to the single dimensional scatter chart of } , a is a unite length vector. When integrating projection index value it is i = 1 ~ n .In formula ① demanded the scattering character of projection value z( i ) should be: partial projection points are as dense as possible, and it is better that agglomerate them to several points groups; but the projection points groups should be as dispersed as possible on the whole. So the projection index function can be expressed as: Q( a ) = S z D z where, S z is the standard difference of projection value z( i ) , D z is the partial density of projection value ,that is : n ∑ ( z( i ) − E( z )) Sz = i =1 n −1 205 2 ② n Dz = ③ n ∑∑ ( R − r( i , j )) ⋅ u( R − r( i , j )) i =1 j =1 where, E( z ) is the average value of sequence {z( i ) i = 1 ~ n}; R is the window radius of the partial density, it must be considered that the average number of projection points in the window not very few when the R value choosing to avoid the big average deviation of slippage, and to avoid it increase too much with n increasing. R can be determined according to experiment , it is commonly 0.1S z ; r( i , j ) is the distance of samples, r( i , j ) = z( i ) − z( j ) u( t ) is a unite rank spring ; 。 function, it is 1 when t ≥ 0 and 0 when t < 0 Step three: optimizing the projection index function. When the sample collection of each index value was given, the projection index function Q( a ) variety only with the projection direction a . Different projection direction reflects different character of data structure; the best projection direction is that can reveal some characteristic structure of high dimension data. so the best projection direction can be estimated through solving the maximizing of projection index function. Maximizing target function: Max : Q( a ) = S z ⋅ D z p Restriction condition: s .t : ∑ a ( j ) =1 2 j =1 ④ ⑤ This is a complex un-linear optimizing problem taking {a( j ) j = 1 ~ p} as the optimizing variable, it is hard to deal with by traditional optimizing method. So, the accelerating inheritance arithmetic RAGA based on real number coding was applied to solve the whole optimizing of high dimension, which simulate biologic natural selection and message exchange mechanism of chromosomal inside the group. Step four: classification (array in optimizing sequence). substitute the best projection direction * a got by last step into formula , the projection value z * ( i ) of each sample point can obtained. ( ) ① * * Compare z ( i ) with z ( j ) , the more closer of them, the bigger possibility of that they belong to the same classes. The order of samples’ grade can be gained by the order of z * (i ) .The sample can be ordered from excellent to bad if ordering of z * ( i ) is carried from big to small. 3. Particle Swarm Optimization algorithm (PSO) 3.1 Introduction of PSO PSO is similar with other evolution algorithm, it remove the individual in colony to excellent area according to its adaptation of environment. The difference is that it do not use evolvement operator for individual like other evolution algorithm, it takes each individual as a particle without quality and bulk in optimizing space flying at a certain speed in searching space, it regulates the flying speed according to integrated analyzing result of the flying experience of individual and colony by learning and adaptation with the environment. During the whole optimizing process, the adaptation value of each particle lies on the function value of the chosen optimizing function. Each particle has the following kinds of information: the current place of the particle; the best place found by itself (Pi) that take the information as flying experience of the particle; the best place found by all particles in the whole particle colony (Pg that it is the best value of Pi) that can be regarded as flying experience shared by the particle colony accompaniers. So the flying speed of each particle is affected by the history movement information of itself and the colony, and the current moving direction and speed are affected by its best history place that coordinate the relation between the movement of particle them self and colony. 3.2 steps of PSO 206 (1) Initialization, supposing accelerating constants c1 and c2 , the maximize evolution generation number Tmax , let the current evolution generation number t = 1 , m particles x1 , x2 ,L xm come in to ; being in the defining space R n randomly, which compose original community X (1) particles’ initial place variety v1 , v2 ,Lvm are generated randomly that compose displacement variety matrix V (1) (2) Appraising community X (t ) calculating the adaptation value of each particle in the solution space . (3) Comparing the adaptation value of particles Pi. If the current value is better than Pi, then set Pi being the current value and suppose the Pi place being the current place of this particle. (4) Comparing adaptation value of particle with community best value Pg. If the current value is better than Pg, then set Pg being the current value of the particle. (5) Updating displacement direction and step length of particles to generate new community X (t + 1) according formula , vidt +1 ⑥、⑦。 ( ) ( = wvidt + c1r1 pidt − xidt + c2 r2 ptgd − xidt xidt +1 = xidt + ) ⑦ vidt +1 。 ⑥ where, in a n dimension searching space, community X = {x1 ,L, xi ,L, xm } composed by m particles ,the place of i particle is xi = (xi1 , xi 2 ,L, xin )T extremum is pi = ( pi1 , pi 2 ,L pin )T , , , the ,speed is vi = (vi1 , vi 2 ,L, vin )T . Individual global extremum of community is pg = ( pg 1, pg 2 ,L, pgn )T . i = 1,2,L m d = 1,2,L n m is community scale, t is the current evolution generation number, w is inertia weigh, r1 and r2 are random numbers distributing between[0,1]; c1 and c2 are accelerating constants. (6) Checking, if the result is satisfied, the optimizing to be finished, or else, t = t + 1 turn to (2). when optimizing reaches the maximum evolution generation Tmax , or the adaptation value is less than the present precision ε , the optimizing should be end. 4. Application of PPC model based on PSO in appraising of underground water quality The water quality results of measuring points in farms of QianJin, ChuangYe, HongWei and QianShao are showed in Table 1: Measuring item Chroma Turbidity Iron(mg/l) Mangane (mg/l) Chloride(mg/l) Fluorid (mg/l) Sulfate(mg/l) Total hardness(mg/l) Nitrate(mg/l) Tab.1 Measuring results of underground water 22th team in 22th team 29th team in QianJin farm inChuangYe farm HongWei farm 20 20 20 5 5 5 0.6 0.5 0.5 0.4 0.3 0.4 19.8 35 27.6 0 0 0 21.6 11.7 20.5 18th team in QianShao farm 20 5 0.5 0.3 25 0 20 98 98 98 98 23.7 25.7 25.5 28.9 207 Water quality level Chroma Turbidity Iron(mg/l) Mangane (mg/l) Chloride(mg/l) Fluorid (mg/l) Sulfate(mg/l) Total hardness(mg/l) Nitrate(mg/l) Tab.2 Appraising standard of underground water I II III 5 5 15 3 3 3 0.1 0.2 0.3 0.05 0.05 0.1 1.0 150 250 50 1.0 1.0 1.0 150 250 IV 25 10 1.5 1.0 350 2.0 350 150 300 450 550 2 5 20 30 To eliminate the effect of different dimensions, data in Table 1 and Table 2 were normalized . The results were showed in Table 3. Tab. 3 The normalizing results Total Ammonia hardness (mg/l) (mg/l) Sample 1 1 1 1 1 0.90854 0.5 0.88679 0.88496 1 Sample 2 1 1 0.92857 1 0.60569 0.5 0.59119 0.5531 0.89286 Sample 3 0.5 1 0.85714 0.94737 0.30285 0.5 0.2956 0.22124 0.35714 0 0 0 0 0 0 0 0 0 Sample 4 Sample 5 0.25 0.71429 0.64286 0.63158 1 1 0.97074 1 0.225 Sample 6 0.25 0.71429 0.71429 0.73684 0.95397 1 1 1 0.15357 Sample 7 0.25 0.71429 0.71429 0.63158 0.97638 1 0.97399 1 0.16071 Sample 8 0.25 0.71429 0.71429 0.73684 0.98425 1 0.97547 1 0.039286 Notes: Samples 1 2 3 4 are the standard water quality values for class I II III IV of water, Samples5 6 7 8 are the values observed in QianJin farm, ChuangYe farm, HongWei farm and QianShao farm respectively. Appraising indices 、 Chroma Turbidity Iron Mangane (mg/l) se (mg/l) 、、、 Chloride (mg/l) Fluorid (mg/l) Sulfate (mg/l) 、、 、 、、 When PPC based on PSO is applied in water quality appraising, standard values of water quality and factual measured results (high dimension data) need to be projected on the single dimension sub-space to establish PPC by means of PSO at first, then through calculation time and time again, the best projection direction and the best projection value can be found. The order of water quality value can be obtained according to the best projection value. Using the Matlab to compile program, let n=9, m=50, c1=2.8, c2=1.3, w=0.5, vmax=0.5, after 20 times calculation, the best projection value 1.9381and the best projection direction =(0.3853 0.3262 0.3180 0.2013 0.4318 0.3382 0.4546 0.3903 0.2091)were obtained substituted them into formula (3) to get the integrated appraising projection value ( )=(2.7498 2.3100 1.5774 0 2.3095 2.3319 2.3101 2.3100).Arranged ( ) from big to small to get the sample’s order of good or bad, viz. sample 1> sample 6> sample 7> sample 2= sample 8> sample 5> sample 3> sample 4, because sample 1,2,3 and 4 are appraising standard value, so it can be concluded that water quality of sample 6,7 and 8 belongs to level II(sample 6>sample 7>sample 8) , sample 5 belongs to level III. It can be concluded from the results above that PPC based on PSO can be used not only to get the absolute level of water quality of each sample, but also to distinguish the water quality of different samples in the same value , and the appraising results were quite same with the factual water quality. zi i , a 5 Conclusions The PPC model based on PSO were used to appraise the underground water quality in the paper, not only the synthetic judging order of each sample can be obtained, but also the importance degree of 208 each appraising index to the collectivity judgment of each sample can be reflected by the projection direction optimization, and the converge speed of the model is fast and the appraising effect is quite good. References [1] Jin Juliang, Wei Yiming, DING Jing, Projection pursuit model of integrated water, Acta Scientiae Circumstantiae, 2001,21(4):431~433. [2] Fu Qiang, System analyses and integrated appraising of agriculture water and soil resource. China WaterPower Press, 2005:192-195 [3] Zhang Jun, Liang Chuan, Zhao Xiejing, LIAO Yong, Application of PPC Model Based on RAGA in the Water-Saving Irrigation Scheme Selection, China rural water and hydropower, 2006,12:13-15. [4] Ye Hao, Qian Jiazhong, Huang Xichuan, LI Youlong, Dong Hongxin, Application of projection pursuit model to evaluation of groundwater quality, Hydrogeology and Engineering Geology, 2005,(5):9-10 . [5] Xu Cheng, Design Study of control system based on particle swarm optimization algorithm, Master’s Degree paper, Chekiang University,2006:10-15. [6] Xia Fei, Study on locating mobile station in CDMA based on particle swarm optimization algorithm, Master’s Degree paper, WuHan University, 2005:23-26 [7] Gao Shang, Yang Jingyu, Swarm capacity arithmetic and application, China rural water and hydropower, 2006:6-10. 209