Download Application of Projection Pursuit Classification Model (PPC) based

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Application of Projection Pursuit Classification Model (PPC) based
on Particle Swarm Optimization Algorithm (PSO) in Water
Quality Evaluation1
﹡
WANG Bai
ZHANG Zhongxue
Water and Civil Engineering College Northeast Agricultural University, Harbin 150030, PRC
[email protected]
Abstract: Based on water quality national standard of groundwater, the multi-dimensional
dimension-reducing technology of Projection Pursuit Classification Model (PPC) was utilized,
considering Particle Swarm Optimization algorithm (PSO) possessed both simple with concept and fast
convergence, using PSO to optimize their projection direction to converse the high dimension indexes
into low dimension, and using the projection function value to appraise water quality. The result showed
that the model can give appropriate assessment of water quality, solve the complexity of the multifactor, and be applied in other environment’s synthetically evaluation.
Key words: water quality; projection pursuit; particle swarm optimization algorithm
1 Introduction
Integrated evaluation of water quality is to establish a mathematical model by some water quality value
and to integrated judge water quality level of the water body, to provide a foundation for the decisions in
water management and water pollution prevention. For the appraisal results by single items of factual
water are always not compatible, and it is not applicable to appraise the water quality level by using
water quality standard table directly. So Gray Clustering Method, Blur Integrated Appraisal ,NN and so
on were proposed one after another, the calculation results of these models are all some discrete water
quality level and half quantificational, the differentiation ratio of the level was lower. Water quality
indexes of factual water were some continual real numbers; that is to say, the difference of water quality
index values are often evident according to water quality appraisal method in common use at present
even though they belong to the same level, which is inconvenient to guide the water quality
management. In addition, there haven’t any necessary ways to judge the water quality appraisal
standard.
At present, studies on water quality integrated appraisal has still been focused on how to integrate
multiple indexes to a single index. It is possible to appraisal water quality synthetically only in the single
dimension space. So, Projection Pursuit Classification Model (PPC) based on Particle Swarm
Optimization algorithm (PSO) was brought forward as a new method of water quality evaluation.
2. Projection Pursuit Model (PP)
2.1 Brief Introduction of PP model
Projection Pursuit is a method that can be used in high dimension data analysis, both exploring
analysis and certainty analysis. Friedman and Tukey 1974 imitated experienced data analysis workers
to make clustering and sorting analysis by a new item which combine the whole spread degree and
partial agglomeration degree. Character of PP method can be induced as follows:
(1) Many data in natural science fall short of normal distribution or the people run short of
(
1
)
(
)
1. The paper is supported by Heilongjiang provincial key research program GB06B106 and
program for innovative research team of north Zhang Zhongxue east agriculture university.
2. First author: Wang Bai (1980-), male, master degree student, has been researching on agricultural
water saving and sustainable water resources utilization.
3.
correspondence author: Zhang Zhongxue, E-mail [email protected]
﹡
204
experiences about the data.. That needs to find the structures and characters from the data them selves.
PP method can conquer the serious difficulty resulted from high dimension data. Because its data
analysis is carried in low dimension sub-space, the data points are close enough for1~3 dimension
projection space, and it is enough to find the structure or character of data in projection space.
(2) PP method can remove the interference of variables without or with little connect to data structure
and character.
(3) PP method made an approach for using single dimension stat. method to solve high dimension
problem. PP method can project the high dimension data into single dimension sub-space and then
analysis the single dimension data to find the better projection by comparing different analysis results of
single dimension projection.
(4) PP method can solve some nonlinear problem as other nonparametric method. Though PP is based
on data linear projection, which it searched is nonlinear structure, so it can solve nonlinear problem,
such as multi-nonlinear regression.
2.2 Step of PPC modeling [2][3][4]
Modeling of PPC Projection Pursuit Classification Model PPC including the following steps:
Step one: normalization of sample appraisal indices collection. Suppose the sample collection of
each index was x* ( i , j ) i = 1 ~ n, j = 1 ~ p
x* ( i , j ) is the value of index j, n p are the number of
samples (sample capacity )and indices respectively. For eliminating the dimension of the indices and
normalizing the variety range of each index, unitary numerical value can be carried by the following
formula:
(
, )
},
{
、
For the index that more big more excellent : x( i , j ) =
x * ( i , j ) − x min ( j )
x max ( j ) − x min ( j )
For the index that more small more excellent: x( i , j ) =
、
,
x max ( j ) − x * ( i , j )
x max ( j ) − x min ( j )
x max ( j ) x min ( j ) are the maximum and the minimum values of the j index respectively,
x( i , j ) is Normalized sequence of eigenvalue index.
{
Step two: Constructing projection index function Q( a ) The p dimension data x * ( i , j ) j = 1 ~ p
}
can be integrated to the single dimension projection value z( i ) based on a = {a( 1 ), a( 2 ), a( 3 ),⋅ ⋅ ⋅, a( p )} in
PP method.
p
z( i ) =
∑ a( j )x( i , j )
(i =1~ n)
j =1
{z( i )
①
Then the classification can be made according to the single dimensional scatter chart of
}
, a is a unite length vector. When integrating projection index value it is
i = 1 ~ n .In formula
①
demanded the scattering character of projection value z( i ) should be: partial projection points are as
dense as possible, and it is better that agglomerate them to several points groups; but the projection
points groups should be as dispersed as possible on the whole. So the projection index function can be
expressed as:
Q( a ) = S z D z
where, S z is the standard difference of projection value z( i ) , D z is the partial density of
projection value ,that is
:
n
∑ ( z( i ) − E( z ))
Sz =
i =1
n −1
205
2
②
n
Dz =
③
n
∑∑ ( R − r( i , j )) ⋅ u( R − r( i , j ))
i =1 j =1
where, E( z ) is the average value of sequence {z( i ) i = 1 ~ n}; R is the window radius of the
partial density, it must be considered that the average number of projection points in the window not
very few when the R value choosing to avoid the big average deviation of slippage, and to avoid it
increase too much with n increasing. R can be determined according to experiment , it is
commonly 0.1S z ; r( i , j ) is the distance of samples, r( i , j ) = z( i ) − z( j ) u( t ) is a unite rank spring
;
。
function, it is 1 when t ≥ 0 and 0 when t < 0
Step three: optimizing the projection index function. When the sample collection of each index
value was given, the projection index function Q( a ) variety only with the projection direction a .
Different projection direction reflects different character of data structure; the best projection direction is
that can reveal some characteristic structure of high dimension data. so the best projection direction can
be estimated through solving the maximizing of projection index function.
Maximizing target function:
Max : Q( a ) = S z ⋅ D z
p
Restriction condition:
s .t :
∑ a ( j ) =1
2
j =1
④
⑤
This is a complex un-linear optimizing problem taking {a( j ) j = 1 ~ p} as the optimizing variable,
it is hard to deal with by traditional optimizing method. So, the accelerating inheritance arithmetic
RAGA based on real number coding was applied to solve the whole optimizing of high dimension,
which simulate biologic natural selection and message exchange mechanism of chromosomal inside the
group.
Step four: classification (array in optimizing sequence). substitute the best projection direction
*
a got by last step into formula , the projection value z * ( i ) of each sample point can obtained.
(
)
①
*
*
Compare z ( i ) with z ( j ) , the more closer of them, the bigger possibility of that they belong to the
same classes. The order of samples’ grade can be gained by the order of z * (i ) .The sample can be ordered
from excellent to bad if ordering of z * ( i ) is carried from big to small.
3. Particle Swarm Optimization algorithm (PSO)
3.1 Introduction of PSO
PSO is similar with other evolution algorithm, it remove the individual in colony to excellent area
according to its adaptation of environment. The difference is that it do not use evolvement operator for
individual like other evolution algorithm, it takes each individual as a particle without quality and bulk
in optimizing space flying at a certain speed in searching space, it regulates the flying speed according
to integrated analyzing result of the flying experience of individual and colony by learning and
adaptation with the environment.
During the whole optimizing process, the adaptation value of each particle lies on the function
value of the chosen optimizing function. Each particle has the following kinds of information: the
current place of the particle; the best place found by itself (Pi) that take the information as flying
experience of the particle; the best place found by all particles in the whole particle colony (Pg that it is
the best value of Pi) that can be regarded as flying experience shared by the particle colony
accompaniers. So the flying speed of each particle is affected by the history movement information of
itself and the colony, and the current moving direction and speed are affected by its best history place
that coordinate the relation between the movement of particle them self and colony.
3.2 steps of PSO
206
(1) Initialization, supposing accelerating constants c1 and c2 , the maximize evolution generation
number Tmax , let the current evolution generation number t = 1 , m particles x1 , x2 ,L xm come in to
;
being in the defining space R n randomly, which compose original community X (1) particles’ initial place
variety v1 , v2 ,Lvm are generated randomly that compose displacement variety matrix V (1)
(2) Appraising community X (t ) calculating the adaptation value of each particle in the solution
space .
(3) Comparing the adaptation value of particles Pi. If the current value is better than Pi, then set Pi
being the current value and suppose the Pi place being the current place of this particle.
(4) Comparing adaptation value of particle with community best value Pg. If the current value is
better than Pg, then set Pg being the current value of the particle.
(5) Updating displacement direction and step length of particles to generate new community
X (t + 1) according formula
,
vidt +1
⑥、⑦。
(
)
(
= wvidt + c1r1 pidt − xidt + c2 r2 ptgd − xidt
xidt +1
=
xidt
+
)
⑦
vidt +1
。
⑥
where, in a n dimension searching space, community X = {x1 ,L, xi ,L, xm } composed by
m particles
,the place of
i particle is xi = (xi1 , xi 2 ,L, xin )T
extremum is pi = ( pi1 , pi 2 ,L pin )T
,
,
, the
,speed is
vi = (vi1 , vi 2 ,L, vin )T . Individual
global extremum of community is pg = ( pg 1, pg 2 ,L, pgn )T .
i = 1,2,L m
d = 1,2,L n
m is community scale, t is the current evolution generation number, w is
inertia weigh, r1 and r2 are random numbers distributing between[0,1]; c1 and c2 are accelerating
constants.
(6) Checking, if the result is satisfied, the optimizing to be finished, or else, t = t + 1 turn to (2). when
optimizing reaches the maximum evolution generation Tmax , or the adaptation value is less than the
present precision ε , the optimizing should be end.
4. Application of PPC model based on PSO in appraising of underground water
quality
The water quality results of measuring points in farms of QianJin, ChuangYe, HongWei and
QianShao are showed in Table 1:
Measuring item
Chroma
Turbidity
Iron(mg/l)
Mangane (mg/l)
Chloride(mg/l)
Fluorid (mg/l)
Sulfate(mg/l)
Total
hardness(mg/l)
Nitrate(mg/l)
Tab.1 Measuring results of underground water
22th team in
22th team
29th team in
QianJin farm
inChuangYe farm
HongWei farm
20
20
20
5
5
5
0.6
0.5
0.5
0.4
0.3
0.4
19.8
35
27.6
0
0
0
21.6
11.7
20.5
18th team in
QianShao farm
20
5
0.5
0.3
25
0
20
98
98
98
98
23.7
25.7
25.5
28.9
207
Water quality level
Chroma
Turbidity
Iron(mg/l)
Mangane (mg/l)
Chloride(mg/l)
Fluorid (mg/l)
Sulfate(mg/l)
Total
hardness(mg/l)
Nitrate(mg/l)
Tab.2 Appraising standard of underground water
I
II
III
5
5
15
3
3
3
0.1
0.2
0.3
0.05
0.05
0.1
1.0
150
250
50
1.0
1.0
1.0
150
250
IV
25
10
1.5
1.0
350
2.0
350
150
300
450
550
2
5
20
30
To eliminate the effect of different dimensions, data in Table 1 and Table 2 were normalized . The
results were showed in Table 3.
Tab. 3 The normalizing results
Total
Ammonia
hardness
(mg/l)
(mg/l)
Sample 1
1
1
1
1
0.90854
0.5
0.88679
0.88496
1
Sample 2
1
1
0.92857
1
0.60569
0.5
0.59119
0.5531
0.89286
Sample 3
0.5
1
0.85714
0.94737
0.30285
0.5
0.2956
0.22124
0.35714
0
0
0
0
0
0
0
0
0
Sample 4
Sample 5
0.25
0.71429
0.64286
0.63158
1
1
0.97074
1
0.225
Sample 6
0.25
0.71429
0.71429
0.73684
0.95397
1
1
1
0.15357
Sample 7
0.25
0.71429
0.71429
0.63158
0.97638
1
0.97399
1
0.16071
Sample 8
0.25
0.71429
0.71429
0.73684
0.98425
1
0.97547
1
0.039286
Notes: Samples 1 2 3 4 are the standard water quality values for class I II III IV of water, Samples5 6
7 8 are the values observed in QianJin farm, ChuangYe farm, HongWei farm and QianShao farm respectively.
Appraising
indices
、
Chroma
Turbidity
Iron
Mangane
(mg/l) se
(mg/l)
、、、
Chloride
(mg/l)
Fluorid
(mg/l)
Sulfate
(mg/l)
、、 、
、、
When PPC based on PSO is applied in water quality appraising, standard values of water quality
and factual measured results (high dimension data) need to be projected on the single dimension
sub-space to establish PPC by means of PSO at first, then through calculation time and time again, the
best projection direction and the best projection value can be found. The order of water quality value can
be obtained according to the best projection value.
Using the Matlab to compile program, let n=9, m=50, c1=2.8, c2=1.3, w=0.5, vmax=0.5, after 20
times calculation, the best projection value 1.9381and the best projection direction =(0.3853 0.3262
0.3180
0.2013 0.4318 0.3382 0.4546 0.3903 0.2091)were obtained substituted them into
formula (3) to get the integrated appraising projection value ( )=(2.7498 2.3100 1.5774
0
2.3095 2.3319 2.3101 2.3100).Arranged ( ) from big to small to get the sample’s order of good
or bad, viz. sample 1> sample 6> sample 7> sample 2= sample 8> sample 5> sample 3> sample 4,
because sample 1,2,3 and 4 are appraising standard value, so it can be concluded that water quality of
sample 6,7 and 8 belongs to level II(sample 6>sample 7>sample 8) , sample 5 belongs to level III.
It can be concluded from the results above that PPC based on PSO can be used not only to get the
absolute level of water quality of each sample, but also to distinguish the water quality of different
samples in the same value , and the appraising results were quite same with the factual water quality.
zi
i
,
a
5 Conclusions
The PPC model based on PSO were used to appraise the underground water quality in the paper,
not only the synthetic judging order of each sample can be obtained, but also the importance degree of
208
each appraising index to the collectivity judgment of each sample can be reflected by the projection
direction optimization, and the converge speed of the model is fast and the appraising effect is quite
good.
References
[1] Jin Juliang, Wei Yiming, DING Jing, Projection pursuit model of integrated water, Acta Scientiae
Circumstantiae, 2001,21(4):431~433.
[2] Fu Qiang, System analyses and integrated appraising of agriculture water and soil resource. China
WaterPower Press, 2005:192-195
[3] Zhang Jun, Liang Chuan, Zhao Xiejing, LIAO Yong, Application of PPC Model Based on RAGA in
the Water-Saving Irrigation Scheme Selection, China rural water and hydropower, 2006,12:13-15.
[4] Ye Hao, Qian Jiazhong, Huang Xichuan, LI Youlong, Dong Hongxin, Application of projection
pursuit model to evaluation of groundwater quality, Hydrogeology and Engineering Geology,
2005,(5):9-10 .
[5] Xu Cheng, Design Study of control system based on particle swarm optimization algorithm,
Master’s Degree paper, Chekiang University,2006:10-15.
[6] Xia Fei, Study on locating mobile station in CDMA based on particle swarm optimization
algorithm, Master’s Degree paper, WuHan University, 2005:23-26
[7] Gao Shang, Yang Jingyu, Swarm capacity arithmetic and application, China rural water and
hydropower, 2006:6-10.
209