Download Clustering Analyses of Paddy Field Soil Based on Self-organizing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Network tap wikipedia , lookup

Airborne Networking wikipedia , lookup

Transcript
Clustering Analyses of Paddy Field Soil Based on Self-organizing
Feature Map Net
LI Bin1 , QIE Zhihong1* .
1 Department of Water Conservancy Engineering, Agriculture University of Hebei, Baoding 071001,
Hebei, China
* Corresponding author: [email protected]
Abstract
Self-organizing Feature Map Net (SOFM) was applied to clustering analysis of paddy field
soil, A SOFM network model, which was trained and tested by the examples, was established. The
research shows that the SOFM presents excellent network performance in clustering analysis of paddy
field soil, high prediction precision and is easy to run. As a result, the method is an effective way to
clustering analyses of paddy field soil.
Key Words Self-organizing Feature Map (SOFM); Clustering analysis; System Clustering; Fuzzy
Clustering
1. Introduction
According to the law of the soil occurrence develop and the natural properties, classifying the soil
in according to the certain standard, that is to know the soil in science methods, to differentiate the soil
Systematically ,to make use of the soil rationally ; At the same time, classifying the soil is a horizontal
sign of soil science, the foundation of soil survey charting , the basis of acting accord to circumstances
and extending agricultural technique, the intermediary of soil information exchanges at home and
abroad. We can see from that, which is very necessary to carry out clustering analysis on soil. Clustering
analyses of paddy field soil is accord to the attribute of sample own, With mathematics method to
compare the quality of things directly, according to certain similarities or the different target, between
the quota determination sample friend or stranger relations, and relates the degree according to this kind
of friend or stranger to carry on clustering analyses of paddy field soil, the common means of clustering
analyses of paddy field soil has the law of the system clustering , the law of dynamic clustering and the
law of fuzzy clustering , and so on. The law of the system clustering is one kind that is used widely in
domestic and foreign at present, it respectively divides N sample into every kind, then discovers "the
closest" two kinds of samples, merges them into one kind. There is left over the N- 1 kinds, then
gradually duplicates this process, like this, each time reduces some kinds, until only two kinds,
although the method is feasible, but works is too tedious. although Fuzzy clustering analysis was
provided the science method for the soil classification to obtain a more widespread application, but
during the course of fuzzy clustering analysis, when carrying on the data demarcated, the different
person may use different method, therefore the fuzzy similar matrix which was established was different,
finally the result of classify exists partial difference. In this paper, one kind nerve network which has
automatic clustering calculate function, Self-organizing Feature Map (SOFM) is also called, which was
introduced, the example data in literature [2] was applied to analysis the paddy field soil, and compare to
the result of fuzzy clustering.
2. The theory of the SOFM network
SOFM network [1] is also called Kohonen feature map. It was brought forward by Finland scholar
Kohonen in 1981. The establishment of the SOFM neural network model is stimulated from the
modeling study of the biological system. There is a small extent area in the visualization layer of human
brain, which responds to the external environment stimulation. According to such a feature of human
brain, Kohonen built up the SOFM network model to stimulate the feedback feature of the brain
visualization cells. He believed that the neighboring cells in the neural network interact and compete
262
with each other, and finally self-adapt to the external environment to become the special detectors,
which are capable of measuring different information, it has powerful anti-interference ability. This
unsupervised neural network can learn to detect regularities and correlations in their input and adapt to
their future responses to that input. SOFM network learns to classify input vectors according to how
they are grouped in the input space by the competitive learning rule. During being trained, SOFM
network learns both the distribution and apology of the input vectors. The training result is that a neuron
and its neighbors will be sensitive to a class.
2.1 Self- Organizing Feature Map model [1]
The SOFM network is composed of input layer and competitive layer. It is shown in figure 1.The
input layer consists of N neurons is a one-dimensional vector sequence. The nodes of the input layer
have the same number as the dimension of the input space. The competition layer has m by n neurons
that are arranged in a plane. The competitive layer is also the output layer. The different nodes in the
output layer of the SOFM network represent the different classes after training. In this network, each
element of the input vector P is connected to each neuron in the input layer through the weight matrix W.
Another weight of each neuron can describe how close the neuron and the input vector are. The more
similar they are, the smaller the distance is between them, and the easier the neuron will win in the
competition.
Figure1 The Self-organizing Feature Map
2.2 The learning rule and working steps of SOFM
The Kohonen learning rule develops from the Instar rule. For the Instar model whose output is 0 or
1.The weight matrixes is modified when output is 1, and then the Kohonen learning rule is found:
Wij = I r .( Pj − Wij )
(1)
(
k
k
k
Supposing that the input vector of the network is pk = p1 ,p 2 , L,p n
)
, =1, 2, L , q. The output vector
of the competitive layer is A j = ( A j1 , A j 2 , K , A jm ), j = 1, 2,K , m . Among them,
pk is a
continuous vector, A j is a numerical value. the weight matrix that connects the input neurons and the
output neurons j is W j = ( w j1 , w j 2 ,K , w jN ), i = 1, 2, K , N ; j = 1, 2,K , M .
The working steps of the network are as follows:
(1) Vector initialization. The linkage weight { Wij } is randomly assigned to a value within the range
from 0.0 to 1.0, and the initial value of the learning rate η(t) and neighborhood Ng (t) are separately
assigned to η(0)(0<η(0)<l)and Ng(0).
(2) Feed the network with an input vector Pk , and make them been normalized by formula (2).
263
(
p1k ,p k2 , L ,p kn )
pk
pk =
=
1/ 2
pk  k 2
k 2
k 2
p
p
p
L
+
+ +( n)
( 1 ) ( 2 )

(2)
(3) Make the linkage weight vector been normalized by using formula (3), and then compute the
distances d j between the input vector and the linkage weight vector with formula (4).
Wj =
wj
=
wj
(w
j1
,w j2 ,L ,w jN )
2 1/ 2
( w j1 ) + ( w j 2 ) +L +( w jN ) 
2
2
(3)
1/ 2
 n

d j =  ∑ ( Pi k − W j ) 2 
 i =1

j = 1, 2,K , M
(4)
(4) Find out the winning neuron that has the minimum distance dg (dg = min [ d j ], j = 1, 2, … , M ) to
the input vector p.
(5) Adjust the linkage weights with formula (5).The linkage weights connect all the neurons of the
neighborhood
in
the
competitive
layer
with
the
input
neuron.
(
w ji ( t + 1) =w ji ( t ) + η ( t ) . p i − w ij ( t )
)
j ∈ N g ( t ) ,j=1,2, L,M ( 0 < η ( t ) < 1)
(5)
Where η(t) is the learning rate at time t.
(6) Feed the network with a new learning vector, then return to step (2), till all the vectors are inputted to
the network and the network converges.
(7) Update the learning rate η(t) and the neighborhood Ng (t) separately by using formulas (6) and (7).
t
t=
) η ( 0 ) 1 − 
 T
η(

t 

N g (t) = int  N g (0) 1 −  
 T 

Where, t is the learning times, T is the total times of learning, INT[x]
(8) Make t= t + l, go to step (2), till t= T.
(6)
(7)
is the sign of getting integer.
3. The application of the SOFM network in clustering analyses of paddy field soil.
Carrying on analysis to paddy field soil in using SOFM net, first parameters of the soil should be
measured and the construct of network be selected.
3.1 the parameters of soil is measured [2]
The sampling place is partial province in south China, the instrument of cut soil is used to measure
the intensity of soil, the outer diameter of the instrument is 210, the inside diameter is 82, the height is
100. The influence factor are very many to the soil, consider the combined actions of various factors, the
parameters is selected in this article: the sand grain content, the clay grain content, the water content, the
cohesive force, internal friction angle, and so on, carrying on the clustering analyses of paddy field soil
in according to the above 5 parameters. The result of measurement is shown in table 1.
3.2 SOFM model structure of clustering analysis of paddy field soil
Using the principle in paragraph 2, SOFM network model for classifying soil was established, the
input level in the model has 5 neurons, the structure of competition level is of 6×4. Because the size of
264
training step affect the performance of network clustering. Here the training step is established by 10,
5000, performance of classify is observed separately.
3.3 analysis of the clustering result
In order to confirm the reliability of analysis to the soil by the SOFM network method, the data of
16 samples in table 1 (5 parameters in each sample) is carried on processing of normalization with the
formula
X i − X min
, the result after processing will be used as the training sample (the input vector
X max − X min
is also named) to carry on the training to the network, when the training step number is 10 and 5000, the
output result of training is shown in table 2. The result of clustering analysis is expressed in table 2.
From the result of clustering analysis, while the train step number is 10, the samples are divided into 5
kinds as follows in table 3.
We can know from the result of clustering analysis: The first kind is mainly the paddy field soil
from Guizhou to Jiangsu in Yangtse Valley; The second is the paddy field soil Sichuan, Hunan in
Yangtse
Table 1
Num
ber
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Soil sample source and the parameters for fuzzy clustering relating to soil strength
Sample
source
Wujin
Jiangsu
Guanyun
Jiangsu
Wuhan
Hubei
Yueyang
Hubei
Kunming.
yunnan
Chengdu
Sichuan
Guanghan.
Sichuan
Jianyang.
Sichuan
Baxian.
Sichuan
Kaili
Guizhou
Chaohu
Anhui
Guiyang
Guizhou
Wuhu
Anhui
Songjiang.
Shanghai
Chongming.
Shanghai
Haikou.
Hainan
Sand content
(%)
Clay content
(%)
Water content
(%)
Cohesion
(K pa)
Internal friction
angle (°)
3.0
50.3
34.7
7.55
2.8
1.0
68.3
58.2
8.63
15.0
2.0
59.6
59.9
7.26
7.9
7.1
32.1
52.3
4.71
18.2
24.9
45.5
61.2
10.99
0.2
19.9
28.8
55.1
7.07
17.6
12.8
32.6
61.2
2.61
12.9
10.4
40.6
47.5
3.00
26.5
15.9
35.7
63.7
2.65
14.9
12.8
47.6
46.2
3.60
24.3
9.4
53.8
56.7
4.62
31.1
1.1
60.1
35.2
6.76
15.6
4.9
35.1
58.1
4.54
19.5
6.5
39.1
48.3
7.41
10.4
0.1
51.2
55.3
5.26
12.8
46.0
30.0
35.0
5.26
24.1
265
Table 2 Table of training result
Training step
Training result
10
3
5 5 4 1 4 4 24 4 24 5 5 5 3 5 24
5000
1
14 13 11 19 24 21 5 22 5 4 1 16 8 15 12
Category number
1
2
3
4
5
Table 3 The result table of classifying sample
Sample number
2 3 12 13 15
4 6 7 9
1 14
5
8 10 11 16
Valley; The third is the paddy field soil Jiangsu, Shanghai in the middle and lower Valley of Yangtse
River; The fourth is the paddy field soil Yunnan; The fifth is the paddy field soil Sichuan , Guizhou ,
Hainan.
In data 2, the 1, 2 , 3 kind are merged into the same class, but the SOFM net divides the person into
three kinds , this three kinds are the paddy field soil Yangtse Valley, they have similarity , this just
embodied that the SOFM net have the merit of high classify accuracy. in data 2 ,Sample 8 , 10 , 11 were
classified into the same class , they are the paddy field soil in Sichuan , Guizhou; Sample 16 is the same
class lonely, that is the soil Hainan. Compare to the 1 , 2 , 3 kind, the fourth , the fifth are the soils of
fragment spread . From the result of clustering analysis, we know that the Soils of having similarity are
divided into the same class, they are the soil belonging to someone river basins.
4 Conclusions
In this paper, A Self-organizing Feature Map networks (SOFM) are applied to analysis the paddy field
soil, if you provide the concern data to the network, the network will be able to through self -training,
self-learning, the competition, finally the result of simile classify will be outputted as itself, a better
self-organization and robustness energy are displayed.
References
[1] MATLAB 6.5 Assistant Analysis and Design of Neural Networks / Compiled by Feisi technology
product development research center. Beijing: Electronic Industries publishing company, 2003.1
[2] Changying Ji, Zhixiong Lu …Fuzzy clustering of paddy soils relating to soft layer and soil strength
Journal of Nanjing Agricultural University 2000. 23 (1): 101-104
[3] Ying Li, zhihong Qie … Stabilization Analysis of Side-slope Based on Self-organizing Map Neural
Net ISDA 2006: 37-41
266