Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Clustering Analyses of Paddy Field Soil Based on Self-organizing Feature Map Net LI Bin1 , QIE Zhihong1* . 1 Department of Water Conservancy Engineering, Agriculture University of Hebei, Baoding 071001, Hebei, China * Corresponding author: [email protected] Abstract Self-organizing Feature Map Net (SOFM) was applied to clustering analysis of paddy field soil, A SOFM network model, which was trained and tested by the examples, was established. The research shows that the SOFM presents excellent network performance in clustering analysis of paddy field soil, high prediction precision and is easy to run. As a result, the method is an effective way to clustering analyses of paddy field soil. Key Words Self-organizing Feature Map (SOFM); Clustering analysis; System Clustering; Fuzzy Clustering 1. Introduction According to the law of the soil occurrence develop and the natural properties, classifying the soil in according to the certain standard, that is to know the soil in science methods, to differentiate the soil Systematically ,to make use of the soil rationally ; At the same time, classifying the soil is a horizontal sign of soil science, the foundation of soil survey charting , the basis of acting accord to circumstances and extending agricultural technique, the intermediary of soil information exchanges at home and abroad. We can see from that, which is very necessary to carry out clustering analysis on soil. Clustering analyses of paddy field soil is accord to the attribute of sample own, With mathematics method to compare the quality of things directly, according to certain similarities or the different target, between the quota determination sample friend or stranger relations, and relates the degree according to this kind of friend or stranger to carry on clustering analyses of paddy field soil, the common means of clustering analyses of paddy field soil has the law of the system clustering , the law of dynamic clustering and the law of fuzzy clustering , and so on. The law of the system clustering is one kind that is used widely in domestic and foreign at present, it respectively divides N sample into every kind, then discovers "the closest" two kinds of samples, merges them into one kind. There is left over the N- 1 kinds, then gradually duplicates this process, like this, each time reduces some kinds, until only two kinds, although the method is feasible, but works is too tedious. although Fuzzy clustering analysis was provided the science method for the soil classification to obtain a more widespread application, but during the course of fuzzy clustering analysis, when carrying on the data demarcated, the different person may use different method, therefore the fuzzy similar matrix which was established was different, finally the result of classify exists partial difference. In this paper, one kind nerve network which has automatic clustering calculate function, Self-organizing Feature Map (SOFM) is also called, which was introduced, the example data in literature [2] was applied to analysis the paddy field soil, and compare to the result of fuzzy clustering. 2. The theory of the SOFM network SOFM network [1] is also called Kohonen feature map. It was brought forward by Finland scholar Kohonen in 1981. The establishment of the SOFM neural network model is stimulated from the modeling study of the biological system. There is a small extent area in the visualization layer of human brain, which responds to the external environment stimulation. According to such a feature of human brain, Kohonen built up the SOFM network model to stimulate the feedback feature of the brain visualization cells. He believed that the neighboring cells in the neural network interact and compete 262 with each other, and finally self-adapt to the external environment to become the special detectors, which are capable of measuring different information, it has powerful anti-interference ability. This unsupervised neural network can learn to detect regularities and correlations in their input and adapt to their future responses to that input. SOFM network learns to classify input vectors according to how they are grouped in the input space by the competitive learning rule. During being trained, SOFM network learns both the distribution and apology of the input vectors. The training result is that a neuron and its neighbors will be sensitive to a class. 2.1 Self- Organizing Feature Map model [1] The SOFM network is composed of input layer and competitive layer. It is shown in figure 1.The input layer consists of N neurons is a one-dimensional vector sequence. The nodes of the input layer have the same number as the dimension of the input space. The competition layer has m by n neurons that are arranged in a plane. The competitive layer is also the output layer. The different nodes in the output layer of the SOFM network represent the different classes after training. In this network, each element of the input vector P is connected to each neuron in the input layer through the weight matrix W. Another weight of each neuron can describe how close the neuron and the input vector are. The more similar they are, the smaller the distance is between them, and the easier the neuron will win in the competition. Figure1 The Self-organizing Feature Map 2.2 The learning rule and working steps of SOFM The Kohonen learning rule develops from the Instar rule. For the Instar model whose output is 0 or 1.The weight matrixes is modified when output is 1, and then the Kohonen learning rule is found: Wij = I r .( Pj − Wij ) (1) ( k k k Supposing that the input vector of the network is pk = p1 ,p 2 , L,p n ) , =1, 2, L , q. The output vector of the competitive layer is A j = ( A j1 , A j 2 , K , A jm ), j = 1, 2,K , m . Among them, pk is a continuous vector, A j is a numerical value. the weight matrix that connects the input neurons and the output neurons j is W j = ( w j1 , w j 2 ,K , w jN ), i = 1, 2, K , N ; j = 1, 2,K , M . The working steps of the network are as follows: (1) Vector initialization. The linkage weight { Wij } is randomly assigned to a value within the range from 0.0 to 1.0, and the initial value of the learning rate η(t) and neighborhood Ng (t) are separately assigned to η(0)(0<η(0)<l)and Ng(0). (2) Feed the network with an input vector Pk , and make them been normalized by formula (2). 263 ( p1k ,p k2 , L ,p kn ) pk pk = = 1/ 2 pk k 2 k 2 k 2 p p p L + + +( n) ( 1 ) ( 2 ) (2) (3) Make the linkage weight vector been normalized by using formula (3), and then compute the distances d j between the input vector and the linkage weight vector with formula (4). Wj = wj = wj (w j1 ,w j2 ,L ,w jN ) 2 1/ 2 ( w j1 ) + ( w j 2 ) +L +( w jN ) 2 2 (3) 1/ 2 n d j = ∑ ( Pi k − W j ) 2 i =1 j = 1, 2,K , M (4) (4) Find out the winning neuron that has the minimum distance dg (dg = min [ d j ], j = 1, 2, … , M ) to the input vector p. (5) Adjust the linkage weights with formula (5).The linkage weights connect all the neurons of the neighborhood in the competitive layer with the input neuron. ( w ji ( t + 1) =w ji ( t ) + η ( t ) . p i − w ij ( t ) ) j ∈ N g ( t ) ,j=1,2, L,M ( 0 < η ( t ) < 1) (5) Where η(t) is the learning rate at time t. (6) Feed the network with a new learning vector, then return to step (2), till all the vectors are inputted to the network and the network converges. (7) Update the learning rate η(t) and the neighborhood Ng (t) separately by using formulas (6) and (7). t t= ) η ( 0 ) 1 − T η( t N g (t) = int N g (0) 1 − T Where, t is the learning times, T is the total times of learning, INT[x] (8) Make t= t + l, go to step (2), till t= T. (6) (7) is the sign of getting integer. 3. The application of the SOFM network in clustering analyses of paddy field soil. Carrying on analysis to paddy field soil in using SOFM net, first parameters of the soil should be measured and the construct of network be selected. 3.1 the parameters of soil is measured [2] The sampling place is partial province in south China, the instrument of cut soil is used to measure the intensity of soil, the outer diameter of the instrument is 210, the inside diameter is 82, the height is 100. The influence factor are very many to the soil, consider the combined actions of various factors, the parameters is selected in this article: the sand grain content, the clay grain content, the water content, the cohesive force, internal friction angle, and so on, carrying on the clustering analyses of paddy field soil in according to the above 5 parameters. The result of measurement is shown in table 1. 3.2 SOFM model structure of clustering analysis of paddy field soil Using the principle in paragraph 2, SOFM network model for classifying soil was established, the input level in the model has 5 neurons, the structure of competition level is of 6×4. Because the size of 264 training step affect the performance of network clustering. Here the training step is established by 10, 5000, performance of classify is observed separately. 3.3 analysis of the clustering result In order to confirm the reliability of analysis to the soil by the SOFM network method, the data of 16 samples in table 1 (5 parameters in each sample) is carried on processing of normalization with the formula X i − X min , the result after processing will be used as the training sample (the input vector X max − X min is also named) to carry on the training to the network, when the training step number is 10 and 5000, the output result of training is shown in table 2. The result of clustering analysis is expressed in table 2. From the result of clustering analysis, while the train step number is 10, the samples are divided into 5 kinds as follows in table 3. We can know from the result of clustering analysis: The first kind is mainly the paddy field soil from Guizhou to Jiangsu in Yangtse Valley; The second is the paddy field soil Sichuan, Hunan in Yangtse Table 1 Num ber 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Soil sample source and the parameters for fuzzy clustering relating to soil strength Sample source Wujin Jiangsu Guanyun Jiangsu Wuhan Hubei Yueyang Hubei Kunming. yunnan Chengdu Sichuan Guanghan. Sichuan Jianyang. Sichuan Baxian. Sichuan Kaili Guizhou Chaohu Anhui Guiyang Guizhou Wuhu Anhui Songjiang. Shanghai Chongming. Shanghai Haikou. Hainan Sand content (%) Clay content (%) Water content (%) Cohesion (K pa) Internal friction angle (°) 3.0 50.3 34.7 7.55 2.8 1.0 68.3 58.2 8.63 15.0 2.0 59.6 59.9 7.26 7.9 7.1 32.1 52.3 4.71 18.2 24.9 45.5 61.2 10.99 0.2 19.9 28.8 55.1 7.07 17.6 12.8 32.6 61.2 2.61 12.9 10.4 40.6 47.5 3.00 26.5 15.9 35.7 63.7 2.65 14.9 12.8 47.6 46.2 3.60 24.3 9.4 53.8 56.7 4.62 31.1 1.1 60.1 35.2 6.76 15.6 4.9 35.1 58.1 4.54 19.5 6.5 39.1 48.3 7.41 10.4 0.1 51.2 55.3 5.26 12.8 46.0 30.0 35.0 5.26 24.1 265 Table 2 Table of training result Training step Training result 10 3 5 5 4 1 4 4 24 4 24 5 5 5 3 5 24 5000 1 14 13 11 19 24 21 5 22 5 4 1 16 8 15 12 Category number 1 2 3 4 5 Table 3 The result table of classifying sample Sample number 2 3 12 13 15 4 6 7 9 1 14 5 8 10 11 16 Valley; The third is the paddy field soil Jiangsu, Shanghai in the middle and lower Valley of Yangtse River; The fourth is the paddy field soil Yunnan; The fifth is the paddy field soil Sichuan , Guizhou , Hainan. In data 2, the 1, 2 , 3 kind are merged into the same class, but the SOFM net divides the person into three kinds , this three kinds are the paddy field soil Yangtse Valley, they have similarity , this just embodied that the SOFM net have the merit of high classify accuracy. in data 2 ,Sample 8 , 10 , 11 were classified into the same class , they are the paddy field soil in Sichuan , Guizhou; Sample 16 is the same class lonely, that is the soil Hainan. Compare to the 1 , 2 , 3 kind, the fourth , the fifth are the soils of fragment spread . From the result of clustering analysis, we know that the Soils of having similarity are divided into the same class, they are the soil belonging to someone river basins. 4 Conclusions In this paper, A Self-organizing Feature Map networks (SOFM) are applied to analysis the paddy field soil, if you provide the concern data to the network, the network will be able to through self -training, self-learning, the competition, finally the result of simile classify will be outputted as itself, a better self-organization and robustness energy are displayed. References [1] MATLAB 6.5 Assistant Analysis and Design of Neural Networks / Compiled by Feisi technology product development research center. Beijing: Electronic Industries publishing company, 2003.1 [2] Changying Ji, Zhixiong Lu …Fuzzy clustering of paddy soils relating to soft layer and soil strength Journal of Nanjing Agricultural University 2000. 23 (1): 101-104 [3] Ying Li, zhihong Qie … Stabilization Analysis of Side-slope Based on Self-organizing Map Neural Net ISDA 2006: 37-41 266