Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Advances in Environmental Biology, 8(1) January 2014, Pages: 185-189 AENSI Journals Advances in Environmental Biology Journal home page: http://www.aensiweb.com/aeb.html Using Self-Organizing Map (SOM) algorithm for Analysts’ equities clustering 1 Nasim Osouli, 2Esmaeil Khoshbakht, 3Zinat Ansari, 4Mahdi Kazemi 1 Hafez Institute of Higher Education, Shiraz, Iran. Department of Accounting, Shiraz Branch, Islamic Azad University, Shiraz, Iran. 2,3,4 ARTICLE INFO Article history: Received 19 November 2013 Received in revised form 18 December 2013 Accepted 29 December 2013 Available online 4 March 2014 ABSTRACT The Self-Organizing Map (SOM) is a vector quantization method which places the prototype vectors on a regular low-dimensional grid in an ordered fashion. This makes the SOM a powerful visualization tool. In this article, the SOM was used for equities classification. The result show that SOM algorithm is one of methods for classification that use in different science such as equities classification. Key words: Self-Organizing Map (SOM), equities classification © 2014 AENSI Publisher All rights reserved. To Cite This Article: Nasim Osouli, Esmaeil Khoshbakht, Zinat Ansari, Mahdi Kazemi., Using Self-Organizing Map (SOM) algorithm for Analysts’ equities clustering. Adv. Environ. Biol., 8(1), 185-189, 2014 INTRODUCTION The SOM is a very popular artificial neural network algorithm based on competitive and unsupervised learning, and is primarily used for the visualization of nonlinear relations of multidimensional data and dimensionality reduction. The SOM is able to project high-dimensional data in a lower dimension, typically 2D, while preserving the relationships among the input data, thus electing it as a data-mining tool of choice [1-3]. This non-linear projection produces a 2D pattern map that can be useful in analyzing and discovering patterns in the input space. The ability of neural networks to discover nonlinear relationships in input data makes them ideal for modeling dynamic systems as the stock market. Neural networks, with their remarkable ability to derive meaning from complicated or imprecise data, can be used to extract patterns and detect trends or correlations that are too complex to be noticed by either humans or other computer techniques. At this point, we believe it is necessary to clarify the differences between the terms clustering and feature clustering: Clustering, in its traditional sense, the unsupervised classification of observations (patterns, data items or feature vectors) into groups (clusters). Each of these observations has a set of attributes (features) that characterizes it and define the dimensionality of the data-space. Emerging data mining applications [4-7] place special requirements on clustering techniques, such as the ability to handle high dimensionality, assimilation of cluster descriptions and usability. Regarding the clustering of high dimensional data, an object typically has from dozens to hundreds of attributes in which the domains of the attributes are large. Feature clustering; on the other hand, can be defined as a technique to cluster the features that characterize the observations, forming groups of features that are somewhat dependent of each other and/or correlated. This study uses different classification methods to find the best solutions and also to create the framework for predicting forecast equities. Objective: The objective of this study is to present a model in order to forecast equities in Tehran Stock Exchange from 2005 to 2012 using Self-Organizing Map (SOM). MATERIAL AND METHOD Self-Organizing Map (SOM) is a clustering method considered as an unsupervised variation of the Artificial Neural Network (ANN). It uses competitive learning techniques to train the network (nodes compete among themselves to display the strongest activation to a given data). Corresponding Author: Zinat Ansari, Department of accounting, Shiraz Branch Islamic Azad University, Shiraz, Iran. Tel: +98-917-1048553; E-mail: [email protected] 186 Zinat Ansari et al, 2014 Advances in Environmental Biology, 8(1) January 2014, Pages: 185-189 Fig.1: A sample of Self-Organizing Map (SOM) The SOM algorithm show in the following: • Select output layer network topology – Initialize current neighborhood distance, D(0), to a positive value • Initialize weights from inputs to outputs to small random values • Let t = 1 • While computational bounds are not exceeded do 1) Select an input sample i1 2) Compute the square of the Euclidean distance of i1 from weight vectors (wj) associated with each output node 3) Select output node j* that has weight vector with minimum value from step 2) 4) Update weights to all nodes within a topological distance given by D(t) from j*, using the weight update rule: 5) Increment t End while Learning rate generally decreases with time: From Mehotra et al. (1997), p. 189 U-matrix (Unified distance matrix): U-matrix representation of the Self-Organizing Map visualizes the distances between the neurons. The distance between the adjacent neurons is calculated and presented with different colorings between the adjacent nodes. A dark coloring between the neurons corresponds to a large distance and thus a gap between the codebook values in the input space. A light coloring between the neurons signifies that the codebook vectors are close to each other in the input space. Light areas can be thought as clusters and dark areas as cluster separators. This can be a helpful presentation when one tries to find clusters in the input data without having any a priori information about the clusters. Variables: These are the variables needed, with vectors in bold: is the current iteration, is the iteration limit, t is the index of the target input data vector in the input data set D, D(t) is a target input data vector, v is the index of the node in the map, Wv is the current weight vector of node v, u is the index of the best matching unit (BMU) in the map, is a restraint due to distance from BMU, usually called the neighborhood function, and, is a learning restraint due to iteration progress. 187 Zinat Ansari et al, 2014 Advances in Environmental Biology, 8(1) January 2014, Pages: 185-189 RESULTS AND DISCUSSION The SOM has been proven useful in many applications One of the most popular neural network models. It belongs to the category of competitive learning networks. Based on unsupervised learning, which means that no human intervention is needed during the learning and that little needs to be known about the characteristics of the input data. Use the SOM for clustering data without knowing the class memberships of the input data. In the study used Self-Organizing Map algorithm for equities clustering. In the study used 28 factors that shown in following. Cash and inventories of bank Short term investments Accounts and notes received of commercial Other accounts and notes received of commercial Stock of during construction Stock of commodity made Stock of parts and Spare parts Other inventories Stock of materials and goods Orders and prepayments Prepayment Long-Term Investments Intangible Assets Other Assets Investments and Other Assets Property and equipment Commercial accounts and notes payables Other accounts and notes payables Prereceipts Provision for tax Financial credit Stock Dividend Payable Long-term financial liabilities Serve termination benefits of employee Non-current liabilities Capital Accumulated gains and losses Deposit Legal reserve 188 Zinat Ansari et al, 2014 Advances in Environmental Biology, 8(1) January 2014, Pages: 185-189 Fig. 2: SOM topology Fig. 3: SOM neighbor weight distances Fig. 4: Weights for 28 factors 189 Zinat Ansari et al, 2014 Advances in Environmental Biology, 8(1) January 2014, Pages: 185-189 Fig. 6: SOM weight position Conclusions: In this paper, the SOM has been shortly introduced. The SOM is an excellent tool in the visualization of high dimensional data. In this article, the SOM was used for equities classification. The result show that SOM algorithm is one of methods for classification that use in different science such as equities classification. REFERENCES [1] [2] [3] [4] [5] [6] [7] Flexer, A., 1999. On the use of self-organizing maps for clustering and visualization. In Principles of Data Mining and Knowledge Discovery, pp: 80-88. Himberg, J., J. Ahola, E. Alhoniemi, J. Vesanto and O. Simula, 2001. The self-organizing map as a tool in knowledge engineering. Vesanto, J., Johan Himberg, Esa Alhoniemi and Juha Parhankangas, 1999. Self-organizing map in Matlab: the SOM Toolbox. Proceedings of the Matlab DSP conference 1999, Espoo, Finland, NOVEMBER 16-17, PP: 35-40. Barraquand, J., D. Martineau, 1995. Numerical valuation of high dimensional multivariate American securities. Journal of Financial and Quantitative Analysis, 30(03): 383-405. Clark, R., H.W. Ressom, A. Wang, J. Xuan, M. Liu, E. Gehan, Y. Wang, 2008. The properties of highdimensional data spaces: implications for exploring gene and protein expression data. Nature Reviews Cancer, 8: 37-49. Kumar, V., 2001. Data Mining for Scientific and Engineering Applications. Kohonen, T., 2001. Self-Organizing Maps. Springer-Verlag New York, Inc., Secaucus, NJ, USA.