Download Advances in Environmental Biology Nasim Osouli,

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Advances in Environmental Biology, 8(1) January 2014, Pages: 185-189
AENSI Journals
Advances in Environmental Biology
Journal home page: http://www.aensiweb.com/aeb.html
Using Self-Organizing Map (SOM) algorithm for Analysts’ equities clustering
1
Nasim Osouli, 2Esmaeil Khoshbakht, 3Zinat Ansari, 4Mahdi Kazemi
1
Hafez Institute of Higher Education, Shiraz, Iran.
Department of Accounting, Shiraz Branch, Islamic Azad University, Shiraz, Iran.
2,3,4
ARTICLE INFO
Article history:
Received 19 November 2013
Received in revised form 18
December 2013
Accepted 29 December 2013
Available online 4 March 2014
ABSTRACT
The Self-Organizing Map (SOM) is a vector quantization method which places the
prototype vectors on a regular low-dimensional grid in an ordered fashion. This makes
the SOM a powerful visualization tool. In this article, the SOM was used for equities
classification. The result show that SOM algorithm is one of methods for classification
that use in different science such as equities classification.
Key words:
Self-Organizing Map (SOM),
equities classification
© 2014 AENSI Publisher All rights reserved.
To Cite This Article: Nasim Osouli, Esmaeil Khoshbakht, Zinat Ansari, Mahdi Kazemi., Using Self-Organizing Map (SOM) algorithm for
Analysts’ equities clustering. Adv. Environ. Biol., 8(1), 185-189, 2014
INTRODUCTION
The SOM is a very popular artificial neural network algorithm based on competitive and unsupervised
learning, and is primarily used for the visualization of nonlinear relations of multidimensional data and
dimensionality reduction. The SOM is able to project high-dimensional data in a lower dimension, typically 2D,
while preserving the relationships among the input data, thus electing it as a data-mining tool of choice [1-3].
This non-linear projection produces a 2D pattern map that can be useful in analyzing and discovering patterns in
the input space. The ability of neural networks to discover nonlinear relationships in input data makes them
ideal for modeling dynamic systems as the stock market. Neural networks, with their remarkable ability to
derive meaning from complicated or imprecise data, can be used to extract patterns and detect trends or
correlations that are too complex to be noticed by either humans or other computer techniques. At this point, we
believe it is necessary to clarify the differences between the terms clustering and feature clustering: Clustering,
in its traditional sense, the unsupervised classification of observations (patterns, data items or feature vectors)
into groups (clusters). Each of these observations has a set of attributes (features) that characterizes it and define
the dimensionality of the data-space. Emerging data mining applications [4-7] place special requirements on
clustering techniques, such as the ability to handle high dimensionality, assimilation of cluster descriptions and
usability.
Regarding the clustering of high dimensional data, an object typically has from dozens to hundreds of
attributes in which the domains of the attributes are large. Feature clustering; on the other hand, can be defined
as a technique to cluster the features that characterize the observations, forming groups of features that are
somewhat dependent of each other and/or correlated. This study uses different classification methods to find the
best solutions and also to create the framework for predicting forecast equities.
Objective:
The objective of this study is to present a model in order to forecast equities in Tehran Stock Exchange
from 2005 to 2012 using Self-Organizing Map (SOM).
MATERIAL AND METHOD
Self-Organizing Map (SOM) is a clustering method considered as an unsupervised variation of the Artificial
Neural Network (ANN). It uses competitive learning techniques to train the network (nodes compete among
themselves to display the strongest activation to a given data).
Corresponding Author: Zinat Ansari, Department of accounting, Shiraz Branch Islamic Azad University, Shiraz, Iran.
Tel: +98-917-1048553; E-mail: [email protected]
186
Zinat Ansari et al, 2014
Advances in Environmental Biology, 8(1) January 2014, Pages: 185-189
Fig.1: A sample of Self-Organizing Map (SOM)
The SOM algorithm show in the following:
• Select output layer network topology
– Initialize current neighborhood distance, D(0), to a positive value
• Initialize weights from inputs to outputs to small random values
• Let t = 1
• While computational bounds are not exceeded do
1) Select an input sample i1
2) Compute the square of the Euclidean distance of i1
from weight vectors (wj) associated with each output node
3) Select output node j* that has weight vector with minimum value from
step 2)
4) Update weights to all nodes within a topological distance given by D(t)
from j*, using the weight update rule:
5) Increment t
End while Learning rate generally decreases with time:
From Mehotra et al. (1997), p. 189
U-matrix (Unified distance matrix):
U-matrix representation of the Self-Organizing Map visualizes the distances between the neurons. The
distance between the adjacent neurons is calculated and presented with different colorings between the adjacent
nodes. A dark coloring between the neurons corresponds to a large distance and thus a gap between the
codebook values in the input space. A light coloring between the neurons signifies that the codebook vectors are
close to each other in the input space. Light areas can be thought as clusters and dark areas as cluster separators.
This can be a helpful presentation when one tries to find clusters in the input data without having any a priori
information about the clusters.
Variables:
These are the variables needed, with vectors in bold: is the current iteration, is the iteration limit, t is
the index of the target input data vector in the input data set D, D(t) is a target input data vector, v is the index of
the node in the map, Wv is the current weight vector of node v, u is the index of the best matching unit (BMU) in
the map,
is a restraint due to distance from BMU, usually called the neighborhood function, and,
is a learning restraint due to iteration progress.
187
Zinat Ansari et al, 2014
Advances in Environmental Biology, 8(1) January 2014, Pages: 185-189
RESULTS AND DISCUSSION
The SOM has been proven useful in many applications One of the most popular neural network models. It
belongs to the category of competitive learning networks. Based on unsupervised learning, which means that no
human intervention is needed during the learning and that little needs to be known about the characteristics of
the input data. Use the SOM for clustering data without knowing the class memberships of the input data. In the
study used Self-Organizing Map algorithm for equities clustering. In the study used 28 factors that shown in
following.
Cash and inventories of bank
Short term investments
Accounts and notes received of commercial
Other accounts and notes received of commercial
Stock of during construction
Stock of commodity made
Stock of parts and Spare parts
Other inventories
Stock of materials and goods
Orders and prepayments
Prepayment
Long-Term Investments
Intangible Assets
Other Assets
Investments and Other Assets
Property and equipment
Commercial accounts and notes payables
Other accounts and notes payables
Prereceipts
Provision for tax
Financial credit
Stock Dividend Payable
Long-term financial liabilities
Serve termination benefits of employee
Non-current liabilities
Capital
Accumulated gains and losses
Deposit Legal reserve
188
Zinat Ansari et al, 2014
Advances in Environmental Biology, 8(1) January 2014, Pages: 185-189
Fig. 2: SOM topology
Fig. 3: SOM neighbor weight distances
Fig. 4: Weights for 28 factors
189
Zinat Ansari et al, 2014
Advances in Environmental Biology, 8(1) January 2014, Pages: 185-189
Fig. 6: SOM weight position
Conclusions:
In this paper, the SOM has been shortly introduced. The SOM is an excellent tool in the visualization of
high dimensional data. In this article, the SOM was used for equities classification. The result show that SOM
algorithm is one of methods for classification that use in different science such as equities classification.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
Flexer, A., 1999. On the use of self-organizing maps for clustering and visualization. In Principles of Data
Mining and Knowledge Discovery, pp: 80-88.
Himberg, J., J. Ahola, E. Alhoniemi, J. Vesanto and O. Simula, 2001. The self-organizing map as a tool in
knowledge engineering.
Vesanto, J., Johan Himberg, Esa Alhoniemi and Juha Parhankangas, 1999. Self-organizing map in Matlab:
the SOM Toolbox. Proceedings of the Matlab DSP conference 1999, Espoo, Finland, NOVEMBER 16-17,
PP: 35-40.
Barraquand, J., D. Martineau, 1995. Numerical valuation of high dimensional multivariate American
securities. Journal of Financial and Quantitative Analysis, 30(03): 383-405.
Clark, R., H.W. Ressom, A. Wang, J. Xuan, M. Liu, E. Gehan, Y. Wang, 2008. The properties of highdimensional data spaces: implications for exploring gene and protein expression data. Nature Reviews
Cancer, 8: 37-49.
Kumar, V., 2001. Data Mining for Scientific and Engineering Applications.
Kohonen, T., 2001. Self-Organizing Maps. Springer-Verlag New York, Inc., Secaucus, NJ, USA.