Download Both thefdcghvhjkj - Department of Physics, Computer Science and

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Business intelligence wikipedia , lookup

Transcript
Higher Order Neural Network Group Models for Financial
Simulation*
Ming Zhang, Jing Chun Zhang
John Fulcher
Department of Computing &
Information Systems
University of Western Sydney, Macarthur
Campbelltown, NSW 2560, Australia
Email: [email protected]
[email protected]
School of Information Technology &
Computer Science
University of Wollongong
Wollongong, NSW 2522 Australia
Email: [email protected]
* This work was performed while the authors were the recipients of a grant from Fujitsu Research
Laboratories, Japan.
ABSTRACT
Real world financial data is often discontinuous and non-smooth. If we attempt to use
neural networks to simulate such functions, then accuracy will be a problem. Neural
network group models perform this function much better. Both Polynomial Higher Order
Neural network Group (PHONG) and Trigonometric polynomial Higher Order Neural
network Group (THONG) models are developed. These HONG models are open box,
convergent models capable of approximating any kind of piecewise continuous function,
to any degree of accuracy. Moreover they are capable of handling higher frequency,
higher order nonlinear and discontinuous data. Results obtained using a Higher Order
Neural network Group financial simulator are presented, which confirm that HONG
group models converge without difficulty, and are considerably more accurate than
neural network models (more specifically, around twice as good for prediction, and a
factor of four improvement in the case of simulation).
1. Introduction
Traditional statistical approaches to the modelling and prediction of financial systems
have met with only limited success (Burns, 1986; Peters, 1991). Azema-Barac & Refenes
(1997) cite the limitations of statistical modelling approaches (such as linear regression,
autoregression, Box-Jenkins AutoRegressive/Moving Average and the like) as:
(i) only a limited number of determinants of any given asset price are analyzed at
any one time,
(ii) the relationship(s) between asset prices and their determinants vary over time,
and
(iii) many of the rules which govern asset price are qualitative, or at best fuzzy
(some would argue chaotic even – see Peters, op.cit.).
As a result, researchers have turned to alternative approaches. In this context, Artificial
Neural Networks – ANNs – have received a lot of attention in recent times. This is
evidenced by the increase in the number and frequency of specialist conferences (e.g.
IEEE/IAFE Computational Intelligence for Financial Engineering), special journal issues
(e.g. Volume 8 of Intl. J. Neural Systems on Data Mining in Finance), and indeed
specialist journals (e.g. J. Computational Intelligence in Finance), as well as specialist
books (e.g. Trippi & Turbon, 1993; Azoff, 1994; Vemuri & Rogers, 1994; Pham & Liu,
1995). A good summary of the application of ANNs to financial applications is presented
in Azema-Barac & Refenes (op.cit.).
Such interest could well be motivated by the inductive nature of ANNs - in other words
their ability to infer complex non-linear relationships between input and output variables
(relationships which may be too complex to model by conventional means).
Not surprisingly, a lot of the attention has focused on MLPs (Chakraborty et.al., 1992;
Azoff, op.cit.). Many other models have been investigated however, including:
(i)
Recurrent (Pham & Liu, 1992 [viz. Elman Net]; Tenti, 1995),
(ii)
Probabilistic (Zaknich & Attikiouzel, 1991; Tan et.al., 1995),
2
(iii)
Radial Basis Functions (Wedding & Cios, 1996),
(iv)
Cascade Correlation (Ensley & Nelson, 1992),
(v)
CounterPropagation (Schumann & Lohrbach, 1993),
(vi)
General Regression Neural Net (Chen, 1992),
(vii)
Trigonometric ANNs (Megson, 1993).
(viii) Group Method of Data Handling [Adaline] (Pham & Liu, 1995),
(ix)
Modular ANNs (Kimoto et.al., 1991),
(x)
NeuroFuzzy (Wong et.al., 1991; Hobbs & Bourbakis, 1995).
Conventional ANN models are incapable of handling discontinuities {f(x)  limx->0
f(x+x)} in the input training data. They suffer from two further limitations:
(i) they do not always perform well because of the complexity (higher frequency
and higher order nonlinearity) of the economic data being simulated, and
(ii) the neural networks function as “black boxes”, and thus are unable to provide
explanations for their behaviour.
This latter characteristic is seen as a disadvantage by users, who prefer to be presented
with a rationale for the simulation being generated.
In an effort to overcome these limitations, interest has recently been expressed in using
Higher Order Neural Network – HONN - models for economic data simulation (Hornik
1991; Hu 1992; Karayiannis 1993; Redding 1993). Such models are able to provide
information concerning the basis of the economic data they are simulating, and hence can
be considered as ‘open box’ rather than ‘black box’ solutions. Furthermore, HONN
models are also capable of simulating higher frequency and higher order nonlinear data,
thus producing superior economic data simulations, compared with those derived from
ANN-based models.
Financial experts often use polynomials or linear combinations of trigonometric functions
for modelling and reasoning. If we are able to develop HONN models capable of
simulating these functions, then users would be able to view them as “open boxes”, and
thus be more readily inclined to accept their results.
3
This was the motivation therefore for developing the Polynomial HONN model for
economic data simulation (Zhang et.al., 1995). Zhang & Fulcher (1996a) extended this
idea to Group PHONN models for financial data simulation, while Zhang, Zhang &
Fulcher (1997a, 1997b) developed trigonometric PHONNG models for financial
prediction.
The first problem we need to address is to devise a neural network structure that will not
only act as an open box to simulate modeling and reasoning functions, but which will
also facilitate learning algorithm convergence. Now polynomial and trigonometric
polynomial functions are continuous by their very nature. If the financial data being
analyzed vary in a continuous and smooth fashion with respect to time, then such
functions can be effective. However, in the real world such variation is more likely to be
discontinuous and non-smooth. Thus if we use continuous and smooth approximation
functions then accuracy will obviously be a problem. We subsequently demonstrate how
it is possible to simulate discontinuous functions, to any degree accuracy, using neural
network group theory, even at points of discontinuity.
2. Neural Network Groups
The neural network hierarchical model devised by Willcox (1991) consists of binary-state
neurons grouped into clusters, and can be analyzed using a Renormalization Group
approach. Zhang & Scofield (1994) used neural network groups in the development of a
rainfall estimation expert system. Zhang & Fulcher (1996b) developed GAT - a neural
network Group based Adaptive Tolerance tree - to recognize human faces. A neural
network group model for rainfall estimation was subsequently developed by Zhang,
Fulcher, & Scofield (1997). These results, together with Naimark’s (1982) earlier theory
of group representations and Yang’s (1990) work with neuron groups, can be used as the
basis for Higher Order Neural Network Group models, which we shall now proceed to
develop.
4
2.1 Artificial Neural Network (ANN) Groups
The artificial neural network set (Zhang, Fulcher, & Scofield, 1997) is a set in which
every element is an ANN. The generalized artificial neural network set - NN - is the
union of the product set NN* and the additive set NN+. A nonempty set N is called a
neural network group, if N  NN (the generalized neural network set), and the sum ni +
nj or product ninj is defined for every two elements ni, nj  N.
2.2 Neural Network Group Features
Hornik (1991) proved the following general result:
“Whenever the activation function is continuous, bounded and nonconstant, then for an
arbitrary compact subset X  Rn, standard multilayer feedforward networks can
approximate any continuous function on X arbitrarily well with respect to uniform distance,
provided that sufficiently many hidden units are available”.
A more general result was proved by Leshno (1993):
“A standard multilayer feedforward network with a locally bounded piecewise continuous
activation function can approximate any continuous function to any degree of accuracy if
and only if the network's activation function is not a polynomial”.
Zhang, Fulcher, & Scofield (op.cit.) used an inductive proof to show that a similar
characteristic exists for artificial neural network groups.
3. HONN Group Models
Higher Order Neural Network models use trigonometric, linear, multiply, power and
other neuron functions based on the polynomial form:
n
z
= 
ak1k2[ f1(ak1k2x x)] k1 [ f2(ak1k2y y)] k2
k1,k2 =0
5
n
=  (ak1k2o){ ak1k2hx[f1(ak1k2xx)]k1 }{ak1k2hy[f2(ak1k2yy)]k2 }
[3.1]
k1,k2 =0
where:
ak1k2 = (ak1k2o)( ak1k2hx)( ak1k2hy)
Output Layer Weights: (ak1k2o)
Second Hidden Layer Weights: (ak1k2x) and (ak1k2y)
First Hidden Layer Weights: (ak1k2hx) and (ak1k2hy)
Choosing a different function fi, results in a different higher order neural network model.
Let:
f1=f2=1
n
then
= 
z
ak1k2[ f1(ak1k2x x)] k1 [ f2(ak1k2y y)] k2
k1,k2 =0
n
=  (ak1k2o){ ak1k2hx(ak1k2xx)k1 }{ak1k2hy(ak1k2yy)k2 }
[3.2]
k1,k2 =0
This is the Polynomial Higher Order Neural Network (PHONN) model.
Let:
f1=sin(x);
then
z
f2=cos(y).
n
= 
ak1k2[ f1(ak1k2x x)] k1 [ f2(ak1k2y y)] k2
k1,k2 =0
n
=  (ak1k2o){ ak1k2hx[sin(ak1k2xx)]k1 }{ak1k2hy[cos(ak1k2yy)]k2 }
[3.3]
k1,k2 =0
This is the Trigonometric polynomial Higher Order Neural Network (THONN) model.
Let:
f1= 1/(1+exp(-x)) ;
f2= 1/(1+exp(-y)) .
n
then
= 
z
ak1k2[ f1(ak1k2x x)] k1 [ f2(ak1k2y y)] k2
k1,k2 =0
n
= (ak1k2o){ ak1k2hx[1/(1+(ak1k2xx))]k1 }{ak1k2hy[(1/1+(ak1k2yy))]k2 }
[3.4]
k1,k2 =0
This is the Sigmoid polynomial Higher Order Neural Network (SHONN) model.
Higher Order Neural network Group (HONG) is one kind of neural network group, in
which each element is a higher order neural network, such as PHONN or THONN.
Based on Section 2.1, we have:
6
HONG  N
[3.5]
where : HONG = { PHONN, THONN, SHONN, ......}
In the following section, we describe two different HONG models.
4. Polynomial Higher Order Neural Network Group Model
4.1 PHONN Model#2
PHONN Model#2 uses both logarithmic and sigmoid neurons (earlier models used only
linear or polynomial neurons – viz. Models#0 & #1, respectively):
n
n
z =  (ak1k2o)[(ak1k2x)x]k1[(ak1k2y)y]k2 =  ak1k2xk1yk2
k1,k2 =0
[4.1]
k1,k2 =0
= ln [(z’)/(1-z’)]
where
ak1k2 = (ak1k2o)[(ak1k2x)k1][(ak1k2y)k2]
and
z’ = 1/(1+ e-z)
PHONN Model#2 is shown in Figure 1. It contains two layers of weights which can be
trained – those in the input layer connecting (1, y & x), and those in the output layer
connected to the sigmoid neuron. Note that some of the neurons in the first layer are
nonlinear. This model is more accurate than the earlier models, and because it uses both
logarithmic and sigmoid neurons, a threshold is not necessary in the output layer.
7
a01y
a02y
1
a10x
a11y
y
a00o
s: Sigmoid Neuron
a01o
L: Logarithm Neuron
x
a11
x
a02
a12y
a10o
a12x
a20x
o
z’
s
a11o
a12o
a20o
a
z
L
: Linear Neuron
y
21
a21o
: Power Neuron
a21x
a22o
: Multiply Neuron
a22y
: unity valued weight
ak1k2
: weight of value ak1k2
a22x
Figure 1 - Polynomial High Order Neural Network Model#2
4.2 PHONG Model
The PHONG model is a PHONN model#2 Group. It is a Piecewise Function Group of
Polynomial Higher Order Neural Networks, and is defined as follows:
Z = { z1, z2, z3, ..., zi, zi+1, zi+2, ...}
[4.2]
8
where
zi = ln [(zi’)/(1-zi’)]
zi  Ki  Rn, Ki is a compact set
zi’ = 1/(1+ e-zi)
n
n
zi =  (aik1k2o)[(aik1k2x)x]k1[(aik1k2y)y]k2 = 
k1,k2 =0
aik1k2xk1yk2
k1,k2 =0
aik1k2 = (aik1k2o)[(aik1k2x)k1][(aik1k2y)k2]
In the PHONG Model (Piecewise Function Group), group addition is defined as the
piecewise function:
 z1,
z 1  K1
 z2,
z 2  K2
Z =  ......
 z i,
z i  Ki
 zi+1,
zi+1  Ki+1
 ......
where:
zi  Ki  Rn, Ki is a compact set
The PHONG Model (Figure 2) is an open and convergent model which can approximate
any kind of piecewise continuous function to any degree of accuracy, even at
discontinuous points (or regions).
9
Z
+
Z1
Z2
............
Zi
Zi+1
............
PHONN
PHONN
PHONN
PHONN
Model #2
Model #2
Model #2
Model#2
1 x
y
1
x
y
1 x
y
1 x
y
Figure 2 - Polynomial Higher Order Neural Network Group Model
5. Trigonometric Polynomial Higher Order Neural Network Groups
5.1 Trigonometric Polynomial Higher Order Neural Network Model #1B
We developed the following Trigonometric polynomial Higher Order Neural Network
(THONN):
m
n
thonn(x,y) =  (aisinj(ix) bicosj(iy)+cisinji(x+y)+dicosji(x+y))+ a
[5.1]
j=1 i=1
where ai, bi, ci, di are the weights of the higher order trigonometric neural
network, and j is the power order (j = 1, 2, ...., m).
When j is sufficiently large, this model is able to simulate higher order nonlinear data.
THONN model#1b is shown in Figure 3.
10
1h00x
1h00y
a00hy
a00hx
1
y
x
1h01y
a01y 1h02ya01hy
a02y
a02hy
1h10x
2h00
x
hx
a10
a10
a00o
a11y 1h11y 2h01a01hx
a01o
1h11x a11hy 2h02 a02hx
a11x
a11hx
a02o
a10hy
a12y
1h12y 2h10
a10o
a12hy
a12x 1h12x 2h11 a11o
a12hx
a20x
a12o
1h20x 2h12
a20hx
a20hy a20 o
a21y 1h21y 2h20
a21hy
a21o
a21x1h21x 2h21
a21hx
a22o
a22y 1h22y 2h22
a22hy
ak1k2
x
hx
a22
a22
1h22x
z
Linear Neuron
: cos(x), sin(y)
: cos2(x), sin2 (y)
: Multiply Neuron
: weight with value 1 (fixed)
: Trainable Weight with value ak1k2
Figure 3 - Trigonometric Polynomial Higher Order Neural Network THONN Model#1B
11
5.2 Trigonometric Polynomial Higher Order Neural network Group (THONG)
Model
In order to handle discontinuities in the input training data, the trigonometric polynomial
higher order neural network Group - THONG - model has also been developed.
A similar model to that of Figure 2 results, in which every element is a trigonometric
polynomial higher order neural network - THONN (Zhang & Fulcher, 1996a).
The domain of the THONN inputs is the n-dimensional real number Rn. Likewise, the
THONN outputs belong to the m-dimensional real number Rm. The neural network
function f constitutes a mapping from the inputs of THONN to its outputs (equation 5.2)
THONN  THONG
[5.2]
where: THONN = f: Rn  Rm
Based on the inference of Zhang, Fulcher & Scofield (1997), each such neural network
group can approximate any kind of piecewise continuous function, and to any degree of
accuracy. Hence, THONG is also able to simulate discontinuous data.
6. Higher Order Neural Network Group Financial Simulation System
The above concepts have been incorporated into a Higher Order Neural network Group HONG - financial simulation system. This system comprises two parts, one being a
Polynomial Higher Order Neural Network Simulator - PHONNSim, and the other a
Trigonometric polynomial Higher Order Neural Network Simulator - THONNSim.
This HONG financial simulation system was written in the C language, runs under XWindows on a SUN workstation, and incorporates a user-friendly Graphical User
12
Interface. Any step, data or calculation can be reviewed and modified dynamically in
different windows. At the top of the THONG simulator main window, there are three
pull-down menus, namely Data, Translators and Neural Network (Figure 4).
Each of these offers several options; selecting an option creates another window for
further processing. For instance, once we have selected some data via the Data menu,
two options are presented for data loading and graphical display.
13
Figure 4 – Main Window of THONNsim
Data is automatically loaded when the Load option is selected. Alternatively, the
Display option displays the data not only in graphical form, but also transformed (i.e.
rotation, elevation, grids, smooth, influence, and so on). The Translators menu is
used to convert the selected raw data into network form, while the Neural Network
menu is used to convert the data into a nominated model (an example of which appears in
Figure 5). These two menus allow the user to select different models and data, in order to
generate and compare results.
14
Figure 5 – Network Model Sub-Window
All the steps mentioned above can be performed simply using a mouse. Hence, changing
data, network models and comparing results can be achieved easily and efficiently.
There are more than twelve windows and sub-windows in this visual system; both the
system mode and its operation can be viewed dynamically, in terms of:
15
Figure 6 – Load data File Sub-Window
 input/output data,
 neural network models,
 coefficients/parameters, and so on.
The system operates as a general neural network system, and includes the following
functions:
 load a data file (Figure 6),
 load a neural network model,
 generate a definition file (Figure 7),
Figure 7 – Generate Definition File Sub-Window
 write a definition file,
 save report,
 save coefficients (Figure 8), etc.
16
The 'System mode' windows allows the user to view - in real time - how the neural
network model learns from the input training data (i.e. how it extracts the weight values).
When the system is running, the following system mode windows can be opened
simultaneously from within the main window:
Figure 8 – Coefficients Sub-Window
 'Display Data (Training, Evolved, Superimposed, and Difference)',
 'Show/Set Parameters',
 'Network Model (including all weights)' and
 'Coefficients'.
Thus every aspect of the system’s operation is able to be viewed graphically.
A particularly useful feature of the system is that one is able to view the mode, modify it,
or alternatively change other parameters in real time. For example, when the user chooses
the 'Display Data' window to view the input training data file, they can change the
graph's format for the most suitable type of display (i.e. modify the graph's rotation,
elevation, grids, smoothing, and influence).
During data processing, the 'Display Data' window offers four different modes to display
the results, which can be changed in real time, namely: 'Training', 'Evolved',
17
'Superimposed', and 'Difference' (using the same format as set for the input data), as
indicated in Figure 9.

'Training' displays the data set used to train the network.

'Evolved' displays the data set produced by the network (and is unavailable if a
network definition file has not been loaded.)
18
Figure 9 – Graph Sub-Window

'Superimposed' displays both the training and the evolved data sets together (so they
can be directly compared in the one graph).

'Difference' displays the difference between the 'Training' and the 'Evolved' data sets.
The 'Rotation' command changes the angle of rotation at which the wire frame mesh is
projected onto the screen. This allows the user to 'fly' around the wire frame surface. The
default value is 30 degrees, but is adjustable from 0o to 355o in increments of 5o, with
wrap around at 360 to 0 (this value can be simply adjusted with either the up/down
buttons or by entering a number directly).
'Elevation' changes the angle of elevation at which the wire frame mesh is projected onto
the screen. This allows the user to 'fly' either above or below the wire-frame surface
(usage is similar to Rotation).
The 'Grids' command changes the number of wires in the wire frame mesh. It is
adjustable from 6 to 30, using either the up/down buttons, or by directly entering a
number. Low grid numbers allow fast display, but with decreased resolution; higher
numbers provide a more accurate rendition of the surface, but at the cost of increased
display time.
If the user is not satisfied with the results and wants a better outcome (a higher degree of
model accuracy), they can stop the processing, and set new values for the model
parameters, such as learning rate, momentum, error threshold, and random seed. The
neural network model can be easily changed as well.
As is usual with neural network software, the operating procedure is as follows:
Step 1. Data pre-processing (encoding)
Step 2. Load & view data
Step 3. Choose & load neural network model
Step 4. Show/Set the network parameters
Step 5. Run the program
Step 6. Check the results:
19
if satisfactory then Go to step 7
else Go to Step 3
Step 7. Save & export the results.
Step 8. Data decoding (post-processing)
In these eight steps, there are two basic requirements that must be satisfied before the
PHONG/THONG program is able to start running. One is input training data, and the
other is input the network. The user must also have loaded some training data, and
loaded a network.
7 Preliminary Testing of PHONG/THONG Simulator
7.1 Comparison of PHONG/THONG with QwikNet
QwikNet is a commercially available financial software package. It is a powerful, yet
easy-to-use,
ANN
simulator
available
via
the
Internet
(http://www.kagi.com/cjensen/).
QwikNet implements several powerful methods to train and test a standard feedforward
neural network in a graphical environment.
QwikNet allows the user to specify the training algorithm, from amongst BackPropagation Online, Randomize Patterns, Back-Propagation Batch, Delta Bar Delta,
RPROP and QUICKPROP (in some cases the choice of the training method can have a
substantial effect on the speed and accuracy of training). The best choice is dependent on
the problem at hand, with trial-and-error usually being needed to determine the best
method.
QwikNet is suited to most problems from the simplest to the most demanding, and has
been used in many disciplines including science, engineering, financial prediction, sport,
artificial intelligence and medicine.
7.2 Network Configuration
20
The following QwikNet demonstration data files were used in this comparison: IRIS41.TRN,
IRIS4-3.TRN,
SINCOS.TRN,
XOR.TRN,
XOR4SIG.TRN,
SP500.TRN,
SPIRALS.TRN and SECURITY.TRN. 'Back-propagation' was used to train both THONG
and QwikNet, with a 'Learning Rate' of 0.1, a single ‘Hidden Layer’ and 'Maximum #
Epochs' set to 100,000 (similar findings apply for PHONG).
21
7.3 Input Training Data Files
7.3.1 IRIS4-1.TRN and IRIS4-3.TRN Input Data Files
Both 'Iris4-1.trn' and ‘Iris4-3.trn’ comprise essentially the same input data, however the
former has one output, while the latter has three. The data consists of 150 samples taken
from 3 different types of Iris plants. Each sample contains four features of the plant and
the task is to determine the type of Iris plant based on these features (50 samples from
each plant).
Input Data File - IRIS.TRN
1.2
1
0.8
0.6
0.4
0.2
0
1
11 21 31 41 51 61 71 81 91 101 111 121 131 141
Input #
Input_1
Input_2
Input Data File - IRIS
1.2
1
0.8
0.6
Input_3
Input_4
Output
0.4
0.2
0
1
8
15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127 134 141 148 155 162 169
Input / Output #
Figure 10 - Features of the Input Training Data Files:
Iris4-1.TRN (top) & Iris4-3.TRN (bottom)
22
Figure 10 shows the main characteristics of these input data files - namely high frequency
components and discontinuous.
7.3.2 SINCOS.TRN Input Data File
This input training data is linear and continuous. The first output is sin (x) and the second
cos(x) (Figure 11). The objective is to learn sin(x) and cos(x), given x.
Input Data File - SinCos
1.2
1
Input
Output_1
Output_2
0.8
0.6
0.4
0.2
0
1
4
7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73
Input/Out #
Figure 11 – Input Training File – SINCOS.TRN
7.3.3 XOR.TRN and XOR4SIG.TRN Input Data Files
'XOR.TRN' is the standard 2-bit exclusive OR problem. It has two inputs and one output,
as indicated in Table 1.
Input#1
0.000000
0.000000
1.000000
1.000000
Input#2
0.000000
1.000000
0.000000
1.000000
Output
0.100000
0.900000
0.900000
0.100000
Table 1 - Input Training Data File - XOR.TRN
'XOR4SIG.TRN' is a four-input version of this same problem (note that 0.1-0.9 targets are
easier to reach in practice than 0-1).
23
Input#1
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
Input#2
0.000000
0.000000
0.000000
0.000000
1.000000
1.000000
1.000000
1.000000
Input#3
0.000000
0.000000
1.000000
1.000000
0.000000
0.000000
1.000000
1.000000
Input#4
0.000000
1.000000
0.000000
1.000000
0.000000
1.000000
0.000000
1.000000
Output
0.100000
0.900000
0.900000
0.100000
0.900000
0.100000
0.100000
0.900000
1.000000
1.000000
1.000000
1.000000
1.000000
1.000000
1.000000
1.000000
0.000000
0.000000
0.000000
0.000000
1.000000
1.000000
1.000000
1.000000
0.000000
0.000000
1.000000
1.000000
0.000000
0.000000
1.000000
1.000000
0.000000
1.000000
0.000000
1.000000
0.000000
1.000000
0.000000
1.000000
0.900000
0.100000
0.100000
0.900000
0.100000
0.900000
0.900000
0.100000
Table 2 - Input Training Data File XOR4SIG.TRN
7.3.4 SP500.TRN Input Data File
The 'SP500.TRN' data file comprises 28 inputs and 3 outputs. The input data consists of
market features as well as the value of the S&P index from previous months. The outputs
are the S&P index 1, 3 and 6 months into the future. In other words, the task of
'SP500.TRN' is to try to predict the S&P 500 index.
1.2
Input Data File - SP500
1
Input_27
Input_28
Output_1
0.8
0.6
0.4
0.2
0
1
15 29 43 57 71 85 99 113 127 141 155 169 183 197 211 225 239 253 267 281 295 309
Input/Output #
Figure 12 - Input Training Data File SP500 Input/Output
24
Rather than list all 31 columns and 300 rows here, Figure 12 shows a couple of
representative inputs (and output) only. The single THONN model is not able to handle
such a high input dimensionality. Accordingly, the SP500.TRN data file needs to be
preprocessed prior to presenting to THONN
7.3.5 SPIRALS.TRN Input Data File
'SPIRALS.TRN' consists of two inputs, x and y, and one output, z. The task here is to learn
to discriminate between two sets of training points which lie on two intertwined spirals in
the x-y plane, and which coil three times around the origin. This is recognized as being a
difficult task for backpropagation networks (and their derivatives). Problems such as this,
where inputs are points on the 2-D plane, are interesting because we can display the 2-D
'receptive field' of any unit in the network, as indicated in Figure 13.
7.3.6 SECURITY.TRN Input Data
The 'SECURITY.TRN' data file contains data collected from simulations of electric power
systems. The inputs are power system features (voltages, currents, generator outputs and
the like), while the output is the corresponding security classification. The latter is a
measure of the stress on the system, and thus is an indication of the likelihood of a
blackout occurring.
The objective here is to train the neural network to predict the security rating of a power
system given its current operating status. This can be performed by traditional methods
(i.e. simulation) but is computationally very demanding. Neural networks offer a much
faster solution. Once again the input dimensionality needs to be reduced so it fits into the
THONG program.
25
Input Training Data File - SPIRALS.TRN
1.2
Input 1
Input 2
1
0.8
0.6
0.4
0.2
0
1
11
21
31
41
51
61
71
81
91 101 111 121 131 141 151 161 171 181 191 201
Number of Input
Figure 13 - Input Training Data File - SPIRALS.TRN
7.4 Comparative Analysis Results
Table 3 reveals that THONG provides much better simulation accuracy (typically two to
three times smaller total RMS error compared with QuickNet).
Both the IRIS and Security input data (* in Table 3) exhibit discontinuities and are nonsmooth. Under such conditions, the THONG models result in less error compared with
conventional neural network models.
The other experimental results of Table 3 indicate that the THONG (& PHONG) models
are also able to simulate linear, boolean and continuous functions, at least as well as
conventional ANN models.
*IRIS4-1
*IRIS4-3
SINCOS
XOR
XOR4SIG
SP500
SPIRALS
*SECURITY
THONG
0.029
0.028
0.011
1.62E-14
1.62E-14
0.02
0.031
0.018
QwikNet
0.055
0.071
0.038
0.078
0.037
0.014
0.395
0.045
QwikNet-Thong
+0.026
+0.043
+0.027
+0.077
+0.036
-0.006
+0.364
+0.027
Table 3 - Summary of Comparative Analysis Experiment
26
8. Results
The HONG financial simulation system described above has been tested in earnest using
economic data extracted from the Reserve Bank of Australia Bulletin (all training data used
in this study were real world data). Training of the various ANN models described earlier is
a time consuming and computationally intensive task. We have been able to verify that the
neural network models of Sections 4 and 5 are capable of firstly interpolating
discontinuous and non-smooth economic data, while at the same time exhibiting
satisfactory learning algorithm convergence.
Each experiment below is divided into two parts - simulation and prediction. In the
former, all of the available data is used to simulate the real world system; in the latter,
half of the available data is used for training and half for testing, with the previous two
months data used to predict a third consecutive month. On all occasions, slightly better
performance was obtained with simulation rather than prediction.
With each experiment, comparisons are made between the basic higher order ANN model
and its ANN Group counterpart. Invariably the latter yielded superior performance
(typically twice as good). The first two experiments pertain to THONG (Australian All
Banks Liabilites & Assets; Japanese Real Gross Domestic Product), while the third and
fourth to PHONG (Australian All Banks Debit Cards; Australian All Ordinaries {i.e.
Stock Market} Index).
8.1 Australian All Banks Liabilities and Assets
8.1.1 Simulation
The THONN and THONG models were used to simulate the 1995 Australian All Banks
Liabilities and Assets (the simulation data came from the Reserve Bank of Australia
Bulletin, August 1996, pp. s3-s4). Simulation results showed that the THONN model
27
converged without difficulty to less than 11% error, while the THONG (higher order
neural network group) model exhibited just under 1.2% error (Table 4 and Figure 14).
Month
/Year
01/95
02/95
03/95
04/95
05/95
06/95
07/95
08/95
09/95
10/95
Total Liabilities
32150
33225
34369
35482
35665
36664
36763
35380
33397
34332
PHONN
|Error|%
9.12
12.44
3.78
3.75
12.56
11.56
10.43
6.78
32.26
5.83
PHONG
|Error|%
3.7
4.3
2.4
1.17
0.26
0.06
0.0001
0.0025
0.0040
0.0016
10.85%
1.191%
Average |Error|
Table 4 - All Banks Liabilities and Assets ($Million)
Reserve Bank of Australia Bulletin (August 1996, pp. s3-s4)
|Error|%
35
THONN
30
THONG
25
20
15
10
5
0
Jan-95
Feb-95
Mar-95
Apr-95
May-95
Jun-95
Jul-95
Aug-95
Sep-95
Oct-95
Figure 14 - All Banks Liabilities and Assets ($Million)
28
8.1.2 Prediction
The THONN and THONG models were also used to predict the 1995 Australian All
Banks Liabilities and Assets. Data from 01/95 through 08/95 were used for training, and
09/95 and 10/95 used for testing. Prediction results showed that the average error of the
THONN model was 27.7 %, while for THONG it was just under 19.4% (Table 5 and
Figure 15).
Month
/Year
01/95
02/95
03/95
04/95
05/95
06/95
07/95
08/95
Total Liabilities
32150
33225
34369
35482
35665
36664
36763
35380
PHONN
|Error|%
9.12
12.44
3.78
3.75
12.56
11.56
10.43
6.78
PHONG
|Error|%
3.7
4.3
2.4
1.17
0.26
0.06
0.0001
0.0025
09/95
10/95
Training
Training
Training
Training
Training
Training
Training
Training
33397
34332
39.61
15.78
34.67
4.11
Testing
Testing
27.70%
19.39%
Average Testing |Error|
Case
Table 5 - All Banks Liabilities and Assets ($Million)
Reserve Bank of Australia Bulletin (August 1996, page s3-s4)
29
|Error|%
40
THONN
35
THONG
30
25
20
15
10
5
0
Jan-95
Feb-95
Mar-95
Apr-95
May-95
Jun-95
Jul-95
Aug-95
Sep-95
Oct-95
Figure 15 - All Banks Liabilities and Assets ($Millions)
8.2 Japanese Real Gross Domestic Product
8.2.1 Simulation
The THONN and THONG models were next used to simulate Japanese Economic
Statistics (Real Gross Domestic Product, relative to the 1990 level). Training data again
were again taken from the Reserve Bank of Australia Bulletin (August 1996, page s64).
Simulation results showed that the THONN model converged without difficulty to
around 9% error, and the THONG model to a little over half this value (5.37%).
8.2.2 Prediction
The THONN and THONG models were also used to predict the Japanese Real Gross
Domestic Product for 09/1995 and 12/1995, using the data from 12/93 through 06/95 for
training. The results again show that the THONG average error was less than half that for
THONN (T5.58% and 12.69%, respectively).
30
8.3 Australian All Banks Debit Cards
8.3.1 Simulation
The next series of tests were conducted using the PHONN and PHONG models. Firstly
they were used to simulate the 1995-1996 All Banks Debit Cards in Australia. The
simulation data came from the Reserve Bank of Australia Bulletin (August 1996, page
s19). The results show that the PHONN model converged without difficulty to around
9.5% error, while the PHONG (higher order neural network group) model error was only
1.77% (Figure 16).
30
PHONN
PHONG
25
20
15
10
5
0
Aug-95
Oct-95
Dec-95
Feb-96
Apr-96
Jun-96
Figure 16 - All Banks Debit Card ($Million)
8.3.2 Prediction
The PHONN and PHONG models were also used to predict 05/96 and 06/96 Australian
All Banks Debit Cards, using data from 08/95 through 04/96 for training. Only
marginally better performance was observed for the PHONG model on this occasion
however – 6.37%, as compared with 6.78% for PHONN (Table 6).
31
Month
/Year
08/95
09/95
10/95
11/95
12/95
01/96
02/96
03/96
04/96
Transactions
1278
1277
1327
1384
1670
1466
1351
1425
1481
PHONN
|Error|%
11.76
27.79
4.65
9.72
24.49
1.00
3.03
5.21
14.11
PHONG
|Error|%
1.68
4.90
1.64
5.47
1.05
0.03
0.24
0.28
0.71
05/96
06/96
Training
Training
Training
Training
Training
Training
Training
Training
Training
1521
1416
0.40
16.16
5.01
7.73
Testing
Testing
6.78%
6.37%
Average Testing |Error|
Case
Table 6 - All Banks Debit Card ($Million)
Reserve Bank of Australia Bulletin (August 1996, page s19)
8.4 Australian All Ordinaries Index
8.4.1 Simulation
The PHONN and PHONG models were next used to simulate the 1995-1996 Australian
All Ordinaries Index. The simulation data once again was taken from the Reserve Bank
of Australia Bulletin (August 1996, page s47). Simulation results showed that the
PHONN model converged without difficulty to around 5.6% error, and the PHONG
model to less than 1.4%.
8.4.2 Prediction
The PHONN and PHONG models were also used to predict the 05/96 and 06/96
Australian All Ordinaries Index, using data from 09/95 through 04/96 for training.
Prediction results showed that the PHONN and PHONG model errors averaged around
7.3% and 4.4%, respectively.
32
9. Conclusion
The definitions, basic models and characteristics of higher order neural network groups
have been presented. It has been demonstrated that higher-order ANNs can achieve
approximately 9% error on financial simulation, and around 14% on prediction. It has
been further demonstrated that HONN Groups can further reduce this error, by nearly a
factor of four for simulation (and around half for financial prediction) – Table 7.
Having defined and proven the viability of HONN groups, there remains considerable
scope for further characterization and development of appropriate algorithms. Research
into hybrid neural network group models also represents a promising avenue for future
investigation. Neural network group-based models are seen as holding considerable
promise for both the understanding and development of complex systems generally (in
other words beyond the immediate sphere of interest of the present paper – financial
markets).
Trigonometric
All Banks
Simulation
Simulation
Prediction
Prediction
HONN
HONG
HONN
HONG
11%
1.2%
27.7%
19.39%
9%
5.37%
12.69%
5.58%
9.5%
1.77%
6.78%
6.37%
5.6%
1.4%
7.3%
4.4%
8.77%
2.44%
13.62%
8.94%
Liabilities &
Assets (Aust)
Trigonometric
Real GDP
(Japan)
Polynomial
All Banks
Debit Cards
(Australia)
Polynomial
All Ordinaries
Index (Aust)
AVERAGE
Table 7 – Results Summary
33
References:
M. Azema-Barac & A. Refenes (1997) “Neural Networks for Financial Applications” in
E. Fiesler & R. Beale (eds) Handbook of Neural Computation Oxford University Press
(G6.3:1-7).
E. Azoff (1994) Neural Network Time Series Forecasting of Financial Markets Wiley.
T. Burns (1986) “The Interpretation and Use of Economic Predictions” Proc. Royal Soc.
A, pp.103-125.
K. Chakraborty. K. Mehrotra, C. Mohan & S. Ranka (1992) “Forecasting the Behaviour
of Multivariate Time Series Using Neural Networks” Neural Networks, 5, pp.961-970.
C. Chen (1992) “Neural Networks for Financial Market Prediction” Proc. Intl. Joint
Conf. Neural Networks, 3, pp.1199-1202.
D. Ensley & D. Nelson (1992) “Extrapolation of Mackey-Glass Data Using Cascade
Correlation” Simulation, 58 (5), pp.333-339.
N. Hobbs & N. Bourbakis (1995) “A NeuroFuzzy Arbitrage Simulator for Stock
Investing”, Proc. IEEE/IAFE Conf. Computational Intelligence for Financial
Engineering, New York, pp.160-177.
K. Hornik (1991) “Approximation Capabilities of Multilayer Feedforward Networks,
Neural Networks, 4, pp.2151-257.
S. Hu & P. Yan (1992) “Level-by-Level Learning for Artificial Neural Groups”, ACTA
Electronica SINICA, 20 (10), pp.39-43.
N. Karayiannis & A. Venetsanopoulos (1993) Artificial Neural Networks: Learning
Algorithms, Performance Evaluation and Applications, Kluwer (Chapter 7).
T. Kimoto, K. Asakawa, M. Yoda & M. Takeoka (1990) “Stock Market Prediction
System with Modular Neural Networks” Proc. Intl. Joint Conf. Neural Networks, San
Diego, 1, pp.1-6.
M. Leshno, V. Lin, A. Pinkus & S. Schoken (1993) “Multilayer Feedforward Networks
with a Non-Polynomial Activation Can Approximate Any Function” Neural Networks 6,
pp.861-867.
G. Megson (1993) “Systematic Construction of Trigonometric Neural Networks”,
Technical Report No. 416, University of Newcastle Upon Tyne.
M. Naimark, & A. Stern (1982) Theory of Group Representations, Springer.
34
E. Peters (1991) Chaos and Order in the Capital Markets Wiley.
D. Pham & X. Liu (1992) “Dynamic System Modelling Using Partially Recurrent Neural
Networks” J. Systems Engineering, 2 (2), pp.90-97.
D. Pham & X. Liu (1995) Neural Networks for Identification, Prediction and Control
Springer.
N. Redding, A. Kowalczyk & T. Downs (1993) “Constructive High-Order Network
Algorithm that is Polynomial Time”, Neural Networks, 6, pp.997-1010.
M. Schumann & T. Lohrbach (1993) “Comparing Artificial Neural Networks with
Statistical Methods within the Field of Stock Market Prediction” Proc. 26th Hawaii Intl.
Conf. System Science¸ 4, pp.597-606.
D. Tan, D. Prokhorov & D. Wunsch (1995) “Conservative Thirty Calendar Day Stock
Prediction Using a Probabilistic Neural Network”, Proc. IEEE/IAFE 1995 Conf.
Computational Intelligence for Financial Engineering, New York, pp.113-117.
P. Tenti (1995) “Forecasting Currency Futures Using Recurrent Neural Networks”, Proc.
3rd Intl. Conf. Artificial Intelligence Applications on Wall Street, New York.
R. Trippi & E. Turbon (eds) (1993) Neural Networks in Finance and Investing Probus.
V. Vemuri & R. Rogers (1994) Artificial Neural Networks: Forecasting Time Series
IEEE Computer Society Press.
D. Wedding & K. Cios (1996) “Time Series Forecasting by Combining RBF Networks,
Certainty Factors, and the Box-Jenkins Model” Neurocomputing, 10 (2), pp.149-168.
C. Willcox (1991) “Understanding Hierarchical Neural Network Behaviour: a
Renormalization Group Approach”, J. Physics A, 24, pp.2655-2644.
F. Wong, P. Wang & T. Goh (1991) “Fuzzy Neural Systems for Decision Making” Proc.
Intl. Joint Conf. Neural Networks, 2, pp.1625-1637.
X. Yang (1990) “Detection and Classification of Neural Signals and Identification of
Neural Networks (synaptic connectivity)”, Dissertation Abstracts International - B,
50/12, 5761.
A. Zaknich, C. deSilva & Y. Attikiouzel (1991) “A Modified Probabilistic Neural
Network (PNN) for Nonlinear Time Series Analysis” Proc. Intl. Joint Conf. Neural
Networks, pp.1530-1535.
35
J. Zhang, M. Zhang, & J. Fulcher (1997a) “Financial Prediction Using Higher Order
Trigonometric Polynomial Neural Network Group Models”, Proc. IEEE Intl. Conf.
Neural Networks, Houston, June 8-12, pp.2231-2234.
J. Zhang, M. Zhang, & J. Fulcher (1997b) “Financial Simulation System Using Higher
Order Trigonometric Polynomial Neural Network Models”, Proc. IEEE/IAFE Conf.
Computation Intelligence for Financial Engineering, New York, March 23-25, pp.189194.
M. Zhang, & R. Scofield, (1994) “Artificial Neural Network Techniques for Estimating
Heavy Convective Rainfall and Recognition of Cloud Mergers From Satellite Data”, Intl.
J. Remote Sensing, 15 (16), pp.3241-3262.
M. Zhang, S. Murugesan, & M. Sadeghi (1995) “Polynomial Higher Order Neural
Network For Economic Data Simulation”, Proc. Intl. Conf. Neural Information
Processing, Beijing, pp.493-496.
M. Zhang & J. Fulcher (1996a) “Neural Network Group Models For Financial Data
Simulation” Proc. World Congress Neural Networks, San Diego, pp. 910-913.
M. Zhang & J. Fulcher, (1996b) “Face Recognition Using Artificial Neural Network
Group-Based Adaptive Tolerance (GAT) Trees”, IEEE Trans. Neural Networks, 7 (3),
pp.555-567.
M. Zhang, J. Fulcher & R. Scofield (1997) “Rainfall Estimation Using Artificial Neural
Network Groups”, Neurocomputing, 16 (2), pp.97-115.
36