Download to the neuron`s output. The neuron does not perform other

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Neurocomputational speech processing wikipedia , lookup

Multielectrode array wikipedia , lookup

Optogenetics wikipedia , lookup

Neural modeling fields wikipedia , lookup

Linear belief function wikipedia , lookup

Catastrophic interference wikipedia , lookup

Nervous system network models wikipedia , lookup

Metastability in the brain wikipedia , lookup

Development of the nervous system wikipedia , lookup

Channelrhodopsin wikipedia , lookup

Artificial neural network wikipedia , lookup

Convolutional neural network wikipedia , lookup

Neural engineering wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Recurrent neural network wikipedia , lookup

Transcript
Experimental investigation of the performance of the optical two-layer neural network
Nikolay N. Evtihiev, Rostislav S. Starikov, Boris N. Onyky, Vadim V. Perepelitsa, Igor B. Scherbakov
Moscow State Engineering Physics Institute (Technical University), Department of Quantum Electronics,
Kashirskoe shosse 31, Moscow, 115409, Russia
ABSTRACT
The paper presents the obtained results of learning of the two-layer (64x8) neural network (TLNN), the results
of tolerant noising of weight matrixes and results of hardware implementation. The imperfectness of optics satisfies the
margin requirements of TLNN modeL
1. INTRODUCTION
Different types of neural networks (NN) are under careful investigation now. Creating of the working models of
NN, including optical models, is most useful and promising'. One of the most interesting is the two-layer NN (TLNN),
which can be easily made on the basis of proposed optical matrix-vector multiplier (OVMM). OVMM has extremely high
information processing capabilities. Computing in such TLNN is performed sequentially in OVMM. Large time-bandwidth
product acoustooptic devices are the fastest up-to-date spatial light modulators. As a base devices for OVMM we propose
to implement high-speed high dynamic range spatial light modulator namely the multichannelmultifrequency acoustooptic
Bragg cell (MAOM), laser array and CCD (charge-coupled device) array in well-known orientation of OVMM.
2. NEURAL NETWORKS MATHEMATICAL MODELS SUITABLE FOR OPTOELECTRONIC
IMPLEMENTATION
Neural network mathematical models are determined by three characteristics: (1) models of used in the network
neurones, (2) architecture of the interconnections in the network, (3) algorithm of determining of the neural network
parameters.
The main operations performed by neuron are the weighted addition and the non-linear transformation. In the
simplest case a non-linear transformation is applied to the weighted sum and the result of transformation is transmitted
to the neuron's output. The neuron does not perform other operation. It is possible to prove that the network which
consists of such simple neurones can solve any problem if algorithm for it's solution exist. The meanings of the synaptic
weights are different for the different neurones. Non-linear transformations may be different or the same for all the
neurones of the network. Neural networks with the same transformations for all the neurones are called homogeneous.
Due to the fact that the optoelectronic devices are the most effective for realization of the vector-matrix
multiplication, it is very useful to put the main calculation load on them and to simplify the non-linear transformations
as much as possible. That is why it is possible to assume that homogeneous neural networks are the most suitable for
the optoelectronic realizations.
O-8194-1778-5/941$6.QQ
SPIE Vol. 2430 Optical Neural Networks (1994) / 189
In theory the neural networks can have arbitrary interconnection architecture. But from the viewpoint of hardware
implementation there is the most perspective architecture. It is the multilayer neural networks with sequential
interconnections. As it was proved2 such networks may solve any task of image recognition. The amount of
interconnection in such networks is less then in completely-connected ones. It reduces the complexity of the hardware
realization.
Thus, the multilayer homogeneous neural networks with sequential connections are the most suitable for the
optoelectronic realization (see Fig. 1).
Fig. 1. Multi-layer neural network.
190 /SP!E Vol. 2430 Optical Neural Networks (1994)
3. TLNN. NOISE TOLERANCE
The TLNN simulation software program was designed. Simulated NN contains 64 neurones in the first layer and
8 neurones in the second one. The NN was trained with gradient and stochastic algorithms. The ability of trained network
allows to recognize binary images of all 26 alphabet letters. Each image consists of 8x8 pixeL For the experiment the
binary images were used. After the contrasting procedure the weight coefficients became the meanings of {-1;O;1}.
The homogeneous TLNN performance is defmed by the following parameters: (1) synaptic weights of first-layer
neurones, (2) input data of TLNN, (3) bias of first-layer neurones, (4) sum's outputs of first-layer neurones, (5)outputs
of first-layer neurones, (6) synaptic weights of second-layer neurones, (7) bias of second-layer neurones, (8) sums outputs
of second-layer neurones. As a result of learning parameters N 1, 3, 6, 7 are defined. The meanings of other parameters
are defmed during the TLNN performance process.
In the hardware implementation of TLNN each one of the eight parameters could has meaning that is different
from the calculated one. This could happen because the technology of optoelectronic NN fabrication does not allow the
exact representation of the calculated meanings of parameters. The aim of statistic experiments on noising was the
defmition of tolerant deviation of meanings of the parameters from the calculated values. The criteria was the ability to
recognize all the learned images.
The modelling program allows to carry out experiments on noising with step of noise ratio (NR) 5%.The term
NR could be illustrated by the following:
p0=px(1+sx(2xRND-1)/1OO)
(1)
where p is initial meaning of the parameter, p0 is the meaning of the parameter after noising, s -NR percentage,
RND - uniformly spreaded random value, 0< RND <1.
As a result of modelling the mean value of misrecognition and the mean data of not recognized images are define.
The number of iteration for meaning was 100. The results of the experiments are presented in the table.
No
NN's parameters to be changed
Allowed %
of deviation
1
Synaptic weights of first-layer neurones
35%
2
Input data of TLNN
10%
3
Bias of first-layer neurones
105%
4
Sums outputs of first-layer neurones
80%
S
Outputs of first-layer neurones
25%
6
Synaptic weights of second-layer neurones
30%
7
Bias of second-layer neurones
130%
SPIE Vol. 2430 Optical Neural Networks (1994) 1191
No
NN's parameters to be changed
Allowed %
of deviation
8
Sums outputs of second-layer neurones
85%
9
Synaptic weights of first- and second-layer
neurones (NR of all other parameters 5%)
15%
10
All parameters of TLNN
10%
The most important information is contained in the rows 9 and 10 of the table. The TLNN retains the ability of
correct recognition of all the 26 alphabet letters in two cases:
1. The NR of synaptic weights of first- and second-layer neurones is less then 15%.
2. The NR of all other parameters is less then 5%. The NR of all the parameters is less then 10%.
So, it proves that the technology of optoelectronic NN fabrication which allows to present all calculated parameters
in hardware with accuracy better then 10% could be implemented for hardware realization of considering TLNN.
The fig. 2 presents the plot of noise tolerant of all the parameters of TLNN. It is important to note the linear
decreasing of the number of correctly recognized images against the NR of TLNN's parameters. While NR=60% the
TLNN retains the ability of correct recognition of 50% images. For computer systems of previous generations another
type of plot is inherent, namely avalanche dependence: 100% recognition at weak noise and 0% recognition after the
increasing of some noise leveL
N
100
50
100
150
200 NR
Fig. 2. Noise toleration of the TLNN.
The computer experiment shows that the TLNN has significant noise resistance. This fact becomes the TLNN
attractive for on-board implementations.
192 ISPIE Vol. 2430 Optical Neural Networks (1994)
4. TLNN. HARDWARE
Optical realization of the TLNN is an example of the effective application of MAOM. The main part of the
calculations in this case is vector-matrix multiplication: A*B = C. The proposed OVMM architecture (fig. 3) allows one
to perform this in one step. Traditionally the DMAC (digital multiplication via analogue convolution)3 algorithm is used
for obtaining high accuracy in optical calculations.
fi
MAOM
f
•..
11 ni
22
2
1 2"k'
Pp
4P
1 Isuk...
Fig. 3. Vector-matrix multiplier with time integration.
The algorithm we propose is similar to DMAC, but there is one significant difference concerned with the ability
to use frequency coding in MAOM. Due to this fact OVMM allows one to multiply (convolve) 2D and 3D data flows and
obtain the whole vector-matrix multiplication in one step with speeds about the frame rate of MAOM. In this case the
vector as a 2D array is inputed into the laser array (LA) and the matrix as a 3D array is inputed into MAOM.
In the proposed architecture vectorA has 8-components (ar) and each component is represented by an 8-bit binary
number (P 1,0), i.e. by a byte: k= 1 is the most (MSB), k=8 is the least significant bit (LSB). The matrixB dimension
is 8x8 and each element b13 is represented by byte too (bs, s =1..8): s = 1 is the MSB, s = 8 is the LSB. The elements of the
3D array b1s (i.e. matrix B) are inputed into the MAOM: i is the pulse position in the channel aperture at the exposure
moment,j is the channel number, s is the acoustic wave frequency number (f'). When the MAOS aperture is filled the
LA executes eight (according to the number of bits in each vector element) light pulses (expositions). Each vector
component is applied to a corresponding "window", as is shown in fig. 3. The LSB of components are transmitted first,
the MSB - last. Radiation from each laser uniformiy illuminates the corresponding "window" of every channel. The optical
system transfers the light diffracted in the MAOM onto a set of eight linear CCD array. Each linear CCD contain eight
(according to number of frequencies) points. The CCD operates in the shift-and-add mode. The shift velocity corresponds
to the bit loading speed of the lasers. The charge is accumulated in a point of a linear CCD while first exposition is shifted
to the next point of this CCD for a second exposition. Each linear CCD represents a component of the output vector C
SPIE Vol. 2430 Optical Neural Networks (1994)! 193
in a mixed representation. The supporting electronic control system (CS) is responsible for converting the presentation
of this data from mixed to hex format, the tresholding, the synchronization, etc.
For checking of opportunity of completion DMAC by described way the breadboard model performing
multiplication 8 digit numbers was assembled. The rate of bitby-bit entering was limited only by existed maximal
achievable for given CCD array transfer frequency. The typical result signal of DMAC-multiplication with implementation
of multiplexing in the MAOM registered directly at the CCD output is presented on fig. 4. Taking into account the utmost
parameters Te02 for MAOM as follows: 100 "windows", 8 frequencies, the period of laser pulse repetition is about 10
ns and the CCD array transfer frequency is about 120 MHz. The productivity of MAOM at calculations with "constant"
matrix is:
P(2xMxN-M)xM/2L,
(2)
where: M, N- dimension, L-number of bits, M - MAOM band of transmittance. In general case such multiplier is L-times
slower than the analog one, using the same elements for inputing (LA and MAOM ). Moreover, considerable defect,
affecting efficiency of system, is the necessity of using the CCD for completion DMAC with time-integration. Existing
in present CCDs have comparatively low detecting ability. As it shown4 this fact on present stage puts under the question
the competitiness of such device not only in comparison with analog AO processors, but also with existing electronic
means. The presented below architectures of the digital vector-matrix AO multipliers with space integration seams to be
more perspective.
—
————
——————
I
0 1 1 0 1 1 1
a)
I
0111100
b)
01123233332100
c)
Fig. 4. Example of DMAC multiplication with implementation of multiplexing: a) lazer pulses,
b) multiplexing signal in acoustooptic cell, c) CCD output signal in a mixed representation.
The processor architecture based on the multifrequency MAOM and realized DMAC with space integration is
presented on fig. 5. The architecture (similar to SAOBIC5 and analogous) consists of the source of collimated light (SCL),
two MAOM (MAOM1 has N channels, MAOM2 has L channels), the projecting (OS1) and the summing (0S2) optical
systems, which perform also the optical filtration, and M-photodetector array (PDA). The processor executes the
multiplication of MxN matrix onto the N-vector using L-bit representation of numbers. The elements of vector b are
loaded in parallel in first MAOM (each in the appropriate channel) in serial code. The elements of matrix ajj are loaded
194 / SPIE Vol. 2430 Optical Neural Networks (1994)
in following order: the elements of matrix with indexes j applies multiplexually - the i-th element at
the i-th frequency and occupy position in the channel of the second MAOM, corresponding to the projection of the j-th
channel of the first MAOM. In the l-th channel of the second MAOM only l-th bits of matrix elements applies. The
materials of modulators and the scale characteristics of the projecting optical system OS1 are chosen so, that the ratio
of the time for passing of the optical image, created by the first MAOM, to the time for passing the pixel of the second
MAOM by sound is equal to the word length L. The whole multiplication process requires 2xL steps of the first ("fast"
one) MAOM. The optical system 0S2 performs the DMAC summation and summation of the i-th components on the
i-th photodetector. The following processing of the resulting vector elements is analog-to-digital transformation of
triangular pulses of mixed code, arrived from the photodiodes after amplification, and summation with shift of the results
of the transformation. The postprocessing as a whole requires M ADC (analog to digital convertor) and M shift-adders.
i_fl second MAOM
SeL
MAOM1
OS1
MAOM2
0S2
IDA
Fig. 5. OVMM architecture with space integration.
The utmost productivity at continuous calculations with "constant" matrix can reach:
P(2xMxN-M)xM/2
(3)
where: M, N - dimensions, L - number of bits, M - second MAOM band of transmittance. The given architecture, after
corresponding alteration of postprocessing devices can be used for operations with mixed-sign arrays, represented in
additional binary code6. Registration of the results of convolution begins only after completion of filling of first modulator
and requires only L "fast" steps. The source of light synchronize so that the light arrived on modulators only on period
between two moments: when the channels of the first MAOM are completely filled by elements of vector b and moment
when last bits of elements b leave the channeL The postprocessing device in this case requires N dividers and N
shift-adders.
The maximal dimension of vectors and matrixes that could be processed in such system is, on the one hand, limited
by the maximum number of resolvable outputs of MAOM2 channels and, on the other hand, defines necessary at given
word length of representation, dynamic range (DR) of the system (in turn, DR can provide discrimination of no more
than 250-260 levels of accepted signals); so at L=8 at processing of square matrixes their dimensions can be about 30.
In general case, n-time increasing of the word length of representation reduces the maximal dimension of processable
arrays in n times and accordingly lowers the efficiency. The increasing of the dimension of processable matrix
SPIE Vol. 2430 Optical Neural Networks (1994) / 195
(simultaneously the efficiency is increasing) is possible by rn-time increasing of the temporary aperture of MAOM1 and
by rn-time increasing the number of PDA and channels of MAOM2.
In the system performing multiplication of the matrix of m xM xN-dimension and vector of N-dimension at word
length L the channels of the "fast" MAOM1 are divided on m parts, providing L discriminated outputs each, and projected
onto appropriate L channels of MAOM2 (with total number of channels m XL ). The loading of elements of vector in
MAOM1 and elements of matrix in each of m groups of channels MAOM2 (with appropriate delays ),the registration
of results from each of m groups of FDA and clocking of sources of light at processing of mixed-sign arrays is executed
similarly to architecture on fig. S. Total matrix-vector multiplication requires (m + 1) XL steps of the first MAOM, however
it is possible to begin following multiplication after 2 XL steps. The speed limit will be:
Pmx(2xMxN-M)xM/2
(4)
It is possible to increase the efficiency of such multiplier by using instead of first MAOM the matrixes of light
sources, providing the similar "movement" of elements of vector b. It is expected the increasing of the efficiency at the
expense of considerable reduction of the required light and UHF power. However because of imperfectness of technology
of modem light source matrixes the practical realization of such system is seems to be rather difficult.
The considerable defect of systems with the space integration are the relative complexity and high speed required
for read-out and postprocessing devices (in general case L-times higher than in analog devices, used such MAOM for
inputing of matrix).
The valuations of throughput various acoustooptic vector-matrix multipliers with Te02 modulators as matrix inputing
devices are presented on fig. 6 and show potentially high competitiness acoustooptics in comparison with electronics.
MOPS
102
mathx
diüoi
Fig. 6. Variations of throughput: 1-TRW TMC2208 8-bit multiplier4, 2-analog multiplier with updatable matrix4,
3-multiplier with time integration, 4-analog multiplier with constant matrix4, 5-space integration multiplier with
constant matrix (with LINbO3 "fast" MAOM).
196 ISPIE Vol. 2430 Optical Neural Networks (1994)
5. CONCLUSION
Thus, at present, vector-matrix multiplications in neural networks can be realized at the base of acoustooptics
processors. The potential of the speed and efficiency of such processors compares favorably with existing electronic means
at large dimensions of processed arrays.
Thus initially the low accuracy of such calculations can be increased for account of complication of representation
of processed values, presentation of rigid requirements to adjusting of the system as a whole and complication of
postprocessing system. The configuration of the system (architecture, accuracy of representation of values, interface and
postprocessing systems) is defined by particular statement of the task.
6. REFERENCES
1. E. Barnard and D. Casasent, "New Optical Neural System Architectures and Applications," Proc.SPIE, Vol. 963,
pp. 537-545, 1988.
2. B. Muller, J. Reinhardt, Neural Networks, Springer-Verlag, 1990.
3. P. R. Beaudet, A. P. Goutzoulis et aL,Appl. Optics, VoL 25, No. 18, p. 3097, 1986.
4. C. K. Gary, "Comparison of optics and electronics for the calculation of matrix-vector products," Proc. SPIE, Vol.
1704.
5. Guilfoyle P. S., "Systolic acousto-optic binary convolver", Opt. Eng., Vol. 23, pp. 20-25, Jan./Feb. 1984.
6. R. P. Bocker, S. P. Clyton, K. Bromley, "Electro-optical matrix multiplication using 2's component arithmetic
for improved accuracy," Appi. Optics, VoL 22, N 13, pp. 2019-2021, July 1983.
SPIE Vol. 2430 Optical Neural Networks (1994) / 197