* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download to the neuron`s output. The neuron does not perform other
Neurocomputational speech processing wikipedia , lookup
Multielectrode array wikipedia , lookup
Optogenetics wikipedia , lookup
Neural modeling fields wikipedia , lookup
Linear belief function wikipedia , lookup
Catastrophic interference wikipedia , lookup
Nervous system network models wikipedia , lookup
Metastability in the brain wikipedia , lookup
Development of the nervous system wikipedia , lookup
Channelrhodopsin wikipedia , lookup
Artificial neural network wikipedia , lookup
Convolutional neural network wikipedia , lookup
Neural engineering wikipedia , lookup
Experimental investigation of the performance of the optical two-layer neural network Nikolay N. Evtihiev, Rostislav S. Starikov, Boris N. Onyky, Vadim V. Perepelitsa, Igor B. Scherbakov Moscow State Engineering Physics Institute (Technical University), Department of Quantum Electronics, Kashirskoe shosse 31, Moscow, 115409, Russia ABSTRACT The paper presents the obtained results of learning of the two-layer (64x8) neural network (TLNN), the results of tolerant noising of weight matrixes and results of hardware implementation. The imperfectness of optics satisfies the margin requirements of TLNN modeL 1. INTRODUCTION Different types of neural networks (NN) are under careful investigation now. Creating of the working models of NN, including optical models, is most useful and promising'. One of the most interesting is the two-layer NN (TLNN), which can be easily made on the basis of proposed optical matrix-vector multiplier (OVMM). OVMM has extremely high information processing capabilities. Computing in such TLNN is performed sequentially in OVMM. Large time-bandwidth product acoustooptic devices are the fastest up-to-date spatial light modulators. As a base devices for OVMM we propose to implement high-speed high dynamic range spatial light modulator namely the multichannelmultifrequency acoustooptic Bragg cell (MAOM), laser array and CCD (charge-coupled device) array in well-known orientation of OVMM. 2. NEURAL NETWORKS MATHEMATICAL MODELS SUITABLE FOR OPTOELECTRONIC IMPLEMENTATION Neural network mathematical models are determined by three characteristics: (1) models of used in the network neurones, (2) architecture of the interconnections in the network, (3) algorithm of determining of the neural network parameters. The main operations performed by neuron are the weighted addition and the non-linear transformation. In the simplest case a non-linear transformation is applied to the weighted sum and the result of transformation is transmitted to the neuron's output. The neuron does not perform other operation. It is possible to prove that the network which consists of such simple neurones can solve any problem if algorithm for it's solution exist. The meanings of the synaptic weights are different for the different neurones. Non-linear transformations may be different or the same for all the neurones of the network. Neural networks with the same transformations for all the neurones are called homogeneous. Due to the fact that the optoelectronic devices are the most effective for realization of the vector-matrix multiplication, it is very useful to put the main calculation load on them and to simplify the non-linear transformations as much as possible. That is why it is possible to assume that homogeneous neural networks are the most suitable for the optoelectronic realizations. O-8194-1778-5/941$6.QQ SPIE Vol. 2430 Optical Neural Networks (1994) / 189 In theory the neural networks can have arbitrary interconnection architecture. But from the viewpoint of hardware implementation there is the most perspective architecture. It is the multilayer neural networks with sequential interconnections. As it was proved2 such networks may solve any task of image recognition. The amount of interconnection in such networks is less then in completely-connected ones. It reduces the complexity of the hardware realization. Thus, the multilayer homogeneous neural networks with sequential connections are the most suitable for the optoelectronic realization (see Fig. 1). Fig. 1. Multi-layer neural network. 190 /SP!E Vol. 2430 Optical Neural Networks (1994) 3. TLNN. NOISE TOLERANCE The TLNN simulation software program was designed. Simulated NN contains 64 neurones in the first layer and 8 neurones in the second one. The NN was trained with gradient and stochastic algorithms. The ability of trained network allows to recognize binary images of all 26 alphabet letters. Each image consists of 8x8 pixeL For the experiment the binary images were used. After the contrasting procedure the weight coefficients became the meanings of {-1;O;1}. The homogeneous TLNN performance is defmed by the following parameters: (1) synaptic weights of first-layer neurones, (2) input data of TLNN, (3) bias of first-layer neurones, (4) sum's outputs of first-layer neurones, (5)outputs of first-layer neurones, (6) synaptic weights of second-layer neurones, (7) bias of second-layer neurones, (8) sums outputs of second-layer neurones. As a result of learning parameters N 1, 3, 6, 7 are defined. The meanings of other parameters are defmed during the TLNN performance process. In the hardware implementation of TLNN each one of the eight parameters could has meaning that is different from the calculated one. This could happen because the technology of optoelectronic NN fabrication does not allow the exact representation of the calculated meanings of parameters. The aim of statistic experiments on noising was the defmition of tolerant deviation of meanings of the parameters from the calculated values. The criteria was the ability to recognize all the learned images. The modelling program allows to carry out experiments on noising with step of noise ratio (NR) 5%.The term NR could be illustrated by the following: p0=px(1+sx(2xRND-1)/1OO) (1) where p is initial meaning of the parameter, p0 is the meaning of the parameter after noising, s -NR percentage, RND - uniformly spreaded random value, 0< RND <1. As a result of modelling the mean value of misrecognition and the mean data of not recognized images are define. The number of iteration for meaning was 100. The results of the experiments are presented in the table. No NN's parameters to be changed Allowed % of deviation 1 Synaptic weights of first-layer neurones 35% 2 Input data of TLNN 10% 3 Bias of first-layer neurones 105% 4 Sums outputs of first-layer neurones 80% S Outputs of first-layer neurones 25% 6 Synaptic weights of second-layer neurones 30% 7 Bias of second-layer neurones 130% SPIE Vol. 2430 Optical Neural Networks (1994) 1191 No NN's parameters to be changed Allowed % of deviation 8 Sums outputs of second-layer neurones 85% 9 Synaptic weights of first- and second-layer neurones (NR of all other parameters 5%) 15% 10 All parameters of TLNN 10% The most important information is contained in the rows 9 and 10 of the table. The TLNN retains the ability of correct recognition of all the 26 alphabet letters in two cases: 1. The NR of synaptic weights of first- and second-layer neurones is less then 15%. 2. The NR of all other parameters is less then 5%. The NR of all the parameters is less then 10%. So, it proves that the technology of optoelectronic NN fabrication which allows to present all calculated parameters in hardware with accuracy better then 10% could be implemented for hardware realization of considering TLNN. The fig. 2 presents the plot of noise tolerant of all the parameters of TLNN. It is important to note the linear decreasing of the number of correctly recognized images against the NR of TLNN's parameters. While NR=60% the TLNN retains the ability of correct recognition of 50% images. For computer systems of previous generations another type of plot is inherent, namely avalanche dependence: 100% recognition at weak noise and 0% recognition after the increasing of some noise leveL N 100 50 100 150 200 NR Fig. 2. Noise toleration of the TLNN. The computer experiment shows that the TLNN has significant noise resistance. This fact becomes the TLNN attractive for on-board implementations. 192 ISPIE Vol. 2430 Optical Neural Networks (1994) 4. TLNN. HARDWARE Optical realization of the TLNN is an example of the effective application of MAOM. The main part of the calculations in this case is vector-matrix multiplication: A*B = C. The proposed OVMM architecture (fig. 3) allows one to perform this in one step. Traditionally the DMAC (digital multiplication via analogue convolution)3 algorithm is used for obtaining high accuracy in optical calculations. fi MAOM f •.. 11 ni 22 2 1 2"k' Pp 4P 1 Isuk... Fig. 3. Vector-matrix multiplier with time integration. The algorithm we propose is similar to DMAC, but there is one significant difference concerned with the ability to use frequency coding in MAOM. Due to this fact OVMM allows one to multiply (convolve) 2D and 3D data flows and obtain the whole vector-matrix multiplication in one step with speeds about the frame rate of MAOM. In this case the vector as a 2D array is inputed into the laser array (LA) and the matrix as a 3D array is inputed into MAOM. In the proposed architecture vectorA has 8-components (ar) and each component is represented by an 8-bit binary number (P 1,0), i.e. by a byte: k= 1 is the most (MSB), k=8 is the least significant bit (LSB). The matrixB dimension is 8x8 and each element b13 is represented by byte too (bs, s =1..8): s = 1 is the MSB, s = 8 is the LSB. The elements of the 3D array b1s (i.e. matrix B) are inputed into the MAOM: i is the pulse position in the channel aperture at the exposure moment,j is the channel number, s is the acoustic wave frequency number (f'). When the MAOS aperture is filled the LA executes eight (according to the number of bits in each vector element) light pulses (expositions). Each vector component is applied to a corresponding "window", as is shown in fig. 3. The LSB of components are transmitted first, the MSB - last. Radiation from each laser uniformiy illuminates the corresponding "window" of every channel. The optical system transfers the light diffracted in the MAOM onto a set of eight linear CCD array. Each linear CCD contain eight (according to number of frequencies) points. The CCD operates in the shift-and-add mode. The shift velocity corresponds to the bit loading speed of the lasers. The charge is accumulated in a point of a linear CCD while first exposition is shifted to the next point of this CCD for a second exposition. Each linear CCD represents a component of the output vector C SPIE Vol. 2430 Optical Neural Networks (1994)! 193 in a mixed representation. The supporting electronic control system (CS) is responsible for converting the presentation of this data from mixed to hex format, the tresholding, the synchronization, etc. For checking of opportunity of completion DMAC by described way the breadboard model performing multiplication 8 digit numbers was assembled. The rate of bitby-bit entering was limited only by existed maximal achievable for given CCD array transfer frequency. The typical result signal of DMAC-multiplication with implementation of multiplexing in the MAOM registered directly at the CCD output is presented on fig. 4. Taking into account the utmost parameters Te02 for MAOM as follows: 100 "windows", 8 frequencies, the period of laser pulse repetition is about 10 ns and the CCD array transfer frequency is about 120 MHz. The productivity of MAOM at calculations with "constant" matrix is: P(2xMxN-M)xM/2L, (2) where: M, N- dimension, L-number of bits, M - MAOM band of transmittance. In general case such multiplier is L-times slower than the analog one, using the same elements for inputing (LA and MAOM ). Moreover, considerable defect, affecting efficiency of system, is the necessity of using the CCD for completion DMAC with time-integration. Existing in present CCDs have comparatively low detecting ability. As it shown4 this fact on present stage puts under the question the competitiness of such device not only in comparison with analog AO processors, but also with existing electronic means. The presented below architectures of the digital vector-matrix AO multipliers with space integration seams to be more perspective. — ———— —————— I 0 1 1 0 1 1 1 a) I 0111100 b) 01123233332100 c) Fig. 4. Example of DMAC multiplication with implementation of multiplexing: a) lazer pulses, b) multiplexing signal in acoustooptic cell, c) CCD output signal in a mixed representation. The processor architecture based on the multifrequency MAOM and realized DMAC with space integration is presented on fig. 5. The architecture (similar to SAOBIC5 and analogous) consists of the source of collimated light (SCL), two MAOM (MAOM1 has N channels, MAOM2 has L channels), the projecting (OS1) and the summing (0S2) optical systems, which perform also the optical filtration, and M-photodetector array (PDA). The processor executes the multiplication of MxN matrix onto the N-vector using L-bit representation of numbers. The elements of vector b are loaded in parallel in first MAOM (each in the appropriate channel) in serial code. The elements of matrix ajj are loaded 194 / SPIE Vol. 2430 Optical Neural Networks (1994) in following order: the elements of matrix with indexes j applies multiplexually - the i-th element at the i-th frequency and occupy position in the channel of the second MAOM, corresponding to the projection of the j-th channel of the first MAOM. In the l-th channel of the second MAOM only l-th bits of matrix elements applies. The materials of modulators and the scale characteristics of the projecting optical system OS1 are chosen so, that the ratio of the time for passing of the optical image, created by the first MAOM, to the time for passing the pixel of the second MAOM by sound is equal to the word length L. The whole multiplication process requires 2xL steps of the first ("fast" one) MAOM. The optical system 0S2 performs the DMAC summation and summation of the i-th components on the i-th photodetector. The following processing of the resulting vector elements is analog-to-digital transformation of triangular pulses of mixed code, arrived from the photodiodes after amplification, and summation with shift of the results of the transformation. The postprocessing as a whole requires M ADC (analog to digital convertor) and M shift-adders. i_fl second MAOM SeL MAOM1 OS1 MAOM2 0S2 IDA Fig. 5. OVMM architecture with space integration. The utmost productivity at continuous calculations with "constant" matrix can reach: P(2xMxN-M)xM/2 (3) where: M, N - dimensions, L - number of bits, M - second MAOM band of transmittance. The given architecture, after corresponding alteration of postprocessing devices can be used for operations with mixed-sign arrays, represented in additional binary code6. Registration of the results of convolution begins only after completion of filling of first modulator and requires only L "fast" steps. The source of light synchronize so that the light arrived on modulators only on period between two moments: when the channels of the first MAOM are completely filled by elements of vector b and moment when last bits of elements b leave the channeL The postprocessing device in this case requires N dividers and N shift-adders. The maximal dimension of vectors and matrixes that could be processed in such system is, on the one hand, limited by the maximum number of resolvable outputs of MAOM2 channels and, on the other hand, defines necessary at given word length of representation, dynamic range (DR) of the system (in turn, DR can provide discrimination of no more than 250-260 levels of accepted signals); so at L=8 at processing of square matrixes their dimensions can be about 30. In general case, n-time increasing of the word length of representation reduces the maximal dimension of processable arrays in n times and accordingly lowers the efficiency. The increasing of the dimension of processable matrix SPIE Vol. 2430 Optical Neural Networks (1994) / 195 (simultaneously the efficiency is increasing) is possible by rn-time increasing of the temporary aperture of MAOM1 and by rn-time increasing the number of PDA and channels of MAOM2. In the system performing multiplication of the matrix of m xM xN-dimension and vector of N-dimension at word length L the channels of the "fast" MAOM1 are divided on m parts, providing L discriminated outputs each, and projected onto appropriate L channels of MAOM2 (with total number of channels m XL ). The loading of elements of vector in MAOM1 and elements of matrix in each of m groups of channels MAOM2 (with appropriate delays ),the registration of results from each of m groups of FDA and clocking of sources of light at processing of mixed-sign arrays is executed similarly to architecture on fig. S. Total matrix-vector multiplication requires (m + 1) XL steps of the first MAOM, however it is possible to begin following multiplication after 2 XL steps. The speed limit will be: Pmx(2xMxN-M)xM/2 (4) It is possible to increase the efficiency of such multiplier by using instead of first MAOM the matrixes of light sources, providing the similar "movement" of elements of vector b. It is expected the increasing of the efficiency at the expense of considerable reduction of the required light and UHF power. However because of imperfectness of technology of modem light source matrixes the practical realization of such system is seems to be rather difficult. The considerable defect of systems with the space integration are the relative complexity and high speed required for read-out and postprocessing devices (in general case L-times higher than in analog devices, used such MAOM for inputing of matrix). The valuations of throughput various acoustooptic vector-matrix multipliers with Te02 modulators as matrix inputing devices are presented on fig. 6 and show potentially high competitiness acoustooptics in comparison with electronics. MOPS 102 mathx diüoi Fig. 6. Variations of throughput: 1-TRW TMC2208 8-bit multiplier4, 2-analog multiplier with updatable matrix4, 3-multiplier with time integration, 4-analog multiplier with constant matrix4, 5-space integration multiplier with constant matrix (with LINbO3 "fast" MAOM). 196 ISPIE Vol. 2430 Optical Neural Networks (1994) 5. CONCLUSION Thus, at present, vector-matrix multiplications in neural networks can be realized at the base of acoustooptics processors. The potential of the speed and efficiency of such processors compares favorably with existing electronic means at large dimensions of processed arrays. Thus initially the low accuracy of such calculations can be increased for account of complication of representation of processed values, presentation of rigid requirements to adjusting of the system as a whole and complication of postprocessing system. The configuration of the system (architecture, accuracy of representation of values, interface and postprocessing systems) is defined by particular statement of the task. 6. REFERENCES 1. E. Barnard and D. Casasent, "New Optical Neural System Architectures and Applications," Proc.SPIE, Vol. 963, pp. 537-545, 1988. 2. B. Muller, J. Reinhardt, Neural Networks, Springer-Verlag, 1990. 3. P. R. Beaudet, A. P. Goutzoulis et aL,Appl. Optics, VoL 25, No. 18, p. 3097, 1986. 4. C. K. Gary, "Comparison of optics and electronics for the calculation of matrix-vector products," Proc. SPIE, Vol. 1704. 5. Guilfoyle P. S., "Systolic acousto-optic binary convolver", Opt. Eng., Vol. 23, pp. 20-25, Jan./Feb. 1984. 6. R. P. Bocker, S. P. Clyton, K. Bromley, "Electro-optical matrix multiplication using 2's component arithmetic for improved accuracy," Appi. Optics, VoL 22, N 13, pp. 2019-2021, July 1983. SPIE Vol. 2430 Optical Neural Networks (1994) / 197