* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download 131-300-1
Survey
Document related concepts
Transcript
Majlesi Journal of Electrical Engineering Vol. 6, No. 2, June 2012 FPGA Implementation of Character Recognition Using Spiking Neural Network Ensieh. Iranmehr1, Bijan. Vosughi Vahdat2 , Mohammad Mahdi. Faraji3 1- Sharif University of Technology/ Department of Electrical Engineering, Tehran, Iran. Email: [email protected] 2- Sharif University of Technology/ Department of Electrical Engineering, Tehran, Iran. Email: [email protected] 3- Sharif University of Technology/ Department of Electrical Engineering, Tehran, Iran. Email: [email protected] Received X X X Revised X X X Accepted X X X ABSTRACT: Character recognition is very useful in various fields of engineering applications. Due to visual remarkable ability of humans, this paper describes a simple biological inspired model based on Spiking Neural Network (SNN) for recognizing characters. Two datasets are used: MNIST for recognizing English characters and Bani Nick Pardazesh dataset for recognizing Persian characters. The proposed network is a two layered structure consisting of Integrate and Fire (IF) and active dendrite neurons. In order to train first layer of this network, a proposed algorithm based on k-means is used. Furthermore, a modified algorithm based on Spike Time Dependent Plasticity (STDP) is used in order to train second layer of this network. This structure is designed in way that can be implemented on Field Programming Gate Array (FPGA) properly. Implementation results demonstrate that this model occupies not many resources and also it is very fast in character recognition applications. Finally by applying test data, the proposed neural structure has been evaluated. Simulation results indicate high accuracy of recognizing characters. KEYWORDS: Character Recognition; Spiking Neural Network; STDP; k-means; FPGA Implementation; 1. INTRODUCTION In recent years, Optical Character Recognition (OCR) technology has been applied throughout the entire spectrum of industries, revolutionizing the document management process. OCR has enabled scanned documents to become more than just image files, turning into fully searchable documents with text content that is recognized by computers. OCR extracts relevant information and enters it automatically. The uses of OCR vary across different fields. One widely known OCR application is in banking, where OCR is used to process checks without human involvement. There are many applications in intelligent transportation systems for recognizing characters of license plate [1]. The License plate recognition (LPR) system can be used in traffic control management in order to recognize vehicles that commit traffic violation, such as entering restricted area without permission, crossing red light, breaking speed limits, etc. Human ability for recognizing handwritten characters is remarkable. Thus, it is preferred to use the biological structure of recognizing characters. Biological neurons use pulses or spikes to encode information and information is stored in the timing of spikes. Thus, in this paper, a spiking neural network model is used to recognize characters. There have been a few studies using spiking neuron models to recognize characters. For example: Gupta et al. [2] recognized character by using two layered SNN which their developed network was able to correctly identify the noisy printed characters. Bhuiyan et al. [3] compared implementation of a two layered spiking neural network for recognizing printed characters in terms of the computation times on two multicore processors by using the Izhikevich and the HodgkinHuxley neuron model. Kulkarni et al. [4] analyzed the performance and amount of hardware for nearest neighbor classifier and SNN for handwritten character recognition application. This paper attempts to implement a designed structure for character recognition on FPGA based on spiking neural network. STDP is used for training this SNN. The idea is that when the stimulus (in form of a constant current) is presented as input to the IF neurons, the output should be in the form of a series of spikes. Perrett et al. [5] shows that visual information must essentially propagate in feed-forward fashion with most neurons only having time to fire one spike. Rank-order 1 Majlesi Journal of Electrical Engineering Vol. 6, No. 2, June 2012 coding (ROC) is a simple way of using the order of neurons spike as a code. In this case, the exact latency at which a neuron fires is not critical and only the rank order of each neuron is important [6, 7]. In this paper, for presenting an image, ROC is used which is ultra-fast categorization of a static image. Next section describes the structure of proposed network for recognizing characters which is based on spiking neural network. Section 3 represents proposed training algorithm. Section 4 represents the result of implementing on FPGA. Section 5 talks about simulation results of proposed algorithm. Finally, section 6 concludes the paper. Logical architecture of first layer IF neuron updating core of proposed network. As it can be seen from Fig. 2, a state machine has been used for updating inside voltage and generating output spikes for each first layer neurons. If we assume an input image in size of 20 × 12 = 240 pixels is given to SNN, there would be needed 240 weights for each first layer IF neuron. For simplicity, these weights are considered binary. Therefore, for applying these binary weights to input binary image, AND gates are sufficient. In other words, 240 AND gates are used to compare input image with weights. It is worth to mention that input weights are computed offline using Matlab software. 2. THE STRUCTURE OF PROPOSED NETWORK The proposed network is a two layered fully connected SNN consisting of IF and active dendrite neurons which is shown in Fig. 1. The input of this network is an image which shows a character. This input image is given to first layer of proposed SNN. The outputs of each IF [8] neurons of this layer is a spike which is generated by using the weights of first layer. Register bank between two layers has been used for pipelining. It means that the first and second layer of SNN are able to work simultaneously. In other words, for each time step while first layer neurons spikes have been computing, the previous time step spikes are saved in register bank. So, the second layer can compute output spikes simultaneously. For identifying class of each character, a second layer neuron which generates spike earlier than others indicates the recognized class of input character. The structure is designed in way that can be implemented properly on FPGA. Each layer of designed network is described in detail in the following. Input Image First Layer Register Second Layer 0 M U X + + > Vth 0 M U X -1 M U X Tr != 0 Output Class Input Weights 240bits Neuron inside Voltage 16bits Neuron Output Spike 1bit Refractory Time 6bits First Layer Neuron Memory Fig. 2. Logical architecture of first layer IF neuron updating core Fig. 1. Structure of proposed neural network 2.1. Configuration of First Layer of Proposed Network In this subsection, first layer of the spiking neural network structure is presented. Fig. 2 demonstrates 2 As shown in Fig. 2, IF neuron adds all output of AND gates with neuron inside voltage value and updates itβs inside voltage value. If its voltage value is more than Vth, spike is generated and the neuron inside voltage value is being zero. A bit in memory shows output spike. In memory, we consider 16 bits for neuron inside voltage value and 6 bits for neuron refractory time. When neuron Majlesi Journal of Electrical Engineering Vol. 6, No. 2, June 2012 inside voltage value is more than Vth, refractory time is being ππππ . Otherwise, if neuron refractory time is not equal to zero, neuron refractory time is decreased in each step and if neuron refractory time is zero, it does not change. Described logical structure in Fig. 2 updates a single neuron status in one clock period. So, for every neuron in first layer a clock period is needed. For instance: if we assume first layer consists of 36 neurons, then for updating one time step of first layer neuron situations there would be 36 clocks needed. 2.2. Configuration of Second Layer of Proposed Network In this section, second layer of the spiking neural network structure is presented. Fig. 3 demonstrates Logical architecture of second layer IF neuron updating core of proposed network. As it can be seen from Fig. 3, a state machine has been used for updating inside voltage and generating output spikes for each second layer neurons. In comparison with first layer that a single updating core is used, in order to speed up updating second layer neuronβs situation, an individual updating core is utilized for each second layer neuron. Unlike first layer weights are 0 or 1, the weights of second layer are 16 bits 2βs complement numbers. If the first layer IF neurons of proposed SNN generate spike, the weights of connections of each second layer IF neuron are summed together. As shown in Fig. 3, weights of second layer are computed offline and stored in Memory. If neuron inside voltage value is more than Vth, a spike is generated and inside voltage value of neuron is being zero. Refractory time has not implemented in this layer neurons because after first spike is generated in second layer, the output class of input character is determined. So, neurons situation is not important anymore. 2.3. Output Class Identifier Layer As mentioned before, the neuron which generates spike at first, determines the class of input image. So, Logical architecture for detecting first received spike is shown in Fig. 4. As it can be seen in Fig. 4, as soon as one of output classes change to 1, all output classes do not change any more. This function is implemented by means of OR and multiplexer gates. In other words, when one input of OR gate is being 1, second input of multiplexers are selected. Thus, a loop is created and the values get frozen. One of the advantages of this structure in comparison with traditional neural networks is its simplicity in finding output class. In other words, comparing all output values are usually needed in order to find maximum value in traditional neural network. On the other hand, in the proposed neural structure, first spike indicates output class and it can be implemented more easily on FPGA in comparison with traditional neural networks. Output Spike M U X Output Class M U X Output Class M U X Output Class Output Spike O R Correspond Spike from First Layer And + Output Spike M U X Neuron inside Voltage 32bits 0 > Vth Input Weights 16bits Output Spike 1bit N Second Layer Neuron Memory Fig. 3. Logical architecture of second layer IF neuron updating core Fig. 4. Logical architecture for detecting first received spike 3. PROPOSED TRAINING ALGORITHM In this section, a training algorithm is presented for the proposed two layer SNN. Training algorithm of each layer is described in detail in the following. 3 Majlesi Journal of Electrical Engineering 3.1. Training algorithm of first layer In this subsection, training algorithm of first layer is presented. In this paper, weights of first layer are computed by means of k-means method [9]. Each neuron has to detect a specific pattern of input image. In order to detect more different pattern types, each class of data is divided into clusters. Therefore, each cluster index stands for one specific pattern of input image. It can be concluded that each class of data can be represented through its own cluster indexes. For example: if we assume 4 clusters for each class and 9 different classes of training data, there would be 4 × 9 = 36 neurons in first layer so that every neuronβs weights are achieved from one cluster index. After computing center of each cluster by means of k-means method, in order to establish equality between classes, the M strongest values of pixels in each cluster index have been considered as 1 weights and others have been considered as 0 weights. Stronger pixel means having more intensity than others. Algorithm 1: Pseudo code for training first layer of SNN Initialization: πΆ β ππ’ππππ ππ ππ’π‘ππ’π‘ ππππ π ππ π β ππ’ππππ ππ πππ’π π‘πππ πππ ππππ π π β ππ’ππππ ππ ππππ π‘ πππ¦ππ πππ’ππππ = π × πΆ π β ππ’ππππ ππ 1 π€πππβπ‘π πππ πππβ ππππ π‘ πππ¦ππ πππ’πππ For each classes c of data do: (a) Cluster training data with label c into k different clusters using k-means method (b) Select M strongest pixels of k cluster indexes, set the corresponding weights as 1 and set others as 0 (c) Assign these weights to corresponding neurons Algorithm 1 presents a modified training algorithm of first layer neuron based on k-means method. In the given algorithm Wki is connecting weight between k th pixel of input character and ith neuron of first layer. In this paper, two dataset have been used: the extracted Persian characters which are provided by Bani Nick Pardazesh Company [10] (size of each character is 12*20=240 pixels) and MNIST dataset (size of each character is 28*28 = 784 pixels). We use MNIST dataset which consists of 10 classes with 60000 training data and 6000 test data. Furthermore, we consider 9 classes of Bani Nick Pardazesh dataset which consists of 10400 training data and 2600 test data. Fig. 5 demonstrates achieved weights for various number of M in two different datasets. The right side of Fig. 5 shows computed weights for MNIST while computed weights for Bani Nick dataset are shown in the left side of this figure. 4 Vol. 6, No. 2, June 2012 Fig. 5. Cluster indexes for two datasets and for different threshold value 3.2. Training algorithm of second layer In this subsection, training algorithm of second layer is presented. In this paper, weights of second layer are computed by means of STDP method. As mentioned before, binary weights of first layer are computed in way that each neuron try to find one specific character type. It means that if input character is more similar to one neuron weights than others, that neuron will generate earlier and more spikes than others. Moreover, first spike in output represents detected class of input character. Therefore, second layerβs duty is establishing connection between corresponding first layer neurons and output classes. Algorithm 2: Pseudo code for training second layer of SNN Initialization: πΌ β ππππππππ πππ‘π π β ππππ‘πππ πππππ’π πππ ππ‘ππ£π ππππππ π£πππ’π For every training character: (a) Initialize all neurons to zero state, (b) Put training character in the input of SNN, (c) Run the SNN until first spike generated in second layer, (d) If generated spike is in correct class: Do nothing Else Updates weights of wrong neuron as below ππππππ€ = ππππππ β βπππ Updates weights of correct neuron as below ππππππ€ = ππππππ + βπππ Where βπππ computed from below equation: π‘πππβππ_π ππππ βπππ = πΌ / ( ππ β π ππππ ππ ) (e) Repeat this loop until convergence of W happens Algorithm 2 presents a modified training algorithm of second layer neuron based on STDP algorithm [11]. In the given algorithm Wij is connecting weight between ith neuron of first layer and jth neuron of second layer. Majlesi Journal of Electrical Engineering π‘πππβππ_π ππππ Also, ππ stands for time of teacher spike for jth neuron of second layer which considered equal to one π ππππ step before wrong signal time. Moreover, ππ represents last spike time of first layer ith neuron. Fig. 6 illustrates proposed training algorithm by using a simple example. Assume the existence of 2 classes. Each neuron of second layer represents one class of these two classes. If first neuron generates spike earlier than second neuron, it means the input belongs to class 1 and if second neuron generates spike earlier than first neuron, it means the input belongs to class 2. Suppose after running the SNN, wrongly second neuron generates spike sooner than first neuron. In this situation, weights have to be updated. Teacher spike time is considered one time step before wrong generated spikeβs time. If each previous layer neuron spike is closer to teacher spike, the amplitude of changing weights ( βπ ) is larger. But, for green connection, positive of βπ is considered in order to increase correspondence weights and negative βπ is considered for red connection to decrease correspondence weights. βW Teacher Spike +2 -2 Vol. 6, No. 2, June 2012 5. SIMULATION RESULTS In this section, represented algorithm is simulated by using MATLAB. This section consists of some simulated examples to demonstrate the accuracy and efficiency of the proposed method. Among 36 neurons of first layer, 4 neurons inside voltage value are shown in Fig. 7. By assuming an input which its class is 1, first layer voltage of these 4 neurons and their corresponding generated spikes are shown in Fig. 7. As mentioned before, IF neurons generate spikes when their inside voltages are bigger than a threshold voltage which is seen in this figure. Table I. Resources needed for FPGA implementation FPGA Spartan-3 Spartan-3 Virtex7 XC3S200 XC3S400 XC7V330T Chip Occupied Slices Occupied Block RAM Max CLK Classifying a character (Time) 45 % 21 % 1% 5.6 % 4.1 % 1% 90 MHz 42 micro second 100 MHz 37 micro second 258 MHz 15 micro second +3 -3 +1 -1 Wrong Spike +4 -4 Fig. 6. A sample to show proposed training algorithm of second layer based on STDP 4. IMPLEMENTATION ON FPGA In this paper, in order to evaluate the proposed algorithm, we use Xilinx Spartan-3 XC3S200 and XC3S400 and Virtex-7 XC7V330T to synthesize and implement Verilog codes. With Spartan-3 XC3S400, we recognize a character in 37us and with Virtex-7 XC7V330T, 15 us is a time of recognizing a character. Table 1 demonstrates Spartan-3 and virtex-7 resources needed for proposed SNN structure implementation. It is worth to mention that implementation results are for (20*12) input image, 36 first layer neurons and 9 second layer neurons. The advantages of implementing on FPGA is high speed of recognizing characters. By using Spartan-3 XC3S400, 27000 character can be recognized per second. Fig. 7. A sample of first layer voltage and corresponding generated spikes for 4 neurons Among 9 neurons of second layer, 3 neurons inside voltage value are shown in Fig. 8. As we know an input image belongs to class 1, the proposed algorithm tells us that first neuron of second layer has to be fired earlier than other second layer neurons. Second layer voltage of these 3 neurons and their corresponding generated spikes are shown in Fig. 8. We see that first neuron generates spike earlier than others. So, the class of input image is determined correctly. 5 Majlesi Journal of Electrical Engineering Simulation on both database for different number of first layer neurons has been done. Because of using 2 dataset with different classes, we consider different number of first layer neurons. For instance: if we have πΆ = 9 classes and π = 4 clusters per class , π = 36 neurons are considered for first layer and if we have πΆ = 10 classes and π = 8, π = 80 neurons are considered for first layer. Mean error percentage of recognizing characters for different number of first layer neurons is presented in Table II. As seen from Table 2, the mean error percentages show that by considering π = 16, best result is achieved for Bani Nick Pardazesh dataset and by considering π = 64 , best result is achieved for MNIST dataset. Simulation results indicates high accuracy about 98.79 % for Bani Nick Pardazesh dataset and 95.17 % for MNIST dataset. Fig. 8. A sample of second layer voltage and corresponding generated spikes for 4 neurons Table 2. Mean error percentage using both dataset Database Bani Nick Required time Neuron Pardazesh MNIST for recognition Number Company a character (micro second) k=1 8.76 88.19 11 k=2 5.15 17.98 19 k=4 1.82 13.38 37 k=8 1.53 10.23 73 k = 16 1.21 8.36 145 k = 32 1.25 6.80 289 k = 64 1.24 4.83 577 Table 3 demonstrates the accuracy and recognition time of this proposed SNN with π = 8 , 16 and another paper [12] in which Persian license plate characters have been recognized. Due to the existence of 8 characters on the 6 Vol. 6, No. 2, June 2012 Persian license plates, the time needed for the recognition of license plate is 8 times greater than the time required for the recognition of just one character. As can be seen from Table 3, time of recognizing license plate by using the proposed SNN is much less than other methods and the accuracy is comparable to other methods. Table 3. Comparing accuracy and speed of proposed method with others Algorithm Accuracy Recognition (%) time (ms) Reference [4] 90.7 MLP-SVM 94.76 92.5 PNN-SVM 95.11 55.7 ML-SVM 95.54 96.8 Proposed SNN in which 91.4 0.59 k=8 Proposed SNN in which 95.2 1.18 k = 16 6. CONCLUSION In this paper, a two layered spiking neural network has been proposed to identify characters. The proposed structure of SNN is designed in way to be fully compatible with FPGA constraints. A state machine is designed in order to update inside voltage value of each first and second layer IF neurons and generate spikes. For each class of characters, a neuron in second layer is considered. An algorithm based on K-means has been used to train the first layer of this network. A proposed algorithm based on STDP has been used to train the second layer of proposed network. The total structure is designed in way that can be implemented on FPGA efficiently. Implementation results show that it can be implemented effectively on commercial FPGAs. By means of parallel and pipeline implementation on FPGA, fast recognition about 37 micro second for each character can be achieved. Finally, the proposed system has been evaluated by simulation in MATLAB. The simulation results demonstrate the accuracy and feasibility of the proposed algorithm. Simulation results on two database indicate high accuracy about 98 percentage of the proposed algorithm. ACKNOWLEDGMENT All the experiments and ideas of this research work have been developed in Artificial Creature Lab, belonging to the Electrical Engineering Department, Sharif University of Technology. Majlesi Journal of Electrical Engineering Vol. 6, No. 2, June 2012 REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] C.N.E. Anagnostopoulos, I.E. Anagnostopoulos, V. Loumos, and E.A. Kayafas, βLicense Plate Recognition Algorithm for Intelligent Transportation System Applications,β IEEE Transactions on Intelligent Transportation Systems, vol. 7, pp. 377-391, 2006. A. Gupta, L. N. Long, βCharacter recognition using spiking neural networks,β Proc. of Intl. Joint Conference on Neural Networks, pp. 53-58, 12-17 Aug. 2007. M. A. Bhuiyan, R. Jalasutram, T. M. Taha, βCharacter recognition with two spiking neural network models on multicore architecture,β Proc. Computational Intelligence for Multimedia Signal and Vision Processingβ09, pp. 29-34, Nashville, 2009. Sh. R. Kulkarni, M. Sh. Baghini, βSpiking Neural Network based ASIC for Character Recognition,β Ninth International Conference on Natural Computation (ICNC), 978-1-4673-4714-3/13, 2013. D. I. Perrett, E. T. Rolls, and W. Caan, βVisual neurons responsive to faces in the monkey temporal,β 1982, Experimental Brain Research, 47 (3), 329-342. S. J. Thorpe, J. Gautrais, βRapid visual processing using spike asynchrony,β Advances in Neural Information Processing, 1997. S. J. Thorpe, J. Gautrais, βRank Order Coding: A new coding scheme for rapid processing in neural networks,β In: J. Bower (Ed.), Computational Neuroscience: Trends in Research 1998, Plenum Press, New York, pp. 113β118, 1998. L. F. Abbott, βLapique's introduction of the integrateand-fire model neuron 1907,β Brain Research Bulletin 50 (5/6): 303β304. Retrieved 2007-11-24, 1999. J. B. MacQueen, βSome Methods for classification and Analysis of Multivariate Observations,β Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability 1. University of California Press. pp. 281β297, 1967. E. Ghahnavieh, M. Enayati and A. Raie βIntroducing a large dataset of Persian license plate characters,β Journal of Electronic Imaging 23(2), 023015 (2014). MM. Taylor, βThe Problem of Stimulus Structure in the Behavioural Theory of Perception,β S. African J. Psychology 3: 23β45, 1973. E. Ghahnavieh, A. A. Shahraki and A. Raie βEnhancing the License Plates Character Recognition Methods by Means of SVM,β ICEE 2014, 220 β 225. 7