Download Comparison of Handwriting characters Accuracy using

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

M-Theory (learning framework) wikipedia , lookup

Affective computing wikipedia , lookup

Computer vision wikipedia , lookup

Catastrophic interference wikipedia , lookup

Edge detection wikipedia , lookup

Image segmentation wikipedia , lookup

Hough transform wikipedia , lookup

Facial recognition system wikipedia , lookup

Scale-invariant feature transform wikipedia , lookup

Histogram of oriented gradients wikipedia , lookup

Visual servoing wikipedia , lookup

Convolutional neural network wikipedia , lookup

Pattern recognition wikipedia , lookup

Transcript
International Journal of Science, Engineering and Technology Research (IJSETR)
Volume 3, Issue 6,May 2014
Comparison of Handwriting characters Accuracy using
Different Feature Extraction Methods
Thin Nu Nu Lwin, Thandar Soe
Department of Information Technology, Mandalay Technological University
[email protected]
Abstract –Feature extraction techniques can be important in
character recognition, because they can enhance the efficiency
of recognition in comparison to pixel-based approaches. These
study aims to investigate the noval feature extraction
techniques in order to use it for representing handwritten
characters. In this system, three feature extraction methods
(DCT, DWT and Gradient) are used to compare the
effectiveness of handwriting characters. The system is started
by acquiring an image containing characters. The characters
are processed into several phases such as binarization, noise
filtering, normalization and feature extraction before
recognizing. A multilayer neural network is used for the
recognition phase; feed forward back propagation algorithm is
applied for training the network. The purpose of this paper is to
compare different feature extraction methods in terms of
recognition accuracy and training time.
Keywords—Handwritten English characters (A-Z), DCT,
DWT, Gradient Multilayer neural network, Feed forward back
propagation, Recognition accuracy
I. INTRODUCTION
andwriting recognition has been one of the active and
challenging research areas in the field of image
processing. The recognition of text is the ability of the
computer to distinguish characters and words, can be divided
into the recognition of printed and handwritten characters.
Printed characters have one style and size for any given font.
However, handwritten characters have styles and sizes which
vary both for the same writer and between different writers.
The field of handwriting recognition is divided into off-line
and on-line recognition .The on-line approach uses a tracking
device to collect time-position-action of writing strokes, i.e.,
tablet digitizer or pen device. The sequences of the writing
positions are stored in a timely order. The off- line approach
uses a light sensitivity device, e.g., a scanner or a digital
camera, to read a written document. The off-line data
acquisition and recognition approach are the interest of this
study.
Common processes of off-line handwriting recognition
systems are preprocessing, feature extraction and recognition.
The feature extraction process extracts the relevant
information, known as feature vectors, which could be used to
identify the input image in the recognition step. The
recognition process uses these features to find the most
compatible class with the input. The main objective for using
feature extraction is to reduce the data dimensionality by
extracting in most important features from character image
[1].The present study aims to compare the performance of the,
DCT, DWT and Gradient for handwritten English characters.
The DCT is a popular signal transformation method, which
is made the use of cosine functions of different frequencies.
The DCT transform coding method compresses image data by
representing the original signal with a small number of
transform coefficients. It exploits the fact that for typical
images a large amount of signal energy is concentrated in a
H
small number of coefficients. The goal of DCT transform
coding is to minimize the number of retained transform
coefficients while keeping distortion at an acceptable level.
For that reason, the DCT have become the most widely used
transform coding technique.
The Wavelet Transform is a powerful technique for
representing data at different scales and frequencies. A
discrete wavelet transform represents a time domain signal
into time frequency domain and the signals are called wavelet
coefficients. The Discrete Wavelet Transform (DWT) is
based on sub-band coding, is found to yield a fast computation
of Wavelet Transform. It is easy to implement and reduces the
computation time and resources required.
The gradient feature provides higher resolution on both
magnitude and angle of the directional stokes, which leads to
improvement on the character recognition rate. The gradient
feature represents local characteristic of a character image.
Features extracted from handwritten characters are directions
of pixels with respect to their neighboring pixels. This
approach increases the information content and gives better
recognition rate with reduced recognition time.
The purpose of the work is to ascertain the effectiveness of
each feature extraction technique to capture useful
information and hence resulting in more accurate recognition
results. The remainder of the paper is organized as follows:
Section II briefly reviews the prior works on feature
extraction of handwritten characters. The proposed system
components are given in Section III. Section IV briefly
explains about materials and methods used on the current
system. The results of our experiment are described in
Section V and conclusions are mentioned in Section VI.
II. RELATED WORKS
Handwriting recognition is one of the most challenging
and oldest problems in computer-related research. The
challenge of handwriting recognition is how to implement
computer systems that can read like humans. Many
researchers had done work towards the off- line handwritten
character recognition.
Lawgali [2] compared the effectiveness of Discrete Cosine
Transform DCT and Discrete Wavelet Transform DWT to
capture discriminative features of Arabic handwritten
characters. A new database containing 5600 characters
covering all shapes of Arabic handwritten characters had also
developed. DCT and DWT techniques are used for feature
extraction of the characters. Coefficients of both techniques
are used in ANN for classification of the characters. That
experiment had demonstrated that feature extraction by DCT
has a higher recognition rate than DWT.
Olarik Surinta et al. [3] proposed a novel feature extraction
technique called the hotspot technique for representing
handwritten characters and digits. In the hotspot technique,
the distance values between the closet black pixels and the
1
All Rights Reserved © 2014 IJSETR
International Journal of Science, Engineering and Technology Research (IJSETR)
Volume 3, Issue61,May 2014
hotspots in each direction are used as representation for a
character. The hotspot technique is applied to three data sets
including Thai handwritten characters (65 classes), Bangla
numeric (10 classes) and MNIST (10 classes). The data sets
are then classified by the k-Nearest Neighbors algorithm
using the Euclidean distance as function for computing
distances between data points. In that study, the classification
rates obtained from the hotspot, mark direction and direction
of chain code techniques were compared. The results showed
that the hotspot technique provides the largest average
classification rates.
Dayashankar Singh et al.[4] presented a new feature
extraction technique to calculate only twelve directional
feature inputs depending upon the gradients. Total 500
handwritten samples, including handwritten Hindi
characters, English characters and some special characters,
were used in that experiment. Features extracted from
handwritten characters are directions of pixels with respect to
their neighboring pixels. These inputs are given to a back
propagation neural network with one hidden layer and one
output layer. Experiment result showed that the new
approach provides better results as compared to other
techniques in terms of recognition accuracy, training time
and classification time.
Wunsch and Laine [5] proposed the wavelet descriptors for
recognition of handwritten characters. Their experimental
results showed that wavelet descriptors are an efficient
representation. In that paper, they proposed a new feature
extraction method based on two-dimensional discrete
wavelet transform for off-line recognition of unconstrained
handwritten numerals using back-propagation neural
networks as a classifier. In order to verify the performance of
the proposed approach, 1500 handwritten numerals written
by 30 persons were collected as the database, 750 numerals
are used as the training set and the other 750 numerals as the
testing set. The recognition rate of the training set and the
testing set are 99.1 % and 96.8 %, respectively. The
experimental result shows that the proposed method is a
simple and an efficient representation for unconstrained
handwritten numerals recognition using fewer image
preprocessing.
Kumar
[6]
compared
performances
of
five
feature-extraction methods on handwritten Devanagari
characters. The various features covered are Kirsch
directional edges, distance transform, chain code, gradient
and directional distance distribution. From that
experimentation, it was found that Kirsch directional edges
are least performing and gradient is the best performing with
SVM classifiers. With multilayer perceptron (MLP), the
performance of gradient and directional distance distribution
is almost same. The chain code based feature is better as
compared to Kirsch directional edges, distance transform.
Amir Mowlaei et. al (2002) [7] presented a feature
extraction using wavelet transform for Farsi/Arabic
characters and numerals. The DWT is used to produce the
wavelet coefficient and Haar wavelet is used during the
feature extraction. The experiment is done using 480 samples
per digit and 190 samples per character. Then both of this
samples is divided into training and test set. Both of this set
has a high recognition rate between 91 -99%.
From the above literature survey, it is clear that feature
extraction is an integral part any recognition system and the
selection of feature extraction techniques is an important step
for getting higher recognition accuracy. DCT, DWT and
Gradient, are used and compare for feature extraction of all
the shapes of handwritten English characters using Neural
Network in the reorganization stage.
III.SYSTEM COMPONENTS
The entire system can be divided into four parts:
A. Image Acquisition
B. Preprocessing
C. Feature extraction
D. Recognition
Start
Image Acquision
Preprocessing
Feature extraction
DCT
Neural
Network
DWT
Gradient
Neural
Network
Neural
Network
Train
database
Save recognition accuracy
Comparison of recognition accuracy
End
Figure .1.Block Diagram of Recognition System
IV. MATERIALS AND METHODS
The steps of the proposed comparison algorithms based on
DCT, DWT and Gradient are described in Fig. 1.
A. Image acquisition
An image is acquired to the system as an input. This image
should have a specific format, for example, png format. This
image can be acquired through the scanner or, digital camera
or other digital input devices. Scanner is the most common
device used to get the image comparing to other devices due
to less noisy accrued during imaging process. The input
images are scanned at a resolution of 300 dpi (dot per inch)
and stored by gray scale image as shown in Figure. 2.
Figure.2. Input Image
2
All Rights Reserved © 2014 IJSETR
International Journal of Science, Engineering and Technology Research (IJSETR)
Volume 3, Issue 6,May 2014
B. Preprocessing
Before the image is given to the recognition system, it
needs to be brought in a format that is standard and
acceptable to the neural network as input. Various blocks of
the preprocessing step are as follows:
(i).Binarization: In image binarization, the text image which
is gray scale image is converted into a binary image with
each pixel taking a value of 0 or 1 depending on threshold
value of the image. The technique is most commonly
employed for determining the threshold involves analyzing
the histogram of gray scale levels in the digitized image. The
scanned image and its binarized output are shown in Figure.
3.
Figure.3.Input Image and Binary Image of “B”
(ii).Noise removal: Noise removal means reducing noise in
an image. For off-line recognition, the noise may come from
the writing style or from the optical device that captures the
image. The presence of noise can reduce the efficiency of the
character recognition system. So, it should be eliminated
from the image as much as possible to avoid confusion
recognition. Median filtering is used in this paper. The
example of noisy image and its filtering image are shown in
Figure.4.
C. Feature Extraction
In printed and handwritten text, the features capture the
information extracted from the characters. This information
is passed onto the matcher to assist in the classification
process. In this research, DCT, DWT and Gradient are
adopted to extract the features of the characters. DCT, DWT
and Gradient are widely used in the field of digital signal
processing applications.
(i).Discrete Cosine Transform (DCT)
The discrete cosine transform (DCT) is a technique for
converting a signal into elementary frequency components.
DCT technique includes three steps: Transformation,
Quantization and Encoding.
The Discrete Cosine Transform converts data of the image
into its elementary frequency components. Quantization is
the process of reducing the number of possible values of a
quantity, thereby reducing the number of bits needed to
represent it.
Entropy encoding is a technique for representing the
quantize data as compactly as possible. First, image is
divided into 8*8 block. It clusters the lowest frequency
components in upper left corner, whereas the highest
frequency components in bottom right corner of the array
(m, n).
1
𝛼(u)=
,
√𝑀
{
√
2
, 1≤𝑢 ≤𝑀−1
𝑀
1
𝛼(v)=
{
𝑢=0
√𝑁
,
𝑣=0
2
√ ,1 ≤ 𝑣 ≤ 𝑁 − 1
𝑁
Figure.4. Noisy image and noise removed image of “B”
(iii).Normalization: Normalization is used to standardize the
fast size within the image. The size of the handwritten
characters varies from person to person and even with the
same person from time to time.
Therefore, the characters should be scaled to a
standardized matrix to make the recognition process
independent of the writing size and to get better recognition
accuracy. In this paper, the character is normalized to size
32*32 pixels. An example of size normalization is illustrated
in Figure.5.
where f(m, n) is the pixel value at the (m, n)coordinate
position in the image. F(u, v) is DCT domain representation
of f(m, n), where u and v represent vertical and horizontal
frequencies.
The DCT coefficients are then quantized. After
quantization, all of the quantized coefficient are extracted in
a zigzag fashion and stored in a vector sequence as shown in
Figure.7. These coefficients are encoded for coefficient
transmission of the image. Therefore, these coefficients are
used to extract the features of the character image.
Figure.7. Zig Zag Sequence
Figure. 5. Normalization of the character “B’’
(ii).Discrete Wavelet Transform (DWT)
The Wavelet Transform (WT) is a way to represent a
signal in time- frequency form. DWT provides a more
detailed picture of the signal being analyzed. DWT is applied
low-pass filter (LPF) and high-pass filter to decompose the
3
All Rights Reserved © 2014 IJSETR
International Journal of Science, Engineering and Technology Research (IJSETR)
Volume 3, Issue61,May 2014
image along Row or Colum. The results of each filter are
down-sampled by two. Each of the sub- signals is then again
high and low filtered and the result is again down-sampled by
two.
At decomposition level, DWT separates an image into one
low- frequency sub-band (LL) and three high frequency
sub-bands (LH, HL, HH). The LL is an approximation
sub-band, LH is horizontal detail sub-band, HL is vertical
detail sub-band and HH is diagonal sub-band.
The low frequency coefficients of sub-band (LL) are
closed to the original image and they contain full details of
the image. Therefore, these coefficients are used to detect the
features of the character image.
Figure. 8. DWT decomposition at one level
(iii).Gradient
The gradient measures the magnitude and direction of the
greatest change in intensity in a small neighborhood of each
pixel. Gradients are computed by means of the Sobel
operator. The Sobel templates used to compute the horizontal
(X) & vertical (Y) components of the gradient. The templates
are shown in Figure.6.
Horizontal Template
Vertical Template
Figure.6. Sobel operator Template
The two gradient components at location (i, j) are
calculated by:
D. Recognition in Classification
Template matching, structural analysis and neural
networks have traditionally been popular classification
methods for character recognition, but neural networks are
increasingly proving to offer better and more reliable
accuracy for handwriting recognition [8].
Architecture: The most popular architecture of neural
network used in English character recognition takes a
network with three layers: input layer, hidden layer and
output layer. Figure.10. depicts example of the architecture
3-layer neural network. The input layer is fed by the feature
of the characters. Therefore, the number of nodes in this layer
depends on the number of input features of the network. The
last layer is called the output layer and the number of its
nodes is based on the desired outs. The hidden layer lies
between the input and output layers. The system consists of
208 characters of different writers.
For DCT, the number of input feature is 16, the number of
neurons in hidden layer is 400 and the number of neurons in
output layer is 26 in this system. For DWT, the number of
input feature is 12, the number of neurons in hidden layer is
400 and the number of neurons in output layer is 26 in this
system. For gradient, the number of input feature is 18, the
number of neurons in hidden layer is 400 and the number of
neurons in output layer is 26 in this system. In the hidden
layer, the number of nodes governs the variance of samples
which can be accurately and correctly recognized by the
network. If the network has trouble in learning, then neurons
can be added to this layer.
Training phase: Commonly neural networks are trained, so
that a particular input leads to a specific target output. There,
the network is adjusted, based on a comparison of the output
and the target, until the network output matches the target.
The system apply feed forward back propagation neural
network algorithm.
The back propagation algorithm consists of three stages.
The first is the forward phase, spread inputs from the input
layer to the output layer through hidden layer to provide
outputs. The second is the backward stage, calculate and
propagate back of the associated error from the output layer
to the input layer through hidden layer. And the third stage is
the adjustment of the weights and biases.
The backward stage is similar to the forward stage except
that error values are propagated back through the network to
determine how the weights are to be changed during training.
During training each input pattern will have an associated
target pattern. After training, application of the network
involves only the computations of the feed forward stage.
Gx (i, j) = f( i-1, j+1) + 2f( i, j+1) +f( i+1, j+1) – f(i-1, j- 1)2f(i, j-1) –f(i+1, j+1)
Gy (i, j) = f( i-1, j-1) + 2f( i-1, j) +f( i-1, j+1) – f(i+1, j-1) –
2f(i+1, j) –f(i+1, j+1)
The gradient strength and the direction are calculated as :
G (i, j ) =√Gx 2 + Gy 2
Θ (i, j) =tan-1Gx (i, j) /Gy (i, j)
After computing the gradient of each pixel of the character,
the gradient values are mapped onto 18 direction values to
the angle span of 10 degree between any two adjacent
direction values.
Figure.9. Example of architecture of neural network with 3-layer
V.EXPERIMENTAL RESULTS
Experiments were carried out using 286 isolated English
characters from 11 independent writers. These characters are
4
All Rights Reserved © 2014 IJSETR
International Journal of Science, Engineering and Technology Research (IJSETR)
Volume 3, Issue 6,May 2014
divided into two data sets: training 208 characters and testing
78 characters. Comparative studies between DCT, DWT and
Gradient, in terms of recognition accuracy are summarized in
Table 1.
TABLE 1
RECOGNITION ACCURACY BY USING DIFFERENT FEATURE EXTRACTION
TECHNIQUES
Feature
Train
Test
Known
Unknown
Accuracy
method
Images Images
images
images
DCT
208
78
68
10
87.3%
DWT
208
78
65
13
83.3%
Gradient
208
78
70
8
89.7%
The accuracy for DCT was 87.3% of correct readings with
13% of incorrect readings and DWT was 83.5% of correct
readings with 16.5% of incorrect reading and for the gradient
system was 89.7% of correct readings with 10% of incorrect
readings on the test data set used. The result has shown that
the feature extraction based on Gradient yields a higher
recognition rate than other two methods. DCT was slightly
higher recognition rate than DWT counterpart.
VI .CONCLUSION
Handwritings of different person are different; therefore it
is very difficult to recognize the handwritten characters. This
paper has compared three features extraction techniques
(DCT, DWT and Gradient) for handwritten English
characters. Both techniques have been used in ANN for
classification of the characters. The recognition rates of
DCT, DWT and Gradient, techniques are 87.3% and 83.5%
and 89.7% respectively. The results have demonstrated that
features extraction by Gradient has a higher recognition rate
for handwritten English characters. A reason may be that the
ability of Gradient to compress data of the image makes it
more efficient far pattern recognition application.
[1]
[2]
[3]
[4]
[5]
[6]
[7]
REFERENCES
Lauer, F., Suen, C.Y, and Bloch, G. (2007). A trainable
feature
extractor for handwritten digit recognition. Pattern
Recognition,
40(6): 1816-1824
Lawgail A., “Handwritten Arabic Character Recognition: Which
Feature Extraction Methods?”School of computing, Engineering
and Information Sciences, Narthumbria University, Newcastle upon
Tyne, UK
Olarik Surinta, Lambert schonraker and Marco Wiering, “
Hanwritten Character Classification using the Hotspot Feature
Extraction Technique”, Department of Artificial Intelligence,
University of Groningen, Nyenborgh 9, Graningen, The Netherlands.
Dayashankar Singh, Sanjay Kr. Singh, Dr. (Mrs.) Maitreyee
Dutta, “Hand written character recognition using twelve
directional feature input and neural network”, ©2010
international Journal of Computer Applications (0975 – 8887) Volume
1 – No. 3.
Wunsch and Laine “Handwritten Script Recognition using DCT
and Wavelet Features at Block Level” G. G. Rajput of Computer
Science, Gulbarga University, Gulbarga-585106 Karnataka, India.
Kumar,Singh “ Performance comparison of features on
Devanagari hand print dataset”, International Journal Recent
Trends, vol.1, no.2, pp.33-37,2009.
Amir Mowlaei et. al (2002) ‘Handwritten Arabic Character
Recognition: Which FeatureExtraction Method?’ A. Lawgali, A.
Bouridane, M. Angelova, Z. Ghassemlooy School of computing,
Engineering and Information Sciences Northumbria University,
Newcastle upon Tyne, UK [email protected]
5
All Rights Reserved © 2014 IJSETR