Download Artificial intelligence neural computing and

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Premovement neuronal activity wikipedia , lookup

Artificial intelligence wikipedia , lookup

Neural oscillation wikipedia , lookup

Feature detection (nervous system) wikipedia , lookup

Neural coding wikipedia , lookup

Neuroanatomy wikipedia , lookup

Artificial general intelligence wikipedia , lookup

Optogenetics wikipedia , lookup

Holonomic brain theory wikipedia , lookup

Neural engineering wikipedia , lookup

Time series wikipedia , lookup

Neuropsychopharmacology wikipedia , lookup

Synaptic gating wikipedia , lookup

Mathematical model wikipedia , lookup

Channelrhodopsin wikipedia , lookup

Central pattern generator wikipedia , lookup

Artificial neural network wikipedia , lookup

Development of the nervous system wikipedia , lookup

Neural modeling fields wikipedia , lookup

Metastability in the brain wikipedia , lookup

Biological neuron model wikipedia , lookup

Catastrophic interference wikipedia , lookup

Convolutional neural network wikipedia , lookup

Nervous system network models wikipedia , lookup

Recurrent neural network wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Transcript
Artificial intelligence neural computing and regression methods for
shelf life prediction of kalakand
Sumit Goyal1 and Gyanendra Kumar Goyal2
Dairy Technology Division, National Dairy Research Institute, Karnal-132001 (Haryana), India
Abstract
Kalakand or Qalaqand is a popular Indian sweet made out of solidified, sweetened milk
and cottage cheese. It owes it origin to the milk-rich Braj area of western Uttar Pradesh.
It is a very popular sweetmeat in North and East India, including Jharkhand, Orissa and
Bengal and is reputed for its exquisite taste. The term qand in qalaqand is derived from
the Arabic language and means sweets. Artificial neural networks have been developed
as generalizations of mathematical models of biological nervous systems. Cascade
Computing, Generalized Regression and Multiple Linear Regression artificial
intelligence models for shelf life prediction of Kalakand were developed. The results of
models were evaluated with three types of prediction performance measures viz., Mean
Square Error , Root Mean Square Error and coefficient of determination R2.The best
result for cascade computing model with single hidden layer (MSE 0.000592818;
RMSE: 0.024347850; R2: 0.992884381),for double hidden layer best results(MSE
0.000592818; RMSE: 0.02434785; R2: 0.992884381).Best results for generalized
regression were (MSE 0.001152787; RMSE: 0.033952711; R2: 0.986166561) and for
multiple linear regression (MSE 0.000144005; RMSE: 0.0120002; R2: 0.998271839).
From the investigation, it can be concluded that the multiple linear regression model has
better than cascade and generalized regression models in predicting shelf life of buffalo
milk Kalakand stored at 6oC.
Keywords: Artificial intelligence, Cascade, Generalized Regression, Multiple
Linear Regression, Kalakand
Present address: 1Senior Research Fellow (Corresponding author: [email protected]), 2Emeritus
Scientist, 1, 2 National Dairy Research Institute, Karnal-132001 (India).
Artificial intelligence neural computing and regression methods for
shelf life prediction of kalakand
1. Introduction
Kalakand or Qalaqand is a popular Indian sweet made out of solidified, sweetened milk
and cottage cheese. It owes it origin to the milk-rich Braj area of western Uttar Pradesh.
It is a very popular sweetmeat in North and East India, including Jharkhand, Orissa and
Bengal and is reputed for its exquisite taste. The term qand in qalaqand is derived from
the Arabic language and means sweets [1]. The human brain provides proof of the
1
existence of massive neural networks that can succeed at those cognitive, perceptual,
and control tasks in which humans are successful. The brain is capable of
computationally demanding perceptual acts (e.g. recognition of faces, speech) and
control activities (e.g. body movements and body functions).The advantage of the brain
is its effective use of massive parallelism, the highly parallel computing structure, and
the imprecise information-processing capability. The human brain is a collection of more
than 10 billion interconnected neurons. Treelike networks of nerve fibers called
dendrites are connected to the cell body or soma, where the cell nucleus is located.
Extending from the cell body is a single long fiber called the axon, which eventually
branches into strands and substrands, and are connected to other neurons through
synaptic terminals or synapses. The transmission of signals from one neuron to another
at synapses is a complex chemical process in which specific transmitter substances are
released from the sending end of the junction. The effect is to raise or lower the
electrical potential inside the body of the receiving cell. If the potential reaches a
threshold, a pulse is sent down the axon and the cell is ‘fired’. Artificial neural networks
(ANN) have been developed as generalizations of mathematical models of biological
nervous systems. A first wave of interest in neural networks (also known as
connectionist models or parallel distributed processing) emerged after the introduction
of simplified neurons by McCulloch and Pitts (1943). The basic processing elements of
neural networks are called artificial neurons, or simply neurons or nodes. In a simplified
mathematical model of the neuron, the effects of the synapses are represented by
connection weights that modulate the effect of the associated input signals, and the
nonlinear characteristic exhibited by neurons is represented by a transfer function. The
neuron impulse is then computed as the weighted sum of the input signals, transformed
by the transfer function. The learning capability of an artificial neuron is achieved by
adjusting the weights in accordance to the chosen learning algorithm. A neural network
has to be configured such that the application of a set of inputs produces the desired set
of outputs. Various methods to set the strengths of the connections exist. One way is to
set the weights explicitly, using a priori knowledge. Another way is to train the neural
network by feeding it teaching patterns and letting it change its weights according to
some learning rule. The learning situations in neural networks may be classified into
2
three distinct sorts. These are supervised learning, unsupervised learning, and
reinforcement learning. In supervised learning, an input vector is presented at the inputs
together with a set of desired responses, one for each node, at the output layer. A
forward pass is done, and the errors or discrepancies between the desired and actual
response for each node in the output layer are found. These are then used to determine
weight changes in the net according to the prevailing learning rule. The term supervised
originates from the fact that the desired signals on individual output nodes are provided
by an external teacher [5].
1.1 Cascade Computing (CC) Model
The ideas behind the cascade-correlation architecture are as follows. The first is to
build up the cascade architecture by adding new neurons together with their
connections to all the inputs as well as to the previous hidden neurons. This
configuration is not changed at the following layers. The second idea is to learn only
the newly created neuron by fitting its weights so that to minimize the residual error of
the network. The new neurons are added to the network while its performance
increases. So, the common cascade-correlation technique assumes that all m
variables x1,…, xm characterizing the training data are relevant to the classification
problem. At the beginning, a cascade network with m inputs and one output neuron
starts to learn without hidden neurons. The output neuron is connected to every input
by weights w1,…, wm adjustable during learning. The output y of neurons in the
network is given by the standard sigmoid function f. Then the new neurons are added
to the network one-by-one. Each new neuron is connected to all m inputs as well as to
all the previous hidden neurons. Each time only the output neuron is trained. For
training, any of algorithms suitable for learning a single-neuron can be used. Training a
new neuron, the algorithm adjusts its weights so that to reduce the residual error of the
network. The algorithm adds and then trains the new neurons while the residual error
decreases. The advantages of the cascade neural networks are well known. First, no
structure of the networks is predefined, that is, the network is automatically built up
from the training data. Second, the cascade network learns fast because each of its
3
neurons is trained independently to each other. However, a disadvantage is that the
cascade networks can be over-fitted in the presence of noisy features [2].
1.2 Generalized Regression (GR) Model
Generalized regression models are a kind of radial basis network that are used
for function approximation. Syntax: net = newgrnn (P,T,spread)
net = newgrnn(P,T,spread) takes three inputs,
P: R-by-Q matrix of Q input vectors
T: S-by-Q matrix of Q target class vectors
Spread : Spread of radial basis functions (default = 1.0) and returns a new
generalized regression model. To fit data very closely, use a spread smaller than the
typical distance between input vectors. To fit the data more smoothly, use a larger
spread. Larger the spread, the smoother the function approximation. Newgrnn
creates a two-layer neural network. The first layer has radbas neurons in it and
calculates weighted inputs with dist and net input with netprod. The second layer
has purelin neurons, calculates weighted input with normprod, and net inputs with
netsum. Only the first layer has biases. newgrnn sets, the first layer weights to P',
and the first layer biases are all set to 0.8326/spread, resulting in radial basis
functions that cross 0.5 at weighted inputs of +/– spread. The second layer weights
W2 are set to T [3].
1.3 Multiple Linear Regression (MLR)
Regression reveals average relationship between two variables and makes possible
to predict the yield. In mathematics Y is called a function of X, but in Statistics it is
termed as regression which describes relationship. Hence, regression is the study of
functional relationship between two variables of which one is dependent (Y) and other is
independent (X). Regression analysis provides an estimate of values of the dependent
variable from values of the independent variable. This estimation procedure is called the
regression line. Regression analysis gives a measure of the error. With the help of
regression coefficients we can find the value of correlation coefficient. The multiple
regression analysis gives the best linear prediction equation involving several
4
independent variables. It also helps in finding the subset that gives the best prediction
values of Y. The multiple regression equation describes the average relationship
between dependent and independent variables which is used to predict the dependent
variable. If Y depends partly on X1
and partly on X2 then the population regression equation is written as,
YR    1 X 1   2 X 2 ,
Eq. (1)
 1 measures the average change in Y when X1 increases by 1 unit, X2 remaining
unchanged it is called the partial regression coefficient of Y on X1 and 
2
the partial
regression coefficient of Y on X2 which measures the average change in Y when X2
increases by 1 unit, X1 remaining unchanged. Thus the regression model is
Y    1 X 1   2 X 2   ,
Eq. (2)
where,   N (0,  2 ) [4].
ANN has been successfully applied for predicting food quality [6], odor change of
muscadine grape [7], fruit ripening [8].There has been no research till date on
predicting shelf life of Kalakand.
2. Method Materials
Fig.1 Design of neural network
5
The input parameters for models were tyrosine, moisture, free fatty acids,
titratable acidity, peroxide value and sensory score was output parameters which are
displayed in Fig.1. Kalakand datasets were developed at National Dairy Research
Institute, Karnal, (India). The dataset consisted of 60 live observations. Further, the
dataset was divided into two subsets i.e., 48 data observations (80% of data
observations) were used for training the network and 12 for testing (20% of data
observations) the network. CC, GR and MLR models were developed and compared
with each other for shelf life prediction of Kalakand. The network was trained with upto
500 epochs and number of neurons in single and double hidden layers varied from 1 to
30 , different combinations were tried and tested , as there is no predefined rule of
achieving good results other than hit and trail method. As the number of neurons
increased as the training time. There are two problems that were kept in mind while
training the network, problem of overfitting and problem of underfitting .Overfitting
means that the size of neurons used in training the network should not be large, as it is
difficult for the network to train and underfitting means neurons should not be less as it
is difficult for a neural network to get properly trained .Hence, balance must be
maintained, while training the neural network. The Neural Network Toolbox under
MATLAB 7.0 software was used for development of artificial intelligence computing
models. Different algorithms were tried i.e., Fletcher-Powell Conjugate
Gradient(traincgf), Levenberg-Marquardt (trainlm), BFGS Quasi-Newton (trainbfg),
Bayesian Regularization (trainbr), Resili Backpropagation(trainrp), Scaled Conjugate
Gradient (trainscg), Conjugate Gradient with Powell/Beal Restarts( traincgb), PolakRibiére Conjugate Gradient( traincgp), One Step Secant( trainoss) Variable Learning
Rate Backpropagation (traingdx ). Bayesian regularization gave good results, therefore
it was selected as the training algorithm for the neural network.
Performance measures for prediction
 N Q Q
exp
cal
MSE   

n
1 





2




1  N  Qexp  Qcal 
RMSE 

n  1  Qexp 

6
2
Eq. (3)




Eq. (4)
 N Q Q
exp
cal
R  1   
2
 1  Qexp
 
2




2




Eq. (5)
Q exp = Observed value; Qcal = Predicted value; n = Number of observations in dataset.
3. Results and Discussion
Table 1. Results of experiments for CC model with single hidden layer
Neurons in Hidden Layer
MSE
RMSE
R2
3
4
5
7
10
12
15
17
20
25
30
0.009680652
0.000592818
0.009290401
0.001279354
0.001661461
0.003712751
0.001024186
0.001324841
0.002385685
0.00284096
0.003515725
0.098390304
0.024347850
0.096386725
0.03576806
0.040761019
0.060932351
0.032002906
0.036398359
0.048843471
0.053300653
0.059293547
0.883087967
0.992884381
0.887656189
0.984634092
0.980062472
0.955446983
0.987699195
0.984101914
0.971371784
0.965908485
0.940706453
Table 2. Results of experiments for CC model with two hidden layer
Neurons in Hidden Layer
MSE
RMSE
R2
2:2
3:3
5:5
6:6
7:7
8:8
10:10
12:12
15:15
18:18
20:20
0.000592818
0.001401085
0.012070402
0.008092429
0.008489083
0.004769402
0.003249304
0.001467108
0.001705251
0.000994657
0.00098877
0.02434785
0.03743107
0.10986538
0.089957932
0.092136219
0.069060856
0.05700267
0.038302848
0.041294684
0.031538184
0.031444715
0.992884381
0.983171243
0.854086061
0.902073957
0.897210554
0.942595717
0.960897045
0.982368529
0.979507173
0.988058022
0.988125331
7
Table 3. Results of experiments for GR model
Spread Constant
MSE
RMSE
R2
2
3
4
5
6
7
8
9
10
25
40
0.001152787
0.002836909
0.005046995
0.006850000
0.008135249
0.009032628
0.009667907
0.010129284
0.010471581
0.011789805
0.011952980
0.033952711
0.053262638
0.071042204
0.082764727
0.090195614
0.095040137
0.098325516
0.100644342
0.102330745
0.108580870
0.109329685
0.986166561
0.965957096
0.939436063
0.917799999
0.902377015
0.891608467
0.883985114
0.878448597
0.874341023
0.858522335
0.856564239
Table 4. Result of Regression Model
Regression Model
MLR
MSE
RMSE
R2
0.000144005
0.0120002
0.998271839
Fig.2 Graphical representation of actual and predicted sensory score for
CC model with one hidden layer
8
Fig.3 Graphical representation of actual and predicted sensory score for
CC model with two hidden layers
Fig.4 Graphical representation of actual and predicted sensory score for
GR model
9
Fig.5 Graphical representation of actual and predicted sensory score for MLR model
Numerous experiments were carried with CC, GR and MLR models. Different
combinations were tried, tested and compared with each other as represented in the
tables 1, 2, 3 and 4 respectively. CC models with single hidden layer having four
neurons gave the best outcome (MSE 0.000592818; RMSE: 0.024347850; R2:
0.992884381). CC models with two hidden layers having two neurons in the first layer
and two neurons in the second layer gave best result as (MSE 0.000592818; RMSE:
0.02434785; R2: 0.992884381). GR model was also developed, the best results given
by this model with spread constant as 2 are (MSE 0.001152787; RMSE: 0.033952711;
R2: 0.986166561).Statistical model of MLR was developed to compare performance of
artificial intelligence neural computing models, it displayed the finest results, (MSE
0.000144005; RMSE: 0.0120002; R2: 0.998271839) as represented in table 5.
Table 5. Displaying best results of different models
Model
Best Results
CC single hidden layer
model
MSE 0.000592818; RMSE: 0.024347850; R2:
0.992884381
CC two hidden layer model
MSE 0.000592818; RMSE: 0.02434785; R2:
0.992884381
GR model
MSE 0.001152787; RMSE: 0.033952711; R2:
0.986166561
MLR model
MSE 0.000144005; RMSE: 0.0120002; R2:
0.998271839
10
Fig.6 Displaying regression equations
Hence, MLR model was selected for predicting shelf life of Kalakand by building
regressions equations based on sensory scores and constant came out as 8.516,
regression coefficient as -0.041 and R2 was found to be 99 percent as represented in
fig.6, after solving them 8.25 came as the output which was subtracted from the actual
shelf life of the product i.e., 40 days. Hence it was found 31.75 days.
4. Conclusion
The possibility of artificial intelligence neural network and statistical computing approach
was investigated to predict shelf life of Kalakand. Cascade neural network with single as
well as double hidden layers were constructed with generalized regression models and
also statistical model of multiple linear regression was developed. Therefore, from the
results it can be concluded that statistical computing model of multiple linear regression
is superior over cascade and generalized regression models in predicting shelf of
buffalo milk Kalakand stored at 6oC.
5. References
[1] En.wikipedia.org web-site accessed on 23.5.2011 :
“http://en.wikipedia.org/wiki/Kalakand”
[2] Arxiv web-site accessed on 24.5.2011 :
“http://arxiv.org/ftp/cs/papers/0504/0504067.pdf”
[3] Mathworks. web-site accessed on 25.5.2011 :
11
“http://www.mathworks.com/help/toolbox/nnet/ref/newgrnn.html”.
[4] S.B. Agarwal, Manual on Statistical Methods for Agriculture and Animal
Sciences. Published by National Dairy Research Institute (Deemed University),
Karnal 132 001 (Haryana) India, 2010
[5] Softcomputing web-site accessed on 25.5.2011 :
“http://www.softcomputing.net/ann_chapter.pdf”
[6] G. Xie , R. Xiong , Use of hyperbolic and neural network models in modeling quality
changes of dry peas in long time cooking. Journal of Food Engineering. 41 :
151-162, 1999
[7] O. Tokusolu , M.O. Balaban , Evaluation of odor and color changes of
muscadine grape stored at different temperatures by electronic nose and
computer vision. Abstracts 39B-18 presented at IFT 2000 annual meeting,
Dallas, USA, 2000
[8] T. Morimoto, W. Purwanto , J. Suzuki, T. Aono , Y. Hashimoto , Identification of
heat treatment effect on fruit ripening and its optimization. In:Mathematical and
Control Applications in Agriculture and Horticulture, Munack, A. & Tantau
H.J.(Eds.),Oxford: Pergamon Press,267-272,1997
12