Download Replace Missing Values with EM algorithm based on GMM

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Multi-armed bandit wikipedia , lookup

Gene expression programming wikipedia , lookup

Neural modeling fields wikipedia , lookup

Minimax wikipedia , lookup

Genetic algorithm wikipedia , lookup

Pattern recognition wikipedia , lookup

Time series wikipedia , lookup

Mixture model wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Transcript
International Journal of Software Engineering and Its Applications
Vol.8, No.5 (2014), pp.177-188
http://dx.doi.org/10.14257/ijseia.2014.8.5.14
Replace Missing Values with EM algorithm based on GMM and
Naïve Bayesian
Xi-Yu Zhou1 and Joon S. Lim2*
1
2
I.T. College Gachon University Seongnam, South Korea
I.T. College Gachon University Seongnam, South Korea
1
[email protected], 2 [email protected]
Abstract
In data mining applications, there are various kinds of missing values in experimental
datasets. Non-substitution or inappropriate treatment of missing values has a high probability
to cause a lot of warnings or errors. Besides, many classification algorithms are very sensitive
to the missing values. Because of these, handling the missing values is an important phase in
many classification or data mining task. This paper introduces traditional EM algorithm and
disadvantage of the EM algorithm. We propose a new method to implement the missing values
based on EM algorithm, which uses Naive Bayesian to improve the accuracy. We conclude by
classifying seeds dataset and vertebral columns dataset and comparing the results to those
obtained by applying two other missing value handling methods: the traditional EM algorithm
and the non-substitution method. The experimental results prove a stable algorithm for
improving the data classification accuracy on large datasets, which contain a lot of missing
values.
Keywords: missing values, EM algorithm, GMM, Naive Bayesian
1. Introduction
Missing values exist in many situations wherein no values are reserved for some variable in
an experiment or observation [1]. In real-life data, some stored values are frequently missing as
a result of unexpected mistakes, most often because either they are lost or they are independent
of conditions [2]. Although missing values are a common occurrence, they can nonetheless
have a significant effect on the processing of and results derived from data. First, the data
mining program will constantly lose a considerable amount of useful information. Second, the
system shows more signs regarding the uncertainty of the result, and it is difficult to ensure the
determinateness [3]. Third, the missing values have a high probability of confusing the data
mining process, leading to uncertain output. Fourth, the missing values frequently infect the
operating performance and produce mistakes in the mining model [4]. Besides, some
classification algorithms, such as backpropagation neural network, K-Nearest neighbor
algorithm, C4.5 decision tress and so on, are very sensitive to the missing values. If there are a
lot of missing values in datasets, then we use one of among algorithms to classify, we will have
a high probability to obtain the low classification accuracy [21]. Accordingly, handling missing
values is an important step in preprocessing phases for most data classification or data mining
tasks [5]. Inappropriate implementation of missing values can produce serious errors or false
results.
*
Corresponding Author
ISSN: 1738-9984 IJSEIA
Copyright ⓒ 2014 SERSC
International Journal of Software Engineering and Its Applications
Vol.8, No.5 (2014)
Generally, methods for dealing with missing values can be divided into three classes: i)
delete the missing values; ii) implement the missing values with estimated values; and iii)
ignore the missing values [7]. Among these methods, deleting missing values is the easiest.
However, when the rate of missing values in each attribute is high, this method has an
unsatisfactory performance [8]. Ignoring missing values also causes similar issues. We thus
naturally prefer methods for implementing missing values.
There are many methods for accomplishing this, such as the approximation [6], stochastic
regression, and neural network methods. Among all the approaches, the EM
(expectation-maximization) algorithm can reliably use the stable and the maximum step to find
the optimal values for implementing the missing values [9]. However, the EM algorithm’s
speed of convergence is quite slow and easily falls into local optimization. If we give the EM
algorithm fixed initial values, we can increase the speed of convergence and algorithm stability.
At the same time, we can overcome the deviation by way of marginal values. Together, these
give the EM algorithm a better performance. This improved EM algorithm is based on Naive
Bayesian, and therefore is named the NB-EM algorithm, which uses the result of classification
to substitute otherwise-random initial values. Below, we describe both the traditional EM
algorithm and the NB-EM algorithm.
1.1. Traditional EM Algorithm
The EM algorithm is a popular method of iterative refinement [10]. In each iterative step, it
has an Expectation Step and a Maximization Step [11], where the Expectation Step estimates
the missing values and the Maximization Step updates the model parameters. The basis of the
algorithm is to first estimate the missing value’s initial values and obtain the values of the
model parameters, and then to iteratively repeat the Expectation Step and Maximization Step,
while updating the estimated values, until the function reaches convergence. In more detail:
(1) Randomly choose K samples as the center of each class.
(2) Repeat Expectation Step and Maximization Step to improve the accuracy, until the function
reaches convergence.
a. Expectation Step: use the probability P ( X i ∈Ck ) to classify each sample into some Class k.
(
) = P(Ck | X i ) = P(Ck ) * P( X i |Ck ) / P( X i )
k
= P (Ck ) * P ( X i |Ck ) / ∑ P ( Ck )*P ( X i |Ck )
i =1
P X i ∈Ck
(1)
In this equation, P ( X i∈Ck ) is the probability that Sample i belongs to Class k, which can be
used in classification to ensure that to some degree the classification result is accurate.
b. Maximization Step: use the estimating values, obtained before, to re-estimate the model
parameter.
(
k
m k =1 / n * ∑ X i * P X i ∈C k
i =1
In this equation,
178
mk
)/
j
∑ P ( X i ∈C j )
i= j
(2)
is the model parameter.
Copyright ⓒ 2014 SERSC
International Journal of Software Engineering and Its Applications
Vol.8, No.5 (2014)
1.2. EM algorithm with Naive Bayesian
The traditional EM algorithm randomly chooses samples as the center of each class, which
easily affects the clustering result. In other words, the disadvantage of traditional EM algorithm
is that EM algorithm is too dependent on the selection of the initial center of each class. Further,
marginal values have a high probability of affecting the entire algorithm, thereby decreasing
the accuracy of implementing missing values. Because of these problems, if we can fix the
initial center of each class, we will decrease the dependence on initial centers. This thesis
proposes an improved EM algorithm based on Naive Bayesian, which we call the NB-EM
algorithm. In this method, we can use the Naive Bayesian to classify the dataset and obtain the
result, and then, use the classification results to substitute the randomly-selected center of each
class before repeating the Expectation Step and Maximization Step. As a result, the NB-EM
algorithm can cluster and obtain convergence more quickly, while also effectively avoiding the
influence of marginal values and obtaining a more accurate value to substitute for the missing
values. This algorithm works as follows:
Start
Expectation Step
Input dataset in
Naive Bayesian
Maximization Step
Classified Process
function
convergence
Get the Classification
Input the
Classification
result
Output
classification
result
Output optimal
values
Fixed the Initial
values
End
Figure 1. Process of NB-EM Algorithm
(1) Use the Naive Bayesian algorithm to classify the dataset. We can use the Naive Bayesian
classifier in Weka, a collection of classification tools designed by WAIKATO University, to
obtain the classification results [12]. WEKA as an open platform for data mining, a collection
of a large number of data mining tasks can take on machine learning algorithms, including data
preprocessing, classification, regression, clustering, association rules, and in the new
interactive interface visualization. If you want your own data mining algorithms, it can refer to
the interface documentation weka. Integrate their own algorithms in weka even draw its own
method for visualization tool is not a very difficult thing.
(2) The most important phase of the NB - EM algorithm is that we use the fixed center of
each class to replace the randomly initial centers. To achieve the performance, we have four
parts, like the Figure 2. First step is to statistic the number of each class. Then, using the number
divides the summery of samples to obtain the density of each class. Second step is to calculate
and obtain the Gaussian Mixture Model of each class. The third step is using the density and
Gaussian Mixture Model of each class to calculate the probability of each class. The final step
Copyright ⓒ 2014 SERSC
179
International Journal of Software Engineering and Its Applications
Vol.8, No.5 (2014)
is using the probability of each class to fix the range, and then, we can fix the center of each
class.
Start
Input the
dataset
Statistic the number of
Class A
Statistic the number of
Class B
Density of Class A
D(A)
Density of Class B
D(B)
Gaussian Mixture
Model of Class A
G(A)
Gaussian Mixture
Model of Class B
G(B)
Probability Of Class A
P(A)
Probability Of Class B
P(B)
Fix the Range of Class A
Fix the Range of Class B
Fix the centers of Class
A and Class B
End
Figure 2. Process of Fixing Center of Each Class
(3) Use the classification result from (1) in place of the random initial classes, and repeat
Expectation Step and Maximization Step to obtain the optimal values and update the model
parameters. To solve high-dimensional problems, we can combine the GMM (Gaussian
Mixture Model) with the EM algorithm [13] to transform the high-dimensional model into a
low-dimensional model.
a. Expectation Step: We use the average values and deviation to obtain the Gaussian
distribution density function [14], which is used to describe the value distribution.
(
)
1
 ( y − µ k )2 
 2σ 2 
k


exp −
φ y |θ k =
2πσ k
(3)
At the same time, we want to obtain the classification to assist in replacing the missing
values. Here, γˆ jk is used to describe the probability that Sample j belongs to Class k.
180
Copyright ⓒ 2014 SERSC
International Journal of Software Engineering and Its Applications
Vol.8, No.5 (2014)
(
)
α k φ y j |θ k
γˆ jk = K
∑ α k φ y j |θ k
k =1
(
)
, j = 1, 2 ,..., N ; k = 1, 2 ,..., K
(4)
b. Maximization Step: The main task in this step is to update the expectation of each attribute
( µ̂ k ), which is used to implement the missing value, and the coefficient of the distribution
density function ( α̂ k ), which is used to describe the probability of each of these categories.
N
µˆ k =
∑ γˆjk y j
j
=1
N
∑ γˆjk
j
(5)
=1
N
αˆ k =
∑ γˆjk
j
=1
(6)
N
1.3. Code Implementation
Here, we implement the NB-EM algorithm in MATLAB [15] .The reason why we have
chosen MATLAB is that most MATLAB functions can accept matrices and will apply
themselves to each value, and the traditional EM algorithm and NB - EM algorithm both use
the covariance matrix in a way corresponding to equations (3)-(6). Besides the MATLAB tools
have four main benefit features: 1) efficient numerical computation and symbolic computation
capabilities, enabling users freed from the complex math analysis, MATLAB is a collection
contains a large of calculation algorithms. Arithmetic functions used in are the latest research
results in scientific and engineering computing, and the functions through a variety of
optimization and fault tolerance. In the case of identical computational requirements, using
MATLAB programming can greatly reduce a lot of effort and time. These MATLAB function
sets include from the simplest and the most basic functions to the complex functions, such as
matrix eigenvectors, fast Fourier transform complex functions; 2) with a complete graphic, and
implement programs and computing results achieve the visualization. In the development
environment, allowing users to control multiple files and graphics window much more easily.
On the programming side, MATLAB support nested functions and conditional interruption. In
terms of inputs and outputs, MATLAB can be connected directly to Excel and HDF5. 3)
User-friendly interface and close to the mathematical expression of the naturalization language
that is easy for scholars to learn and master. The new version of the MATLAB language is
based on the most popular C + + language, and therefore grammatical features and is very
similar to C + + language, but more simple, and more suit for scientific and technical personnel
to write the format of the mathematical expressions, which also make it more conducive to
non-computer professional technology personnel. And the portability of this language and the
scalability of this language are very strong, and this is an important reason that MATLAB can
be used into various fields of scientific and engineering computing. 4) function-rich application
toolkit (such as signal processing toolbox, Communications Toolbox) provides users with a
large number of convenient and practical processing tools. The new version of MATLAB can
use MATLAB Compiler and C / C + + math library and graphics library, the program will
Copyright ⓒ 2014 SERSC
181
International Journal of Software Engineering and Its Applications
Vol.8, No.5 (2014)
automatically convert your MATLAB program to the one is independent on MATLAB with C
and C + + code, which allows users to write C or C + + program to interact with MATLAB.
The program is detailed below, including the table of variable definitions, like Table 1.
Table 1. Variable Definition
Variable
GDD( i , j)
m (: , : , j )
P( i , j)
density( j)
Description
Gaussian distribution density function
Attributes covariance matrix
The probability that Sample i belong
to Class j
Coefficient of the distribution density
function
Variable
X( i , j)
pi
temp1, temp2
Description
Sample i ‘s attribute j
π
Intermediate variable
U(j,l)
Expectation
attribute j
of
a. Expectation Step:
To obtain the Gaussian distribution density function corresponding to equation (3), we use
the covariance matrix in place of the deviation [16] because using the covariance matrix can
describe and demonstrate the relationship between different attributes better than the deviation
[17] and can conveniently solve the high-dimensional problem. Gaussian model is a Gaussian
probability density function accurately quantify things, things will be decomposed into a lot of
models based on the formation of Gaussian probability density function. Modeling process, we
need some parameters of Gaussian mixture model, such as variance, mean, weights and other
initialization parameters, and use these parameters obtained by modeling the data needed.
for i = 1 to n
do
for j = 1 to k
do
GMM (i ,j ) =
(X (i ,:) − U (:,j ))2
);
* exp(−
2 * m(:;:; j )2
2 * π * m(:;:; j )
1
j++;
end
i++;
end
Obtaining the probability that Sample i belongs to Class j corresponds to equation (4), which
contains two parts, first part is to obtain density and Gaussian Mixture Model. Second part is to obtain
the probability with the result from first part.
for i=1 to n
do
for j=1 to k
do
temp1 = temp1 + density(j)*GDD(i,j);
j++;
end
p(i) = temp1;
temp1 = 0;
182
Copyright ⓒ 2014 SERSC
International Journal of Software Engineering and Its Applications
Vol.8, No.5 (2014)
i++;
end
for i = 1 to n
do
for j = 1 to k
do
P(i,j)=(density(j)*GDD(i,j))/(P(i));
j++;
end
i++;
end
b. Maximization Step:
Through updating the expectation and coefficient, obtain a new Gaussian distribution density model
for the next iteration, inputting the updated expectation and coefficient values in the next Expectation
Step. Repeat the Expectation Step and Maximization Step until the whole program achieves
convergence; that is, when the change in the expectation and coefficient values is sufficiently small.
Updating the expectation of each attribute corresponds to equation (5). When the function achieves
the convergence, the expectation of each attribute is the optimal values to implement the missing values.
Therefore, we must store the U(j,l).
for j=1 to k
do
for l=1 to m
do
for i=1 to n
do
temp2 = temp2 + P(i , j)*X(i , l);
temp3 = temp3 + P(i , j);
i++;
end
U(j , l) = temp2/temp3;
temp2 = 0;
temp3 = 0;
l++;
end
j++;
end
Updating the coefficient of the Gaussian mixture model corresponds to equation (6). Because, in the
expectation step, we changed the clustering result and corresponding GMM at the same time, then we
need to update the coefficient of Gaussian Mixture Model to make the whole program repeat.
for j=1 to k
do
for i=1 to n
do
temp4 = temp4 + P(i , j);
Copyright ⓒ 2014 SERSC
183
International Journal of Software Engineering and Its Applications
Vol.8, No.5 (2014)
i++;
end
density(j) = temp4*(1/n);
temp4 = 0;
j++;
end
2. Data Implement and Classification Result
2.1. Data Implementation
In this experiment, we selected two datasets, both of which were downloaded from the UCI
machine learning website. The first dataset describes kernels belonging to two different
varieties of wheat: Kama and Rosa, 70 samples each and randomly selected [18]. The second
dataset describes vertebral columns divided into two categories: Normal (100 patients) and
Abnormal (210 patients) [19]. Details about these two datasets are shown in Table 2 and Table
3.
Table 2. Seed’s Attribute Information
Attributes
A
P
C
Length
Width
a
l
Description
Different area
Perimeter
Compactness C = 4*π*A/P^2
Length of kernel
Width of kernel
Asymmetry coefficient
Length of kernel groove
Table 3. Column’s Attribute Information
Attributes
PI
PT
LLA
SS
PR
GS
Description
Pelvic incidence
Pelvic tilt
Lumbar lordosis angle
Sacral slope
Pelvic radius
Grade of spondylolisthesis
Because we want to have more obvious results, we used the MCAR method (missing
completely at random) to increase the rate of missing values by up to 30%, and compared the
results of both the traditional EM algorithm and our NB-EM algorithm to the datasets prior to
values being removed.
In applying to the MCAR Seeds’ dataset both the traditional EM algorithm and the NB-EM
algorithm, we obtain two sets of optimal estimation shown in Table 4.
184
Copyright ⓒ 2014 SERSC
International Journal of Software Engineering and Its Applications
Vol.8, No.5 (2014)
Table 4. Seeds’ Optimal Estimation with EM and NB-EM
EM
NB
EM
EM
NB
EM
Attributes
Class A
Class B
Class A
Class B
Length
4.10955965
3.8655073
4.071186742
3.894977581
A
10.8269840
11.5229968
10.64784974
11.66546843
Width
2.73642423
2.782240497
3.331424247
2.138882163
P
10.7887897
9.87305443
9.329911901
11.2472513
a
2.548285411
1.93333857
2.74458494
1.689669352
C
0.58293985
0.62747953
0.608751284
0.602245701
l
3.786276617
3.176935235
3.432135692
3.530836093
In applying to the MCAR Columns’ dataset both the traditional EM algorithm and the
NB-EM algorithm, we obtain two sets of optimal estimation shown in Table 5.
Table 5. Columns’ Optimal Estimation with EM and NB-EM
EM
NB
EM
EM
NB
EM
Attributes
Class A
Class B
Class A
Class B
Attributes
Class A
Class B
Class A
Class B
PI
38.97077617
43.94582274
38.99196927
43.90693893
SS
26.20333843
34.79580486
26.17940252
34.82255991
PT
11.66092049
12.97999265
11.52869673
13.18351406
PR
87.24249788
82.3805435
87.26586852
82.35016052
LLA
33.61154508
46.15201578
33.30749357
46.60853731
GS
0.640833259
42.22666898
0.653738189
42.15643263
We then use these two tables to substitute the missing values in MCAR datasets, obtaining
two pairs of updated datasets. Subsequently, we input the updated datasets into Weka and
classify them.
2.2. Implement the Missing Values
When the whole function achieves the convergence, there will be two outputs, first is the
clustering result, second is the optimal values, which are used to replace the missing values,
like Figure 3 below. Firstly, we should search from clustery table and obtain the sample belong
to which class. Secondly, we can extract the corresponding attribute optimal values to replace
the missing values in this sample. For example, the second attribute of this sample is missing,
then, we can use the second optimal value to replace. After this phase, we can obtain a
implemented dataset without missing values.
Copyright ⓒ 2014 SERSC
185
International Journal of Software Engineering and Its Applications
Vol.8, No.5 (2014)
Optimal values of Class A
Input the
sample with
missing values
Search from the clustery
results
Replace the missing
values
The sample belong
to Class A or Class B
Optimal values of Class B
End
Replace the missing
values
Figure 3. Process of Replacing Missing Values
2.3. Classification Results
Table 6 and table7 show the results of different methods of implementing the missing values
using the Multilayer Perceptron as the classifier in Weka [20]. The accuracy rate shows the
method that has a better effect.
Table 6. Classification Results of Seeds
Dataset
Original Dataset
Dataset with EM algorithm
Dataset with NB-EM
Correctly Classified Instances
79.2857%
81.4286%
88.5714%
Table 7. Classification Results of Column
Dataset
Original Dataset
Dataset with EM algorithm
Dataset with NB-EM
Correctly Classified Instances
69.6774 %
73.2258 %
78.0645 %
In both tables, the first row is the result of classifying the MCAR dataset without any
processes, just a Multilayer Perceptron method to classify (Original Dataset). The second row
is the result of using a traditional EM algorithm to substitute the missing values, then using a
Multilayer Perceptron method to classify (Dataset with EM algorithm). The third row is the
result of using an NB-EM algorithm to substitute the missing values, then using a Multilayer
Perceptron method to classify (Dataset with NB-EM algorithm).
3. Experimental Results
In this paper, we studied a new method, the NB-EM algorithm, for handling missing values
in preparation of datasets for data discrimination and mining applications. The performance of
this method is compared with the traditional EM method and non-substitution approaches for
dealing with datasets containing randomly missing value attribute values. Thus, we can easily
determine which method is most effective. Compared with the traditional EM algorithm, the
NB-EM algorithm has a higher accuracy rate, which suggests that the NB-EM algorithm can
obtain a better effect on missing values in practice.
186
Copyright ⓒ 2014 SERSC
International Journal of Software Engineering and Its Applications
Vol.8, No.5 (2014)
Therefore, the NB-EM algorithm is used fixed the initial values to make sure the whole
program avoid the local optimization and the influence of marginal values. Through repeating
the Expectation Step and the Maximization Step wants to continuously approximate the
optimal values. Therefore, the NB-EM algorithm can achieve a better result.
The application of these results to data mining and knowledge discovery could not only help
to improve the selection of a method for handling missing values during the data preprocessing
phases of different data structures, but also produce a more reliable and efficient
decision-making process given the uncertainties and incompleteness in presenting data
collections.
Acknowledgment
This research was supported by Basic Science Research Program through the National
Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and
Technology. (2012R1A1A2044134).
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
W. Vach, “Missing values: statistical theory and computational practice”, Computational Statistics, Edited by
P. Dirschedl and R. Ostermann, Heidelberg, (1994), pp. 345-354.
J. W. Grzymala-Busse, “Rough set approach to incomplete data”, Lecture Notes in Artificial Intelligence, vol.
3070, (2004), pp. 50-55.
K. Lakshminarayan, S. A. Harp and T. Samad, “Imputation of missing data in industrial databases”, Applied
Intelligence, vol. 11, (1999), pp. 259-275.
W. Vach, “Missing values: statistical theory and computational practise”, Computational Statistics, edited by P.
Dirschedl and R. Ostermann, Heidelberg: Physica-Verlag, (1994), pp. 345-354.
H. Akaike, “A new look at the statistical identification model”, IEEE Trans on Automat Control, vol. 19,
(1974), pp. 716-723.
B. G. Lindsay, “Mixture Models: Theory, Geometry and Applications”, NSF-CBMS Regional Conference
Series in Probability and Statistics, Institute of Mathematical Statistics, California, vol. 5, (1995).
X. Huang and Q. Zhu, “A pseudo-nearest-neighbor approach for missing data recovery on Gaussian random
data sets”, Pattern Recognition Letters, vol. 23, (2002), pp. 1613-1622.
J. W. Grzymala-Busse and M. Hu, “A comparison of several approaches to missing attribute values in data
mining”, Rough Sets and Current Trends in Computing Lecture Notes in Computer Science, vol. 2005, (2001),
pp. 378-385.
G. J. McLachlan and T. Krishnam, Editors, “The EM algorithm and Extensions”, Wilev-Interscience
Publishers, New York, (2007).
J. MacQueen, “Some methods for classification and Analysis of multivariate observations”, Proceedings of the
Fifth Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, vol. 1,
(1967), pp. 281-297.
R. S. Pilla and B. G. Lindsay, “Alternative EM methods for nonparametric finite mixture models”, Biometrika,
vol. 88, (2001), pp. 535-550.
A. P. Dempster, N. M. Laird and D. B. Rubin, “Maximum likelihood from incomplete data via the EM
algorithm”, Journal of the Royal statistical Society, vol. 39, (1977), pp. 1-38.
C. Elkan, “Boosting and Naive Bayesian learning”, Technical Report No. CS97-557, (1997) September.
J. A. Bilmes, “A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian
mixture and hidden Markov models”, International Computer Science Institute, vol. 4, (1998), pp. 510-523.
C. Liu and D. X. Sun, “Acceleration of EM algorithm for mixtures models using ECME”, ASA Proceedings of
The Stat. Comp. Session, (1997), pp. 109-114.
C. R. Houckm, J. A. Joines and M. G. Kay, “A Genetic Algorithm for Function Optimization: A Matlab
Implementation”, National Science Foundation under grant number, North Carolina State University, Report
no.: NCSU_IE_TR_95_09, (1995).
G. Celeuxa, S. Chretiena, F. Forbesa and A. Mkhadria, “A component-wise EM algorithm for mixtures”,
Journal of Computational and Graphical Statistics, vol. 10, (2001), pp. 699-712.
D. B. Mohning, “Computer-Assisted Analysis of Mixtures and Applications: Meta-Analysis”, Disease
Mapping and Others. Technometrics, vol. 42, (2000), pp. 442-442.
UCI Reposotory of Machine Learning. http://archive.ics.uci.edu/ml/ datasets/seeds.
Copyright ⓒ 2014 SERSC
187
International Journal of Software Engineering and Its Applications
Vol.8, No.5 (2014)
[20] UCI Reposotory of Machine Learning. http://archive.ics.uci.edu/ml /datasets/Vertebral+Column.
[21] B. W. Porter, R. Bareiss and R. C. Holte, “Concept learning and heuristic classification in weak-theory
domains”, Artificial Intelligence, vol. 45, (1990), pp. 229-263.
[22] X. Zhou and J. S. Lim, “EM algorithm with GMM and Naive Bayesian to Implement Missing Values”,
Proceedings of April 17th 2014 Jeju Island, Korea, Workshop 2014, Jeju Island, Korea, (2014) April 15-19.
Authors
Joon S. Lim, he received his B.S. and M.S. degrees in computer
science from Inha University, Korea, The University of Alabama at
Birmingham, and Ph.D. degree was from Louisiana State University,
Baton Rouge, Louisiana, in 1986, 1989, and 1994, respectively. He is
currently a professor in the department of computer software at Gachon
University, Korea. His research focuses on neuro-fuzzy systems,
bio-medical prediction systems, and human-centered systems. He has
authored three textbooks on Artificial Intelligence Programming (Green
Press, 2000), Javaquest (Green Press, 2003), and C# Quest (Green Press,
2006).
Xi-Yu Zhou, he received his B.S. in computer science from Ludong
University, China in 2013. He is currently in master’s course in computer
science from department of computer software at Gachon University,
Korea. His research focuses on neuro-fuzzy systems, biomedical
prediction systems, and signal process.
188
Copyright ⓒ 2014 SERSC