Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Multi-armed bandit wikipedia , lookup
Gene expression programming wikipedia , lookup
Neural modeling fields wikipedia , lookup
Genetic algorithm wikipedia , lookup
Pattern recognition wikipedia , lookup
Time series wikipedia , lookup
Mixture model wikipedia , lookup
International Journal of Software Engineering and Its Applications Vol.8, No.5 (2014), pp.177-188 http://dx.doi.org/10.14257/ijseia.2014.8.5.14 Replace Missing Values with EM algorithm based on GMM and Naïve Bayesian Xi-Yu Zhou1 and Joon S. Lim2* 1 2 I.T. College Gachon University Seongnam, South Korea I.T. College Gachon University Seongnam, South Korea 1 [email protected], 2 [email protected] Abstract In data mining applications, there are various kinds of missing values in experimental datasets. Non-substitution or inappropriate treatment of missing values has a high probability to cause a lot of warnings or errors. Besides, many classification algorithms are very sensitive to the missing values. Because of these, handling the missing values is an important phase in many classification or data mining task. This paper introduces traditional EM algorithm and disadvantage of the EM algorithm. We propose a new method to implement the missing values based on EM algorithm, which uses Naive Bayesian to improve the accuracy. We conclude by classifying seeds dataset and vertebral columns dataset and comparing the results to those obtained by applying two other missing value handling methods: the traditional EM algorithm and the non-substitution method. The experimental results prove a stable algorithm for improving the data classification accuracy on large datasets, which contain a lot of missing values. Keywords: missing values, EM algorithm, GMM, Naive Bayesian 1. Introduction Missing values exist in many situations wherein no values are reserved for some variable in an experiment or observation [1]. In real-life data, some stored values are frequently missing as a result of unexpected mistakes, most often because either they are lost or they are independent of conditions [2]. Although missing values are a common occurrence, they can nonetheless have a significant effect on the processing of and results derived from data. First, the data mining program will constantly lose a considerable amount of useful information. Second, the system shows more signs regarding the uncertainty of the result, and it is difficult to ensure the determinateness [3]. Third, the missing values have a high probability of confusing the data mining process, leading to uncertain output. Fourth, the missing values frequently infect the operating performance and produce mistakes in the mining model [4]. Besides, some classification algorithms, such as backpropagation neural network, K-Nearest neighbor algorithm, C4.5 decision tress and so on, are very sensitive to the missing values. If there are a lot of missing values in datasets, then we use one of among algorithms to classify, we will have a high probability to obtain the low classification accuracy [21]. Accordingly, handling missing values is an important step in preprocessing phases for most data classification or data mining tasks [5]. Inappropriate implementation of missing values can produce serious errors or false results. * Corresponding Author ISSN: 1738-9984 IJSEIA Copyright ⓒ 2014 SERSC International Journal of Software Engineering and Its Applications Vol.8, No.5 (2014) Generally, methods for dealing with missing values can be divided into three classes: i) delete the missing values; ii) implement the missing values with estimated values; and iii) ignore the missing values [7]. Among these methods, deleting missing values is the easiest. However, when the rate of missing values in each attribute is high, this method has an unsatisfactory performance [8]. Ignoring missing values also causes similar issues. We thus naturally prefer methods for implementing missing values. There are many methods for accomplishing this, such as the approximation [6], stochastic regression, and neural network methods. Among all the approaches, the EM (expectation-maximization) algorithm can reliably use the stable and the maximum step to find the optimal values for implementing the missing values [9]. However, the EM algorithm’s speed of convergence is quite slow and easily falls into local optimization. If we give the EM algorithm fixed initial values, we can increase the speed of convergence and algorithm stability. At the same time, we can overcome the deviation by way of marginal values. Together, these give the EM algorithm a better performance. This improved EM algorithm is based on Naive Bayesian, and therefore is named the NB-EM algorithm, which uses the result of classification to substitute otherwise-random initial values. Below, we describe both the traditional EM algorithm and the NB-EM algorithm. 1.1. Traditional EM Algorithm The EM algorithm is a popular method of iterative refinement [10]. In each iterative step, it has an Expectation Step and a Maximization Step [11], where the Expectation Step estimates the missing values and the Maximization Step updates the model parameters. The basis of the algorithm is to first estimate the missing value’s initial values and obtain the values of the model parameters, and then to iteratively repeat the Expectation Step and Maximization Step, while updating the estimated values, until the function reaches convergence. In more detail: (1) Randomly choose K samples as the center of each class. (2) Repeat Expectation Step and Maximization Step to improve the accuracy, until the function reaches convergence. a. Expectation Step: use the probability P ( X i ∈Ck ) to classify each sample into some Class k. ( ) = P(Ck | X i ) = P(Ck ) * P( X i |Ck ) / P( X i ) k = P (Ck ) * P ( X i |Ck ) / ∑ P ( Ck )*P ( X i |Ck ) i =1 P X i ∈Ck (1) In this equation, P ( X i∈Ck ) is the probability that Sample i belongs to Class k, which can be used in classification to ensure that to some degree the classification result is accurate. b. Maximization Step: use the estimating values, obtained before, to re-estimate the model parameter. ( k m k =1 / n * ∑ X i * P X i ∈C k i =1 In this equation, 178 mk )/ j ∑ P ( X i ∈C j ) i= j (2) is the model parameter. Copyright ⓒ 2014 SERSC International Journal of Software Engineering and Its Applications Vol.8, No.5 (2014) 1.2. EM algorithm with Naive Bayesian The traditional EM algorithm randomly chooses samples as the center of each class, which easily affects the clustering result. In other words, the disadvantage of traditional EM algorithm is that EM algorithm is too dependent on the selection of the initial center of each class. Further, marginal values have a high probability of affecting the entire algorithm, thereby decreasing the accuracy of implementing missing values. Because of these problems, if we can fix the initial center of each class, we will decrease the dependence on initial centers. This thesis proposes an improved EM algorithm based on Naive Bayesian, which we call the NB-EM algorithm. In this method, we can use the Naive Bayesian to classify the dataset and obtain the result, and then, use the classification results to substitute the randomly-selected center of each class before repeating the Expectation Step and Maximization Step. As a result, the NB-EM algorithm can cluster and obtain convergence more quickly, while also effectively avoiding the influence of marginal values and obtaining a more accurate value to substitute for the missing values. This algorithm works as follows: Start Expectation Step Input dataset in Naive Bayesian Maximization Step Classified Process function convergence Get the Classification Input the Classification result Output classification result Output optimal values Fixed the Initial values End Figure 1. Process of NB-EM Algorithm (1) Use the Naive Bayesian algorithm to classify the dataset. We can use the Naive Bayesian classifier in Weka, a collection of classification tools designed by WAIKATO University, to obtain the classification results [12]. WEKA as an open platform for data mining, a collection of a large number of data mining tasks can take on machine learning algorithms, including data preprocessing, classification, regression, clustering, association rules, and in the new interactive interface visualization. If you want your own data mining algorithms, it can refer to the interface documentation weka. Integrate their own algorithms in weka even draw its own method for visualization tool is not a very difficult thing. (2) The most important phase of the NB - EM algorithm is that we use the fixed center of each class to replace the randomly initial centers. To achieve the performance, we have four parts, like the Figure 2. First step is to statistic the number of each class. Then, using the number divides the summery of samples to obtain the density of each class. Second step is to calculate and obtain the Gaussian Mixture Model of each class. The third step is using the density and Gaussian Mixture Model of each class to calculate the probability of each class. The final step Copyright ⓒ 2014 SERSC 179 International Journal of Software Engineering and Its Applications Vol.8, No.5 (2014) is using the probability of each class to fix the range, and then, we can fix the center of each class. Start Input the dataset Statistic the number of Class A Statistic the number of Class B Density of Class A D(A) Density of Class B D(B) Gaussian Mixture Model of Class A G(A) Gaussian Mixture Model of Class B G(B) Probability Of Class A P(A) Probability Of Class B P(B) Fix the Range of Class A Fix the Range of Class B Fix the centers of Class A and Class B End Figure 2. Process of Fixing Center of Each Class (3) Use the classification result from (1) in place of the random initial classes, and repeat Expectation Step and Maximization Step to obtain the optimal values and update the model parameters. To solve high-dimensional problems, we can combine the GMM (Gaussian Mixture Model) with the EM algorithm [13] to transform the high-dimensional model into a low-dimensional model. a. Expectation Step: We use the average values and deviation to obtain the Gaussian distribution density function [14], which is used to describe the value distribution. ( ) 1 ( y − µ k )2 2σ 2 k exp − φ y |θ k = 2πσ k (3) At the same time, we want to obtain the classification to assist in replacing the missing values. Here, γˆ jk is used to describe the probability that Sample j belongs to Class k. 180 Copyright ⓒ 2014 SERSC International Journal of Software Engineering and Its Applications Vol.8, No.5 (2014) ( ) α k φ y j |θ k γˆ jk = K ∑ α k φ y j |θ k k =1 ( ) , j = 1, 2 ,..., N ; k = 1, 2 ,..., K (4) b. Maximization Step: The main task in this step is to update the expectation of each attribute ( µ̂ k ), which is used to implement the missing value, and the coefficient of the distribution density function ( α̂ k ), which is used to describe the probability of each of these categories. N µˆ k = ∑ γˆjk y j j =1 N ∑ γˆjk j (5) =1 N αˆ k = ∑ γˆjk j =1 (6) N 1.3. Code Implementation Here, we implement the NB-EM algorithm in MATLAB [15] .The reason why we have chosen MATLAB is that most MATLAB functions can accept matrices and will apply themselves to each value, and the traditional EM algorithm and NB - EM algorithm both use the covariance matrix in a way corresponding to equations (3)-(6). Besides the MATLAB tools have four main benefit features: 1) efficient numerical computation and symbolic computation capabilities, enabling users freed from the complex math analysis, MATLAB is a collection contains a large of calculation algorithms. Arithmetic functions used in are the latest research results in scientific and engineering computing, and the functions through a variety of optimization and fault tolerance. In the case of identical computational requirements, using MATLAB programming can greatly reduce a lot of effort and time. These MATLAB function sets include from the simplest and the most basic functions to the complex functions, such as matrix eigenvectors, fast Fourier transform complex functions; 2) with a complete graphic, and implement programs and computing results achieve the visualization. In the development environment, allowing users to control multiple files and graphics window much more easily. On the programming side, MATLAB support nested functions and conditional interruption. In terms of inputs and outputs, MATLAB can be connected directly to Excel and HDF5. 3) User-friendly interface and close to the mathematical expression of the naturalization language that is easy for scholars to learn and master. The new version of the MATLAB language is based on the most popular C + + language, and therefore grammatical features and is very similar to C + + language, but more simple, and more suit for scientific and technical personnel to write the format of the mathematical expressions, which also make it more conducive to non-computer professional technology personnel. And the portability of this language and the scalability of this language are very strong, and this is an important reason that MATLAB can be used into various fields of scientific and engineering computing. 4) function-rich application toolkit (such as signal processing toolbox, Communications Toolbox) provides users with a large number of convenient and practical processing tools. The new version of MATLAB can use MATLAB Compiler and C / C + + math library and graphics library, the program will Copyright ⓒ 2014 SERSC 181 International Journal of Software Engineering and Its Applications Vol.8, No.5 (2014) automatically convert your MATLAB program to the one is independent on MATLAB with C and C + + code, which allows users to write C or C + + program to interact with MATLAB. The program is detailed below, including the table of variable definitions, like Table 1. Table 1. Variable Definition Variable GDD( i , j) m (: , : , j ) P( i , j) density( j) Description Gaussian distribution density function Attributes covariance matrix The probability that Sample i belong to Class j Coefficient of the distribution density function Variable X( i , j) pi temp1, temp2 Description Sample i ‘s attribute j π Intermediate variable U(j,l) Expectation attribute j of a. Expectation Step: To obtain the Gaussian distribution density function corresponding to equation (3), we use the covariance matrix in place of the deviation [16] because using the covariance matrix can describe and demonstrate the relationship between different attributes better than the deviation [17] and can conveniently solve the high-dimensional problem. Gaussian model is a Gaussian probability density function accurately quantify things, things will be decomposed into a lot of models based on the formation of Gaussian probability density function. Modeling process, we need some parameters of Gaussian mixture model, such as variance, mean, weights and other initialization parameters, and use these parameters obtained by modeling the data needed. for i = 1 to n do for j = 1 to k do GMM (i ,j ) = (X (i ,:) − U (:,j ))2 ); * exp(− 2 * m(:;:; j )2 2 * π * m(:;:; j ) 1 j++; end i++; end Obtaining the probability that Sample i belongs to Class j corresponds to equation (4), which contains two parts, first part is to obtain density and Gaussian Mixture Model. Second part is to obtain the probability with the result from first part. for i=1 to n do for j=1 to k do temp1 = temp1 + density(j)*GDD(i,j); j++; end p(i) = temp1; temp1 = 0; 182 Copyright ⓒ 2014 SERSC International Journal of Software Engineering and Its Applications Vol.8, No.5 (2014) i++; end for i = 1 to n do for j = 1 to k do P(i,j)=(density(j)*GDD(i,j))/(P(i)); j++; end i++; end b. Maximization Step: Through updating the expectation and coefficient, obtain a new Gaussian distribution density model for the next iteration, inputting the updated expectation and coefficient values in the next Expectation Step. Repeat the Expectation Step and Maximization Step until the whole program achieves convergence; that is, when the change in the expectation and coefficient values is sufficiently small. Updating the expectation of each attribute corresponds to equation (5). When the function achieves the convergence, the expectation of each attribute is the optimal values to implement the missing values. Therefore, we must store the U(j,l). for j=1 to k do for l=1 to m do for i=1 to n do temp2 = temp2 + P(i , j)*X(i , l); temp3 = temp3 + P(i , j); i++; end U(j , l) = temp2/temp3; temp2 = 0; temp3 = 0; l++; end j++; end Updating the coefficient of the Gaussian mixture model corresponds to equation (6). Because, in the expectation step, we changed the clustering result and corresponding GMM at the same time, then we need to update the coefficient of Gaussian Mixture Model to make the whole program repeat. for j=1 to k do for i=1 to n do temp4 = temp4 + P(i , j); Copyright ⓒ 2014 SERSC 183 International Journal of Software Engineering and Its Applications Vol.8, No.5 (2014) i++; end density(j) = temp4*(1/n); temp4 = 0; j++; end 2. Data Implement and Classification Result 2.1. Data Implementation In this experiment, we selected two datasets, both of which were downloaded from the UCI machine learning website. The first dataset describes kernels belonging to two different varieties of wheat: Kama and Rosa, 70 samples each and randomly selected [18]. The second dataset describes vertebral columns divided into two categories: Normal (100 patients) and Abnormal (210 patients) [19]. Details about these two datasets are shown in Table 2 and Table 3. Table 2. Seed’s Attribute Information Attributes A P C Length Width a l Description Different area Perimeter Compactness C = 4*π*A/P^2 Length of kernel Width of kernel Asymmetry coefficient Length of kernel groove Table 3. Column’s Attribute Information Attributes PI PT LLA SS PR GS Description Pelvic incidence Pelvic tilt Lumbar lordosis angle Sacral slope Pelvic radius Grade of spondylolisthesis Because we want to have more obvious results, we used the MCAR method (missing completely at random) to increase the rate of missing values by up to 30%, and compared the results of both the traditional EM algorithm and our NB-EM algorithm to the datasets prior to values being removed. In applying to the MCAR Seeds’ dataset both the traditional EM algorithm and the NB-EM algorithm, we obtain two sets of optimal estimation shown in Table 4. 184 Copyright ⓒ 2014 SERSC International Journal of Software Engineering and Its Applications Vol.8, No.5 (2014) Table 4. Seeds’ Optimal Estimation with EM and NB-EM EM NB EM EM NB EM Attributes Class A Class B Class A Class B Length 4.10955965 3.8655073 4.071186742 3.894977581 A 10.8269840 11.5229968 10.64784974 11.66546843 Width 2.73642423 2.782240497 3.331424247 2.138882163 P 10.7887897 9.87305443 9.329911901 11.2472513 a 2.548285411 1.93333857 2.74458494 1.689669352 C 0.58293985 0.62747953 0.608751284 0.602245701 l 3.786276617 3.176935235 3.432135692 3.530836093 In applying to the MCAR Columns’ dataset both the traditional EM algorithm and the NB-EM algorithm, we obtain two sets of optimal estimation shown in Table 5. Table 5. Columns’ Optimal Estimation with EM and NB-EM EM NB EM EM NB EM Attributes Class A Class B Class A Class B Attributes Class A Class B Class A Class B PI 38.97077617 43.94582274 38.99196927 43.90693893 SS 26.20333843 34.79580486 26.17940252 34.82255991 PT 11.66092049 12.97999265 11.52869673 13.18351406 PR 87.24249788 82.3805435 87.26586852 82.35016052 LLA 33.61154508 46.15201578 33.30749357 46.60853731 GS 0.640833259 42.22666898 0.653738189 42.15643263 We then use these two tables to substitute the missing values in MCAR datasets, obtaining two pairs of updated datasets. Subsequently, we input the updated datasets into Weka and classify them. 2.2. Implement the Missing Values When the whole function achieves the convergence, there will be two outputs, first is the clustering result, second is the optimal values, which are used to replace the missing values, like Figure 3 below. Firstly, we should search from clustery table and obtain the sample belong to which class. Secondly, we can extract the corresponding attribute optimal values to replace the missing values in this sample. For example, the second attribute of this sample is missing, then, we can use the second optimal value to replace. After this phase, we can obtain a implemented dataset without missing values. Copyright ⓒ 2014 SERSC 185 International Journal of Software Engineering and Its Applications Vol.8, No.5 (2014) Optimal values of Class A Input the sample with missing values Search from the clustery results Replace the missing values The sample belong to Class A or Class B Optimal values of Class B End Replace the missing values Figure 3. Process of Replacing Missing Values 2.3. Classification Results Table 6 and table7 show the results of different methods of implementing the missing values using the Multilayer Perceptron as the classifier in Weka [20]. The accuracy rate shows the method that has a better effect. Table 6. Classification Results of Seeds Dataset Original Dataset Dataset with EM algorithm Dataset with NB-EM Correctly Classified Instances 79.2857% 81.4286% 88.5714% Table 7. Classification Results of Column Dataset Original Dataset Dataset with EM algorithm Dataset with NB-EM Correctly Classified Instances 69.6774 % 73.2258 % 78.0645 % In both tables, the first row is the result of classifying the MCAR dataset without any processes, just a Multilayer Perceptron method to classify (Original Dataset). The second row is the result of using a traditional EM algorithm to substitute the missing values, then using a Multilayer Perceptron method to classify (Dataset with EM algorithm). The third row is the result of using an NB-EM algorithm to substitute the missing values, then using a Multilayer Perceptron method to classify (Dataset with NB-EM algorithm). 3. Experimental Results In this paper, we studied a new method, the NB-EM algorithm, for handling missing values in preparation of datasets for data discrimination and mining applications. The performance of this method is compared with the traditional EM method and non-substitution approaches for dealing with datasets containing randomly missing value attribute values. Thus, we can easily determine which method is most effective. Compared with the traditional EM algorithm, the NB-EM algorithm has a higher accuracy rate, which suggests that the NB-EM algorithm can obtain a better effect on missing values in practice. 186 Copyright ⓒ 2014 SERSC International Journal of Software Engineering and Its Applications Vol.8, No.5 (2014) Therefore, the NB-EM algorithm is used fixed the initial values to make sure the whole program avoid the local optimization and the influence of marginal values. Through repeating the Expectation Step and the Maximization Step wants to continuously approximate the optimal values. Therefore, the NB-EM algorithm can achieve a better result. The application of these results to data mining and knowledge discovery could not only help to improve the selection of a method for handling missing values during the data preprocessing phases of different data structures, but also produce a more reliable and efficient decision-making process given the uncertainties and incompleteness in presenting data collections. Acknowledgment This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology. (2012R1A1A2044134). References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] W. Vach, “Missing values: statistical theory and computational practice”, Computational Statistics, Edited by P. Dirschedl and R. Ostermann, Heidelberg, (1994), pp. 345-354. J. W. Grzymala-Busse, “Rough set approach to incomplete data”, Lecture Notes in Artificial Intelligence, vol. 3070, (2004), pp. 50-55. K. Lakshminarayan, S. A. Harp and T. Samad, “Imputation of missing data in industrial databases”, Applied Intelligence, vol. 11, (1999), pp. 259-275. W. Vach, “Missing values: statistical theory and computational practise”, Computational Statistics, edited by P. Dirschedl and R. Ostermann, Heidelberg: Physica-Verlag, (1994), pp. 345-354. H. Akaike, “A new look at the statistical identification model”, IEEE Trans on Automat Control, vol. 19, (1974), pp. 716-723. B. G. Lindsay, “Mixture Models: Theory, Geometry and Applications”, NSF-CBMS Regional Conference Series in Probability and Statistics, Institute of Mathematical Statistics, California, vol. 5, (1995). X. Huang and Q. Zhu, “A pseudo-nearest-neighbor approach for missing data recovery on Gaussian random data sets”, Pattern Recognition Letters, vol. 23, (2002), pp. 1613-1622. J. W. Grzymala-Busse and M. Hu, “A comparison of several approaches to missing attribute values in data mining”, Rough Sets and Current Trends in Computing Lecture Notes in Computer Science, vol. 2005, (2001), pp. 378-385. G. J. McLachlan and T. Krishnam, Editors, “The EM algorithm and Extensions”, Wilev-Interscience Publishers, New York, (2007). J. MacQueen, “Some methods for classification and Analysis of multivariate observations”, Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, vol. 1, (1967), pp. 281-297. R. S. Pilla and B. G. Lindsay, “Alternative EM methods for nonparametric finite mixture models”, Biometrika, vol. 88, (2001), pp. 535-550. A. P. Dempster, N. M. Laird and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm”, Journal of the Royal statistical Society, vol. 39, (1977), pp. 1-38. C. Elkan, “Boosting and Naive Bayesian learning”, Technical Report No. CS97-557, (1997) September. J. A. Bilmes, “A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models”, International Computer Science Institute, vol. 4, (1998), pp. 510-523. C. Liu and D. X. Sun, “Acceleration of EM algorithm for mixtures models using ECME”, ASA Proceedings of The Stat. Comp. Session, (1997), pp. 109-114. C. R. Houckm, J. A. Joines and M. G. Kay, “A Genetic Algorithm for Function Optimization: A Matlab Implementation”, National Science Foundation under grant number, North Carolina State University, Report no.: NCSU_IE_TR_95_09, (1995). G. Celeuxa, S. Chretiena, F. Forbesa and A. Mkhadria, “A component-wise EM algorithm for mixtures”, Journal of Computational and Graphical Statistics, vol. 10, (2001), pp. 699-712. D. B. Mohning, “Computer-Assisted Analysis of Mixtures and Applications: Meta-Analysis”, Disease Mapping and Others. Technometrics, vol. 42, (2000), pp. 442-442. UCI Reposotory of Machine Learning. http://archive.ics.uci.edu/ml/ datasets/seeds. Copyright ⓒ 2014 SERSC 187 International Journal of Software Engineering and Its Applications Vol.8, No.5 (2014) [20] UCI Reposotory of Machine Learning. http://archive.ics.uci.edu/ml /datasets/Vertebral+Column. [21] B. W. Porter, R. Bareiss and R. C. Holte, “Concept learning and heuristic classification in weak-theory domains”, Artificial Intelligence, vol. 45, (1990), pp. 229-263. [22] X. Zhou and J. S. Lim, “EM algorithm with GMM and Naive Bayesian to Implement Missing Values”, Proceedings of April 17th 2014 Jeju Island, Korea, Workshop 2014, Jeju Island, Korea, (2014) April 15-19. Authors Joon S. Lim, he received his B.S. and M.S. degrees in computer science from Inha University, Korea, The University of Alabama at Birmingham, and Ph.D. degree was from Louisiana State University, Baton Rouge, Louisiana, in 1986, 1989, and 1994, respectively. He is currently a professor in the department of computer software at Gachon University, Korea. His research focuses on neuro-fuzzy systems, bio-medical prediction systems, and human-centered systems. He has authored three textbooks on Artificial Intelligence Programming (Green Press, 2000), Javaquest (Green Press, 2003), and C# Quest (Green Press, 2006). Xi-Yu Zhou, he received his B.S. in computer science from Ludong University, China in 2013. He is currently in master’s course in computer science from department of computer software at Gachon University, Korea. His research focuses on neuro-fuzzy systems, biomedical prediction systems, and signal process. 188 Copyright ⓒ 2014 SERSC