Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
318 Vol 04, Special Issue01; 2013 http://ijpaper.com/ PUBLICATIONS OF PROBLEMS & APPLICATION IN ENGINEERING RESEARCH - PAPER CSEA2012 ISSN: 2230-8547; e-ISSN: 2230-8555 CLASSIFICATION AND CLUSTERING MEDICAL DATASETS BY USING ARTIFICIAL NEURAL NETWORK MODELS B.V.S DHEERAJ REDDY1, MOUNIKA BOOREDDY2 1 Department of Computer Science and Engineering, Sastra University Department of Information and Communication Technology, Sastra University [email protected], [email protected] 2 1. INTRODUCTION Artificial Neural Networks (ANN) is an information-processing paradigm inspired by the way the human brain processes information. Artificial neural networks are collections of mathematical models that represent some of the observed properties of biological nervous systems and draw on the analogies of adaptive biological learning. The key element of ANN is topology. The ANN consists of a large number of highly interconnected processing elements (nodes) that are tied together with weighted connections (links). Learning in biological systems involves adjustments to the synaptic connections that exist between the neurons. This is true for ANN as well. Learning typically occurs by example through training, or exposure to a set of input/output data (pattern) where the training algorithm adjusts the link weights. The link weights store the knowledge necessary to solve specific problems. Neural networks provide a new suite of non linear algorithms for feature extraction (using hidden layers) and classification (e.g. multi layer perceptrons). The main characteristics of neural networks are those having the ability to learn complex non-linear input output relationships, use training procedures, and adapt themselves to the data. The power of Artificial Neural Networks resides in its capacity to generate an area of decision of any form. The main goal of research in the field of artificial neural networks is to understand and emulate the working principles of biological neural systems. Some of the benefits of neural network are as follows • Ability to process a massive input data in parallel with parallel architecture • Simulation of diffuse medical reasoning • Higher performances when compared with statistical approaches • Self-Organizing ability-learning capability • Easy knowledge base updating 2. NEURAL NETWORK IN MEDICAL FIELD Neural networks are known to produce highly accurate results in practical applications. Neural networks have been successfully applied to a variety of real world classification tasks in industry, business and science. Also they have been applied to various areas of medicine, such as diagnostic aides, medicine, biochemical analysis, image analysis, drug development. They are used in the analysis of medical images from a variety of imaging modalities. Earlier works in Clinical Diagnosis, Image Analysis and Signal Analysis are presented in the following sections. 2.1 Clinical Diagnosis A research group at University Hospital, Lund, Sweden tested whether neural networks training to detect acute myocardial infarction could lower this error rate. They trained the network using ECG measurements from 1120 patients who had suffered a heart attack, and 10,452 healthy persons with no history of heart attack. The performance of the neural networks was then compared with that of a widely used ECG interpretation program and that of an experienced cardiologist. An Entropy Maximization Network (EMN) has been applied to prediction of metastases in breast cancer patients [1]. They used EMN to construct discrete models that predict the occurrence of auxiliary lymph node metastases in breast cancer patients, based on characteristics of the primary tumor alone. An artificial neural network has been used to predict the occurrence of coronary artery disease. Serum lipid profile and clinical events of 162 patients over a period of 10 years served as input data to the network [2]. In [3], the authors carried out a study to investigate the effectiveness of radial basis function networks as an alternative data driven diagnostic technique of myocardial infraction. The study included clinical data from 500 cases. A Bayesian posterior probability distribution is used in a neural network input selection. The network is designed to assist inexperienced gynecologist in the pre-operative discrimination between benign and malignant ovarian tumors[4]. Serum electrophoresis is used as standard laboratory medical test for diagnosis of several pathological conditions such as liver cirrhosis or nephritic syndrome. A multilayer perceptron trained using the Back- 2010-2013 - IJPAPER Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat, DOAJ, and other major databases etc., 319 Vol 04, Special Issue01; 2013 http://ijpaper.com/ PUBLICATIONS OF PROBLEMS & APPLICATION IN ENGINEERING RESEARCH - PAPER CSEA2012 ISSN: 2230-8547; e-ISSN: 2230-8555 Propagation learning algorithm, and a Radial-Based Function network were used to implement an erective diagnostic aid system[5]. 2.2 Image Analysis In [6], the authors presented examples of filtering, segmentation and edge detection techniques using cellular neural networks to improve resolution in brain tomographies, and improve global frequency correction for the detection of micro calcifications in mammograms. In [7], the authors trained different neural networks to recognize regions of interest (ROIs) corresponding to specific organs within electrical impedance tomography images (EIT) of the thorax. The network allows automatic selection of optimal pixels based on the number of images, over a sample period, in which each pixel is classified as belonging to a particular organ. In [8], the authors compared neural networks (cascade correlation) and fuzzy clustering techniques for segmentation of MRI of the brain. Both approaches were applied to intelligent diagnosis. In [9], the authors implemented a self-organizing network multilayer adaptive resonance architecture (MARA) for the segmentation of CT images of the heart. Similarly, [10] implemented a two layer neural network for segmentation of CT images of the abdomen. 2.3 Signal Analysis A knowledge-based neural network (KBANN) is implemented for classification of phosphorus (31P) magnetic resonance spectra (MRS) from normal and cancerous breast tissues [11]. In [12], the authors reported the results from the application of tools for synthesizing, optimizing and analyzing neural networks to an Electrocardiogram (ECG) Patient Monitoring task. A neural network was synthesized from a rule-based classifier and optimized over a set of normal and abnormal heart beats. In [13], the purpose of study was to identify and characterize clusters in a heterogeneous breast cancer computer-aided diagnosis database. Identification of subgroups within the database could help elucidate clinical trends and facilitate future model building. A self-organizing map (SOM) was used to identify clusters in a large 92258 cases), heterogeneous computer-aided diagnosis database based on mammographic findings (BIRADSTM) and patient age. Analysis of NN as ECG analyzer also proves that NN is capable to deal with ambiguous nature of ECG signal[14]. Silipo and Marchesi use static and recurrent neural network (RNN) architectures for the classification tasks in ECG Analysis for arrhythmia, myocardial ischemia and chronic alterations. 3. PROPOSED MODEL The aim of this paper is to present the application of ANN’s in medical diagnosis with three different datasets such as breast cancer, heart disease and diabetes dataset. These data sets are obtained from UCI ML repository http://www.ics.uci.edu. The two important techniques classification and clustering are applied on these three medical datasets by constructing ANN models 3.1 Neural Networks Training The various phases in classification and clustering problems solved by Neural Network techniques are designing, training and testing. The three aspects involved in the construction of Neural Networks are: 1. Structure : The architecture and topology of the neural network. 2. Encoding : The method of changing weights (Training). 3. Recall : The method and capacity to retrieve information. The structure relates to how many layers should a network contain, and what their function are, in relation to input, output, or feature extraction. Encoding refers to the paradigm used for the determination of and changing weights on the connections between neurons. The performance of the network can be analyzed by using recall. 3.2 Classification on Medical Datasets Medical datasets are rich with hidden information that can be used for making intelligent medical diagnosis. Classification and prediction are two forms of data analysis that can be used to extract models describing important data classes or to predict future data trends. Whereas classification predicts categorical labels, prediction models continuous-valued functions. Data classification is a two step process. In the first step, a model is built describing a predetermined set of data classes or concepts. The model is constructed by analyzing datasets described by attributes. Each data set is assumed to belong to a predefined class, as determined by one of the attributes, called the class label attributes. In the context of classification, data sets are also referred to as training samples and are randomly selected from the sample population. Classification is often referred to as supervised learning because the classes are determined before examining the data. Classification algorithms require that the classes be defined based on data attribute values. They often describe these classes by looking at the characteristics of data already known to belong to the classes. Pattern recognition is a type of classification where an input pattern is classified into one of several classes based on its similarity to these predefined classes. In the second step the model is used for classification. First the predictive accuracy of the model (or classifier) is estimated. The holdout method is a simple technique that used a test set of class-labeled samples. These samples 2010-2013 - IJPAPER Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat, DOAJ, and other major databases etc., 320 Vol 04, Special Issue01; 2013 http://ijpaper.com/ PUBLICATIONS OF PROBLEMS & APPLICATION IN ENGINEERING RESEARCH - PAPER CSEA2012 ISSN: 2230-8547; e-ISSN: 2230-8555 of a model on a given test set are the percentage of test set samples that are correctly classified by the model. For each test sample, the known class label is compared with the learned model’s class prediction for that sample. 3.3 Feed Forward Neural Networks with Back-Propagation The invention of the Back-Propagation algorithm has played a large part in the resurgence of interest in ANNs. Back-Propagation is a systematic method for training multilayer ANNs. Feed Forward network is a very popular model in networks. Back-Propagation learning algorithm consists of two passes namely forward pass and backward pass. In the forward pass an input vector is applied to the network and the output at each neuron in the output layer is calculated. During this pass the synaptic weights of the network are all fixed. In backward pass the synaptic weights are all adjusted in accordance with error correction rule. This error is then Back-Propagated through the network against the direction of synaptic connections. Momentum and variable learning rates are considered to improve the Back-Propagation algorithm. Momentum allows the network to respond not only the gradient but also recent trends in the error surface. With the momentum concept it is possible for the network to ignore small features in the error surface. 3.3.1 Classification of Breast Cancer Data Set The Wisconsin Breast Cancer dataset was initially created to conduct experiments that were to prove the usefulness of automation of fine needle aspiration cytological diagnosis. It contains 699 instances of cytological analysis of fine needle aspiration from breast tumors. Each case comprises 11 attributes: a case ID, cytology data (normalized, with values in the range 1-10) and a benign/malignant attribute. The attribute information is given in Table 1. The values are normalized in the form of zero’s and one’s. Table 1. Breast Cancer Data Set Attributes Attribute Domain 1. Sample code number 2. Clump thickness 1 – 10 3. Uniformity of cell size 1 – 10 4. Uniformity of cell shape 1 – 10 5. Marginal adhesion 1 – 10 6. Single epithelial cell size 1 – 10 7. Bare nuclei 1 – 10 8. Bland chromatic 1 – 10 9. Normal nucleoli 1 – 10 10. Mitosis 11. Class: 2 for benign, 4 for malignant Id number 1 – 10 2, 4 3.3.2 Heart Disease Data Set Heart disease data set concerns to diagnosis a person is having heart disease or not. It contains 414 instances, 13 attributes and a class attribute. A class value of 0 indicates ‘normal person, a value of 1 indicates first stroke, a value of 2 indicates second stroke, and a value of 3 indicates end of life. The attribute description of this data set is given in Table 2. Table 2. Heart Disease Data Set Attributes S.No. Attribute Description Range 1. Age Age in years Continuous 2. Sex (1=male; 0=female) 0,1 3. Cp Value 1:typical angina 1,2,3,4 --value 2:atypical angina --value 3:non-anginal pain --value 4: asymptomatic 4. Trestbps Resting blood pressure (in mm Hg) Continuous 5. Chol Serum cholesterol in mg/dl Continuous 6. Fbs (Fasting blood sugar >120 mg/dl) 0,1 (1=true; 0=false) 7. Restecg Resting electrocardiographic results 0,1,2 --value 0: normal 2010-2013 - IJPAPER Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat, DOAJ, and other major databases etc., 321 Vol 04, Special Issue01; 2013 http://ijpaper.com/ PUBLICATIONS OF PROBLEMS & APPLICATION IN ENGINEERING RESEARCH - PAPER CSEA2012 8. 9. 10. Thalach Exang Oldpeak 11. Slope 12. Ca 13. 14. Thal Class: 0 for normal person, 1 for first stroke, 2 for second stroke, 3 for end of life. ISSN: 2230-8547; e-ISSN: 2230-8555 --value 1: having ST-T wave abnormality (T wave inversions and/or ST Elevation or depression of>0.05 mV) --value 2: showing probable or definite left ventricular Hypertrophy by Estes’ criteria Maximum heart rate achieved Exercise induced angina (1=yes; 0=no) ST depression induced by exercise relative to rest The slope of the peak exercise ST segment --value 1: up sloping --value 2: flat --value 3: down sloping Number of major vessels (0-3)colored by fluoroscopy Normal,fixed defect,reversible defect Continuous 0,1 Continuous 1,2,3 Continuous 3,6,7 0,1,2,3 3.3.3 Diabetes Data Set The Data Set taken here is of all female patients, at least 21 years old, and of Pima Indian heritage. Diabetes data set concerns to diagnosis a person is Diabetic or not. It contains 768 instances, 8 attributes and a class attribute. A class value of 0 indicates not diabetic person, a value of 1 indicates diabetic person. The attribute description of this data set is given in Table 3. S. No. 1. 2. 3. 4. 5. 6. 7. 8. 9. Table 3. Diabetes Data Set Attributes Attribute Number of times pregnant Plasma glucose concentration a 2 hours in an oral glucose tolerance test Diastolic blood pressure (mm Hg) Triceps skin fold thickness (mm) 2-Hour serum insulin (mu U/ml) Body mass index (weight in kg/(height in m)^2 Diabetes pedigree function Age (years) Class variable (0 or 1) 3.3.4 Experimental Results from Classification The classification accuracy is improved in case of multi layer network compared to single layer network because of the reason the hidden layer neuron acts as a feature extractor. The experimental results of the three data sets are given in Table 4. Table 4. Results of Classification Experiments on Three Data Sets Accuracy of Classification Dataset Attributes Instances Classes Single layer Multi layer Breast Cancer 9 699 2 72 % 80.3% Heart Disease 13 414 4 70% 81.4% Diabetes 8 768 2 69.4% 78.2% 4. CLUSTERING OF MEDICAL DATA SETS Clustering is a multivariate analysis technique widely adopted in medical diagnosis studies and pattern recognition areas. By examining the underlying structure of a dataset, cluster analysis aims to class data into 2010-2013 - IJPAPER Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat, DOAJ, and other major databases etc., 322 Vol 04, Special Issue01; 2013 http://ijpaper.com/ PUBLICATIONS OF PROBLEMS & APPLICATION IN ENGINEERING RESEARCH - PAPER CSEA2012 ISSN: 2230-8547; e-ISSN: 2230-8555 separate groups according to their characteristics. The clustering is performed such that spectra held within a cluster are as similar as possible, and those found in opposing clusters as dissimilar as possible. In machine learning, clustering is an example of unsupervised learning. Unlike classification, clustering and unsupervised learning do not rely on predefined classes and class-labeled training examples. For this reason, clustering is a form of learning by observation, rather than learning by examples. One of the important applications of neural network is clustering of medical data for clinical diagnosis. In this paper, the neural network model used for clustering is Kohonen Self-Organizing Map. 4.1 Self-Organizing Maps Dimensionality reduction concomitant with preservation of topological information is common in normal human subconscious information processing. We routinely compress information by extracting relevant facts and thereby develop reduced representations of impinging information while retaining essential knowledge. A good example is that of biological vision where three dimensional visual images are routinely mapped onto a two dimensional retina and information is preserved in a way that permits perfect visualization of a three dimensional world. The self-organization feature map is a neural network model that is based on Kohonen’s discovery that topological information prevalent in high dimensional input data can be transformed onto a one or two dimensional layer of neurons. 4.3 Experiment Results from Cluster Analysis: Experiments are conducted on the above mentioned three medical datasets using self organization neural network model for cluster analysis. The total instances in these datasets are divided into training vector and testing vectors and the results are shown in the following tables Table 5 : Results of the Cluster Analysis for Breast Cancer data set: Total instances taken are 699. Training Test Vectors Time Recognized Efficiency Vectors (sec) Vectors 139 560 0.156 450 80.35 % 279 419 0.129 352 84.0 % 419 279 0.099 240 86.02 % 560 139 0.046 126 90.6 % Table 6: Results of the Cluster Analysis for Heart Disease Data Set: Total instances taken are 414. Training Time Recognized Test Vectors Efficiency Vectors (sec) Vectors 90 324 0.133 261 80.5% 180 294 0.104 245 83.3% 262 152 0.082 135 88.8% 341 73 0.034 67 91.7% Table 7: Results of the Cluster Analysis for Diabetic data set: Total instances taken are 768. Training Time Test Vectors Vectors (sec) 155 613 0.152 307 460 0.117 460 307 0.092 614 155 0.059 Recognized Vectors 500 400 275 140 Efficiency 81.5% 86.95% 89.57% 90.32% CONCLUSION This study has been carried out to develop a system for performing classification and clustering tasks on three types of medical data sets such as, Breast Cancer, Heart Disease and Diabetes, using Neural Network technique for diagnosis purpose. Two types of Neural Network models like Feed Forward Neural Network (FFNN) with Back-Propagation and Self-organization networks are considered in this paper. Experiments are conducted with FFNN with single and multilayer and experiments results proved that classification accuracy is improved in case of multilayer network compared to single layer network. Another unsupervised Self-Organization Network is designed to perform cluster analysis on the three medical data sets. Cluster experiments are performed with different percentage of samples in the training process. The experimental results proved that the performance accuracy is improved if more sample data is used in the training process. 2010-2013 - IJPAPER Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat, DOAJ, and other major databases etc., 323 Vol 04, Special Issue01; 2013 http://ijpaper.com/ PUBLICATIONS OF PROBLEMS & APPLICATION IN ENGINEERING RESEARCH - PAPER CSEA2012 ISSN: 2230-8547; e-ISSN: 2230-8555 References [1]Choong P. L., DeSeilva C.J.S., “Breast Cancer Prognosis using EMN Architecture”. Proceedings of IEEE International Conference on Neural Networks. June, 1994. [2]Lapuerta P., Azen S.P., and LaBree L., “Use of Neural Networks in Predicting the Risk of Coronary Artery Disease”, Computers and Biomedical Research, 28, 1995, pp. 38—52. [3]Fraser H., Pugh R., Kennedy R., Ross P., and Harrison R., “A comparison of Backpropagation and Radial Basis Functions, in the Diagnosis of Myocardial Infraction”, In Ifeachor E., and Rosen K. (Eds.), International Conference on Neural Networks and Expert Systems in Medicine and Healthcare, 1994, pp. 76—84. [4]Verrelst H., Vandewalle J., and De Moor B., “Bayesian Input Selection for Neural Network Classifiers”, In Ifeachor E., Sperduti A., and Starita A. (Eds.), Third International Conference on Neural Networks and Expert Systems in Medicine and Healthcare, 1998, pp. 125—132. World Scientific. [5]Costa A., Cabestany J., Moreno J., and Calvet M., “Neuroserum: AnArtificial Neural Net Based Diagnostic Aid Tool for Serum Electrophoresis”. In Ifeachor E., Sperduti A., and Starita A. (Eds.), Third International Conference on Neural Networks and Expert Systems in Medicine and Healthcare, 1998, pp. 34—43. World Scientific. [6]Aizenberg I., Aizenberga N., Hiltnerb J., “Cellular neural networks and computational intelligence in medical image processing. Image and VisionComputing”, 19(4), 2001,177-183. [7]Miller A., Blott B., and Hames T., “Review of Neural Network Applications in Medical Imaging and Signal Processing”, Medical and Biological Engineering and Computing, 30(5), 1992, 449-464. [8]Hall L., Bensaid A., Clarke L., Velthuizen R., Silbiger M., and Bezdek J., “A Comparison of Neural Network and Fuzzy Clustering Techniques in Segmenting Magnetic Resonance Images of the Brain”, IEEE Transactions on Neural Networks, 3(5), 1992, 672-682. [9]Rajapakse J., and Acharya R., “Medical Image Segmentation with MARA, In International Joint Conference on Neural Networks”, Vol. 2, 1990, pp. 965-972. [10]Daschlein R., Waschulzik T., Brauer W., “Computer Aided Analysis of LungParenchyma Lesions in Standard Chest Radiography”, In Ifeachor E., and Rosen K. (Eds.), International Conference on Neural Networks and Expert Systems in Medicine and Healthcare, 1994, pp.174-180. [11]Sarle W. S., “Neural Networks and Statistical Models, Proceedings of the Nineteenth Annual SAS Users Group International Conference”, April, 1994. [12]Waltrus R.L., Towell G., and Glassman M.S., “Synthesize, Optimize, Analyze, Repeat (SOAR): Application of Neural Network Tools to ECG Patient Monitoring”. [13]Mia K. Markey, Joseph Y. Lo, Georgia D. Tourassi, Carey E. Floyd Jr., “Self-organizing map for cluster analysis of a breast cancer databases”, Artificial Intelligence in Medicine, Vol. 27, 2003, pp. 113-127. [14]Silipo R., and Marchesi C., “Artificial Neural Networks for automatic ECG analysis”, IEE E Transactions on Signal Processing, Vol. 46, n. 5, 1998, pp. 1417-1425. 2010-2013 - IJPAPER Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat, DOAJ, and other major databases etc.,