Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
M. Chandamona et al., Available Online http://www.ijncse.com ISSN Online: 2395-7018 Special issue ,March, 1-9 AN IMPROVED ANALYSIS OF DATA MINING TECHNIQUES ON MEDICAL DATA Mrs.M.Chandamona M.Sc.,M.Phil, Assistant Professor & Head of the Department BCA G.T.N Arts College, Karur Road Dindigul Dr.PonPeriasamy,M.Sc.,MCA.,M.Phil., Associate Professor Nehru Memorial College, Puthanampatti, Trichy 1.INTRODUCTION Abstract In the last decade there has been increasing usage of data mining techniques on medical data for locating helpful trends or patterns that are utilized in identification and higher cognitive process. data mining techniques like cluster, classification, regression, association rule mining, CART (Classification and Regression Tree) area unit wide utilized in attention domain. data mining algorithms, once appropriately used, are capable of rising the standard of prediction, identification and un wellness classification. the most focus of this paper is to research data mining techniques needed for medical data mining particularly to find locally frequent diseases like heart ailments, lung cancer, breast cancer and so on. we tend to measure the data mining techniques for finding locally frequent patterns in terms of value, performance, speed and accuracy. we have a tendency to additionally compare data mining techniques with typical strategies. Keywords—Data mining, frequent patterns, data mining techniques, medical data mining Data mining is the method of excavation data for locating latent patterns which may be translated into valuable info. data mining usage witnessed unprecedented growth within the previous couple of years. currently the quality of data mining techniques has been accomplished in health care domain. This realization is within the wake of explosion of complicated medical data. Medical data mining will exploit the hidden patterns present in voluminous medical data which otherwise is left undiscovered. data mining techniques that are applied to medical data embrace association rule mining for locating frequent patterns, prediction, classification and agglomeration. historically data processing techniques were utilized in numerous domains. However, it's introduced comparatively late into the health care domain. however, as on nowadays lot of analysis is found within the literature. This has LED to the development of intelligent systems and decision support systems in health care domain for correct diagnosing of diseases, predicting the severity of varied diseases, and remote health observation. particularly the data mining techniques are a lot of helpful in predicting heart diseases, lung cancer, and breast cancer and so on. The data mining techniques that are applied to medical data embrace Apriori and FPGrowth [1], [2], Int J Nano Corr Sci and Engg 3(3) (2016) 85-90 Editors: Dr S Rajendran, Dr M Pandiarajan, and A. Daniel Das 85 M. Chandamona et al., [3], [4], [5], [6], [7], and [8], unattended neural networks [9][10], linear genetic programming [9], Association rule mining [11], [12], Bayesian Ying yang [13], decision tree algorithms like ID3, C4.5, C5, and CART [14], [15], [16], [17], [18], [19], [20], outlier prediction technique [21], Fuzzy cluster analysis [22], classification algorithmic rule [17], [23], [24], Bayesian Network algorithmic rule [14], [25], Naive Bayesian [26], combination of K-means, Self-Organizing Map (SOM) and Naïve Bayes [27], statistic technique [28], [29], combination of SVM, ANN and ID3 [16], agglomeration and classification [30],SVM [16], [31], FCM [29],k-NN [24], and Bayesian Network [14]. This paper provides the outline of these techniques in terms of the matter they solve or their utility in medical data processing or the tools that are enforced over them and so on. The remainder of the paper is structured as follows. Section II reviews literature bearing on data mining and applications of data mining techniques in health care domain. Section III provides the outline of medical data mining techniques. Section IV presents the importance of discovering locally frequent patterns or trends or diseases in medical knowledge. Section V concludes the paper. RELATED WORK Data mining could be a process of analyzing voluminous data in numerous views so as to motivate trends or patterns that cause business intelligence [32]. data mining plays a vital role in IT because it discovers data from historical data of varied domains. for example data mining may be wont to mine medical data as health care domain produces large amount of information concerning patients, diseases, diagnosis, medication and so on. By applying data mining techniques in health care domain, the directors will improve the QoS (Quality of Service) by discovering latent probably helpful trends needed by diagnosing [33]. data mining is helpful in medical applications like medications, medical tests, prediction of surgical procedures, and discovery of relationships between pathological data and clinical data [34]. Apriori and FPGrowth are the most widely used frequent pattern mining algorithms [35]. These 2 algorithms and algorithms supported them are studied in [2], [3], [4], [5], [6], [7], and [8]. These 2 algorithms are utilized in medical data mining. Goodwin et al. [36] applied data mining techniques for birth outcomes. Evans et al. [37] expressed that hereditary syndromes will be detected automatically using data mining techniques. Doron Shalvi and St. Nicholas DeClaris, [10] mentioned medical data mining through unsupervised neural networks besides a technique for data visual image. They additionally stressed the necessity for preprocessing before medical data mining. within the year 2000 Krzysztof J. Cior [38], biotechnology academician, known the requirement for data mining methods to mine medical multimedia system content. Tsumoto [39] known issues in medical data mining. the problems include missing values, data storage with reference to temporal information and ambiguous data, completely different medical coding systems being used in Hospital Info Systems (HIS). Brameier and Banzhaf [9] explored and analyzed 2 programming models like neural networks, and linier genetic programming for medical data mining. Abidi and Hoe [40] projected and enforced a symbolic rule extraction work table for generating emerging rule-sets. Abidi et al. [41] explored the usage of rule-sets as results of information mining for building rule-based professional systems. Olukunle and Ehikioya [11] proposed an algorithmic rule for extracting association rules from medical image data. The association rule mining discovers frequently occurring things within the given dataset. wedge and Xu [13] proposed a classification methodology supported Bayesian Ying rule (BYY) that may be a 3 stratified model. They applied this model to classify liver disease through automatic discovery of medical trends. Brunie et al. [42] proposed architecture for mining genomedical data in heterogeneous and grid-based distributed infrastructures. Mahmud Khan et al. [15] targeted on call tree data mining algorithmic rule for medical image analysis. particularly they studied on lung cancer diagnosing through classification of x-ray images. Podgorelec et al. [21] given an outlier prediction methodology for rising performance of classification as a part of medical data mining. Wang et al. [22] applied fuzzy cluster analysis for medical images. They used decision tree rule to classify diagnostic procedure into traditional and abnormal cases. Cheng et al. [17] applied classification algorithmic rule to diagnose cardio vascular diseases. For classification effectiveness they targeted on 2 feature extraction techniques particularly automatic feature choice and professional judgment. Seng et al. [43] introduced web based data mining for the applying of telemedicine. Ghannad-Rezaie et al. [44] conferred an approach to integrate PSO rule mining strategies and classifier on patient dataset. They used Particle Swarm optimisation technique moreover. The results discovered that, their approach is capable of performing arts surgery candidate selection method effectively in brain disease. bethel et al. [12] developed an association rule learner that is predicated on the standards collected from past breast cancer patients. The rule learner is employed during a tool by name “Clinical Trial Assignment Professional System”. Xue et al. [25] proposed and applied Int J Nano Corr Sci and Engg 3(3) (2016) 85-90 Editors: Dr S Rajendran, Dr M Pandiarajan, and A. Daniel Das 86 M. Chandamona et al., Bayesian Network algorithmic rule for diagnosing of an disorder called Coronary cardiopathy (CHD). Abraham et al. [26] projected discrimination techniques to enhance the accuracy of classification of medical data victimisation Naive Bayesian classifier algorithmic rule. Hassan and Verma [27] projected a hybrid approach for classification of medical data which mixes K-means, Self-Organizing Map (SOM) and Naïve Thomas Bayes with NN based mostly classifier. Tsumoto [45] studied multi-stage diagnosing victimization experts’ diagnostic rules and diagnostic taxonomy. They targeted on automatic grouping of medical data extracted from clinical database. Berlingerio et al. [28] studied Time Annotated Sequences (TAS) rule for mining medical knowledge with temporal dimensions. The extracted patterns exhibited the attribute relationships in time domain that helps in correct identification. Xing et al. [16] developed data mining techniques for predicting the chance of survival of CHD patients. to realize this they combined 3 prediction models like SVM (Support Vector Machine), Artificial Neural Networks (ANN), and call trees victimisation C4.5 or ID3, CART and C5. Abe et al. [46] projected an integrated time-series data mining environment for mining large amount of medical data for extracting additional valuable rule-sets. Jiquan et al. [47] projected a framework referred to as term-mapping to mix multiple medical data sources for data processing. Barnathan et al. [30] conferred a framework for cluster, classification and similarity search of biomedical images or 2d and 3D in nature. Shusaku et al. [48] projected multi-scale matching and clump technique on medical knowledge. Their results unconcealed that their technique is capable of grouping liver disease data supported temporal covariance of choline esterase, albumin and living substance. Hai Wang, and Shouhong Wang [49] studied on the role of doctors in medical data mining. medical experts will provide expert recommendation that may be used as input in medical data mining. Abdullah et al. [1] applied apriori rule for medical data mining. They extracted frequent item sets by analyzing associations between treatments and diagnosis. Saraee et al. [18] applied data mining techniques to medical data referring to military with reference to death rate in kids because of accidents. They used CART algorithmic rule to get a decision tree. Balakrishnan and Narayanaswamy [31] conferred feature choice victimisation SVM for classifying diabetes databases. medicine and health effects are well-mined by Froelich and Wakulicz-Deja [29] victimization adaptive FCM (Fuzzy cognitive Maps). Their work has LED to improved call support and designing in health care domain. known as Knowledge Discover Question Language for preparing questions that are used to discover knowledge from medical data. They explored ways and means for intelligent medical data mining. SUMMARY OF TECHNIQUES FOR MEDICAL DATA MINING Data mining techniques have shown significant improvement in medical industry in terms of prediction and decision making with respect to various diseases like cancer, cardio vascular abnormalities, diabetes, and others. Table 1 summary the medical data mining, its areas of application and the utility of the techniques. CONCLUSION In this paper we tend to survey numerous data mining techniques that are utilized for medical data mining. data mining techniques have higher utility in medical data mining as there's voluminous data during this industry. Attributable to the rising of medical data, it's become indispensable to use data mining techniques to assist decision support and predication systems within the field of health care. The medical mining yields required business intelligence to support well informed diagnosing and decisions. This paper has provided the outline of data mining techniques used for medical data mining besides the diseases they classified. It conjointly throws light into the importance of locally frequent patterns and therefore the mining techniques used for the purpose. Int J Nano Corr Sci and Engg 3(3) (2016) 85-90 Editors: Dr S Rajendran, Dr M Pandiarajan, and A. Daniel Das 87 M. Chandamona et al., TABLE 1 – Summary of medical data mining techniques REFERENCES TECHNIQUES UTILITY [1],[2],[3],[4],[5],[6],[7],an d [8] Appriori and FP Growth Associating rule mining for finding frequent item sets(diseases) in medical databases [9],[10] Neural Networks Extracting patterns, detecting trends [9] Genetic Algorithm Classification of medical data [11],[12] Association Rule Mining Finding frequent Patterns [13] Bayesian Ying Yang Classification DISEASE Diabetic Diseases Liver Diseases (BYY) [10],[14],[15],[16],[17],[18] , Decision Tree Algorithms such as ID3,C4.5,C5,and CART Decision Support [21] Outlier Prediction Technique For improving classification accuracy [22] Fuzzy Cluster analysis Analyzing medical images [17],[23],[24] Classification algorithm Disease Classification Cardio Vascular Diseases [14],[25] Bayesian Network algorihm Modeling and analysis of medical data. Coronary Heart Disease [26] Naive Bayesian Improving Classification accuracy Coronary Heart Disease [28],[29] Combined use of K-means SOM and Naive Bayes Medical diagnosis [16] Combination of SVM,ANN and ID3 Accurate Classification of medical data. [30] Clustering and Classification Clustering and Classification Of biomedical databases [16],[31] SVM Disease Classification [29] Fuzzy Cognitive Maps Drugs and Health effects classification [24] k-NN Classification of Diseases [19] Diabetes Diabetes , Cancer Int J Nano Corr Sci and Engg 3(3) (2016) 85-90 Editors: Dr S Rajendran, Dr M Pandiarajan, and A. Daniel Das 88 M. Chandamona et al., REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] Sarojini Balakrishnan (n.d). SVM Ranking with Backward Search for Feature Selection in Type II Diabetes Databases. IEEE. p1-6. Arun K Pujari “Data Mining Techniques”, Edition 2001. M. Ilayaraja Department of Computer Science & Engineering Alagappa University Karaikudi, India [email protected]. (2013). Mining Medical Data to Identify Frequent Diseases using Apriori Algorithm. IEEE. 0 (0), p1-6. J. C. Prather, D. F. Lobach, L. K. Goodwin, J. W.Hales , M. L. Hage¸ W. Edward Hammond, “MedicalData Mining: Knowledge Discovery in a Clinical DataWarehouse”, 1997. HAI-BING MA, JIN ZHANG, YlNG-JIE FAN, YUN-FA W. (2004). MINING FREQUENT PATTERNS BASED ON IS+TREE. IEEE. 0 (0), P1208-1213. Goodwin L, Prather J, Schlitz K, Iannacchione My Hammond W, Grzymala J, DataMining Issues for Impproved Birth Outcomes, Biomed. Science Instrum, 34, 1997, pp. 291-296. Evans S, Lemon S, Deters C, Fusaro R and Lynch H, Automated Detection of hereditary Syndromes Using Data Mining, Computers and Biomedical Research 30, 1997, pp. 337-348. Krzysztof J. Cior , Medical Data Mining and Knowledge Discovery. (n.d). From the guest Editor. IEEE. p1-2 Shusaku Tsumoto (n.d). Problems with Mining Medical Data. IEEE. p1-2. Syed Sibte Raza Abidi Kok Meng (n.d). Symbolic Exposition of Medical Data-Sets: A Data Mining Workbench to Inductively Derive Data-Defining Symbolic Rules. p1-6. S. S. R. Abidi, K. M. Hoe, A. Goh, “Analyzing data clusters: A rough set approach to extract cluster definingsymbolic rules, Fisher, Hand, Hoffman, Adams (Eds.) Lecture Notes in Computer Science: Advances inIntelligent Data Analysis, 4th Intl. Symposium, IDA-01. Springer Verlag: Berlin, 2001. Lionel Brunie, Maryvonne Miquel, Jean-Marc Pierson, and Anne Tchounikine, “Information grids: managing and mining semantic data in a grid infrastructure; open issues and application to geno-medical data. 2003, 14th International workshop on Database and Expert Systems Applications. Wong Kok Seng, Rosli Bin Besar, Fazly Salleh Abas trosli, (n.d). Collaborative Support for Medical Data Mining in Telemedicine. IEEE. p1-6. M. Ghannad-Rezaie, H. Soltanain-Zadeh, M.-R. Siadat, K.V. Elisevich. (2006). Medical Data Mining using Particle Swarm Optimization for Temporal Lobe Epilepsy. IEEE. p18. Shusaku Tsumoto , (n.d). Problems with Mining Medical Data. IEEE. p1-2. Hidenao Abe AND Hideto Yokoi (n.d). Developing an Integrated Time-Series Data Mining Environment for Medical Data Mining. IEEE. p1-6. Liu Jiquan Deng Wenliang Xudong Lu (n.d). Liu Jiquan Deng Wenliang Xudong Lu Huilong Duan College of Biomedical Engineering & Instrument Science Zhejiang University Hangzhou 310027, China [email protected]. IEEE. p1-4. Shusaku Tsumoto (n.d). Problems with Mining Medical Data. IEEE. p1-2. Hai Wang, Shouhong Wang 1. (n.d). Medical Knowledge Acquisition through Data Mining. IEEE. 0 (0), p1-4. [20] Gaurav N. Pradhan AND B. Prabhakaran (n.d). ASSOCIATION RULE MINING IN MULTIPLE, MULTIDIMENSIONAL TIME SERIES MEDICAL DATA. IEEE. p1-4. [21] Oliver Hogl, Michael Müller (2001). On Supporting Medical Quality with Intelligent Data Mining. IEEE. p1-10. [22] Umair Abdullah (2008). Analysis of Effectiveness of Apriori Algorithm in Medical Billing Data Mining1. IEEE.p1-5. [23] Cong-Rui Ji and Zhi-Hong Deng. (n.d). Mining Frequent Ordered Patterns without Candidate Generation. IEEE. 0 (0), P1-5. [24] Hai-Tao He and Shi-Ling Zhang. (2007). A New method for Incremental Updating Frequent patterns mining. IEEE. 0 (0), p1-4. [25] Carson Kai-Sang Leung∗ Christopher L. Carmichael and Boyu Hao. (2007). Efficient Mining of Frequent Patterns from Uncertain Data. IEEE. 0 (0), p489-494. [26] Shariq Bashir, Zahid Halim, A. Rauf Baig. (2008). Mining Fault Tolerant Frequent Patterns using Pattern Growth Approach. IEEE. 0 (0), p172-179. [27] Sunil Joshi and Dr. R. C. Jain. (2010). A Dynamic Approach for Frequent Pattern Mining Using Transposition of Database. IEEE. 0 (0), p498-501. [28] Thanh-Trung Nguyen. (2010). An Improved Algorithm for Frequent Patterns Mining Problem. IEEE. 0 (0), p503-507. [29] Xiaoyong Lin and Qunxiong Zhu. (2010). Share-Inherit: A novel approach for mining frequent patterns. IEEE. 0 (0), p2712-2717. [30] Markus Brameier and Wolfgang Banzhaf. (2001). A Comparison of Linear Genetic Programming and Neural Networks in Medical Data Mining. IEEE.p1-10. [31] Doron Shalvi and Nicholas DeClaris., (n.d). An Unsupervised Neural Network Approach to Medical Data Mining Techniques. IEEE. 0 (0), p1-6. [32] Adepele Olukunle and Sylvanus Ehikioya, (n.d). A Fast Algorithm for Mining Association Rules in Medical Image Data. IEEE. p1-7. [33] Cindy L. Bethel and Lawrence O. Hall and Dmitry Goldgof (n.d). Mining for Implications in Medical Data. IEEE. p1-4. [34] Jeong-Yon Shim, Lei Xu (n.d). MEDICAL DATA MINING MODEL FOR ORIENTAL MEDICINE VIA BYY BINARY INDEPENDENT FACTOR ANALYSIS. IEEE. p1-4. [35] Jenn-Lung Su, Guo-Zhen Wu, I-Pin Chao (2001). THE APPROACH OF DATA MINING METHODS FOR MEDICAL DATABASE. IEEE. p1-3. [36] Safwan Mahmud Khan Md. Rafiqul Islam Morshed U. (n.d). Medical Image Classification Using an Efficient Data Mining Technique. IEEE, p1-6. [37] Yanwei Xing, Jie Wang and Zhihong Zhao (2007). Combination data mining methods with new medical data to predicting outcome of Coronary Heart Disease. IEEE. p1-5. [38] Tsang-Hsiang Cheng, Chih-Ping Wei, Vincent S. Tseng (n.d). Feature Selection for Medical Data Mining: Comparisons of Expert Judgment and Automatic Approaches . IEEE. p1-6. [39] Mohammad Saraee, George Koundourakis, Babis Theodoulidis. (n.d). EASYMINER: DATA MINING IN MEDICAL DATABASES. IEEE. p1-3. [40] SAM CHAO, FAI WONG, “AN INCREMENTAL DECISION TREE LEARNING METHODOLOGYREGARDING ATTRIBUTES IN MEDICAL DATA MINING”. Proceedings of the Eighth Int J Nano Corr Sci and Engg 3(3) (2016) 85-90 Editors: Dr S Rajendran, Dr M Pandiarajan, and A. Daniel Das 89 M. Chandamona et al., [41] [42] [43] [44] [45] International Conference on Machine Learning and Cybernetics, Baoding, 12-15 July 2009. My Chau Tu AND Dongil Shin (2009). A Comparative Study of Medical Data Classification Methods Based on Decision Tree and Bagging Algorithms. IEEE. p1-5. Vili Podgorelec, Marjan HerikoMaribor, (n.d). Improving Mining of Medical Data by Outliers Prediction. IEEE. p1-6. Shuyan Wang Mingquan Zhou Guohua Geng (n.d). Application of Fuzzy Cluster Analysis for Medical Image Data Mining. IEEE. p1-6. Asha Gowda Karegowda M.A.Jayaram (2009). Cascading GA & CFS for Feature Subset selection in Medical Data Mining. IEEE. p1-4. Graduate Institute of Applied Information Sciences (2009). MEDICAL DATA MINING USING BGA AND RGA FOR WEIGHTING OF FEATURES IN FUZZY K-NN CLASSIFICATION. IEEE. p1-6. [46] Weimin Xue, Yanan Sun, Yuchang Lu (n.d). Research and Application of Data Mining in Traditional Chinese Medical Clinic Diagnosis. IEEE.p1-4. [47] Ranjit Abraham, Jay B.Simha, Iyengar (n.d). A comparative analysis of discretization methods for Medical Datamining with Naïve Bayesian classifier. IEEE. p1-2. [48] Syed Zahid Hassan and Brijesh Verma,(n.d). A Hybrid Data Mining Approach for Knowledge Extraction and Classification in Medical Databases. IEEE. p1-6. [49] Michele Berlingerio (n.d). Mining Clinical Data with a Temporal Dimension: a Case Study. IEEE. p1-8. [50] Wojciech Froelich, Alicja Wakulicz-Deja (2009). Mining Temporal Medical Data Using Adaptive Fuzzy Cognitive Maps. IEEE. P1-8. [51] Michael Barnathan, Jingjing Zhang, Vasileios (n.d). A WEBACCESSIBLE FRAMEWORK FOR THE AUTOMATED STORAGE AND TEXTURE ANALYSIS OF BIOMEDICAL IMAGES. IEEE. p1-3. Int J Nano Corr Sci and Engg 3(3) (2016) 85-90 Editors: Dr S Rajendran, Dr M Pandiarajan, and A. Daniel Das 90