Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
International Academic Journal of Science and Engineering Vol. 3, No. 5, 2014, pp. 11-22. International Academic Journal of Science and Engineering ISSN 2454-3896 www.iaiest.com International Academic Institute for Science and Technology Predicting Diabetes Symptoms by Means of Data Mining Techniques: Study Conducted in Kermanshah – Iran Ehsan Pashaeea,b, Abdullah ChaleChalec a MSc Student of computers, Department of computers, Collage of Engineering, Kermanshah Branch, Islamic Azad University, Kermanshah, Iran. b Department of computers, Collage of Engineering, Kermanshah Science and Research Branch, Islamic Azad University, Kermanshah, Iran. c Assistant Professor of computers Department, Department of computers, Collage of Engineering, Kermanshah Branch, Islamic Azad University, Kermanshah, Iran. Abstract In the last 10 years, the prevalence of diabetes disease had doubled in the world. This disease has 200 million victims worldwide today and each year this number increases by 6%. In this study, we have analyzed the relationship between observed symptoms of diabetic patients and their personal characteristic such as: glucose levels, age, weight and unique complications. The analyzed data of this study was collected from one of Kermanshah’s hospitals which had 538 diabetic patients. We have used algorithms such as NaiveBayes, BayesNet, J48 ،Random Forest, Random Tree, lazy, IBk and association rules in order to predict symptoms of diabetes. Since diabetes is a gateway disease for developing heart and coronary and kidney diseases , thus our results showed that the patients being treated who did their blood sugar and HbA1C checkups regularly , never missed a dose of their medication and also took on exercising improved their conditions significantly. Keywords: Diabetes, Data mining, Disease Symptoms 11 International Academic Journal of Science and Engineering, Vol. 3, No. 5, pp. 11-22. Introduction: Twenty years after the year 1950 and the earliest use of computers for data recording and analysis, the amount of recorded data had doubled and the progression of technology doubles the amount of recorded data in databases every two year. Thus the amount of recorded data is increasing rapidly (Tearling, 2014). Most data are huge and of no use alone, however, lately data mining techniques are gaining popularity in the medical fields. Medical environments are rich with information and require knowledge, thus data mining techniques can be considered as proper tools for knowledge discovery. The rapid growth of medical databases in developed countries motivates medical researchers to use these techniques in order to extract knowledge from them. Kermanshah’s university of medical sciences currently uses data mining algorithms in order to extract previously unknown patterns from huge amounts of data (Parsaei and Salehi, 2015). By building a related database model this university was able to design a decision support system in order to facilitate the decision making process for doctors. Diabetes mellitus which is generally know as diabetes is a metabolic disease in which there is either low amount of insulin in the body or body’s insulin doesn’t function properly, therefore this resistance against insulin increases the amount of glucose in the blood. Insulin is a hormone produced by beta cells of the pancreas which main function is to regulate the amount of glucose in the blood. Pancreas is a huge gland located behind the stomach. High levels of glucose in the blood over a long period of time might lead to heart disease, kidney and eye diseases and might even affect the nervous system negatively (Kermanshah Medical Science University, 2015). Theoretical basis of research: 1. Data Mining Data mining is known as the most important function of data warehouses. Data mining is actually the analysis of data in order to extract probable procedures and previously unknown interesting patterns from them. Data mining process uses complex mathematical and statistical algorithms in order to turn data into knowledge for the organization (Nourouzi & Taefie Hamrah, 2012). Today, data mining techniques are mostly used in banks, industrial centers, huge factories, treatment centers, hospitals, educational and research centers, intelligent marketing etc. Data mining is a connecting bridge between the science of statistics, computer science, artificial intelligence, machine learning, pattern discovery and visual representation of data (Parsaei et al. 2016). Data mining is a complex process for identifying previously unknown and interesting patterns and models in huge amount of data in a way that is understandable to a human mind. Data mining is not a product to purchase but is rather a scientific field and a process which needs to be implemented (Parsaei et al. 2016). Prediction is the main goal of data mining. To be more precise “The main goal of data mining is to analyze a dataset in order to extract new, useful, interesting and understandable patterns that are impossible to identify with conventional processes from them.” Data mining involves various algorithms and techniques such as classification, clustering, regression analysis, artificial intelligence, data tree and 12 International Academic Journal of Science and Engineering, Vol. 3, No. 5, pp. 11-22. genetic algorithm which are used in data discovery processes. The following section analyzes each of these techniques briefly in order to clarify their functions (Nourouzi & Taefie Hamrah, 2012). 1.1 Data Mining Techniques Data mining techniques consist of a set of different techniques and tools that are used for reaching different goals. These techniques and models are based on artificial intelligence, machine learning, statistics, neural networks, pattern discovery, science based systems, knowledge extraction, data recovery, high speed computing and visual data representation. The role of data mining is particularly accentuated in disease diagnosis and health care improvement when dealing with huge amounts of data (Wasan et al. 2006). The main function of algorithms used in data mining process is to discover and present the closest model to the characteristics of the targeted data. These models can be either predictive or descriptive, Predictive models are good for diagnosing certain and unique diseases. Sometimes treating a disease not only involves observing the patient’s condition and history but includes considering treatment results of other patients with the same disease. In these methods determining the condition of future data is based on the former and current amounts of the patient’s data. The function of descriptive models is to identify patterns in a dataset. Classification, regression and time series analysis are amongst the most popular methods of predictive models as clustering; association rules and visualization are considered to be the most popular methods of descriptive models (Wasan et al. 2006). The following paragraphs aim to clarify these methods briefly. 1.1.1 Classification Classification puts a sample data in its already defined classes. This method involves creating a set of classification rules based on the characteristics of the dataset. These rules can also be useful for classification of a new sample data (Tahmasebi 2010). For example, classification rules of a disease can help to diagnose new diseases based on the symptoms observation of well known diseases. Therefore, medical diagnosis is the main function of classification rules. 1.1.2 Regression Regression analysis method uses a function in order to estimate the amount of a variable. This method estimates the amount of an output variable based on the amount of the input variable. For example, regression analysis can determine children’s height based on their age and weight. 1.1.3 Time Series Analysis In this method the amount of a certain characteristics is determined by measuring them in an equal and regular time intervals. For example, the amount of a patients’ certain characteristics might be determined based on his or her round the clock or daily checkups. This method can be used for predicting future amounts or determining the degree of likeness amongst time intervals. 1.1.4 Association Rules 13 International Academic Journal of Science and Engineering, Vol. 3, No. 5, pp. 11-22. A close association (both negative and positive) is often observed between certain datasets. Association rules describe the association between seemingly unrelated datasets. For example a set of disease symptoms might appear after another set of symptoms or they might be a connection between patients’ genetic and appearing characteristics. 1.1.5 Visualization Data visualization methods are useful tools for discovering patterns in medical datasets (Wasen et al. 2006). By showing data in graphs, tables and charts this method facilitates their classification and clustering. Therefore it increases the ability of data based decision making and makes it faster, more precious and with less cognitive effort. For example, this method can discover the subset of patients with abnormally high blood pressure in a dataset of heart disease patients, so other data mining techniques can be applied on this subset for more knowledge extraction. 1.1.6 Clustering Clustering refers to the identification and classification of datasets or objects into cluster concept subclasses in a way that every cluster would consist of identical data which acts as a group. Clustering is very identical to classification with this difference that classes are already defined when groups act freely and unsupervised. For example, a set of new diseases can be put into several groups based on their identical symptoms so that their common symptoms can be used for describing the diseases of that group, 1.2 Data Mining Strategies The overall goal of data mining is to learn something from the data. Therefore data mining strategies consist of supervised and unsupervised learning methods. Supervised learning methods are used when the values of the input variable are absolutely clear. Finding error predicting models in insurance claims of a health care institution is an example of supervised learning strategy. In this strategy, models and characteristics are clear for us and it aims to predict data and discover information. The unsupervised strategy on the other hand focuses on data which values are unknown. For example, characteristics and error models of insurance claims would be unknown in an unsupervised learning strategy, but data mining-made patterns and clusters would lead to new discoveries (Obenshain 2004). Research Background 2. Data mining Applications in Healthcare Sector For the healthcare industry the challenging question had always been the same:” How could healthcare organization reduce their costs without scarifying their service quality, and still stay competitive?” Finding the right answer to this question might evolve this industry dramatically. Quality improvement in healthcare industry might be better defined through its stimulating forces which have great impact on it. Healthcare data are amongst those stimulating forces, in other words data would be the heart of every patient based quality improvement program (Rogers and Joyner, 1997). In our modern age which is dubbed as the information age, data are the greatest assets of healthcare organizations, as a matter of fact 14 International Academic Journal of Science and Engineering, Vol. 3, No. 5, pp. 11-22. data collection , recording and analysis are the key factors for the success of those organizations (Moqaddassi et al . 2012). However, collecting a large amount of data can be wasteful, unless they are used in a way that would turn them into a financial resource for the organization. In order to reach this objective organizations use data mining techniques to turn these potentially valuable data into strategic information (Roger and Joyner 1997). This trend is due to the fact that data mining gives these organizations the possibility to discover relationships, procedures and previously unknown patterns, this leads to knowledge extraction something that helps organizations to deal with their obvious and hidden challenges. Healthcare industry produces large amounts of data continuously, those who deal with these data are fully aware of the huge gap between collecting data and analyzing them. Therefore, the relatively young field of data mining can help this industry to analyze its data effectively in order to develop medical researches and scientific decision making in terms of disease diagnosis and treatment (Moqaddassi et al. 2012). Data mining slowly but effectively solves several problems and discovers knowledge in the healthcare sector. 3. Examples of Data Mining Applications in Healthcare Sector 3.1 Data mining Applications in Unaggressive Diagnosis Some laboratory and diagnostic procedures are aggressive, costly and rather painful. Pap smear testes which are screening procedures for cervical cancer are an example of the mentioned costly and painful procedures. Thangavel et al analyzed a group of cervical cancer patients by means of clustering algorithms and found new preventive results (Moqaddassi et al 2012). 3.2 Applications of Data Mining in Determining the Type of Treatment Applying data mining techniques on medical data had crucial results. These techniques can determine the type of treatment for certain ailments and consequently save many lives. For example, doctors at Shahid Hashemi Nejad Hospital of Tehran use data mining techniques to treat patients with urethral stones. This hospital follows a tree algorithm based on which doctors choose the most effective treatment. The accuracy of this algorithm tree was measured77%, which is more accurate than conventional algorithms and doctors’ estimations. 3.3 Applications of Data Mining in Identifying Medication Side Effects Some drugs that are thought to be harmless proved otherwise in the long run. The Food and Drug Administration of the United States (FDA), uses data mining techniques on its database in order to extract knowledge about the side effects of various medications. MGPS is the name of the algorithm used by this agency. This algorithm which has a 67% rate of success identifies medication side effects 5 years earlier than the conventional methods. 3.4 Applications of Data Mining in Electronic Health Records Several studies had proved the effectiveness of data mining techniques in identifying interesting patterns from electronic health records. Since computerized health records consist of huge amounts of diagnosis, treatment, lab and medication data, they are considered to be potential sources of valuable knowledge. 15 International Academic Journal of Science and Engineering, Vol. 3, No. 5, pp. 11-22. Even though discovering useful data from huge datasets in not impossible manually but it is rather difficult, thus data mining techniques are the best solutions for this particular challenge. 3.5 Applications of Data Mining in Hospital Rankings The ranking of hospitals and health programs is based on the reported data from service providers; thus standard reporting is a crucial necessity for a meaningful comparison. Data mining techniques are amongst the methods used for standardizing such reports. For example, If ICD(International Statistical Classification of Diseases and Related Health Problems) codes on patients’ health records would be combined with data mining techniques such as clustering or relations then this might result in reports that are compatible with the real rates of death and diseases and other quality criteria used in hospital rankings (Moqaddassi et al 2012). 3.6 Applications of Data Mining in Exploitation of Healthcare Services It is through data mining techniques that hospitals identify key variables for predicting service exploitation, improve quality outcomes, predict future behaviors of patients and improve their treatment services. Data mining and modeling techniques are also good tools for identifying patients whose condition might be of high risk. By providing information for service providers, data mining techniques helps them to identify high risk patients so that they can take necessary measures in order to improve their conditions. These techniques can help to design adequate interventions which might reduce the number of hospital admissions. For example, modeling techniques are predictors of data mining in diabetes management, which increases the quality and reduces the costs of this disease for people who suffer from this disease. (Moqaddassi et al 2012). 4. Diabetes Diabetes prevalence rate is between 4 to 12% worldwide. Iran’s Statistics Center of Disease Control estimates that rate to be 8.5% in this country. Due to the prevalence of obesity and inactivity this number is expected to rise to 37.5% by the year 2025. Diabetes is a metabolic disease in which there is either low amount of insulin in the body or body’s insulin doesn’t function properly, therefore this resistance against insulin increases the amount of glucose in the blood. Diabetes or hyperglycemia (high blood sugar) is one of the main diseases of the body’s endocrine glands. People are diagnosed with diabetes and deal with its outcomes each year. In this condition, glucose in the blood rises due to various reasons. Therefore regular blood work in order to measure glucose levels in the blood is of upmost importance for both diagnosing this disease and checking the effectiveness of its treatments. Different laboratorial methods and certain criteria are used in order to reach this objective. Insulin is a hormone produced by beta cells of the pancreas which main functions are to regulate the amount of glucose in the blood. Pancreas is a huge gland located behind the stomach. High levels of glucose in the blood over a long period of time might lead to heart and coronary disease, kidney and eye diseases and might even affect the nervous system negatively (Soltani & Khalili 2010). 4.1 Type 1 Diabetes 16 International Academic Journal of Science and Engineering, Vol. 3, No. 5, pp. 11-22. Type1 diabetes which is more common in children and young people is a chronic condition in which the body produces little or no amount of insulin. Treatments consist of eating healthy food and regular exercise. Even though this disease is more common in children and young people but every age group can fall victim to this disease. Type 1 diabetes is an autoimmune disease (in which body’s immune system attacks and destroys healthy body cells) for which there is no absolute cure. Before the discovery of insulin in 1920, people diagnosed with this disease would die shortly after their diagnosis. But the discovery of insulin changed everything and saved many lives (Qolamali Zadeh, 2010). 4.2 Type 2 Diabetes Diabetes affects the way the body uses insulin. Insulin helps different cells of the body to take in glucose to be used for energy. Unlike type1 diabetes in which the body stops producing insulin, type2 diabetes is a noninsulin-dependence type of the disease in which there is no coordination between bodies’s increasing need for insulin production and its ability to do so. The main causes of the increasing need for insulin in the body are extra weight and inactivity. Therefore the most effective treatments for it involves healthy eating and regular exercise. Certain medications can also provoke insulin production in the body or optimize its effectiveness (Qolamali Zadeh, 2010). Table 1 compares two types of diabetes disease (Soltani & Khalili 2010). Table 1: a comparison of type 1&2 Diabetes Type 2 Type 1 Characteristics noninsulin-dependence diabetes- insulin -dependence diabetes, adolescence diabetes Other names 90% 5-10% Share of Diabetic Patients Generally older than 40 Generally, Younger than 30 – older than 12-14 Age of Occurrence Generally exists Generally, does not exist Family History Prevalent)%60-90( Not prevalent Obesity Rarely exists Often exists History of Ketoacidosis Mild frequent urination, fatigue (which is often diagnosed accidentally) Moderate to severe symptoms including : increases thirst and appetite, frequent urination , fatigue, weight loss, ketoacidosis Clinical Demonstrations Insulin therapy, healthy eating- regular exercisemedication Insulin therapy, healthy eating- regular exercise Treatment 17 International Academic Journal of Science and Engineering, Vol. 3, No. 5, pp. 11-22. Risk Factors for Developing Type2 Diabetes - Being overweight(²BMI≥25 KG/M) Family history Inactivity Pre-diabetes (condition in which your glucose level is higher than normal but not high enough to be classified as diabetes) History of polycystic ovarian syndrome Insulin resistance provoking condition (obesity) Having high blood pressure History of heart and coronary disease Figure 1 summarizes the above factors (Soltani & Khalili, 2010) Environmental Factors (Age – Obesity, inactivity) Genetic Factors Insullin Resistance Increased Insulin Production Other complications and diseases Heart and Vessels Disease High Blood Pressure Stroke Role of Pancreas´s Beta cells Lipid disorders Impaired glucose tolerance, impaired fasting glucose, type 2 diabetes 4.3 Diabetes and Pregnancy Gestational diabetes also known as diabetes during pregnancy is a form of hyperglycemia (high blood sugar) which only affects pregnant women. 3-8 women out of every 100 pregnant women in the US are diagnosed with this condition. Diabetes itself is a condition in which there is too much glucose in the 18 International Academic Journal of Science and Engineering, Vol. 3, No. 5, pp. 11-22. blood. Although our body uses glucose for energy, but too much of it can lead to serious health problems. If you are diagnosed with gestational diabetes or diabetes during pregnancy, your high blood sugar might be harmful to your fetus (Soltani & Khalili, 2010). 5. Complications of Diabetes High Blood Pressure (Hypertension): 70% of diabetic patients have a blood pressure rate of higher than 130/80. This condition does not generally affect those who are diagnosed with type1 diabetes and are not showing any symptoms of diabetic nephropathy. However, developing diabetic nephropathy and Microalbuminuria will increase blood pressure in those patients (Soltani & Khalili, 2010). Diabetic Nephropathy: Diabetic nephropathy is a condition characterized by nephritic syndrome. In fact, diabetic nephropathy is damage to your kidney caused by diabetes. Nephropathy diabetic is the number one cause of death in patients diagnosed with type 1 diabetes and it is becoming more common in people with type 2 diabetes as well (Soltani & Khalili , 2010). Heart and Coronary Diseases: Heart and coronary diseases are the main cause of death for people diagnosed with type2 diabetes (approximately 50%). Lipid Disorder: Lipid disorder is a condition in which there are high levels of fat like substances in the blood. Therefore it is necessary for diabetic patients to monitor their lipid profile regularly. PBS criteria for lipid profile are: - TC < 200 mg/dl - TG < 150 mg/dl - LDL < 100 mg/dl Retinopathy: Retinopathy is usually diagnosed 3 years after diabetes diagnosis. 90% of type1 diabetic patients develop this condition, 15 years after their diabetes diagnosis. Autonomic Neuropathy : Autonomic Neuropathy refers to a group of symptoms that occur when there is damage to the brain, these symptoms include: abdominal bloating, nausea, urinary problems, sexual impotency in men, conditional drop in blood pressure, heart rate issues, diarrhea, uncontrolled bowel emptying. Peripheral Neuropathy: Peripheral neuropathy affects 60-70 % of diabetic patients. Symptoms of peripheral neuropathy include: burning pain, tingling sensation, numbness, weakness and slowdown of sensory nerve (Soltani & Khalili 2010). 19 International Academic Journal of Science and Engineering, Vol. 3, No. 5, pp. 11-22. Research methodology 6. Chosen Characteristics for Prediction Diagnosis The following characteristics are amongst the most important factors for diagnosis and prediction. These are available in patients’ health records and we have analyzed them in our study, Gender: male or female? Age: older people are more likely to develop the disease Wight range: obesity is one of the main risk factors for diabetes Duration of hospitalization: the number of days a patients remains hospitalized Complications and other diseases: for example, a patient might check in to a hospital with an injured foot (ulcers or infections) which later turns out to be complications of diabetes. In general diseases are classified into 9 following groups: blood circulatory, respiratory, digestive, diabetes, injuries, musculoskeletal, genitourinary and cancer. Hbalc Test: Hemoglobin A1C test or Hbalc measures glycated hemoglobin in the blood and shows the average blood sugar levels in the last 2 or 3 months Blood Glucose Test: Determines the amount of glucose in the blood. Medications: Refers to medications prescribed by a doctor. Recurrent Hospitalization: Did the patient have recurrent hospitalization? City of Residence: Which city does the patient lives in, Kermanshah or someplace else? Improvements/Changes: How the patient responds to medications? Table 2: classifications of early signs and symptoms of hospitalized patients Description Diseases of the Circulatory Systems icd9 codes 390–459, 785 Diseases of the Respiratory System Diseases of the Digestive System Hyperglycemia Injury and Poisoning Diseases of the Musculoskeletal System and Connective Tissue Diseases of the Genitourinary System Neoplasm (Cancer) Symptoms , signs and ill defined conditions Endocrine, Nutritional and Metabolic Diseases and Immunity disorders Diseases of the Skin and Subcutaneous Tissue Infectious and Parasitic Diseases 460–519, 786 520–579, 787 250.xx 800–999 710–739 20 580–629, 788 140–239 780, 781, 784, 790–799 240–279, without 250 680–709, 782 001–139 Group Name Blood Circulatory Respiratory Digestive Diabetes Injuries Musculoskeletal Genitourinary Neoplasm Other International Academic Journal of Science and Engineering, Vol. 3, No. 5, pp. 11-22. Mental Disorders External Causes of Injury and Supplemental Classification Diseases of the Blood and Blood Forming Organs Diseases of the Nervous System Complications of Pregnancy ,Childbirth and the Puerperium Diseases of the Sense Organs Congenital Anomalies 290–319 E–V 280–289 320–359 630–679 360–389 740–759 Research Findings 7. Quantitative Algorithm Analysis for Diabetes We have used different data mining techniques in order to sort the best algorithms for diabetes. Table 3 shows the best sorted algorithms in terms of the highest predictions and the lowest mean squared error (MSE). Table 3: Quantitative Algorithm Analysis for Diabetes Disease Algorithm name NaiveBayes BayesNet trees.J48 RandomForest RandomTree lazy.IBk, K= 2 Correctly Classified Instances 80.292% 80.8394% 91.2409% 99.635% 99.0876% 88.3212% Incorrectly Classified Instances 19.708 19.1606 8.7591 0.365 0.9124 11.6788 After implementation of data mining classification algorithms and extraction of algorithmic results, we chose trees.RandomForest as our best related algorithm. This algorithm has the highest percent of prediction compared to other data mining algorithms. 7.1 Algorithm of Associative Rules We have used Apriori algorithm in order to achieve association rules which can be seen in table 4. Table 4, summaries the most associative relations determined by association rules. Table 4: The outcome results of association rules algorithm for diabetes disease Complications/Symptoms Weight 21 Age Gender International Academic Journal of Science and Engineering, Vol. 3, No. 5, pp. 11-22. Blood Circulatory, Diabetes and Other Groups Blood Circulatory and Other Groups 75 to 100 KG 50 to 75 KG 51 to 91 Years Old 81 to 91 Years old Male Female Conclusions: According to our results women in the age range of older than 80 who are heavier than 75 KG and men in the age range of older than 70 who are heavier than 75 KG are at high risk for developing blood circulatory and diabetic disorders. We have found that this age group would usually seek the help of a general practitioner to treat their complications and symptoms; the highest rate of insulin use was also amongst this age group. To be more specific, we can conclude that men are at higher risk for developing diabetes than women. We have determined the main causes of diabetes disease to be either inactivity due to old age or extra weight. Finally, we conclude that failure to seek treatment for diabetes might worsen this condition and may lead to heart and coronary or kidney diseases. References: Tahmasebi, H. (2010). An Evaluation of Classification Techniques Efficiency for Medical Data. 14th National Conference of Data Mining. Parsaei, M. R., Taheri, R., & Javidan, R. (2016). Perusing The Effect of Discretization of Data on Accuracy of Predicting Naïve Bayes Algorithm. Journal of Current Research in Science, (1), 457-462. Parsaei, M. R., & Salehi, M. (2015, November). E-mail spam detection based on part of speech tagging. In 2015 2nd International Conference on Knowledge-Based Engineering and Innovation (KBEI) (pp. 1010-1013). IEEE. Parsaei, M. R., Javidan, R., & Sobouti, M. J. (2016). Optimization of Fuzzy Rules for Online Fraud Detection with the Use of Developed Genetic Algorithm and Fuzzy Operators. Asian Journal of Information Technology, 15(11), 1856-1864. Nourouzi, F , Taefie Hamrah , A. (2012). A Closer Look at Data Mining and Neural Network Analysis. Soltani , R, Khalili , H. (2010). Medication Therapy for Diseases of the Endocrine Glands and Women. First Edition, Tehran, Arjmand Publications. Rogers, G., & Joyner, E. (1997). Mining your data for health care quality improvement. In SAS User Group International Conference (pp. 641-647). An Introduction to Data Mining, Research Page: http://www.thearling.com/text/dmwhite/dmwhite.htm [visited on 24 Jul. 2014] Moghaddassi, H., A. Hoseini, F. Asadi, and M. Jahanbakhsh. (2012). Application of Data Mining. Director General 9, no. 2: 304. Wasan, S. K., Bhatnagar, V., & Kaur, H. (2006). The impact of data mining techniques on medical diagnostics Data Science Journal, 5, 119-126. Obenshain, M. K. (2004). Application of data mining techniques to healthcare data., Infection Control, 25(08), 690-695. What is Diavetes?[”؟Online].Available: http://mardomsalari.com/template1/Article.aspx?AID=5960# 40692 . [Accessed: 22 Jun. 2014] 22 International Academic Journal of Science and Engineering, Vol. 3, No. 5, pp. 11-22. What is Diabetes, Medical Sciences University of Kermanshah, Online]. Available: http://kermanshah. selection .behdasht.gov.ir/index.aspx?siteid=297&pageid=51778. [Accessed: 11-Nov-2015]. 23