Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MINING FOR KNOWLEDGE TO BUILD DECISION SUPPORT SYSTEM FOR DIAGNOSIS AND TREATMENT OF TINNITUS Pamela L. Thompson & Zbigniew W. Ras University of North Carolina at Charlotte College of Computing and Informatics 10/21/2011 1 Research partially supported by the Project ME913 of the Ministry of Education, Youth, and Sports of the Czech Republic 2 Introduction Methodology ◦ Domain Knowledge ◦ Data Collection ◦ Data Preparation New Feature Construction Advanced Clustering Techniques for Temporal Feature Extraction Mining the Data: Unclustered and Clustered Data Action Rules Contributions Future Research Questions Topics 3 Neil Young, Barbara Streisand, Pete Townshend, William Shatner, David Letterman, Paul Schaffer, Steve Martin, Ronald Reagan, Neve Campbell, Jeff Beck, Burt Reynolds, Sting, Eric Clapton, Thomas Edison, Peter Jennings, Dwight D. Eisenhower, Cher, Phil Collins, Vincent Van Gogh, Ludwig Van Beethoven, Charles Darwin, . . . Introduction 4 Introduction 5 OUR APPROACH: We are interested in the application of data mining and action rule discovery to the TRT patient databases THE RESEARCH QUESTION: Can data mining and action rule discovery help us understand the relationships among the treatment factors, measurements and patient emotions in order to better understand tinnitus treatment and gain new knowledge for predicting treatment success? THE KNOWLEDGE GAINED will result in the design foundations of a decision support system to aid in tinnitus treatment effectiveness for TRT. Introduction 6 CONTRIBUTIONS: 1) A new knowledge discovery approach which can be used to build a decision support system for supporting tinnitus treatment 2) New temporal, emotional and text features related to tinnitus evaluation and treatment along with an evaluation of their contribution to learning the tinnitus problem 3) A new clustering approach for grouping similar visit sequences for tinnitus patients 4) The first application of Action Rule Discovery to the Tinnitus Problem including the application of LISP-miner and a new frequent sets based action rule generator (MARDs) 5) The first application and evaluation of new emotion centered temporal features integrated with the emotion-valence plane used in music emotion classification research Introduction 7 TRT includes DIAGNOSIS ◦ Preliminary medical examination ◦ Completion of initial interview questionnaire ◦ Audiological testing ◦ TREATMENT ◦ Counseling ◦ Sound Habituation Therapy ◦ Exposure to a different stimulus to reduce emotional reaction ◦ Visit questionnaire (THI) ◦ Secondary questionnaire (TFI) in the new dataset ◦ Instrument tracking (instruments can be table top or in ear, different manufacturers) ◦ Continued audiological tests Methodology: Domain Knowledge 8 Tinnitus Retraining Therapy ◦ Neurophysical Model ◦ Focuses on physiological aspect of nervous system function TRT “cures” tinnitus by ◦ Working with association between Limbic nervous system (fear, thirst, hunger, joy, happiness) Autonomic nervous system (breathing, heart rate) ◦ Involvement of limbic nervous system worsens tinnitus symptoms Methodology: Domain Knowledge 9 Development of a Vicious Cycle Auditory and Other Cortical Areas Perception and Evaluation (Consciousness, Memory, Attention) Auditory Subconscious Detection/Processing Auditory Periphery Source Limbic System Emotions Reactions Autonomic Nervous System Methodology: Domain Knowledge 10 Original Dataset ◦ 555 patients ◦ Relational ◦ 11 tables New Dataset ◦ 758 patients ◦ Relational ◦ Secondary questionnaire (TFI) answers are added to the new dataset TFI - Tinnitus Functional Index Methodology: Database Features 11 Initial Interview form provides basis for Patient/Doctor Treatment Category 0 to 4 (stored in Questionnaires tables) 0 1 2 3 – – – – low tinnitus only: counseling high tinnitus: sound generators set at mixing point high tinnitus w/hearing loss (subjective): hearing aid Hyperacusis: sound generators set above threshold of hearing 4 – persistent hyperacusis: sound generators set at the threshold; very slow increase of sound level varies as treatment progresses, stored as C (first) and CC (last) Methodology: Database Features 12 Tinnitus Handicap Inventory ◦ ◦ ◦ ◦ Questionnaire, forms Neumann-Q Table Function, Emotion, Catastrophic Scores Total Score (sum) THI 0 to 16: slight severity 18 to 36: mild 38 to 56: moderate 58 to 76: severe 78 to 100: catastrophic Methodology: Database Features 13 Tinnitus Functional Index ◦ ◦ ◦ ◦ In the new dataset but only for some patients Cognitive and emotional questions Scale of 0 to 10 and some % Includes questions related to Anxious/worried Bothered/upset Depressed Methodology: Database Features 14 Methodology: Database Features 15 Audiological Features ◦ Standard Deviation of Audiological Testing related to LDL’s ◦ LDL - measure of decreased sound tolerance as indicated by Hyperacusis (discomfort to sound) Misophonia (dislike of sound) Phonophobia (fear of sound) Methodology: ETL 16 THI - Tinnitus Handicap Inventory Discretization of attributes ◦ mainly based on domain/expert knowledge ◦ T score is discretized with a purpose to form decision attribute: a (good) to e (bad) and other variations Methodology: ETL 17 Data Preparation for Mining Work with: 1)Missing values (sparse data) 2)Problems with primary keys 3)Temporal information – related to visits, needs to be tied to PATIENT for some mining operations Methodology: ETL 18 Data Transformation – ORIGINAL DATABASE ◦ Flattened File in original database - one tuple per patient with additional features added ◦ ◦ ◦ ◦ ◦ Patterns Text Statistical Temporal Decision Feature – discretized THI total score Clustered patient databases (by similar visit patterns) with new additional features Coefficients, angles Data Transformation – NEW DATABASE Clustered patient records (by similar visit patterns) Boolean decision features plus TFI [Tinnitus Functional Index] features added (features in new dataset) 19 20 Feature Development for Categorical Data ◦ ◦ ◦ ◦ Treatment Category and Instruments MFP – Most Frequent Pattern (Value) FP/LP – First Pattern, Last Pattern (Value) Used for: Instrument Treatment Category Tinnitus Problem New Feature Construction: Categorical Features 21 Text Mining ◦ Text fields Demographic, Miscellaneous, Medication tables Categories may show cause of tinnitus for patient Stress, Noise, Medical: New Boolean Features Stress, Noise, and Medical Based on Text Mining of Terms Stress stress, depression, emotion, work, marriage, wedding Noise accident, noise, concert, loud, music, shooting, blast Medical surgery, infection, medicine, depression, hospital New Feature Construction: Text Features 22 Statistical ◦ From Audiological Features over visits Standard Deviation Average Methodology: ETL 23 Temporal Feature Development and Extraction ◦ Extract features that describe the situation of the patient based on behavior of attributes over time ◦ Temporal patterns may better express treatment process than static features ◦ New temporal features: Sound level centroid, sound level spread, recovery rate New Feature Construction: Temporal Features 24 New Temporal Features ◦ Sound Level Centroid T - Total number of visits per patient (3) V - Sound level feature (ex. LDL measurement) measured at each visit - values V(1), V(2), V(3). 1/3*V(1) + 2/3 * V(2) + 3/3 * V(3) V(1) + V(2) + V(3) New Feature Construction: Temporal Features 25 New Temporal Features ◦ Sound Level Spread SQRT V(1) * (1/3-C)2 + V(2) * (2/3-C)2 + V(3) * (3/3 – C)2 V(1) + V(2) + V(3) C - sound level centroid; V – sound level feature; T – number of visits. New Feature Construction: Temporal Features 26 New Temporal Features ◦ Recovery Rate V0 Vk , k min Vi , i [0, N ] Tk T0 V = Total Score from THI Vo = first score (should be less than Vk) Vk is the best or min score in the vector Tk is the date of best score New Feature Construction: Temporal Features 27 Creation of 8 new decision attributes based on different discretizations of Total Score from Tinnitus Handicap Inventory. Total Score Difference Description Discretization (score a represents the highest T Score in all cases) TSa a= {s: s>0}, b= {0} , c = {s: s < 0} TSb a={ s: s>30}, b ={s: 10 < s 30}, c={s: -10 < s 10}, d={s: -40 < s -10}, e – remaining scores a={s : s > 28}, b={s: 0 < s 28}, c ={s: -1 < s 0}, d ={s: -15 < s -1} , e – remaining scores a={s: s > 40}, b={s: 10 < s 40}, c={s: -10 < s 10}, d={s: -40 < s -10}, e – remaining scores a={s: s > 50}, b={s: 0< s 50}, c={s: -50< s 0}, d – remaining scores TSc TSd TSe TSg a={s: s > 80}, b={s: 60< s 80}, c={s: 40<s 60}, d={s: 20 < s 40}, e ={s: 0< s 20}, f={s: -20 < s 0}, g={s: -40< s -20}, h={s: -60 < s -40}, i – remaining scores a={s: s > 28}, b={s: 0 < s 28}, c={s: -12 < s 0}, d – remaining scores TSh a ={s: s> 10}, b={s: -10 s 10}, c – remaining scores TSf New Feature Construction: Decision Feature 28 Initial Experiments and Results ◦ WEKA ◦ J48 (C4.5 Decision Tree Learner) ◦ 253 patients, 126 attributes: Experiment 1 Investigate treatment factors and recovery ◦ 229 patients, 16 attributes: Experiment 2 Investigate audiological features and recovery Data Mining: Unclustered Data 29 In Search for Optimal Classifiers ◦ WEKA ◦ J48 (C4.5 Decision Tree Learner) ◦ Random Forest ◦ Multilayer Perceptron Data Mining: Unclustered Data 30 Initial Experiments and Results Experiment#1: ◦ (Category of treatment = C1) (R50 >12.5) (R3 <=15)==> improvement is neutral The support of the rules is 10, the accuracy is 90.9%. It means that if treatment category chosen by patient is C1 then when R50 parameter is above 12.5 and average of R3 is less or equals to 15 then the recovery is neutral. ◦ (Category of treatment = C2) ==> good The support of the rules is 44, the accuracy is 74.6%. It means that if category of treatment chosen by patient is C2 then Improvement is good. ◦ (Category of treatment = C3) (Model = BTE)==>good The support of the rules is 17, the accuracy is 100.0%. Experiment#2: ◦ 40>Lr50>19 ==>Somehow has tinnitus all of the time The support of the rules is 27, the accuracy is 100.0%. It means that if Lr50 is in range of 19 to 40, somehow the patient has tinnitus all the time, where the tinnitus may not be a major problem. "From Mining Tinnitus Database to Tinnitus Decision-Support System, Initial Study", P. Thompson, X. Zhang, W. Jiang, Z.W. Ras, in the Proceedings of IEEE/WIC/ACM International Conference on Intelligent Agent Technology (IAT 2007), IEEE Computer Society, San Jose, Calif., 2007, 203-206 Data Mining: Unclustered Data 31 Additional Experiments and Results ◦ Seven more experiments using 8 new decision attributes 253 patients, variations of 126 attributes Goal of exploring treatment factors and recovery using discretized total score WEKA ◦ J48, Random Forest, Multilayer Perceptron Data Mining: Unclustered Data 32 Additional Experiments and Results ◦ Seven Experiments: 1) Original data with Standard Deviations and Averages from Audiological features 2) Original data with Standard Deviations, Averages, Sound level centroid and sound level spread (Sound) only 3) Original Data with Standard Deviations, Averages, and Text 4) Original Data Standard Deviations, Averages, Text and Sound 5) Original Data with Text 6) Original Data with Sound 7) Original Data with Sound, Text, and Recovery Rate Data Mining: Unclustered Data 33 Best Results Table 5: WEKA Results, Classifier Tree for J48 Original Data with Sound Level Centroid, Sound Level Spread, Recovery Rate Decision Feature: TSa Precision Recall F-Measure .751 .806 .776 Tree: Recovery Rate <= -0.4: c (40.48/19.04) Recovery Rate > -0.4: a (212.52/26.4) Data Mining: Unclustered Data 34 Top Classification Results for all 8 decision variables Sound Level Centroid, Sound Level Spread, Recovery Rate 0.9 0.8 0.7 Tsa 0.6 TSb 0.5 TSc 0.4 TSd 0.3 Tse 0.2 TSf 0.1 TSg TSh 0 J48 RF Precision MP J48 RF Recall MP J48 RF MP Fmeasure Data Mining: Unclustered Data 35 SUMMARY – Mining unclustered data from the Original Database: The addition of new temporal based features improves the confidence of the original classification. Of particular interest are the new Sound and Recovery Rate Features – these have value for Decision Support System (DSS) implementation WEKA J48 appears to be the best classifier. Summary Data Mining: Unclustered Data 36 Continuing the Search for Optimal Classifiers ◦ Transformation to Visit Structure ◦ Creating Clustered-Driven Databases for Mining ◦ Adding New Features Data Mining: Clustered Data 37 38 Clustering for the purpose of Temporal Feature Extraction Data Selection Temporal Feature Extraction Classification Rules Action Rules Data Mining: Clustered Data 39 If we have two patients denoted by p, q, then patient p visits are represented by a vector vp = [v1, v2,…, vn] and vector vq = [w1, w2,…, wm] represents visits of patient q. If n m, then the distance (p,q) between p, q and the distance (q,p) between q, p is defined as n ( q, p ) ( p, q ) | v i wJ (i ) | i 1 n [wJ(1) , wJ(2) ,…, wJ(n)] is a subsequence of [w1, w2,…wm] such that where the sum of the distances is minimal for all n-element subsequences of [w1, w2,…, wm]. By |vi – wJ(i)| we mean the absolute value of [vi – wJ(i)]. Clustering Techniques for Temporal Feature Extraction 40 Clustering Techniques for Temporal Feature Extraction 41 Ultimate goal of constructing tolerance classes: to identify the right groups of patients for which useful temporal features can be built and used to extend the original (or current) database. The construction of a collection of databases Dp where p is patient and Dp is a database representing patients identified by the tolerance class generated by p. Two groups of databases for three and four visit sets were constructed. Clustering Techniques for Temporal Feature Extraction 42 Coefficients and Angles Feature Construction for Dp where p is a patient with 4 visits: Clustering Techniques for Temporal Feature Extraction 43 44 Quadratic Equation Based New Features Clustering Techniques 45 Clustering Techniques 46 Clustering Process Resulted in two classes of viable datasets for mining: 1) Three visits datasets (14 total) 2) Four visits datasets (5 total) Data Mining: Clustered Data 47 Attributes Values of Attributes Type Type Instrument Type Text Total Visits Total Number of Visits Numeric Model Instrument Model Text Last_P Last Patient Type Text Instrument Instrument Name Text First_P First Patient Type Text CC Category of Treatment chosen by Doctor Text C Category of Treatment chosen by Patient Text T Difference Difference in T Score Numeric Coefficients Numeric Sound Features 3 coefficients for 3 visits datasets 4 coefficients for 4 visits datasets 3 angles corresponding to visits 1-2, 1-3, and 2-3 (for 3 visits datasets) 6 angles corresponding to visits 1-2, 1-3, 1-4, 2-3, 2-4, and 3-4 (for 4 visits datasets) Sound Level Centroid, Sound Level Spread Recovery Rate Recovery Rate Numeric Text Stress, Noise, Medical Boolean Decision Feature One of the eight descritized total scores Angles Numeric Numeric Data Mining: Clustered Data 48 In order to test the classifiers with the clustered data, WEKA with J48, Random Forest, and Multilayer Perceptron (Neural Network) was used on the following: 1) 2) 3) 4) 5) 6) 7) Datasets Datasets Datasets Datasets Datasets Datasets Datasets with with with with with with with standard deviations and averages, coefficients and text, coefficients and angles, coefficients only, angles only, angles and text, angles, coefficients and text. Data Mining: Clustered Data 49 WEKA test with angles, coefficients and text data File: base_angle_coef_noise_4_d3_[E04-015]_j48.txt Experiment classifier: J4.8 precision = 0.884 Data Mining: Clustered Data 50 Data Mining: Clustered Data 51 Previously, the top classifier for the unclustered datasets was evidenced by the original Tinnitus dataset with decision feature TSa, Sound Level Centroid, Sound Level Spread, and Recovery Rate features as previously described. The clustering and new features for coefficients and angles improve the classification with the data grouping presenting a more homogeneous dataset. Results are encouraging on the sample datasets ◦ Top precision is .884 ◦ This represents an improvement over the classification precision of .751 with J48 classification on the original dataset and features Sound Level Centroid, Sound Level Spread and Recovery Rate being present Summary Data Mining: Clustered Data 52 Action Rules 53 54 THE COLLECTION OF DATABASES Dp (v=4) was extended with the following features: Features A1 to A3, T1 to T3 for patient q with even visits: A1 (q) [ a J ( n ) / 2 a J ( 0) ] w J ( n ) / 2 w J ( 0) T1(q) = aJ(n)/2 – aJ(0) A2 (q) [a J ( n ) a J ( n ) / 2 ] wJ ( n) wJ ( n) / 2 A3 (q ) T2(q) = aJ(n) – aJ(n)/2 [a J ( n ) a J (0) ] w J ( n ) w J (0) T3(q) = aJ(n) – aJ(0) For odd visits, we add A4/T4 for difference in visit 1 to visit 0, A5/T5 for visit 1 to last visit, and A6/T6 and A7/T7 similar to A1/T1 and A2/T2. Action Rules 55 Action Rule Discovery using features A1-A7 and T1-T7 For Decision: Total Score of Emotion, Function, and Catastrophic Neuman-Q Results Decision: Whether or not patient symptoms improved Action Rules 56 Action Rule Discovery using features A1-A7 and T1-T7 For Decision: Total Score of Emotion, Function, and Catastrophic Neuman-Q Results Decision: Whether or not patient symptoms improved Action Rules 57 Action Rule Discovery using features A1-A7 and T1-T7 For Decision: Total Score of Emotion, Function, and Catastrophic Neuman-Q Results Decision: Whether or not patient symptoms improved Action Rules 58 Action Rule Discovery using features A1-A7 and T1-T7 For Decision: Total Score of Emotion, Function, and Catastrophic Neuman-Q Results Decision: Whether or not patient symptoms improved Action Rules 59 Action Rule Discovery using features A1-A7 and T1-T7 For Decision: Total Score of Emotion, Function, and Catastrophic Neuman-Q Results Decision: Whether or not patient symptoms improved Action Rules 60 Action Rule Discovery using features A1-A7 and T1-T7 For Decision: Total Score of Emotion, Function, and Catastrophic Neuman-Q Results Decision: Whether or not patient symptoms improved Action Rules 61 Action Rule Discovery using features A1-A7 and T1-T7 For Decision: Total Score of Emotion, Function, and Catastrophic Neuman-Q Results Decision: Whether or not patient symptoms improved Action Rules 62 Summary of Research: Action Rules Action rules show promise toward leading to the discovery of new and interesting rules for a tinnitus DSS: further refinement is needed on the decision variable and on linking the study to emotions Paper presented at IEEE GrC 2010 in San Jose, California: “From Tinnitus Data to Action Rules and Tinnitus Treatment” (Zhang, Thompson, Ras, Jastrebof), August, 2010. Summary - Action Rules 63 Action Rules with LISp-Miner and MARDs, New Database 64 65 Tinnitus Functional Index and Emotion Features • • • • • • In the second dataset representing the new database from Dr. Jastreboff (161 visit tuples for 75 unique patients) Most questions are 11 point scale 0 to 10 THI also administered Mapped to new emotion features E1-E4 Improvement features added (+ or -) Visit based • >1 visit necessary, last visit removed New Database Characteristics 66 New Features Based on the TFI and emotions Table 2: Tinnitus Functional Index (scale of 0 to 10) Category of Question Q1 % of time aware Awareness Q2 loud HEARING Q3 in control E11 E-V Scale Q4 % of time annoyed Annoyance Q5 cope E11 E1 Q6 ignore E21 E2 Q7 concentrate THINKING CONCENTRATION Q8 think clearly THINKING CONCENTRATION Q9 focus attention THINKING CONCENTRATION Q10 fall/stay asleep E33 E3 Q11 as much sleep E33 E3 Q12 sleeping deeply E33 E3 Q13 hear clearly HEARING Q14 understand people HEARING Q15 follow conversation HEARING Q16 quite, resting activities E41 E4 Q17 relax E43 E4 Q18 peace and quiet E42 E4 Q19 social activities SOCIAL Q20 enjoyment of life E11 Q21 relationships SOCIAL Q22 work on other tasks SOCIAL Q23 anxious, worried E23 E2 Q24 bothered upset E22 E2 Q25 depressed E31 E3 E1 E1 Sum of values represents E1 Energetic Positive, E2 Energetic Negative, E3 Calm Negative, E4 Calm Positive New Feature Construction: TFI and Emotions 67 Thayer’s Emotion Valence Plane with emotions E1, E2, E3, E4. It is a common scale to measure emotions on a scale of arousal (high/low) and Valence (positive/negative) in music domain. Expert knowledge used to map TFI values to Thayer’s plane 68 New Decision Features ◦ Features for Action Rule mining related to change over time (visits) ◦ Boolean features + or – related to a feature such as Total Score improving or getting worse Calculated from score on next visit Stored as + or – on visit related tuple New Feature Construction: Decision Features showing change over time 69 LISp-Miner LISp-Miner [http://lispminer.vse.cz/] includes an advanced system of software modules that have been developed to implement classification and action rule discovery algorithms on data sets. The 4ft-Miner procedure is used to discover new action rules in the tinnitus datasets with respect to new patients (those completing the Tinnitus Functional Index). Action Rules 70 LISp-Miner Ac 4ft Quantifier: M a b c d Action Rules 71 Attributes Abbreviation Characteristics BASIC TRT QQQ E_SCORE TRTM IMPR_TRT CHG_E List of attributes Initial state Patient’s basic characteristics ProblemTHL, Misophonia, Sc_T Patient’s initial state – H_Sv, H_An, H_EL, H_pr, Hl_ pr, Aw%T, An%T, questions from TRT Tch, T_Sv, T_An, T_EL Patient’s initial state – Q1, …, Q25 Tinnitus Function Index Patient’s initial state – E1_SCORE_TFI, E2_SCORE_TFI, E1_SCORE_TFI, emotion score E4_SCORE_TFI Treatment Treatment Instrument, Trtmt_Cat_Patient, Trtmt_Cat_Dr Results of treatment Improvements in attributes Impr_in_H_Sv, Impr_in_H_An, Impr_in_H_EL related to the TRT Impr_in_H_pr, Impr_in_Hl_ pr, Impr_in_Aw%T Impr_in_An%T, Impr_in_Tch, Impr_in_T_Sv Impr_in_T_An, Impr_in_T_EL, Changes in emotional score CHG_IN_E1, CHG_IN_E2, CHG_IN_E3, CHG_IN_E4, CHG_IN_Q1, ACTION RULES: EXPERIMENT AND RESULTS 72 Mining Tasks of Interest Task Test T_01 T_02 T_03 T_04 T_05 T_06 Antecedent stable E1_Score BASIC BASIC BASIC, TRT BASIC, TRT BASIC, QQQ BASIC flexible Instrument TRTM TRTM TRTM TRTM TRTM E_SCORE Succedent stable not used not used not used not used not used not used not used flexible An%T IMPR_TRT CHG_E IMPR_TRT CHG_E IMPR_TRT IMPR_TRT ACTION RULES: EXPERIMENT AND RESULTS 73 Domain Knowledge for LISp-Miner ACTION RULES: EXPERIMENT AND RESULTS 74 Rules using LISp ACTION RULES: EXPERIMENT AND RESULTS 75 Analysis: Before confidence: 9/9+0 After confidence: 9/ [9+20] Low confidence but shows promise ACTION RULES: EXPERIMENT AND RESULTS 76