Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Symbolic and Machine Learning Methods for Patient Discharge Summaries Encoding Julia Medori CENTAL (Centre for Natural Language Processing) Université catholique de Louvain (Belgium) Séminaire du Cental - 17/12/2010 Overview • Problem outline • System structure – Extraction – Encoding • Extraction module • Encoding module – Machine learning methods • Experiments for features selection • Results – Symbolic methods description • • • • Method 1: Morphological Analysis (MA) Method 2: Extended lexical patterns (ELP) Methods combination Results • Conclusions Introduction • Aim Build a (semi-)automated system for ICD-9-CM encoding • Collaboration CENTAL/Saint-Luc – Université catholique de Louvain (Belgium) • CENTAL : Centre for Natural Language Processing • Saint-Luc hospital : – team of 10 coders processes medical records : Extraction of medical acts and diagnoses ICD-9-CM codes – 85,000 patient’s stays encoded each year. Manual encoding Data • International Classification of Diseases -9th RevisionClinical Modification (ICD-9-CM) – Hierarchy : • first 3 digits -> general category : 1,135 categories • Digits 4 and 5 -> specific diagnosis : 15,688 codes • Example : Code Label 001 Cholera 0010 Cholera due to Vibrio cholerae 0011 Cholera due to Vibrio cholerae el tor 0019 Cholera, unspecified Objectives • Design a coding help: – a tool that will suggest the most likely codes to be assigned to a patient’s medical record. • Why not a fully automated system? – Main source of information : Patient discharge summary (PDS) • PDS : letter, addressed to patient’s GP with no standard structure – 15-20% of the codes inferred from other sources from patient’s medical record (often scanned documents). System structure Context analysis + tagging Morphological processing Code modification according to context and stats PDS + ordered list of codes Manual checking Machine learning module Matching lists ICD9CM + Inclusions Dictionaries and linguistic structures PDS Preprocessing Extraction Coding Structure outline • 2 steps : – Extraction • Develop an extraction system able to extract information necessary to the encoding task : – Diagnoses, procedures, locations, dates, allergies, aggravating factors, etc. => Reading help tool. – Encoding • Extracted information => codes through a combination of statistical and symbolic methods. Extraction • Develop specialized linguistic resources – Specialized dictionaries • Diagnoses and procedures <= ICD-9-CM + UMLS • Medications • Anatomy – Linguistic structure description • • • • Diagnoses context (present, absent, probable, etc.) Allergies and smoking Dates Weight and height Example of linguistic structure graph Fracture de l’épaule => <MALINDET> Fracture de l’<ANAT>épaule</ANAT></MALINDET> Extraction result Structure outline • 2 steps: – Extraction • Develop an extraction system able to extract information necessary to the encoding task : – Diagnoses, procedures, localisations, dates, allergies, aggravating factors, etc. => Reading help tool. – Encoding • Extracted information => codes through a combination of statistical and symbolic methods. Machine Learning • Encoding = categorization problem – Features = extracted phrases? – Classes = codes • Baseline method : Naive Bayes – Tool: Weka • Corpus : – 13,635 PDS from Digestive Surgery • 90% training set / 10% test set (1364 PDS) • Average number of codes per PDS: 6.2 • Trained 1 classifier per code occurring>5 times in the corpus : – 775 codes -> 775 classifiers – Limitation: 5% rare codes – attributes: kept only those co-occurring at least twice with the codes. • Measures: Precision and recall according to the probability returned by the Naive Bayes test. Experiments • A series of experiments were conducted where attributes were variants of the extracted diagnoses and procedures after stemming. • Variants implied: – Kept original word order or not. • Ex: excisional biopsy bile duct • Or bile biopsy duct excisional – Included details like location, date, context. • Excisional biopsy – Each word of the extracted phrases is a feature • • • • Excisional Biopsy Bile Duct – Words and morphemes (together) composing the extracted phrase • Bile biopsy excision excisional duct – Words and morphemes (separately) composing the extracted phrase • Excisional biopsy bile duct • Excision biopsy bile duct – Values were 0 or 1 whether the attribute was in the text or not. – Values were the frequency of the attribute in the text. Results 3 best results when thresholding the list of results where the probability returned by Naive Bayes = 1 Features Extracted phrases + details + same word order + 0/1 as values Extract phrases + details + alphabetical order + 0/1 as values Words and morphemes (together) + details + alphabetical order + 0/1 as values Average number of Recall Precision suggested codes 68,7 73,2 7,87 59,1 75,7 6,49 68,5 74,2 7,54 Discussion • Limitations of the machine learning method: – 5% rare codes – not enough data to build a classifier for these codes – Need for annotated data means that these methods are unable to face changes in classifications In these cases, we need to use symbolic methods Kevers Laurent et Medori Julia, Symbolic classification methods for patient discharge summaries encoding into ICD, In: Advances in Natural Language Processing, 7th International Conference on NLP, IceTAL 2010, Reykjavik, August 16-18, 2010, Lecture Notes in Artificial Intelligence, 2010, p. 197208 Objective • Automatic encoding of PDS according to categories (first 3 digits) • Use of symbolic methods – No need for annotated data – Can assign rare codes (27% used 5 times or less) • Principle : – Make use of the nomenclature – Enrich it with other resources in French from UMLS (Unified Medical Language System) Corpus • 19,692 patient discharge summaries (PDS) in French • General Internal Medicine • 150,116 codes (137,336 categories) • 6,029 distinct codes (895 categories) • Average = 7.6 codes/document (7 categories) Method 1 (MA) – General Principle • Based on the rich morphology of medical language – Ex. Bronchoscopy: Fibroscopie bronchique = bronchoscopie par fibre optique • 2 steps process : – Extract phrases or terms describing diagnoses or procedures to be encoded – Encoding : match these terms to the right code. Method 1 (MA) – Encoding • Bags-of-words : Words – stop words + morphemes + meaning ICD-9-CM PDS Fibroscopie bronchique fibroscopie bronchique fibrofibre -scopie bronchbronche -ique Bronchoscopie par fibre optique Similarity score bronchoscopie par fibre optique bronchbronche -scopie Method 1 (MA) – Results Recall Best Recall 46.13 Best F34.52 measure Precision F-measure Nb. classes 14.70 27.34 21.10 28.00 20 8.6 Method 2 (ELP) – General principle Developed by L. Kevers as designed for the Stratego project on parliamentary documents. • Symbolic method with less manual work Use existing « terminological » resources – ICD-9-CM + UMLS Two steps process 1. 2. Automatic transformation of existing terminological resources into an extraction resource (only once) Use extraction resource on documents for terms extraction and classification (for each document) Method 2 (ELP) – build extraction resource (1) For each ICD-9-CM term (= a class), the automatic processing implies : Gather synonyms (UMLS) « dengue » → « dengues », « dengue fever », « infection by the dengue virus » Parse complex compound expressions « Infectious and parasitic diseases » → « Infectious disease » → « Parasitic disease » Transform initial term into Extended Lexical Pattern (ELP) - Stopwords : → « infection <TOKEN> dengue virus » Stemming : → « infect <TOKEN> dengue virus » Allow insertions : → « infect <I> <TOKEN> <I> dengue <I> virus » Add negative contexts patterns Build the main transducer for text annotation Method 2 (ELP) – Transducer & output • Transducer for class '061' • Output of main transducer for a document Zona [[053]] extremement douloureux [[729]] gastroscopie [[Z44]] acide [[E96]] anemie normochrome normocytaire [[285]] sequellaires apicales droite (tuberculose [[137]] intestin grele [[Z45]] tuberculose [[V12]] oesophagite moderee aspecifique [[947]] infection a mycobacterie [[031]] fond de oeil [[Z16]] pas de [[-]] atteinte du nerf [[957]] zona [[053]] hyperthyroidie [[242]] goitre [[706]] goitre [[240]] Method 2 (ELP) – Class assignment (2) For a text to classify, analyse the main transducer output When negative contexts, the phrase is skipped Each recognized phrase has one (or more) related code Compute a weight for each phrase based on – – Frequency Is a multi word expression (frequency*2), or not Compute a weight for each code by summing up the weights obtained for the phrases Result : ordered list of codes (possibly threshold it) Method 2 (ELP) – Results Recall Precision F-measure Nb of classes Best Recall 52.74 20.69 27.37 19.6 Best Fmeasure 30.30 29.43 9.8 37.97 Combination of methods 1 & 2 • Merge the lists from method 1 & 2 1. 2. 3. 4. • Threshold(M.1 union M.2) Threshold(M.1 inter M.2) Threshold(M.1) union Threshold(M.2) Threshold(M.1) inter Threshold(M.2) The weight for each method can be balanced – Example: 0.4*M.1 union 0.6* M.2 Evaluation of symbolic methods combination Recall Precision F-measure Nb. Threshold α/1-α (R) (P) (F1) classes Mix1 : Threshold(Method1 union Method2) Best R 60.21 13.20 20.86 30.5 No Any Best F1 37.13 33.12 31.64 8.1 Mix2 : Threshold(Method1 inter Method2) Best R 38.66 29.28 30.52 9.1 Yes No Best F1 34.73 34.55 31.50 7 Yes Mix3 : Threshold(Method1) union Threshold(Method2) Best F1 43.28 20.59 27.90 14.7 Yes Mix4 : Threshold(Method1) inter Threshold(Method2) Best F1 24.07 37.95 29.46 4.4 Yes 0.3/0.7 Any 0.3/0.7 N/A N/A Conclusions • Results have to be put into perspective: – – – – Inter-annotator agreement ~70% 15 to 20% cannot be inferred from PDS Machine learning methods performed well. Symbolic methods: • MA method based on extraction module : 66% of useful information is extracted. • ELP method performs better when built from short unambiguous phrases. ICD-9-CM code descriptions are more complex. • Future work : – Give more weight to information contained in important parts of the PDS (introduction, conclusion…) – Evaluate the actual help given to human coders – Combine with learning algorithms