Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Applications of Machine Learning to Medical Informatics Daniela S. Raicu, PhD Assistant Professor Email: [email protected] Lab URL: http://facweb.cs.depaul.edu/research/vc/ Intelligent Multimedia Processing & Medical Imaging Labs • Faculty: – GM. Besana, L. Dettori, J. Furst, G. Gordon, S. Jost, D. Raicu • CTI Students: – J. Cisneros, M. Doran, W. Horsthemke, B. Malinga, R. Susomboon, E. Varutbangkul, S.G. Valencia, J. Zhang • NSF REU Students (2006): – A. Bashir, T. Disney, S. Greenblum, J. Hasemann, M.O. Krucoff, M. Lam, M. Pham, A. Rogers, S. Talbot CSC578, Fall 2006 2 Intelligent Multimedia Processing & Medical Imaging Labs • IMP Collaborators & Funding Agencies – National Science Foundation (NSF) - Research Experience for Undergraduates (REU) – Northwestern University - Department of Radiology, Imaging Informatics Section – University of Chicago – Medical Physics Department – Argonne National Laboratory - Biochip Technology Center – DePaul University, College of Law CSC578, Fall 2006 3 Outline Part I: Introduction to Medical Informatics     Medical Informatics Medical Imaging Imaging Modalities Basic Concepts in Image Processing Part II: Current Research Problems in Medical Informatics      Segmentation of soft tissues Classification of pure patches Visualization of pure patches Content-based Image Retrieval and Annotation Image Content-Driven Ontology for Chest interpretation CSC578, Fall 2006 4 What is Medical Informatics? Simplistic definition: Medical informatics is the application of computers, communications and information technology and systems to all fields of medicine - medical care, medical education and medical research. MF Collen, MEDINFO '80, Tokyo CSC578, Fall 2006 5 What is Medical Informatics? Medical Informatics is the branch of science concerned with the use of computers and communication technology to acquire, store, analyze, communicate, and display medical information and knowledge to facilitate understanding and improve the accuracy, timeliness, and reliability of decision-making. Warner, Sorenson and Bouhaddou, Knowledge Engineering in Health Informatics, 1997 CSC578, Fall 2006 6 Subdomains of Medical Informatics (by Wikipedia) • • • • • • • • • imaging informatics clinical informatics nursing informatics Consumer health informatics public health informatics dental informatics clinical research informatics bioinformatics pharmacy informatics CSC578, Fall 2006 7 What is Medical Imaging? The study of medical imaging is concerned with the interaction of all forms of radiation with tissue and the development of appropriate technology to extract clinically useful information (usually displayed in an image format) from observation of this technology. Sources of Images: • Structural/anatomical information (CT, MRI, US) - within each elemental volume, tissue-differentiating properties are measured. • Information about function (PET, SPECT, fMRI). CSC578, Fall 2006 Understanding Visual Information: Technical, Cognitive and Social Factors 8 Examples of Medical Images X-ray Image of the hand Computed Tomography (CT) Image of Functional Magnetic plane through liver and stomach Resonance Imaging (fMRI) of the brain CSC578, Fall 2006 Tomography Single Fluorescence Computed Microscopy: of the tissue heart culture cells. 9 Ultrasound image ofPhoton a woman’s abdomen Image of living What is a Medical Image? pixel CSC578, Fall 2006 slice thickness 10 DICOM standard in Medical Imaging DICOM: "Digital Imaging and Communication in Medicine” The DICOM Standard allows to get pixel data for each produced image and to associate specific information to them: name of the patient, type of examination, hospital, date of examination, type of acquisition etc... CSC578, Fall 2006 11 DICOM Header: CSC578, Fall 2006 12 Computer Aided Diagnosis • Computed Aided Diagnosis (CAD) is diagnosis made by a radiologist when the output of computerized image analysis methods has been incorporated into his or her medical decision-making process. • CAD may be interpreted broadly to incorporate both • the detection of the abnormality task and • the classification task: likelihood that the abnormality represents a malignancy CSC578, Fall 2006 13 Motivation for CAD systems The amount of image data acquired during a CT scan is becoming overwhelming for human vision and the overload of image data for interpretation may result in oversight errors. Computed Aided Diagnosis for: • Breast Cancer • Lung Cancer – A thoracic CT scan generates about 240 section images for radiologists to interpret. • Colon Cancer – CT colonography (virtual colonoscopy) is being examined as a potential screening device (400-700 images) CSC578, Fall 2006 14 CAD for Breast Cancer A mammogram is an X-ray of breast tissue used as a screening tool searching for cancer when there are no symptoms of anything being wrong. A mammogram detects lumps, changes in breast tissue or calcifications when they're too small to be found in a physical exam. • Abnormal tissue shows up a dense white on mammograms. • The left scan shows a normal breast while the right one shows malignant calcifications. CSC578, Fall 2006 15 CAD for Lung Cancer • Identification of lung nodules in thoracic CT scan; the identification is complicated by the blood vessels • Once a nodule has been detected, it may be quantitatively analyzed as follows: • The classification of the nodule as benign or malignant • The evaluation of the temporal size in the nodule size. CSC578, Fall 2006 16 CAD for Colon Cancer • Virtual colonoscopy (CT colonography) is a minimally invasive imaging technique that combines volumetrically acquired helical CT data with advanced graphical software to create two and threedimensional views of the colon. Three-dimensional endoluminal view of the colon showing the appearance of normal haustral folds and a small rounded polyp. CSC578, Fall 2006 17 Role of Image Analysis & Machine Learning for CAD • An overall scheme for computed aided diagnosis systems Lesion / Abnormality Segmentation Organ Segmentation - Breast Images - Thoracic Images - Breast Boundary - Lungs - Colon - Nodule - Polyps Classification Feature Extraction - Texture - Shape - Geometrical properties Evaluation & Interpretation - Malignant - Benign CSC578, Fall 2006 18 Texture Classification of Tissues in CT Chest/Abdomen A. Pixel-level Classification: - tissue segmentation - context-sensitive tools for radiology reporting - Pixel Level Texture Extraction Pixel Level Classification d1 , d 2 , d k  Organ Segmentation tissue _ label  CSC578, Fall 2006 19 Pixel-level Texture Extraction • Consider texture around the pixel of interest. • Capture texture characteristic based on estimation of joint conditional probability of pixel pair occurrences Pij(d,θ). Neighborhood of a pixel – Pij denotes the normalized co-occurrence matrix of specify by displacement vector (d) and angle (θ). CSC578, Fall 2006 20 Haralick Texture Features CSC578, Fall 2006 21 Haralick Texture Features CSC578, Fall 2006 22 Examples of Texture Images Texture images: original image, energy and cluster tendency, respectively. M. Kalinin, D. S. Raicu, J. D. Furst, D. S. Channin,, " A Classification Approach for Anatomical Regions Segmentation", The IEEE International Conference on Image Processing (ICIP), Genoa, Italy, September 11-14, 2005. CSC578, Fall 2006 23 Texture Classification of Tissues in CT Chest/Abdomen Example of Liver Segmentation: (J.D. Furst, R. Susomboon, and D.S. Raicu, "Single Organ Segmentation Filters for Multiple Organ Segmentation", IEEE 2006 International Conference of the Engineering in Medicine and Biology Society (EMBS'06)) Original Image Initial Seed at 90% Split & Merge at 85% Split & Merge at 80% Region growing at 70% Region growing at 60% Segmentation Result CSC578, Fall 2006 24 Texture Classification of Tissues in CT Chest/Abdomen B. Patch-level Classification: - creation of an electronic handbook of normal tissues in CT scans including visual and quantitative samples, and tools to annotate, browse and retrieve samples. Patch Samples - Ground truth: tissue names liver liver kidney fat muscle trabecular bone … CSC578, Fall 2006 … 25 Texture Classification of Tissues in CT Chest/Abdomen B. Patch-level Classification (cont.): Texture quantification - CSC578, Fall 2006 26 Texture Classification of Tissues in CT Chest/Abdomen B. Patch-level Classification (cont.): - Supervised learning (classification) of the mappings between texture features and type of pure patch CSC578, Fall 2006 IF F13<.2 and F16 >.8 THEN LIVER (p=.95) 27 Evaluation & Interpretation • Sensitivity: the ratio between true positives and total positives • Specificity: the ratio between true negatives and total negatives • Receiver Operator Characteristic (ROC) A true positive is an abnormality classified as malignant when it is actually malignant. A true negative is an abnormality classified as benign when it is actually benign. CSC578, Fall 2006 28 Evaluation & Interpretation • Receiver Operator Characteristic (ROC) curves for distinction between benign and malignant nodules on high-resolution CT. CSC578, Fall 2006 29 Texture Classification of Tissues in CT Chest/Abdomen Organ -, patch -, and pixel - level classification of spinal cord, liver, heart, kidneys and spleen using decision trees: Organ Level Organ Pure Patch Level Pixel-level (9 x 9) Pixel level (13 x 13) Sensitivity Specificity Sensitivity Specificity Sensitivity Specificity Sensitivity Specificity 100.0% 97.6% 97.7% 99.3% 100.0% 96.3% 100.0% 99.2% Liver (259) 73.8% 95.9% 91.9% 97.9% 100.0% 99.0% 100.0% 98.4% Heart (77) 73.6% 97.2% 79.2% 98.3% 81.1% 99.5% 66.7% 100.0% Kidney (225) 86.2% 97.8% 91.6% 97.1% 78.9% 98.0% 96.6% 93.0% Spleen (98) 70.5% 95.1% 65.3% 98.5% 94.4% 95.5% 100.0% 97.6% Backbone (44) - • D. Xu, J. Lee, D.S. Raicu, J.D. Furst, D. Channin. "Texture Classification of Normal Tissues in Computed Tomography", The 2005 Annual Meeting of the Society for Computer Applications in Radiology, Florida. • D. Channin, D. S. Raicu, J. D. Furst, D. H. Xu, L. Lilly, C. Limpsangsri, "Classification of Tissues in Computed Tomography using Decision Trees", Poster and Demo, The 90th Scientific Assembly and Annual Meeting of Radiology Society of North America (RSNA04), Chicago, 2004. CSC578, Fall 2006 30 Texture Classification of Tissues in CT Chest/Abdomen Low-level features: texture CT images Classification & Association Techniques Image & Textual Feature Extraction Patient demographics & Radiologist Annotations High-level features: diagnosis, tissue labels Classification, Segmentation & Annotation Interpretation & evaluation: sensitivity, specificity Diagram of a Classification System CSC578, Fall 2006 31 Classification models: challenges (a) Optimal selection of an adequate set of textural features is a challenge, especially with the limited data we often have to deal with in clinical problems. Consequently, the effectiveness of any classification system will always be conditional on two things: (i) how well the selected features describe the tissues (ii) how well the study group reflects the overall target patient population for the corresponding diagnosis CSC578, Fall 2006 32 Classification models: challenges (b) how other type of information can be incorporated into the classification models: - metadata - image features from other imaging modalities (need of image fusion) (c) how stable and general the classification models are CSC578, Fall 2006 33 Content-based medical image retrieval (CBMS) systems Definition of Content-based Image Retrieval: Content-based image retrieval is a technique for retrieving images on the basis of automatically derived image features such as texture and shape. - Applications of Content-based Image Retrieval: • Teaching • Research • Diagnosis • PACS and Electronic Patient Records CSC578, Fall 2006 34 Diagram of a CBIR Image Database Image Features Feature Extraction [D1, D2,…Dn] Similarity Retrieval Query Image Feedback Algorithm User Evaluation CSC578, Fall 2006 http://viper.unige.ch/~muellerh/demoCLEFmed/index.php Query Results 35 CBIR as a Diagnosis Aid An image retrieval system can help when the diagnosis depends strongly on direct visual properties of images in the context of evidence-based medicine or case-based reasoning. CSC578, Fall 2006 36 CBIR as a Teaching Tool An image retrieval system will allow students/teachers to browse available data themselves in an easy and straightforward fashion by clicking on “show me similar images”. Advantages: - stimulate self-learning and a comparison of similar cases - find optimal cases for teaching Teaching files: • Casimage: http://www.casimage.com • myPACS: http://www.mypacs.net CSC578, Fall 2006 37 CBIR as a Research Tool Image retrieval systems can be used: • to complement text-based retrieval methods • for visual knowledge management whereby the images and associated textual data can be analyzed together • multimedia data mining can be applied to learn the unknown links between visual features and diagnosis or other patient information • for quality control to find images that might have been misclassified CSC578, Fall 2006 38 CBIR as a tool for lookup and reference in CT chest/abdomen • Case Study: lung nodules retrieval – Lung Imaging Database Resource for Imaging Research http://imaging.cancer.gov/programsandresources/Inf ormationSystems/LIDC/page7 – 29 cases, 5,756 DICOM images/slices, 1,143 nodule images – 4 radiologists annotated the images using 9 nodule characteristics: calcification, internal structure, lobulation, malignancy, margin, sphericity, spiculation, subtlety, and texture • Goals: – Retrieve nodules based on image features: • Texture, Shape, and Size – Find the correlations between the image features and the radiologists’ annotations CSC578, Fall 2006 39 CBIR as a tool for lung nodule lookup and reference CSC578, Fall 2006 40 Choose a nodule CSC578, Fall 2006 41 Choose an image feature& a similarity measure M. Lam, T. Disney, M. Pham, D. Raicu, J. Furst, “Content-Based Image Retrieval for CSC578, Fall Computed 2006 Pulmonary Tomography Nodule Images”, SPIE Medical Imaging 42 Conference, San Diego, CA, February 2007 CSC578, Fall 2006 Retrieved Images 43 CBIR systems: challenges •Type of features • image features: - texture features: statistical, structural, model and filter-based - shape features • textual features (such as physician annotations) • Similarity measures -point-based and distribution based metrics • Retrieval performance: • precision and recall • clinical evaluation CSC578, Fall 2006 44 Image features and physician annotations correlations CSC578, Fall 2006 45 Malignancy regression model Characteristics Regression Coefficients Calcification InternalStructure Lobulation Adj_R2 = 0.990 Malignancy Margin Sphericity F-value = 963.560 p-value = 0.000 (Constant) gabormean_1_2 MinIntensityBG Energy gabormean_0_1 IntesityDifference inverseVariance gabormean_1_1 gabormean_2_1 Correlation clusterTendency ConvexPerimeter 5.377275 -0.02069 0.003819 -28.5314 -0.00315 0.000272 6.317133 0.009743 -0.00667 -0.39183 5.16E-06 -0.00291 p-value 1.64E-54 7.80E-07 3.30E-82 3.31E-12 5.80E-14 0.003609 3.41E-05 0.000259 5.79E-05 5.67E-05 0.000131 0.023032 Spiculation Subtlety Texture Estimated Malignancy = 5.377275 - 0.02069 gabormean_1_2 + 0.003819 MinIntensityBG - 28.5314 energy - 0.00315 gabormean_0_1 + 0.000272 IntesityDifference + 6.317133 inverseVariance + 0.009743 gabormean_1_1 - 0.00667 gabormean_2_1 - 0.39183 correlation + 5.16E-06 clusterTendency CSC578, Fall 2006 46 - 0.00291 ConvexPerimeter Multiple Regression Models E. Varutbangkul, J. G. Cisneros, D. S. Raicu, J. D. Furst, D. S. Channin, S. G. Armato III, "Semantics and Image Content Integration for Pulmonary Nodule Interpretation in Thoracic Computed Tomography", SPIE Medical Imaging Conference, San Diego, CA, February 2007 Characteristics Entire dataset (1106) At least 2 radiologists agreed At least 3 radiologists agreed Calcification 0.397 0.578 (884) 0.645 (644) Internal Structure 0.417 - (855) - (659) Lobulation 0.282 0.559 (448) 0.877 (137) Malignancy 0.310 0.641 (489) 0.990 (107) Margin 0.403 0.376 (519) - (245) Sphericity 0.239 0.481 (575) 0.682 (207) Spiculation 0.320 0.563 (621) 0.840 (228) Subtlety 0.301 0.282 (659) 0.491 (360) Texture 0.181 0.473 (736) 0.843 (437) CSC578, Fall 2006 47 Image-Driven Ontologies for CT chest interpretation Texture definition: Margin definition: Nodule internal texture, e.g., nonsolid, part solid, or solid texture How well defined the margin of the nodule is (poorly or sharp) Circularity definition: Solidity definition: division of ‘area of the nodule’ by ‘area of a circle with the same convex perimeter of the nodule’ The proportion of the pixels in the convex hull that are also in the region (Area/ConvexArea) CSC578, Fall 2006 48 Image-based Ontology: challenges • Identify type of features and their values for certain physician annotations Example: What is the “Gaborness” image representation for a particular annotation? • Build a CT- RADS visual atlas to standardize chest reporting CSC578, Fall 2006 49 Ideal CAD Workstation? It will have the human abilities • to transfer acquired knowledge to new tasks, • to adapt to the diagnostic problem, • to choose image features that are relevant to the clinical task and to analyze the image • to offer diagnostic suggestions, and, finally, • to justify the suggestions on the basis of available reference data. That CAD system will be a true partner to the diagnostic radiologist. CSC578, Fall 2006 50 uestions ? CSC578, Fall 2006 51