Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Using Probabilistic Logic and the Principle of Maximum Entropy for the Analysis of Clinical Brain Tumor Data Julian Varghese1 Christoph Beierle1 Nico Potyka1 Gabriele Kern-Isberner2 1 FernUniversität in Hagen, Germany, 2 Technische Universität Dortmund, Germany Abstract Dealing with uncertainty that is inherently present in any medical domain, is one of the major challenges when designing a medical decision support system. We demonstrate how probabilistic logic can be used to design medical knowledge bases at the example of analysing clinical brain tumor data. We use ME CoRe, a system implementing probabilistic conditional logic, to create a knowledge base BT that contains medical knowledge originating from both statistical data as well as from medical experts. Any incomplete or unspecified knowledge is completed by ME CoRe in an information-theoretically optimal way by employing the principle of maximum entropy. BT is evaluated with respect to a series of queries regarding diagnosis and prognosis, using a real documented patient case. 1 Introduction Knowledge-based systems encode human expert knowledge in a computer-readable way. In this way, they are aiming at simulating human decision-making processes and providing decision support for experts in the field. A particular challenge for medical expert systems is that they invariably have to deal with uncertain knowledge. This aspect was already addressed in MYCIN, one of the first and most popular medical expert systems, which can help physicians to establish the proper diagnosis and therapy for patients with infectious disease problems [15]. To deal with uncertain knowledge, rules contain additional uncertainty factors. However, MYCIN’s uncertainty factors are a heuristic approach, lacking any well-founded theory explaining its semantics. Another prominent example of a medical knowledge-based system is Pathfinder [4], which assists pathologists in the diagnosis of diseases in lymph nodes. It uses Bayesian networks for knowledge representation, which are based on a comprehensive probabilistic theory [7]. However, a disadvantage from a knowledge engineering perspective is that Bayesian networks are based CBMS 2013 on complete conditional probability distributions (CPDs), while often only parts of the CPDs are known. E.g., if we know a certain disease D causes a certain symptom S with probability x, we have P (S|D) = x. However, to obtain a complete CPD, we have to define also P (S|D), i.e., the conditional probability of the symptom given that the disease is not present. Whereas the former probability can often be obtained from statistical data or can be estimated by a physician, the latter probability is usually harder to obtain. The principle of maximum entropy (ME principle) [11, 6] offers an alternative way to develop a probability-based expert system. Conditionals like P (S|D) = x can be regarded as constraints for a probability distribution. Among all probability distributions satisfying all given constraints, the ME principle chooses the unique probability distribution P ∗ that is the most unbiased one. Hence, the incomplete information represented by the given conditionals is completed to a full probability distribution over the domain of interest in an information-theoretically optimal way [11, 6], and P ∗ can be used to answer arbitrary probabilistic-logical queries. SPIRIT [14] and ME CoRe [3] are software systems providing implementations of this framework. Here, we report on a case study showing how probabilistic logic and the principle of maximum entropy can be used to model and reason about knowledge in a medical domain [17]. We look at the analysis of clinical brain tumor data and represent some core knowledge from this area in a knowledge base BT, using ME CoRe for constructing BT and illustrating ME CoRe’s capabilities of modelling uncertain and incomplete information. Due to lack of space, for details of ME CoRe and its underlying methodology we refer to [1]. 2 Brain Tumors In this paper, we use the term brain tumor to refer to intracranial tumors which are tumorous neoplasms localized in the brain or its meningeal tissues. Two major clinical and neurophysiological problems are caused by a growing brain tumor process. One is the local infiltration of tumor tissue which destroys closely spaced brain tissue. Another one is c 978-1-4799-1053-3/13/$31.00 2013 IEEE 401 caused by the increase of global intracranial pressure leading to a comprehensive brain damage. This is due to the fact that the cranium can be seen as a rigid box, since after birth the cranial fontanels start to ossify leaving the whole brain with very limited pressure releasing openings. Clinical Relevance The prevalence of brain tumors is about 50:100.000 in the middle European region [13]. The incidence is about 1:10.000 per year. There are two age peaks, one within the range of 40 years and 70 years, another one within the childhood. Noteworthy, in childhood, brain tumors are the second most common tumor entity after leukemia. While adult patients mostly suffer from gliomas, meningeomas and metastases of other primary tumors, children mostly suffer from medulloblastomas, cerebellar astrozytomas and ependymomas [2]. Guiding symptoms are neurological failures; the brain tumor itself is mostly confirmed by medical imaging through CT/MRI-Scans. A histopathological tissue analysis secures the diagnosis and the exact classification-type and the grading of the brain tumor. Depending on the exact tumor type, the treatment consists of surgical removement and/or chemotherapy. In rare cases of very small tumor sizes and probable benignity surgery can be avoided if repeated medical imaging over the next months show no malignant potential. Classification The international WHO classification [8] lists all types of known brain tumors according to the origin of the tumor tissue which is found out through histopathology. Subtypes are listed as refined tumor types according to cell-analysis such as determining the tumor-grading based on morphological aspects of neoplastic cells. Cell nucleus polymorphisms, atypical mitosis, vessel proliferation and necrosis are such morphologies which indicate the malignancy of the tumor type. Based on that a tumor grading is classified in four grades as follows: Grade 1 – benign, Grade 2 – semi-benign, Grade 3 – semi-malignant, Grade 4 – malignant. The grading represents the malignancy potential of the tumor tissue independent from its current infiltration size. 3 Modelling Brain Tumor Knowledge Variables BT uses the nine propositional variables as listed in Table 1 together with their possible values. The variable diagnosis indicates the diagnosis. Its domain corresponds to the most common brain tumor types like gliomas and meningiomas [12]. As the probability for certain values like the type of the tumor depends on the age of the patient, the variable age classifies patients with respect to three groups. The first group contains patients with age lower-or-equal twenty, the second group contains patients with age between twenty and eighty and the third group contains patients with 402 diagnosis – pilocytic-astrocytoma, diffuse-astrocytoma, anaplastic-astrocytoma, glioblastoma, oligodendroglioma, ependymoma, meningeoma, medulloblastoma, cranialnerve-tumor, metastatic-tumor, other age – le20, 20to80, ge80 warningSymptoms – true, false malignancy – 1, 2, 3, 4, other icpSymptoms – true, false ASA – 1, 2, 3, 4 therapy – conservative, surgical, none complication – 1, 2, 3 prognosis – very good, good, intermediate, poor, very poor Table 1. Variables and their values age greater-or-equal eighty. The variable warningSymptoms indicates whether certain warning symptoms like perceptual disturbances or unusual pain in the head can be observed. Given results of a magnetic resonance tomography (MRT), the variable malignancy corresponds to the assumed malignancy of the tumor with respect to the WHO grading system [9]. A higher index corresponds to a higher malignancy. The variable icpSymptoms indicates whether MRT results provide symptoms for intracranial pressure (ICP). The preoperative physical fitness of patients is evaluated by the ASA (American Society of Anesthesiologists) physical status classification system represented by the variable ASA. It is associated with perioperative risks. The higher the value, the higher the risk. We regard only the first four states, as treatment of a brain tumor is of low priority for a higher value. The variable therapy denotes a possible therapy. We distinguish a conservative therapy without surgery, surgery, or no therapy at all. Possible complications during an inpatient stay are expressed by three degrees by the variable complication. We distinguish between three stages, higher values correspond to more serious complications: 1 – no complications or minor, completely reversible complications like temporary pain after surgery; 2 – medium or heavy complications with uncertain reversibility like neurological or other functional disorders; 3 – life-threatening complications like serious internal bleeding or neurological deficits at the risk of brain death. Finally, prognosis indicates the expected health of the patient after inpatient stay. Rules The ME framework and its support by ME CoRe allow the definition of flexible conditional rules that can express statistical facts as well as subjective expert knowledge. Table 2 shows some empirical frequencies of certain brain tumor types collected from different published sources [2, 5, 10, 16]. By defining our knowledge base appropriately, all given probabilities can be integrated in our knowledge state. E.g., for adults the diagnosis meningeoma has a relative frequency of 20%. We can model this empiriCBMS 2013 diagnosis glioma - glioblastoma - pilocytic-astrocytoma - diffuse-astrocytoma - anaplastic-astrocytoma - oligodendroglioma - ependymoma meningeoma medulloblastoma cranialnerve-tumor metastatic-tumor other Adults 50% 15% unspecified 10% 10% 10% 4% 20% 7% 7% 10% unspecified Children 48% unspecified 35% unspecified unspecified unspecified 8% unspecified 25% unspecified unspecified unspecified Table 2. Empirical brain tumor frequencies cal observation in ME CoRe by the conditional (diagnosis = meningeoma | !(age = le20))[0.2]. (1) where ! is negation. As gliomas appear very frequently they are further specialized in BT. Table 2 shows that for adults gliomas have a relative frequency of 50%. Using (diagnosis = glioma) as an abbreviation for the expression (diagnosis in {pilocytic-astrocytoma, diffuse-astrocytoma, anaplastic-astrocytoma, glioblastoma, oligodendroglioma, ependymoma}), we express the relative frequency of gliomas in ME CoRe as: (diagnosis = glioma | !(age = le20))[0.5]. (2) Analogously, all other probabilities explicitly given in Table 2 induce a corresponding conditional. Additionally, BT contains probabilistic facts like (age = le20)[0.15] reflecting the age distribution in Germany in the year 2009. Note, however, that there are some missing frequencies in Table 2. The missing knowledge is completed in an information-theoretically optimal way by employing the ME principle, thus by being as unbiased as possible with respect to each diagnosis with unspecified probability. Besides available statistical data, another important knowledge source is the clinical expert knowledge of a physician. For example, for adults, Table 2 tells us that the most frequently appearing malignant tumor type is glioblastoma, but no information is provided about its probability given specific symptoms. An experienced physician working with brain tumor patients might state the following conditionals expressing his expert beliefs about the probability of a glioblastoma given various observations (where gliob abbreviates diagnosis = glioblastoma): (gliob | !(age = le20) ∧ warningSymptoms)[0.2], (3) (gliob | !(age = le20) ∧ icpSymptoms)[0.2], (4) (gliob | !(age = le20) ∧ (malignancy = 4))[0.4], (5) (gliob | !(age = le20) ∧ (malignancy = 3))[0.1], (6) (gliob | !(age = le20) ∧ (malignancy = 2))[0.05], (7) CBMS 2013 (gliob | !(age = le20) ∧ (malignancy = 1))[0.01]. (8) Taking into account only Table 2, the probability for glioblastoma is 15%. Therefore, given the respective preconditions, rules (3) - (5) increase the probability, whereas rules (6) - (8) decrease it. ME CoRe allows the smooth and easy integration of statistical and such expert knowledge since rules like (3) - (8) can be added directly to BT. Altogether, BT contains 107 conditionals, and its overall semantics is well-defined by the ME principle. 4 Working with BT We illustrate the use of BT with ME CoRe on the basis of a real documented patient case. Our patient is 80 years old, with full consciousness, and he is experiencing amnesic aphasia since he is complaining about not being able to remember simple words on some occasions. He noticed a weakness in elevating his right foot and diffuse tingling sensations in the right lower limb. The physical neurological examination confirmed that and showed furthermore dilated fixed pupils at both sides. No cranial nerve disorders or further physical neuropathological findings. The MRT reveals an irregularly formed mass with inhomogeneous and cystic contrast enhancement localized in the temporal lobe. The mass is surrounded by a perifocal edema, causing a constriction of the left lateral ventricle and a midline shift. Preexisting illnesses: arterial hypertension. Thus, for our patient the variable warningSymptoms must be set to true, and we can query the probability of a diagnosis as follows: BT.query((diagnosis | (age = ge80) ∧ (9) warningSymptoms)). (diagnosis = other)[0.0111011134] (diagnosis = pilocytic-astrocytoma)[0.04996003353] (diagnosis = diffuse-astrocytoma)[0.09828440105] (diagnosis = anaplastic-astrocytoma)[0.10608281855] (diagnosis = glioblastoma)[0.22388719219] (diagnosis = oligodendroglioma)[0.08626747218] (diagnosis = ependymoma)[0.03960979843] (diagnosis = meningeoma)[0.15623784532] (diagnosis = medulloblastoma)[0.06524103534] (diagnosis = cranialnerve-tumor)[0.05725318378] (diagnosis = metastatic-tumor)[0.10607510624] These results are reasonable from a medical point of view. In particular, given the observed symptoms, glioblastoma is indeed the most probable diagnosis from a neurosurgeon’s perspective. Subsequent MRT results for the patient provided evidence for a malignant tumor and for intracranial pressure. When working with ME CoRe, this can be taking into account by adding (malignancy = 4) and icpSymptoms as additional premises to to the query (9), yielding: BT.query((diagnosis | (age = ge80) ∧ (10) warningSymptoms ∧(malignancy = 4) ∧ icpSymptoms). 403 Query (10) results in modified probabilities for diagnosis. In particular, with (diagnosis = glioblastoma)[0.5562571052], glioblastoma now dominates clearly, having an increased probability of more than 55%, which is again very plausible from a medical expert point of view. As the symptoms indicated a malignant tumor, it was surgically removed. Neuropathological results proved the diagnosis glioblastoma. After the operation, amnesic aphasia changed for the worse. The remaining neurological deficiencies remained unaffected. The new evidence can be expressed in ME CoRe using (diagnosis = glioblastoma), expressing the diagnostic findings, and (therapy = surgical) and (complication = 2). We can ask ME CoRe for the probability of post-operative complications when taking into account this evidence with the query BT.query((complication | (diagnosis = glioblastoma) ∧ (age = ge80) ∧ warningSymptoms ∧ (malignancy = 4) ∧ icpSymptoms ∧ (therapy = surgical))). for which we obtain the following results: (complication = 1)[0.00113586771] (11) (complication = 2)[0.32888273171] (12) (complication = 3)[0.66998140057] (13) While complications of grade 2 or 3 are rare in general, the evidence caused ME CoRe to rise the probabilities for these types of complications considerably. For the given patient, there was indeed a complication of grade two, for which ME CoRe determined a probability of 33%. From a clinical perspective, the probabilities for complication computed by ME CoRe is an adequate warning; however, from a medical point of view, the probability for (complication = 3) is too pessimistic, since compared to similar patient-risk constellations, life-threatening complications are frequent, but less than 50%, indicating that a more fine-grained modelling of complications would be adequate. Further types of queries for BT are reported in [17]. Asking ME CoRe for the expected health of the patient after inpatient stay, returned a very realistic prognosis from a medical point of view [17]. 5 Conclusion From a knowledge representation point of view, dealing with uncertainty that is inherently present in any medical domain, is one of the major challenges when designing a medical decision support system. Combining logic with well-established concepts of probability theory provides a framework for modelling uncertain knowledge and for reasoning about it. In this paper, we reported on a case study in which we used probabilistic logic and the principle of maximum entropy in a medical domain. We showed how to model clinical brain tumor data and medical export knowledge from this area with probabilistic conditionals. The employed principle of maximum entropy completes any miss404 ing or unknown probabilities to a full probability distribution in an information-theoretically optimal way. Our current work includes extending BT by a more fine-grained modelling, involving more variables and rules, and further evaluating BT with respect to additional real world examples. References [1] C. Beierle, M. Finthammer, N. Potyka, J. Varghese, and G. Kern-Isberner. A case study on the application of probabilistic conditional modelling and reasoning to clinical patient data in neurosurgery. In Proc. ECSQARU 2013. Springer, 2013. (to appear). [2] H. Bruch and O. Trentz. Berchthold Chirurgie, 6.Auflage. Elsevier GmbH, 2008. [3] M. Finthammer, C. Beierle, B. Berger, and G. KernIsberner. Probabilistic reasoning at optimum entropy with the ME CORE system. In Proc. FLAIRS’09. AAAI Press, Menlo Park, Ca., 2009. [4] D. E. Heckerman, E. J. Horvitz, and B. N. Nathwani. Toward normative expert systems: Part I. The Pathfinder project. Methods of information in medicine, 31(2):90–105, June 1992. [5] N. Hosten and T. Liebig. Computertomografie von Kopf und Wirbelsaeule. Georg Thieme Verlag, 2007. [6] G. Kern-Isberner. Characterizing the principle of minimum cross-entropy within a conditional-logical framework. Artificial Intelligence, 98:169–208, 1998. [7] D. Koller and N. Friedman. Probabilistic Graphical Models: Principles and Techniques. MIT Press, 2009. [8] D. Louis, H. Ohgaki, O. Wiestler, W. Cavenee, P. Burger, A. Jouvet, B. Scheithauer, and P. Kleihues. The 2007 WHO Classification of Tumours of the Central Nervous System. Acta Neuropathologica, 114(2):97–109, Aug. 2007. [9] D. N. Louis, H. Ohgaki, O. D. Wiestler, W. K. Cavenee, P. C. Burger, A. Jouvet, B. W. Scheithauer, and P. Kleihues. The 2007 who classification of tumours of the central nervous system. Acta Neuropathologica, 114(2):97–109, 2007. [10] M. Mueller. Chirurgie fuer Studium und Praxis, 9. Auflage. Medizinische Vlgs- u. Inform.-Dienste, 2007. [11] J. Paris and A. Vencovska. In defence of the maximum entropy inference process. International Journal of Approximate Reasoning, 17(1):77–103, 1997. [12] B. J. Park, H. K. Kim, B. Sade, and J. H. Lee. Epidemiology. In J. H. Lee, editor, Meningiomas: Diagnosis, Treatment, and Outcome., page 11. Springer, 2009. [13] K. Poeck and W. Hacke. Neurologie. Springer DE, 2006. [14] W. Rödder, E. Reucher, and F. Kulmann. Features of the expert-system-shell SPIRIT. Logic Journal of the IGPL, 14(3):483–500, 2006. [15] E. H. Shortliffe. Computer-Based Medical Consultations: MYCIN. Elsevier, 1976. [16] H.-J. Steiger and R. H.J. Manual Neurochirurgie. Ecomed Medizin, 2006. [17] J. Varghese. Using probabilistic logic for the analyis and evaluation of clinical patient data in neurosurgery. B.Sc. Thesis, FernUniversität in Hagen, 2012. (in German). CBMS 2013