Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
W07 The Discovery Challenge on Thrombosis Data Data Provider Katsuhiko TAKABAYASHI MD Chiba University Hospital, Japan Anti-Phospholipid antibody Syndrome (APS) Anti-cardiolipin antibodies (aCL) Lupus anticoagulant (LAC) Induces thrombotic events (such as AMI, Stroke, deep venous thrombosis, miscarriage, pulmonary hypertension etc.) sometimes positive in other collagen diseases (Lupus, Sjoegren syndrome) TF Ca++ VIIa VII TF/VIIa XI XII XIa XIIa IX Ca++Mg++ Ca++ Mg++ IXa VIII Ca++ VIIIa PL Ca++Mg ++ X Xa Ca++ PL ProteinC ProteinC Va prothrombin fibrinogen V thrombin fibrin XIIIa XIII Collagen disease SLE APS RA Collagen Diseases Autoimmune disease Rheumatic disease Connective tissue disease autoantibodies APS Thombosis ; Vessel stasis by blood clots Myocardial Infarction, Stroke etc. Thrombosis Who in APS ? When ? Which laboratory data have relations with thrombosis as well as anti-cardiolipin antibodies ? The Goal of this trial (1) Assessment of validity of each study If a data mining technique can point out important key factors (aCL, LAC, PT, APTT) which are already known to be related with thrombosis properly from many variants we provided. The Goal of this trial (2) The Results to expect 1) to identify high risk patients who have no history of thrombosis so far. 2) to predict the time of thrombosis or detect the change of some variants in the course of thrombosis from the series of temporal data. Evaluation of the results From the current medical point of view,. Common sense results (positive control) Probable results Possible results unclear results, difficult to evaluate Nonsense results (negative control) We cannot judge what we do not know ! The study The study most results of which have low accordance with current knowledge Domain researchers cannot believe the rest of unclear results ! most results of which have good accordance with current knowledge Domain researchers cannot say that other unclear results are also true. Assessment in domain field Medical Data Set Medical data set here is from 1241 patients with collagen diseases and 7 basic laboratory data for aCL from 806 cases were provided. As for temporal laboratory data, 41 items in 57,543 tests totally in 17 years were prepared. Seventy-six cases had some thrombotic events in their clinical course. Evaluations from medical aspects Coursac I et al The bridge theory ; Genetic Programming It can predict patients’ health state from spe-exams and lab-exams in 99.28%. CNS lupus has a relation with anti-DNA Ab level and IgM type aCL. aCL IgM and anti-DNA Ab levels are related independently with the thrombosis in the future. Evaluations from medical aspects Boulicaut et al δ- strong classification rules a lot of rules with 100% confidence, but most of them were not useful. A rule that aCL >2.4 and range of aCL IgM from 1.9 to 2.7 and KCT (-) is SLE. The rule that sex is M and ANA is 0 is Behcet. We would like to look at the other rules not written here to find attractive ones. Evaluations from medical aspects Jensen S et al CRISP ( cross-industry standard process ) LAC, ANA, U-pro, centromere-type, SSA, SSB,RNP,SM,SCl-70 were strong contributors to predict the presence of thrombosis. Other possibilities of thrombosis without aCL antibodies. Evaluations from medical aspects Jensen S et al Sequential analysis for temporal data did not show interesting results. It might be difficult to predict the time of thrombosis. One possibility is that the data might be modified by the treatment or prophylaxis. Bias by physicians Modification of treatment Selection of the cases, laboratory data Evaluations from medical aspects Werner J and Fogarty T genetic programming determined a discriminate function that separates occurrences of thrombosis with very low false negatives. However, ..... is it possible to translate the meaning and make us understood ? Weightening? When the Results Beyond Expert’s Knowledge ability Complicated relations might be difficult to be explained. No drug relations for three items were tried. The results through a black box might be ignored by the experts simply because it can not make them understood! Evaluations from medical aspects Zytkow J and Gupta S SQL ; cross contingency classification reasonable results as Infozoom. ANA pattern analysis Patients with severe attacks have more possibilities of other attacks. Thrombosis related with the level of aCLs. Alveolar hemorrhage and CNS attacks are not associated with milder attacks. Evaluations from medical aspects Beilken and Spenke (InfoZoom) : by using user friendly interface, easy to understand their test results. They could choose the reasonable and interesting rules. Levin: by using Wizwhy producing 7356 rules. Complicated rules are difficult to comment because of its complexity. Taylor : from temporal data missing data disturbed the analysis. Only common sense findings were selected. Evaluations from medical aspects Zytkow J and Gupta S SQL ; cross contingency classification reasonable results as Infozoom. ANA pattern analysis Patients with severe attacks have more possibilities of other attacks. Thrombosis related with the level of aCLs. Alveolar hemorrhage and CNS attacks are not associated with milder attacks. To obtain the good results efficiently (1) Cleaning of data Preprocessing the data is very essential by domain researchers who concerned with the database to minimize the noises. Definition, classification, adjustment etc. Recognition of the modification by the treatment or prophylaxis. Indication to treat missing data To obtain the good results efficiently (2) Introduction of the domain knowledge To involve medical knowledge as possible with the data set in the beginning To cooperate with domain researchers to obtain domain knowledge during data mining. Causal Relation Misjudge in temporary meaning Backward and non-objective relationships Bacteria invades Pneumonia occurs Bacteria has invaded Pneumonia occurs Bacteria will invade Pneumonia occurs To obtain the good results efficiently (3) Cooperation with domain researchers An interactive technique will avoid user’s discontent of a black box and assist to drive to the right direction. Hypothetico-deductive method will be easily accepted by physicians. Causal Relation Misjudge in temporary meaning Backward and non-objective relationships It rains The road is wet. It rains The road is wet. It will rain The road is wet. Data mining Retrospective approach ; not arranged, many noises. Data ; More genuine and adequate data set must be prepared. Terms, definitions and background must be introduced beforehand. Rules ; Complicated rules (relations between more than 3 items) found by this analysis cannot be explained nor proved whether they are true from medical approach. 3種の薬剤の治験はない Bias By Physicians Modification of treatment Selection of the cases, laboratory data By Accident change of the disease; before and after the events (thrombosis)