Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Preliminary Results of Data Mining in Epidemiology. Jeanette Ann Auer, Dr. John Zeleznikow, Dr. Graeme K Hart M.B., B.S., FANZCA, FFICANZCA. La Trobe University, Bundoora, Victoria, Australia. Summary Patient data collected, as part of on-going research into outcomes in Australian Intensive Care Units (ICU), is a domain rich in complex relationships between variables. This project, and on-going work, is aimed at understanding Knowledge Discovery in Databases (KDD) when applied to the investigation of patient mortality. This paper described some of the gross class characteristics of the ANZICS database and shows the comparison of Receiver Operating Characteristic (ROC) curve analysis of patient mortality calculated by the Acute Physiology and Chronic Health Evaluation (APACHE II) Score (Knaus, Draper et al. 1985) when applied to differing hospital types. Key Word knowledge discovery, data mining, severity of illness, intensive care unit, ROC analysis. The ANZICS Database. The Australian and New Zealand Intensive Care Society (ANZICS) has developed a comprehensive database of Adult Intensive Care Units (ICU) covering the period from 1992. The database contains information regarding more than 200,000 patient admissions to 110 ICUs in the seven states and two territories of Australia and a further 8 ICUs in New Zealand. In the ANZICS database an admission is defined as a single admission to an ICU; thus multiple admissions to an ICU within a single hospitalization episode were recorded separately with the outcome of each ICU admission. The hospital types that the ICUs are associated with are notionally referred to as Metropolitan, Tertiary, Rural or regional, and private. The Australian hospital system is made up of two broad categories of hospital, those referred to as public hospitals and those referred to as private. Public hospitals are funded predominantly from the public funds and usually have long waiting lists for elective surgery types. Emergency admissions however are handled as quickly as resources allow. Private hospitals on the other hand are funded by a mix of government funds and other private funding including charges and/or associations with the major medical insurance groups. The rural or regional hospitals are those located outside to the major metropolitan centers. The tertiary hospitals are the major hospitals within the capital cities; these often also function as teaching hospitals. The metropolitan hospitals are the other smaller hospitals within the bounds of the capital cities. The problem with notional hospital types is that they are not mutually exclusive categories and examples have been seen in the database of a single hospital being coded as either private or rural. The current report concerns a subset of the current total ANZICS database, some 95,000 records are considered as part of this discussion. The current complete set comprises more 200,000 and is currently being preprocessed for knowledge discovery. Summaries of selected variables characteristic of the ANZICS database are shown below. Gross characteristics of data in the ANZICS database One of the main features of the ANZICS database is that the contributing hospitals represent a range of hospital types, both public and private, and from differing distances from the major metropolitan centers. What the domain experts expect is that there will be some measurable differences in outcomes between, particularly, the public and private hospitals. One of the drawbacks of data mining though is that patterns can not be found if the supporting data is not present in the database. Here the problem is that some would like to fix on apparent differences between public and private hospitals without a complete understanding of underlying factors, some of which may not even be captured in the database. For example in Figure 1a 50% of private hospital admission to an ICU in a private hospital 1 result from elective surgery, the breakdown and type are currently under consideration but what is not represented in the database is that there are socio-economic factors that impact on the way in which patients are admitted to hospitals, and to which hospitals they are admitted. 100% 16000 80% 70% 60% ELECTSX = Yes ELECTSX = No 50% 40% 30% 20% Number of ICU Admissions 90% 14000 12000 10000 Rural/Regnl 8000 6000 Private 4000 Metropolitan 2000 Tertiary/Ref 85+ yrs 80-84 yrs 75-79 yrs 70-74 yrs 65-69 yrs 60-64 yrs 55-59 yrs 50-54 yrs 45-49 yrs 40-44 yrs Tertiary/Ref 35-39 yrs Rural/Regnl 30-34 yrs Private 25-29 yrs Metropolitan 20-24 yrs 0 0% 15-19 yrs 10% Age Group A) Percentage Elective Surgery B) Age Distribution of Admissions 100% 50000 90% 45000 Percentage 70% 60% % Died % Survived 50% 40% 30% Number of Admissions 80% 40000 35000 30000 25000 20000 15000 20% 10000 10% 5000 0 53 49 45 41 37 33 29 25 21 17 9 13 5 1 0% Metropolitan C) APACHE II Score versus Percentage Survival Private Rural/Regnl Tertiary/Ref Hospital Type APACHE II Score D) Number of Admission to each Hospital Type Figure 1 Selected Attributes of the ANZICS database. Figure 1b shows the distribution of the age of patients admitted to ICUs in Australia. Australia is beginning to show the effects of an aging population and this is reflected in the ICU admissions. As expected as the APACHE II score increases the chance of survival decreases (Figure 1c). Figure 1d indicates the number of admission to ANZICS database hospital between 1992 and 1998. The revised database currently being preprocessed contains records from 1992 to 1999, the additional year’s data accounting for another 100,000 records as further ICUs join the ANZICS group. Concepts involved in Data The ANZICS database exhibits the classic features associated with general databases used in KDD. The database is large with over 200,000 records. The database contains over 130 fields, some of which are core data, data recorded about a particular patient, and others that are derived according to some formulation, for example the APACHE II score. The types of values available for data mining include: 1. 2. 3. 4. Binary values where the actual field takes the value true or false. Whilst this is the easiest of values to handle, some understanding of the domain is necessary, particularly when interpreting null vales. Categorical data values such as the APACHE II diagnosis categories, e.g. non-op respiratory infection, non-operative respiratory neoplasm, other respiratory. A full list can be found in (Knaus, Draper et al. 1985). Values in these fields are members of a finite set of available choices. Continuous data such as heart rate. Values in these fields can take any value, usually between two nominated points. Variables which for the purpose of this research can be considered as “Point accurate variables”. These variables are categorized by values within the set of allowable values that it is reasonable to expect that any trained person taking the value will obtain the same result at the same point in time, for example the heart rate should be considered as a point accurate data value. 2 5. Variables which are hierarchically fuzzy. For example diagnosis category can be shown to be hierarchically fuzzy using an example of an admission diagnosis of organ failure equals respiratory system. Diagnosis itself is a fuzzy variable where two admitting surgeons, staff officers, etc. will not necessarily nominate the same disease category. The idea of hierarchically fuzziness encompasses those diagnoses that fall within a hierarchy themselves, e.g. respiratory infection ISA respiratory system failure, respiratory neoplasm ISA respiratory system failure. Both diagnosis may be more accurate than simply respiratory system failure but the more tightly defined diagnosis may not be anymore accurate or of prognostic value than the higher classification. The differing aims of prognostication and stratification: Probability of mortality measures are generally based on some form of severity score or stratification index, examples include APACHE II (Knaus, Draper et al. 1985), APACHE III (Knaus, Wagner et al. 1991), MPM II (Lemeshow, Teres et al. 1993), ODIN (Fagon, Chastre et al. 1993), SAPS II (Le Gall, Lemeshow et al. 1993), MODS (Marshall, Cook et al. 1995), and the LODS (Le Gall, Klar et al. 1996). While using mortality as the predictive outcome each of the previously mentioned models seek to define a severity index or score that can define a particular patient. That score is then used to calculate the probability of mortality for the patient. Myles et. al. stress that outcome studies designed to predict individual risks, as opposed to group risks, need to be based on very large database populations. The current models all suffer from problems when attempting to give an accurate point estimate of probability of mortality for any given patient (Wong and Knaus 1991), this is not to say that they are without value. The fact that they are based on severity indices allows them to be used as a metric for the controlled evaluation of care outcomes and the assessment of new therapeutic measures (Krischer 1976; Knaus, Zimmerman et al. 1981). A definition of a severity of illness index is as a scale that assess the probability of morbidity (Krischer 1976; Krischer 1979). High degrees of accuracy for prognostication should follow from the increasing size of patient databases. It is at this boundary where KDD meets medical research. What is Knowledge Discovery in Databases. Research, or knowledge discovery, in any form is an iterative and interactive process. KDD is the formalization of the research process as applied to the extraction of knowledge from databases. Interest in this field of research has been growing steadily since the first conference on KDD in 1989. One of the central reasons for the increased interest in KDD is the result of changes in the views held concerning data analysis. Decker and Forcardi (1995)(Decker and Focardi 1995)identify three factors that have changed the view of data analysis leading to expansion of interest in KDD and DM. The first factor relates to the increased availability of high-powered computing at low cost. Current personal computers have as much, if not more, computing power than the mainframes of the 1960s and 1970s. Hand in hand with this is the falling cost of data storage allowing the accumulation of large volumes of data. The second factor is the explosive rate that data is accumulation in area ranging from the human genome project (Miyano 1995) to business (Nazem and Shin 1999) and science (Fayyad 1997). Of note, medical research is a relative newcomer to the normal domains considered under the umbrella of KDD although an allusion to medical knowledge discovery is found in Miller et. al. (1993) when they discuss the growth in patient record databases. Tafeit and Reibnegger (1999)(Tafeit and Reibnegger 1999) when discussing the growth in NN use in the medical domain show that approximately 100 articles were published in 1991 in journals normally abstracted by MEDLINE. A search in INSPEC for the same period shows 3000+ articles. A further search of INSPEC found only 163 articles with "medical" and "data mining", of these 100 were published 1999. Much of the data accumulation is a result of the increased reliance on automated data gathering devices (Decker and Focardi 1995). The final factor is the new set of methods developed by members of the artificial intelligence (AI) community, statisticians and physicists. These methods are computationally 3 expensive and rely on the current power available in computing to be able to be performed. While Decker & Focardi (1995) identify three factors that have affected the data analysis panorama, the key factor is the availability of increased computational power, both in terms of data processing and data storage. The specific definition of KDD is the non-trivial extraction of previously unknown and potentially useful information from data (Frawley, Piatetsky-Shapiro et al. 1991; Agrawal, Imielinski et al. 1993; Fayyad, Piatetsky-Shapiro et al. 1996; Fayyad, Piatetsky-Shapiro et al. 1996). With KDD generally represented as one of three overall models, the system model (Matheus, Chan et al. 1993), the KDD environment model (Brachman and Anand 1996), and the process model (Fayyad, Piatetsky-Shapiro et al. 1996; Fayyad 1996; Fayyad, Piatetsky-Shapiro et al. 1996). A number of authors regard KDD in terms of the overall process involved in knowledge discovery (Fayyad, Piatetsky-Shapiro et al. 1996; Fayyad 1996; Fayyad, Piatetsky-Shapiro et al. 1996; Mannila 1996; Simoudis 1996). A common theme to all process models is that KDD is an iterative process; the level of interaction in the process is one of the points of difference between the various authors. The core of each of the definitions remains the DM or pattern extraction phase The process of discovering patterns should therefore contain the following steps. 1. 2. 3. 4. 5. The user needs an understanding of the domain. Domain knowledge is a prerequisite for extracting useful knowledge from data. The KDD user must have some understanding of the application area before valuable information can be obtained. If, on the other hand, very good human experts exist it may be difficult for semiautomatic tools to derive any novel information. The data set must be prepared for KDD. Preparation of the data set involves the selection of data sources, the integration of heterogeneous data, cleaning the data, assessing noise levels, and dealing with missing values. According to Mannila (1996), this step can take up to 80% of the time associated with the KDD process. This is in contrast with (Fayyad, Piatetsky-Shapiro et al. 1996; Fayyad 1996; Fayyad, Piatetsky-Shapiro et al. 1996) where it is stated that a significant portion of the KDD application can go into the proper formulation of the problem. This is not to say that Mannila (1996) is incorrect; rather this difference is a result of the emphasis by Fayyad et al. (1996) in breaking the process down into discrete steps. The KDD system should discover any useful and interesting patterns. This is the phase commonly referred to as data mining. Discovered patterns must be postprocessed. This step is the same as the interpretation phase of Fayyad et al. (1996), and The results must be put to use. One of the most detailed descriptions of the KDD under the process model was proposed by Fayyad et al. (1996) (Fayyad, Piatetsky-Shapiro et al. 1996). KDD was defined as an iterative and interactive process consisting of nine steps. The steps involved are outlined below. 1. 2. 3. Learning the application domain. This step involves acquiring relevant prior knowledge of the domain and the goal of the particular application of KDD to the domain. Sufficient domain knowledge is required, as current KDD techniques are still reliant on considerable user input. Future developments in KDD should see the emergence of semi-intelligent engines that will free the user from having to fully guide the KDD process. Creating the target dataset. In KDD, the creation of the target dataset does not always mean creation of a new data set, more often this step involves focusing on which subset of variables or data samples from an existing database is appropriate for the current KDD application. Data cleaning and preparation. This step covers those areas associated with preparing the data for knowledge discovery and includes; 3.1. removing outliers and noise if appropriate or collecting information necessary to account for outliers and noise, 3.2. developing strategies to account for missing data, time sequences and known changes within the data, and 3.3. dealing with database management (DBMS) issues such as field types, and mappings to missing, unknown or dependent values. 4 4. 5. 6. 7. 8. 9. Data reduction, and projection. Depending on the goal of the current KDD process one must find the useful features of the dataset, and using transformation or reduction methods minimize the number of variables under consideration. Choosing the function of data mining. Choose broadly the purpose of the current KDD exercise. Is it to classify the dataset into discrete subsets, summarize the dataset, perform regression analysis on the data or find clusters within the data? The function of data mining is further expanded in the sixth step as part of choosing the specific algorithms. Choosing the data mining algorithm(s). The data mining algorithms consist largely of a varying mix of three specific components. The components of the DM algorithm are described below. 6.1. The model component consists of those parameters that are to be determined from the data. There are two factors to the model, the function of the model, and the representational form of the model. The function of the model describes which of a group of broad classifiers, such as classification or clustering, the model will represent. The representational form of the model describes how the model will be represented. 6.2. The preference criteria component is a measure of preference for one model over another. The criterion, often a goodness-of-fit function, seeks to avoid problems associated with overfitting or over generalization. 6.3. The search method consists of two components: parameter search and model search. The parameter search algorithm must search for the parameters that will optimize the preference criteria given the data and a fixed model representation. The model search occurs as a loop over the parameter search methods; the model representation is changed to allow a family of models to be considered. For each model representation, the parameter search method is instantiated to evaluate the quality of the model. The actual application of the data mining algorithms to the dataset. In this phase the actual exercise of data mining is performed. Discovered patterns need to be interpreted in the context of the KDD exercise. Interpretation involves aspects of post-processing of patterns and if necessary revisiting the DM phases to expand or refine the discovered patterns. The newly discovered knowledge needs to be put to use whether this use is: incorporation of the knowledge back into the discovery system; taking some action as a result of the discovered pattern; reporting and documenting the patterns; or, validating the pattern in light of previous knowledge, which includes investigation any contradiction (Fayyad, Piatetsky-Shapiro et al. 1996). The actions associated with data mining have historically been referred to by a number of names including knowledge extraction, data dredging, data analysis (Chen, Han et al. 1996), pattern analysis, information harvesting (Piatetsky-Shapiro 1994), information discovery, data archeology and data pattern processing (Fayyad, Piatetsky-Shapiro et al. 1996). All of theses phrases imply the extraction of patterns of information from data, though there are negative connotations associated with some, especially data dredging. DM as seen today can be characterized as a set of methods for finding logical or mathematical descriptions of regularities in any database (Decker and Focardi 1995). Practical limitation on this are the assumptions that the generalized model of the database can be approximated through some relatively simple computational model and that there is sufficient data in the database to facilitate the generalization (Decker and Focardi 1995). Thus, the existence of a database is an essential feature of DM. A database in its simplest form is a collection of related data where each point of data is some fact description or the attributes of the real world entity of interest to the database (Vickery 1997). Implicit in the idea of a database is the following properties. 1. 2. 3. The database represents some portion of the real world. The data is organized in some logically coherent manner with some inherent meaning in the structure of the data; a random collection of data is not a database. The database has a specific purpose and audience. The data that populates the database relates to the purpose of the database. Motivation for the ANZICS database has been standardized of data collection by ICUs in Australia and New Zealand. With the strongest motives being a desire to compare outcomes from different treatment regimes, either within the one unit, or between a number of units. Additionally, there is a desire to increase the sum of knowledge regarding specific diseases, outcomes and work practices. To this end the database is populated with records pertaining to patient outcomes when admitted to an ICU. 5 The General Statistics of Prognostication Four statistics are traditionally used to describe the utility of a diagnostic test (Myles, Williams et al. 1994). These statistics are sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV). Sensitivity is a measure of the true positive rate. It is the proportion of samples correctly identified as positive by the test. Specificity on the other hand is the measure of the true negative rate and shows the proportion of samples correctly identified as negative. The positive predictive value is the proportion true outcomes predicted to be true, and similarly the negative predictive value is the proportion of negative events out of those predicted to be negative (Myles, Williams et al. 1994). The aim of any model is to minimize the false negative rate (type II errors), those samples predicted to have a positive outcome that actually have a negative outcome, for example predicting survival of a patient who subsequently dies. The problem is that the negative results are generally rare events requiring extremely large databases to minimize the likelihood of type II errors (Myles, Williams et al. 1994). These statistics can also be used to describe the accuracy of prognostic systems. The statistics specificity and sensitivity are particularly important as they are the base statistics for receiver operating characteristic (ROC) curve analysis (Miller, Langefeld et al. 1993). ROC curves are increasingly being used to judge the discrimination ability of various statistical methods (Hanley and McNeil 1982). One of the underlying concepts for a ROC curve that makes it very applicable to prognostic studies is that given the available data an operator is forced to choose between two alternative decisions (Hanley and McNeil 1982). For a given confidence threshold the operator attempts to classify the pattern as either of the outcome states. Probability distributions can then be derived for the various truth states (Swaving, Houwelingen et al. 1996). In the case of prognostic scores the operator is the scoring system itself. The confidence threshold is the probability of the specified outcome according to the scoring system. While numerous examples of where ROC curve analysis has been used in the medical prognostic literature can be sited including APACHE (Knaus, Zimmerman et al. 1981), APACHE II (Knaus, Draper et al. 1985), ODIN (Fagon, Chastre et al. 1993), SAPS II (Le Gall, Lemeshow et al. 1993), Logistic Organ Dysfunction System (LODS) (Le Gall, Klar et al. 1996), meanwhile a recent search on INSPEC showed only three examples of ROC curve analysis applied to domains other than the medical domain. The non-medical citation are acoustics (Pflug, Ioup et al. 1992), LADAR imaging (Keller, Gader et al. 1998) and object recognition (Shin, Goldgof et al. 1999). As an initial investigation ROC curves were plotted for each of the four hospital types, the result is shown in Figure 2. Further research is indicated in an attempt to clarify whether the apparent differences in curves are significant and if so why. The probability of mortality equation of the APACHE II score (Knaus, Draper et al. 1985) is adjusted based on admission category so in theory other facts must be effecting the outcomes between the hospital groups. ROC Curves for APACHEII ANZICS Database 1 0.9 True Positive Rate (Sensitivity) 0.8 0.7 0.6 Tertiary ROC Metropolitan ROC 0.5 Rural ROC Private ROC 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 False Positive Rate (1 - Specificity) Figure 2 ROC Curvve of APACHE II scores for Differing Hospital Types. 6 The following section leads into a discussion of work in progress. Why does APACHE lend itself to modeling with neural network tools? Clinical decisions like most decision-making processes are rarely made on the basis of a single factor. In medicine it is rare for a single symptom or measurement to be pathognomic for a single disease (Cross, Harrison et al. 1995). Clinicians regularly participate in multidimensional analysis applying previous observed patterns to the pattern under current observation. Of interest in a KDD sense is can the multidimensional analysis performed by clinicians be modeled using the tools of KDD. To this end a series of neural networks are currently being trained. The APACHEII severity index (Knaus, Draper et al. 1985) is a widely used and well understood and well validated means of stratifying patients in ICU on the basis of the degree of physiological derangement from the norm (Wong and Young 1999). The transformations applied by the APACHEII model itself can be shown to be non-linear disjointed transformations of patient data. For any of the physiological characteristics essential to the APACHEII score, the transformation is non-linear. An examination of the transformation applied to rectal temperature, if the mid point of the range for each of the range values is plotted against the APACHEII value for that range the plot in non-linear (see Figure 3) shows a non-linear line even if the high abnormal range is represented as negative values for the APACHEII value. In general the APACHEII values form the non-linear ordered set {4, 3, 2, 1, 0, 1, 2, 3, 4} with ranges of values for each of the physiological attributes mapped into this set. 38 High Abnormal as Negative APACHEII Transformation 33 28 -4 -3 -2 -1 0 1 2 3 4 Range Transform Set (High Abnormal Represented as Negative) Figure 3 APACHEII Value against mid-point of Rectal Temperature Ranges. The ranges for each transformation do not necessarily abut each other leading to a disjoint transformation sequence. For example the transformations for rectal temperature are shown in Table 1. Table 1 Transformation of Rectal Temperature. TEMP High Abnormal Range +4 +3 +2 >41 39-40.9 +1 38.5-38.9 0 36-38.4 +1 34-35.9 +2 32-33.9 Low Abnormal Range +3 +4 30-31.9 <=29.9 The size of the transformation step from 38.9oC to 39oC is not the same as the steps between any other range so we have a disjoint transformation with all the added problems of boundary classifiers. The fact that the APACHE II score is a non-linear, disjoint transformation of the underlying data combined with the continuous nature of the underlying variables indicates that it would be very difficult to learn this score using techniques poorly suited to handling this type of data. Further work is planned to investigate the differences in ROC curves and the applicability of a Neural Network Model to the ANZICS database. Agrawal, R., T. Imielinski, et al. (1993). “Database mining: a performance perspective.” IEEE Transactions on Knowledge and Data Engineering 5(6): 914-25. 7 Brachman, R. J. and T. Anand (1996). The Process of Knowledge Discovery in Databases. Advances in Knowledge Discovery and Data Mining. U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R. Uthurusamy. Menlo Prk, Calif., AAAI/MIT Press: 37-57. Chen, M.-S., J. Han, et al. (1996). “Data mining: an overview from a database perspective.” IEEE Transactions on Knowledge and Data Engineering 8(6): 866-83. Cross, S. S., R. F. Harrison, et al. (1995). “Introduction to neural networks.” The Lancet. 346: 10751079. Decker, K. M. and S. Focardi (1995). Technology Overview: A Report on Data Mining. Manno, Swiss Scientific Computing Centre. Fagon, J. Y., J. Chastre, et al. (1993). “Characterization of Intensive Care Unit Patients Using a Model Based on the Presence or Absence of Organ Dysfunctions and/or Infections: The {ODIN} Model.” Intensive Care Medicine 19: 137-144. Fayyad, U., G. Piatetsky-Shapiro, et al. (1996). “From data mining to knowledge discovery in databases.” AI Magazine 17(3): 37-54. Fayyad, U. M. (1996). “Data mining and knowledge discovery: making sense out of data.” IEEE Expert 11(5): 20-5. Fayyad, U. M. (1997). Data Mining and Knowledge Discovery in Databases: Implications for Scientific Databases. Proceedings. Ninth International Conference on Scientific and Statistical Database Management, Olympia, Washington, IEEE. Fayyad, U. M., G. Piatetsky-Shapiro, et al. (1996). From Data Mining to Knowledge Discovery: An Overview. Advances in Knowledge Discovery and Data Mining. U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R. Uthurusamy. Menlo Park, Calif,, AAAI Press: 1-34. Frawley, W. J., G. Piatetsky-Shapiro, et al. (1991). Knowledge Discovery in Databases: An Overview. Knowledge Discovery in Databases. G. Piatetsky-Shapiro and W. J. Frawley. Cambridge, Mass., AAAI/MIT Press: 1-27. Hanley, J. A. and B. J. McNeil (1982). “The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve.” Diagnostic Radiology. 143: 29-36. Keller, J. M., P. Gader, et al. (1998). A fuzzy logic automatic target detection system for LADAR range images. IEEE International Conference on Fuzzy Systems Proceedings. IEEE World Congress on Computational Intelligence., Anchorage. AK. USA., IEEE. Knaus, W. A., E. A. Draper, et al. (1985). “APACHE II: a severity of disease classification system.” Crit Care Med 13(10): 818-29. Knaus, W. A., D. P. Wagner, et al. (1991). “The APACHE III prognostic system. Risk prediction of hospital mortality for critically ill hospitalized adults [see comments].” Chest 100(6): 1619-36. Knaus, W. A., J. E. Zimmerman, et al. (1981). “APACHE - acute physiology and chronic health evaluation: a physiologically based classification system.” Critical Care Medicin 9(8): 591-597. Krischer, J. P. (1976). “Indexes of Severity: Underlying Concepts.” Health Services Research 11(1): 45-53. Krischer, J. P. (1979). “Indexes Of Severity: Conceptual Development.” Health Services Research 14(1): 56-68. Le Gall, J. R., J. Klar, et al. (1996). “The Logistic Organ Dysfunction system. A new way to assess organ dysfunction in the intensive care unit. ICU Scoring Group.” Jama 276(10): 802-10. Le Gall, J. R., S. Lemeshow, et al. (1993). “A new Simplified Acute Physiology Score (SAPS II) based on a European/North American multicenter study [published erratum appears in JAMA 1994 May 4;271(17):1321].” Jama 270(24): 2957-63. Lemeshow, S., D. Teres, et al. (1993). “Mortality Probability Model (MPM II) bases on an international cohort of intensive care unit patients.” Journal of the American Medical Association 270(20): 2478-2486. Mannila, H. (1996). Data mining: machine learning, statistics, and databases. Proceedings. Eighth International Conference on Scientific and Statistical Database Management, IEEE Comput. Soc. Press Los Alamitos CA. Marshall, J. C., D. J. Cook, et al. (1995). “Multiple Organ Dysfunction Score - a Reliable Descriptor of a Complex Clinical Outcome.” Critical Care Medicine 23(10): 1638-1652. Matheus, C. J., P. K. Chan, et al. (1993). “Systems for knowledge discovery in databases.” IEEE Transactions on Knowledge and Data Engineering 5(6): 903-13. Miller, M. E., C. D. Langefeld, et al. (1993). “Validation of probabilistic predictions.” Med Decis Making 13(1): 49-58. Miyano, S. (1995). “Learning theory toward genome informatics.” IEICE Transactions on Information and Systems E78-D(5): 560-7. 8 Myles, P. S., N. J. Williams, et al. (1994). “Predicting Outcome in Anaesthesia: Understanding Statistical Methods.” Anaesthesia and Intensive Care. 22(4): 447-453. Nazem, S. M. and B. Shin (1999). “Data mining: new arsenal for strategic decision-making.” Journal of Database Management 10(1): 39-42. Pflug, L. A., G. E. Ioup, et al. (1992). “Detection of oscillatory and impulsive transients using higherorder correlations and spectra.” Journal of the Acoustical Society of America 91(5): 2763-76. Piatetsky-Shapiro, G. (1994). An overview of knowledge discovery in databases: recent progress and challenges. Rough Sets, Fuzzy Sets and Knowledge Discovery. Proceedings of the International Workshop on Rough Sets and Knowledge Discovery (RSKD'93)., Banff, Alberta, Canada, SpringerVerlag. Shin, M. C., D. Goldgof, et al. (1999). Comparison of edge detectors using an object recognition task. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition., Fort Collins, CO, USA., IEEE. Simoudis, E. (1996). “Reality Check for Data Mining.” IEEE Expert(Oct,): 26-33. Swaving, M., H. V. Houwelingen, et al. (1996). “Statistical Comparison of ROC Curves from Multiple REaders.” Medical Decision Making 16(2): 143-152. Tafeit, E. and G. Reibnegger (1999). “Artificial neural networks in laboratory medicine and medical outcome prediction.” Clinical Chemistry & Laboratory Medicine. 37(9): 845-853. Vickery, B. (1997). “Knowledge discovery from databases: an introductory review.” Journal of Documentation 53(2): 107-22. Wong, D. T. and W. A. Knaus (1991). “Predicting outcome in critical care: the current status of the APACHE prognostic scoring system.” Can J Anaesth 38(3): 374-83. Wong, L. S. S. and J. D. Young (1999). “A comparison of ICU mortality prediction using the APACHE II scoring system and artificial neural networks.” Anaesthesia. 54(11): 1048-1054. 9