Download Preliminary Results of Data Mining in Epidemiology.

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Preliminary Results of Data Mining in Epidemiology.
Jeanette Ann Auer, Dr. John Zeleznikow, Dr. Graeme K Hart M.B., B.S., FANZCA, FFICANZCA.
La Trobe University, Bundoora, Victoria, Australia.
Summary
Patient data collected, as part of on-going research into outcomes in Australian Intensive Care Units
(ICU), is a domain rich in complex relationships between variables. This project, and on-going work,
is aimed at understanding Knowledge Discovery in Databases (KDD) when applied to the investigation
of patient mortality. This paper described some of the gross class characteristics of the ANZICS
database and shows the comparison of Receiver Operating Characteristic (ROC) curve analysis of
patient mortality calculated by the Acute Physiology and Chronic Health Evaluation (APACHE II)
Score (Knaus, Draper et al. 1985) when applied to differing hospital types.
Key Word knowledge discovery, data mining, severity of illness, intensive care unit, ROC analysis.
The ANZICS Database.
The Australian and New Zealand Intensive Care Society (ANZICS) has developed a
comprehensive database of Adult Intensive Care Units (ICU) covering the period from 1992. The
database contains information regarding more than 200,000 patient admissions to 110 ICUs in the
seven states and two territories of Australia and a further 8 ICUs in New Zealand. In the ANZICS
database an admission is defined as a single admission to an ICU; thus multiple admissions to an ICU
within a single hospitalization episode were recorded separately with the outcome of each ICU
admission.
The hospital types that the ICUs are associated with are notionally referred to as Metropolitan,
Tertiary, Rural or regional, and private. The Australian hospital system is made up of two broad
categories of hospital, those referred to as public hospitals and those referred to as private. Public
hospitals are funded predominantly from the public funds and usually have long waiting lists for
elective surgery types. Emergency admissions however are handled as quickly as resources allow.
Private hospitals on the other hand are funded by a mix of government funds and other private funding
including charges and/or associations with the major medical insurance groups. The rural or regional
hospitals are those located outside to the major metropolitan centers. The tertiary hospitals are the
major hospitals within the capital cities; these often also function as teaching hospitals. The
metropolitan hospitals are the other smaller hospitals within the bounds of the capital cities. The
problem with notional hospital types is that they are not mutually exclusive categories and examples
have been seen in the database of a single hospital being coded as either private or rural.
The current report concerns a subset of the current total ANZICS database, some 95,000 records are
considered as part of this discussion. The current complete set comprises more 200,000 and is
currently being preprocessed for knowledge discovery. Summaries of selected variables characteristic
of the ANZICS database are shown below.
Gross characteristics of data in the ANZICS database
One of the main features of the ANZICS database is that the contributing hospitals represent a range of
hospital types, both public and private, and from differing distances from the major metropolitan
centers. What the domain experts expect is that there will be some measurable differences in outcomes
between, particularly, the public and private hospitals. One of the drawbacks of data mining though is
that patterns can not be found if the supporting data is not present in the database. Here the problem is
that some would like to fix on apparent differences between public and private hospitals without a
complete understanding of underlying factors, some of which may not even be captured in the
database. For example in Figure 1a 50% of private hospital admission to an ICU in a private hospital
1
result from elective surgery, the breakdown and type are currently under consideration but what is not
represented in the database is that there are socio-economic factors that impact on the way in which
patients are admitted to hospitals, and to which hospitals they are admitted.
100%
16000
80%
70%
60%
ELECTSX = Yes
ELECTSX = No
50%
40%
30%
20%
Number of ICU Admissions
90%
14000
12000
10000
Rural/Regnl
8000
6000
Private
4000
Metropolitan
2000
Tertiary/Ref
85+ yrs
80-84 yrs
75-79 yrs
70-74 yrs
65-69 yrs
60-64 yrs
55-59 yrs
50-54 yrs
45-49 yrs
40-44 yrs
Tertiary/Ref
35-39 yrs
Rural/Regnl
30-34 yrs
Private
25-29 yrs
Metropolitan
20-24 yrs
0
0%
15-19 yrs
10%
Age Group
A) Percentage Elective Surgery
B) Age Distribution of Admissions
100%
50000
90%
45000
Percentage
70%
60%
% Died
% Survived
50%
40%
30%
Number of Admissions
80%
40000
35000
30000
25000
20000
15000
20%
10000
10%
5000
0
53
49
45
41
37
33
29
25
21
17
9
13
5
1
0%
Metropolitan
C) APACHE II Score versus Percentage Survival
Private
Rural/Regnl
Tertiary/Ref
Hospital Type
APACHE II Score
D) Number of Admission to each Hospital Type
Figure 1 Selected Attributes of the ANZICS database.
Figure 1b shows the distribution of the age of patients admitted to ICUs in Australia. Australia is
beginning to show the effects of an aging population and this is reflected in the ICU admissions. As
expected as the APACHE II score increases the chance of survival decreases (Figure 1c). Figure 1d
indicates the number of admission to ANZICS database hospital between 1992 and 1998. The revised
database currently being preprocessed contains records from 1992 to 1999, the additional year’s data
accounting for another 100,000 records as further ICUs join the ANZICS group.
Concepts involved in Data
The ANZICS database exhibits the classic features associated with general databases used in KDD.
The database is large with over 200,000 records. The database contains over 130 fields, some of which
are core data, data recorded about a particular patient, and others that are derived according to some
formulation, for example the APACHE II score. The types of values available for data mining include:
1.
2.
3.
4.
Binary values where the actual field takes the value true or false. Whilst this is the easiest of
values to handle, some understanding of the domain is necessary, particularly when interpreting
null vales.
Categorical data values such as the APACHE II diagnosis categories, e.g. non-op respiratory
infection, non-operative respiratory neoplasm, other respiratory. A full list can be found in
(Knaus, Draper et al. 1985). Values in these fields are members of a finite set of available choices.
Continuous data such as heart rate. Values in these fields can take any value, usually between two
nominated points.
Variables which for the purpose of this research can be considered as “Point accurate variables”.
These variables are categorized by values within the set of allowable values that it is reasonable to
expect that any trained person taking the value will obtain the same result at the same point in
time, for example the heart rate should be considered as a point accurate data value.
2
5.
Variables which are hierarchically fuzzy. For example diagnosis category can be shown to be
hierarchically fuzzy using an example of an admission diagnosis of organ failure equals respiratory
system. Diagnosis itself is a fuzzy variable where two admitting surgeons, staff officers, etc. will
not necessarily nominate the same disease category. The idea of hierarchically fuzziness
encompasses those diagnoses that fall within a hierarchy themselves, e.g. respiratory infection ISA
respiratory system failure, respiratory neoplasm ISA respiratory system failure. Both diagnosis
may be more accurate than simply respiratory system failure but the more tightly defined diagnosis
may not be anymore accurate or of prognostic value than the higher classification.
The differing aims of prognostication and stratification:
Probability of mortality measures are generally based on some form of severity score or stratification
index, examples include APACHE II (Knaus, Draper et al. 1985), APACHE III (Knaus, Wagner et al.
1991), MPM II (Lemeshow, Teres et al. 1993), ODIN (Fagon, Chastre et al. 1993), SAPS II (Le Gall,
Lemeshow et al. 1993), MODS (Marshall, Cook et al. 1995), and the LODS (Le Gall, Klar et al. 1996).
While using mortality as the predictive outcome each of the previously mentioned models seek to
define a severity index or score that can define a particular patient. That score is then used to calculate
the probability of mortality for the patient. Myles et. al. stress that outcome studies designed to predict
individual risks, as opposed to group risks, need to be based on very large database populations.
The current models all suffer from problems when attempting to give an accurate point estimate of
probability of mortality for any given patient (Wong and Knaus 1991), this is not to say that they are
without value. The fact that they are based on severity indices allows them to be used as a metric for
the controlled evaluation of care outcomes and the assessment of new therapeutic measures (Krischer
1976; Knaus, Zimmerman et al. 1981). A definition of a severity of illness index is as a scale that
assess the probability of morbidity (Krischer 1976; Krischer 1979). High degrees of accuracy for
prognostication should follow from the increasing size of patient databases. It is at this boundary
where KDD meets medical research.
What is Knowledge Discovery in Databases.
Research, or knowledge discovery, in any form is an iterative and interactive process. KDD is the
formalization of the research process as applied to the extraction of knowledge from databases. Interest
in this field of research has been growing steadily since the first conference on KDD in 1989. One of
the central reasons for the increased interest in KDD is the result of changes in the views held
concerning data analysis.
Decker and Forcardi (1995)(Decker and Focardi 1995)identify three factors that have changed the view
of data analysis leading to expansion of interest in KDD and DM. The first factor relates to the
increased availability of high-powered computing at low cost. Current personal computers have as
much, if not more, computing power than the mainframes of the 1960s and 1970s. Hand in hand with
this is the falling cost of data storage allowing the accumulation of large volumes of data. The second
factor is the explosive rate that data is accumulation in area ranging from the human genome project
(Miyano 1995) to business (Nazem and Shin 1999) and science (Fayyad 1997).
Of note, medical research is a relative newcomer to the normal domains considered under the umbrella
of KDD although an allusion to medical knowledge discovery is found in Miller et. al. (1993) when
they discuss the growth in patient record databases. Tafeit and Reibnegger (1999)(Tafeit and
Reibnegger 1999) when discussing the growth in NN use in the medical domain show that
approximately 100 articles were published in 1991 in journals normally abstracted by MEDLINE. A
search in INSPEC for the same period shows 3000+ articles. A further search of INSPEC found only
163 articles with "medical" and "data mining", of these 100 were published 1999.
Much of the data accumulation is a result of the increased reliance on automated data gathering devices
(Decker and Focardi 1995). The final factor is the new set of methods developed by members of the
artificial intelligence (AI) community, statisticians and physicists. These methods are computationally
3
expensive and rely on the current power available in computing to be able to be performed. While
Decker & Focardi (1995) identify three factors that have affected the data analysis panorama, the key
factor is the availability of increased computational power, both in terms of data processing and data
storage.
The specific definition of KDD is the non-trivial extraction of previously unknown and potentially
useful information from data (Frawley, Piatetsky-Shapiro et al. 1991; Agrawal, Imielinski et al. 1993;
Fayyad, Piatetsky-Shapiro et al. 1996; Fayyad, Piatetsky-Shapiro et al. 1996). With KDD generally
represented as one of three overall models, the system model (Matheus, Chan et al. 1993), the KDD
environment model (Brachman and Anand 1996), and the process model (Fayyad, Piatetsky-Shapiro et
al. 1996; Fayyad 1996; Fayyad, Piatetsky-Shapiro et al. 1996).
A number of authors regard KDD in terms of the overall process involved in knowledge discovery
(Fayyad, Piatetsky-Shapiro et al. 1996; Fayyad 1996; Fayyad, Piatetsky-Shapiro et al. 1996; Mannila
1996; Simoudis 1996). A common theme to all process models is that KDD is an iterative process; the
level of interaction in the process is one of the points of difference between the various authors. The
core of each of the definitions remains the DM or pattern extraction phase
The process of discovering patterns should therefore contain the following steps.
1.
2.
3.
4.
5.
The user needs an understanding of the domain. Domain knowledge is a prerequisite for
extracting useful knowledge from data. The KDD user must have some understanding of the
application area before valuable information can be obtained. If, on the other hand, very good
human experts exist it may be difficult for semiautomatic tools to derive any novel information.
The data set must be prepared for KDD. Preparation of the data set involves the selection of data
sources, the integration of heterogeneous data, cleaning the data, assessing noise levels, and
dealing with missing values. According to Mannila (1996), this step can take up to 80% of the
time associated with the KDD process. This is in contrast with (Fayyad, Piatetsky-Shapiro et al.
1996; Fayyad 1996; Fayyad, Piatetsky-Shapiro et al. 1996) where it is stated that a significant
portion of the KDD application can go into the proper formulation of the problem. This is not to
say that Mannila (1996) is incorrect; rather this difference is a result of the emphasis by Fayyad et
al. (1996) in breaking the process down into discrete steps.
The KDD system should discover any useful and interesting patterns. This is the phase commonly
referred to as data mining.
Discovered patterns must be postprocessed. This step is the same as the interpretation phase of
Fayyad et al. (1996), and
The results must be put to use.
One of the most detailed descriptions of the KDD under the process model was proposed by Fayyad et
al. (1996) (Fayyad, Piatetsky-Shapiro et al. 1996). KDD was defined as an iterative and interactive
process consisting of nine steps. The steps involved are outlined below.
1.
2.
3.
Learning the application domain. This step involves acquiring relevant prior knowledge of the
domain and the goal of the particular application of KDD to the domain. Sufficient domain
knowledge is required, as current KDD techniques are still reliant on considerable user input.
Future developments in KDD should see the emergence of semi-intelligent engines that will free
the user from having to fully guide the KDD process.
Creating the target dataset. In KDD, the creation of the target dataset does not always mean
creation of a new data set, more often this step involves focusing on which subset of variables or
data samples from an existing database is appropriate for the current KDD application.
Data cleaning and preparation. This step covers those areas associated with preparing the data for
knowledge discovery and includes;
3.1. removing outliers and noise if appropriate or collecting information necessary to account for
outliers and noise,
3.2. developing strategies to account for missing data, time sequences and known changes within
the data, and
3.3. dealing with database management (DBMS) issues such as field types, and mappings to
missing, unknown or dependent values.
4
4.
5.
6.
7.
8.
9.
Data reduction, and projection. Depending on the goal of the current KDD process one must find
the useful features of the dataset, and using transformation or reduction methods minimize the
number of variables under consideration.
Choosing the function of data mining. Choose broadly the purpose of the current KDD exercise.
Is it to classify the dataset into discrete subsets, summarize the dataset, perform regression analysis
on the data or find clusters within the data? The function of data mining is further expanded in the
sixth step as part of choosing the specific algorithms.
Choosing the data mining algorithm(s). The data mining algorithms consist largely of a varying
mix of three specific components. The components of the DM algorithm are described below.
6.1. The model component consists of those parameters that are to be determined from the data.
There are two factors to the model, the function of the model, and the representational form
of the model. The function of the model describes which of a group of broad classifiers, such
as classification or clustering, the model will represent. The representational form of the
model describes how the model will be represented.
6.2. The preference criteria component is a measure of preference for one model over another.
The criterion, often a goodness-of-fit function, seeks to avoid problems associated with
overfitting or over generalization.
6.3. The search method consists of two components: parameter search and model search. The
parameter search algorithm must search for the parameters that will optimize the preference
criteria given the data and a fixed model representation. The model search occurs as a loop
over the parameter search methods; the model representation is changed to allow a family of
models to be considered. For each model representation, the parameter search method is
instantiated to evaluate the quality of the model.
The actual application of the data mining algorithms to the dataset. In this phase the actual
exercise of data mining is performed.
Discovered patterns need to be interpreted in the context of the KDD exercise. Interpretation
involves aspects of post-processing of patterns and if necessary revisiting the DM phases to
expand or refine the discovered patterns.
The newly discovered knowledge needs to be put to use whether this use is: incorporation of the
knowledge back into the discovery system; taking some action as a result of the discovered pattern;
reporting and documenting the patterns; or, validating the pattern in light of previous knowledge,
which includes investigation any contradiction (Fayyad, Piatetsky-Shapiro et al. 1996).
The actions associated with data mining have historically been referred to by a number of names
including knowledge extraction, data dredging, data analysis (Chen, Han et al. 1996), pattern analysis,
information harvesting (Piatetsky-Shapiro 1994), information discovery, data archeology and data
pattern processing (Fayyad, Piatetsky-Shapiro et al. 1996). All of theses phrases imply the extraction
of patterns of information from data, though there are negative connotations associated with some,
especially data dredging. DM as seen today can be characterized as a set of methods for finding logical
or mathematical descriptions of regularities in any database (Decker and Focardi 1995). Practical
limitation on this are the assumptions that the generalized model of the database can be approximated
through some relatively simple computational model and that there is sufficient data in the database to
facilitate the generalization (Decker and Focardi 1995). Thus, the existence of a database is an
essential feature of DM.
A database in its simplest form is a collection of related data where each point of data is some fact
description or the attributes of the real world entity of interest to the database (Vickery 1997). Implicit
in the idea of a database is the following properties.
1.
2.
3.
The database represents some portion of the real world.
The data is organized in some logically coherent manner with some inherent meaning in the
structure of the data; a random collection of data is not a database.
The database has a specific purpose and audience. The data that populates the database relates to
the purpose of the database.
Motivation for the ANZICS database has been standardized of data collection by ICUs in Australia and
New Zealand. With the strongest motives being a desire to compare outcomes from different treatment
regimes, either within the one unit, or between a number of units. Additionally, there is a desire to
increase the sum of knowledge regarding specific diseases, outcomes and work practices. To this end
the database is populated with records pertaining to patient outcomes when admitted to an ICU.
5
The General Statistics of Prognostication
Four statistics are traditionally used to describe the utility of a diagnostic test (Myles, Williams et al.
1994). These statistics are sensitivity, specificity, positive predictive value (PPV) and negative
predictive value (NPV). Sensitivity is a measure of the true positive rate. It is the proportion of
samples correctly identified as positive by the test. Specificity on the other hand is the measure of the
true negative rate and shows the proportion of samples correctly identified as negative. The positive
predictive value is the proportion true outcomes predicted to be true, and similarly the negative
predictive value is the proportion of negative events out of those predicted to be negative (Myles,
Williams et al. 1994).
The aim of any model is to minimize the false negative rate (type II errors), those samples predicted to
have a positive outcome that actually have a negative outcome, for example predicting survival of a
patient who subsequently dies. The problem is that the negative results are generally rare events
requiring extremely large databases to minimize the likelihood of type II errors (Myles, Williams et al.
1994).
These statistics can also be used to describe the accuracy of prognostic systems. The statistics
specificity and sensitivity are particularly important as they are the base statistics for receiver operating
characteristic (ROC) curve analysis (Miller, Langefeld et al. 1993). ROC curves are increasingly being
used to judge the discrimination ability of various statistical methods (Hanley and McNeil 1982). One
of the underlying concepts for a ROC curve that makes it very applicable to prognostic studies is that
given the available data an operator is forced to choose between two alternative decisions (Hanley and
McNeil 1982). For a given confidence threshold the operator attempts to classify the pattern as either
of the outcome states. Probability distributions can then be derived for the various truth states
(Swaving, Houwelingen et al. 1996). In the case of prognostic scores the operator is the scoring system
itself. The confidence threshold is the probability of the specified outcome according to the scoring
system.
While numerous examples of where ROC curve analysis has been used in the medical prognostic
literature can be sited including APACHE (Knaus, Zimmerman et al. 1981), APACHE II (Knaus,
Draper et al. 1985), ODIN (Fagon, Chastre et al. 1993), SAPS II (Le Gall, Lemeshow et al. 1993),
Logistic Organ Dysfunction System (LODS) (Le Gall, Klar et al. 1996), meanwhile a recent search on
INSPEC showed only three examples of ROC curve analysis applied to domains other than the medical
domain. The non-medical citation are acoustics (Pflug, Ioup et al. 1992), LADAR imaging (Keller,
Gader et al. 1998) and object recognition (Shin, Goldgof et al. 1999).
As an initial investigation ROC curves were plotted for each of the four hospital types, the result is
shown in Figure 2. Further research is indicated in an attempt to clarify whether the apparent
differences in curves are significant and if so why. The probability of mortality equation of the
APACHE II score (Knaus, Draper et al. 1985) is adjusted based on admission category so in theory
other facts must be effecting the outcomes between the hospital groups.
ROC Curves for APACHEII ANZICS Database
1
0.9
True Positive Rate (Sensitivity)
0.8
0.7
0.6
Tertiary ROC
Metropolitan ROC
0.5
Rural ROC
Private ROC
0.4
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
False Positive Rate (1 - Specificity)
Figure 2 ROC Curvve of APACHE II scores for Differing Hospital Types.
6
The following section leads into a discussion of work in progress.
Why does APACHE lend itself to modeling with neural network tools?
Clinical decisions like most decision-making processes are rarely made on the basis of a single factor.
In medicine it is rare for a single symptom or measurement to be pathognomic for a single disease
(Cross, Harrison et al. 1995). Clinicians regularly participate in multidimensional analysis applying
previous observed patterns to the pattern under current observation. Of interest in a KDD sense is can
the multidimensional analysis performed by clinicians be modeled using the tools of KDD. To this end
a series of neural networks are currently being trained.
The APACHEII severity index (Knaus, Draper et al. 1985) is a widely used and well understood and
well validated means of stratifying patients in ICU on the basis of the degree of physiological
derangement from the norm (Wong and Young 1999). The transformations applied by the APACHEII
model itself can be shown to be non-linear disjointed transformations of patient data. For any of the
physiological characteristics essential to the APACHEII score, the transformation is non-linear. An
examination of the transformation applied to rectal temperature, if the mid point of the range for each
of the range values is plotted against the APACHEII value for that range the plot in non-linear (see
Figure 3) shows a non-linear line even if the high abnormal range is represented as negative values for
the APACHEII value. In general the APACHEII values form the non-linear ordered set {4, 3, 2, 1, 0,
1, 2, 3, 4} with ranges of values for each of the physiological attributes mapped into this set.
38
High Abnormal as
Negative
APACHEII Transformation
33
28
-4
-3
-2
-1
0
1
2
3
4
Range Transform Set (High Abnormal Represented as
Negative)
Figure 3 APACHEII Value against mid-point of Rectal Temperature Ranges.
The ranges for each transformation do not necessarily abut each other leading to a disjoint
transformation sequence. For example the transformations for rectal temperature are shown in Table 1.
Table 1 Transformation of Rectal Temperature.
TEMP
High Abnormal Range
+4
+3
+2
>41
39-40.9
+1
38.5-38.9
0
36-38.4
+1
34-35.9
+2
32-33.9
Low Abnormal Range
+3
+4
30-31.9
<=29.9
The size of the transformation step from 38.9oC to 39oC is not the same as the steps between any other
range so we have a disjoint transformation with all the added problems of boundary classifiers. The
fact that the APACHE II score is a non-linear, disjoint transformation of the underlying data combined
with the continuous nature of the underlying variables indicates that it would be very difficult to learn
this score using techniques poorly suited to handling this type of data.
Further work is planned to investigate the differences in ROC curves and the applicability of a Neural
Network Model to the ANZICS database.
Agrawal, R., T. Imielinski, et al. (1993). “Database mining: a performance perspective.” IEEE
Transactions on Knowledge and Data Engineering 5(6): 914-25.
7
Brachman, R. J. and T. Anand (1996). The Process of Knowledge Discovery in Databases. Advances in
Knowledge Discovery and Data Mining. U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R.
Uthurusamy. Menlo Prk, Calif., AAAI/MIT Press: 37-57.
Chen, M.-S., J. Han, et al. (1996). “Data mining: an overview from a database perspective.” IEEE
Transactions on Knowledge and Data Engineering 8(6): 866-83.
Cross, S. S., R. F. Harrison, et al. (1995). “Introduction to neural networks.” The Lancet. 346: 10751079.
Decker, K. M. and S. Focardi (1995). Technology Overview: A Report on Data Mining. Manno, Swiss
Scientific Computing Centre.
Fagon, J. Y., J. Chastre, et al. (1993). “Characterization of Intensive Care Unit Patients Using a Model
Based on the Presence or Absence of Organ Dysfunctions and/or Infections: The {ODIN} Model.”
Intensive Care Medicine 19: 137-144.
Fayyad, U., G. Piatetsky-Shapiro, et al. (1996). “From data mining to knowledge discovery in
databases.” AI Magazine 17(3): 37-54.
Fayyad, U. M. (1996). “Data mining and knowledge discovery: making sense out of data.” IEEE
Expert 11(5): 20-5.
Fayyad, U. M. (1997). Data Mining and Knowledge Discovery in Databases: Implications for
Scientific Databases. Proceedings. Ninth International Conference on Scientific and Statistical
Database Management, Olympia, Washington, IEEE.
Fayyad, U. M., G. Piatetsky-Shapiro, et al. (1996). From Data Mining to Knowledge Discovery: An
Overview. Advances in Knowledge Discovery and Data Mining. U. M. Fayyad, G. Piatetsky-Shapiro,
P. Smyth and R. Uthurusamy. Menlo Park, Calif,, AAAI Press: 1-34.
Frawley, W. J., G. Piatetsky-Shapiro, et al. (1991). Knowledge Discovery in Databases: An Overview.
Knowledge Discovery in Databases. G. Piatetsky-Shapiro and W. J. Frawley. Cambridge, Mass.,
AAAI/MIT Press: 1-27.
Hanley, J. A. and B. J. McNeil (1982). “The Meaning and Use of the Area under a Receiver Operating
Characteristic (ROC) Curve.” Diagnostic Radiology. 143: 29-36.
Keller, J. M., P. Gader, et al. (1998). A fuzzy logic automatic target detection system for LADAR
range images. IEEE International Conference on Fuzzy Systems Proceedings. IEEE World Congress on
Computational Intelligence., Anchorage. AK. USA., IEEE.
Knaus, W. A., E. A. Draper, et al. (1985). “APACHE II: a severity of disease classification system.”
Crit Care Med 13(10): 818-29.
Knaus, W. A., D. P. Wagner, et al. (1991). “The APACHE III prognostic system. Risk prediction of
hospital mortality for critically ill hospitalized adults [see comments].” Chest 100(6): 1619-36.
Knaus, W. A., J. E. Zimmerman, et al. (1981). “APACHE - acute physiology and chronic health
evaluation: a physiologically based classification system.” Critical Care Medicin 9(8): 591-597.
Krischer, J. P. (1976). “Indexes of Severity: Underlying Concepts.” Health Services Research 11(1):
45-53.
Krischer, J. P. (1979). “Indexes Of Severity: Conceptual Development.” Health Services Research
14(1): 56-68.
Le Gall, J. R., J. Klar, et al. (1996). “The Logistic Organ Dysfunction system. A new way to assess
organ dysfunction in the intensive care unit. ICU Scoring Group.” Jama 276(10): 802-10.
Le Gall, J. R., S. Lemeshow, et al. (1993). “A new Simplified Acute Physiology Score (SAPS II) based
on a European/North American multicenter study [published erratum appears in JAMA 1994 May
4;271(17):1321].” Jama 270(24): 2957-63.
Lemeshow, S., D. Teres, et al. (1993). “Mortality Probability Model (MPM II) bases on an
international cohort of intensive care unit patients.” Journal of the American Medical Association
270(20): 2478-2486.
Mannila, H. (1996). Data mining: machine learning, statistics, and databases. Proceedings. Eighth
International Conference on Scientific and Statistical Database Management, IEEE Comput. Soc. Press
Los Alamitos CA.
Marshall, J. C., D. J. Cook, et al. (1995). “Multiple Organ Dysfunction Score - a Reliable Descriptor of
a Complex Clinical Outcome.” Critical Care Medicine 23(10): 1638-1652.
Matheus, C. J., P. K. Chan, et al. (1993). “Systems for knowledge discovery in databases.” IEEE
Transactions on Knowledge and Data Engineering 5(6): 903-13.
Miller, M. E., C. D. Langefeld, et al. (1993). “Validation of probabilistic predictions.” Med Decis
Making 13(1): 49-58.
Miyano, S. (1995). “Learning theory toward genome informatics.” IEICE Transactions on Information
and Systems E78-D(5): 560-7.
8
Myles, P. S., N. J. Williams, et al. (1994). “Predicting Outcome in Anaesthesia: Understanding
Statistical Methods.” Anaesthesia and Intensive Care. 22(4): 447-453.
Nazem, S. M. and B. Shin (1999). “Data mining: new arsenal for strategic decision-making.” Journal of
Database Management 10(1): 39-42.
Pflug, L. A., G. E. Ioup, et al. (1992). “Detection of oscillatory and impulsive transients using higherorder correlations and spectra.” Journal of the Acoustical Society of America 91(5): 2763-76.
Piatetsky-Shapiro, G. (1994). An overview of knowledge discovery in databases: recent progress and
challenges. Rough Sets, Fuzzy Sets and Knowledge Discovery. Proceedings of the International
Workshop on Rough Sets and Knowledge Discovery (RSKD'93)., Banff, Alberta, Canada, SpringerVerlag.
Shin, M. C., D. Goldgof, et al. (1999). Comparison of edge detectors using an object recognition task.
1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition., Fort Collins,
CO, USA., IEEE.
Simoudis, E. (1996). “Reality Check for Data Mining.” IEEE Expert(Oct,): 26-33.
Swaving, M., H. V. Houwelingen, et al. (1996). “Statistical Comparison of ROC Curves from Multiple
REaders.” Medical Decision Making 16(2): 143-152.
Tafeit, E. and G. Reibnegger (1999). “Artificial neural networks in laboratory medicine and medical
outcome prediction.” Clinical Chemistry & Laboratory Medicine. 37(9): 845-853.
Vickery, B. (1997). “Knowledge discovery from databases: an introductory review.” Journal of
Documentation 53(2): 107-22.
Wong, D. T. and W. A. Knaus (1991). “Predicting outcome in critical care: the current status of the
APACHE prognostic scoring system.” Can J Anaesth 38(3): 374-83.
Wong, L. S. S. and J. D. Young (1999). “A comparison of ICU mortality prediction using the
APACHE II scoring system and artificial neural networks.” Anaesthesia. 54(11): 1048-1054.
9