Download Using Bayesian Networks for Discovering Temporal

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Proceedings of the 35th Hawaii International Conference on System Sciences - 2002
Using Bayesian Networks for
Discovering Temporal-State Transition patterns in Hemodialysis
Fu-ren Lin Chih-hung Chiu
San-chiang Wu
Department of Information Management Kaohsiung Municipal Women and Children
National Sun Yat-sen University
General Hospital
Kaohsiung, Taiwan 804, ROC
Kaohsiung, Taiwan 804, ROC
[email protected]
The constructed probabilistic network keeps up to date by
propagating new instances to the existing networks. The
probabilistic network can be used for proposing alternatives
for decision making, and facilitating the communications
and training between professionals.
Due to the prevalence of information technology in
medical care, data collected from clinical processes can be
used for discovering useful patterns. These patterns can be
analyzed automatically or by medical professionals in order
to develop better strategies to improve the quality of medical
treatment. For example, numerous physiological states of
patients, medical activities, and therapeutic interventions
performed by the physicians, nurses, and other staffs are
recorded for future reference and legal documents. These
records, as medical histories, are valuable sources for
discovering clinical rules and enhancing clinical knowledge
facilitated by information technology (e.g. [3]).
This paper reports our research in adopting Bayesian
networks for discovering temporal-state transition patterns
in the Hemodialysis process. The discovered Hemodialysis
clinical pathway patterns can be used for predicting possible
paths for an admitted patient, and for helping medical
professionals to react to exceptions during the Hemodialysis
process. We also a suggest knowledge management systems
of the Hemodialysis process can be created using the results
of this study. In Section 2, we introduce the theory of
Bayesian networks, and explain why Bayesian networks are
adopted to represent probabilistic networks. Bayesian
networks are constructed to codify knowledge of
temporal-state transitions from workflow logs in Section 3.
In Section 4, we apply the Bayesian networks to represent
causal relationships between transitions of patient’s
physiological states and medical treatments in the
Hemodialysis process. Section 5 illustrates the usage of
Bayesian networks for generalizing clinical paths of
Hemodialysis, and supporting medical decision-making.
Conclusions and future research are described in Section 6.
Abstract
In this paper, we adopt Bayesian networks for
discovering temporal-state transition patterns in the
Hemodialysis process. Bayesian network is a graphical
model that encodes probabilistic relationships among
variables, and easily incorporatesnew instances to maintain
rules up to date. We demonstrate the proposed method in
representing the causal relationships between medical
treatments and transitions of patient’s physiological states in
the Hemodialysis process. The discovery of Hemodialysis
clinical pathway patterns can be used for predicting
possible paths for an admitted patient, and to help medical
professionals to react to exceptions during the Hemodialysis
process. The discovery of clinical pathway patterns enables
reciprocal learning cycle for medical organizational
knowledge management.
1. Introduction
The importance of extracting professional knowledge
from domain experts and representing it in an explicit form
has been widely recognized. Taking the medical industry as
an example, medical professionals make decisions
throughout the clinical paths. When is the right time to
perform the next therapeutic interventions? To what extent
should a patient with medical history of high blood pressure
or diabetes remain comfortable during hemodialysis? These
questions indicate the need to document the professional
knowledge explicitly embedded with medical professionals.
The explicit knowledge representation facilitates the
communication between medical professionals, accelerates
practical training, and supports professional judgment.
The probabilistic network is usually used for
representing knowledge of temporal-state transitions from
workflow logs. A probabilistic network consists of states,
pathways, and causal probabilistic relationships [1]. It is an
explicit representation of inter-dependencies between
variables that ignores the specific numeric or functional
details. Depending on interpretation, they can also represent
causality [2]. Probabilistic networks are a good model for
representing frequent state transitions in medical processes.
2. Literature Review
2.1 The Hemodialysis
Normally, the human body has two kidneys positioned
~1~
0-7695-1435-9/02 $17.00 (c) 2002 IEEE
Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS-35’02)
0-7695-1435-9/02 $17.00 © 2002 IEEE
1
Proceedings of the 35th Hawaii International Conference on System Sciences - 2002
in the back of the body behind the lower ribs. A kidney is a
filtering system. As blood passes through kidneys, the
containing wastes and excess fluid are filtered and flow
through the urethra to the bladder. They are passed out of
the body as urine. Healthy kidneys daily produce a total of
about 1.5 to 2.5 quarts of urine. Kidneys also balance
chemicals (sodium, potassium, calcium, phosphorous and
others) in blood, and produce hormones, which help to
regulate blood pressure, stimulate red blood cell production,
and promote strong bones.
When both kidneys fail, the body holds fluid and the
blood pressure rises. There is an accumulation of certain
toxic substances (urea and creatinine) and an increase in the
volume of body water. This excess water results in swelling
of tissues and high blood pressure and prevents normal
activities of other organs, such as the heart and lungs. For
example, if the body cannot remove its own excess
phosphate, calcium levels drop resulting in bone diseases,
harmful wastes build up in the body, and the body will not
produce enough red blood cells. If total kidney function
drops below ten percent of normal capacity and the
impairment is irreversible, the condition is known as
end-stage renal disease (ESRD). When this situation occurs,
artificial dialysis is needed to replace the work of the failed
kidneys. There are two different kinds of dialysis: one is
called Hemodialysis, which cleans blood outside the body
via a machine, and the other is called peritoneal dialysis,
which cleans blood inside the body. Both peritoneal dialysis
and Hemodialysis are based on the same principles. The first
principle is osmosis, which denotes that water moves from a
low concentration of particulates to a higher concentration,
and the second one is diffusion, which denotes that particles
spread out evenly in a solution [4].
Hemodialysis (also called kidney dialysis) is the most
popular treatment for ESRD patients. It does not only
replace the functions of kidneys in cleaning and filtering
blood to removes harmful wastes and extra salt and fluids,
but also control blood pressure and help the body to keep the
proper balance of chemicals such as potassium, sodium, and
chloride. In this procedure, two needles are inserted into a
blood vessel (usually in the patient's arm). Each needle is
attached to a thin length of tube. One tube carries blood to a
machine containing a dialyzer, which is a unit comprised of
many very fine hollow fibers. These fibers are made of a
semi-permeable membrane. As blood flows through the
fibers, dialysate flows around them, removing impurities
and excess water and adjusting the chemical balance of the
blood. After being cleansed and adjusted, the blood returns
to the patient's body through the second tube. Less than five
percent of a patient's blood is outside the body during the
dialysis process. The treatment lasts 3 to 6 hours and is
usually performed three times a week. Nurses record
patients’ status, such as blood pressure, pulse, and dialysis
machine operation state, such as fluid and dialysis blood
velocity.
2.2 Why Bayesian network is chosen
There are several data mining techniques available for
extracting and representing knowledge from data. To
represent the complex causal relationships and probabilistic
semantics between numerous variables, three techniques
including decision trees [5], artificial neural networks [6]),
and probabilistic networks are usually used. Buntine (1996)
points out that probabilistic networks have the
distinguishing characteristics from decision trees and
artificial neural networks. Probabilistic networks have a
clear semantics that allow them to be processed in order to
do diagnosis, learning, explanation, and many other
inference tasks necessary for intelligent systems. Artificial
neural network usually has higher prediction accuracy, but
fails to explicitly explain the causal relationship between
input conditions and output outcomes. The decision tree
technique is not qualified to represent inter-dependencies or
independencies between variables and may not prevent a
minority of special cases from being blindfolded or ignored
in our task.
The Bayesian network is a popular
representation of probabilistic network. Justifying the
fitness of these potential technologies, we assert that the
Bayesian network is more suitable for our task than the other
two techniques.
The Bayesian network is a graphical model that encodes
probabilistic relationships among variables of interest.
Heckerman [7] states four initial steps to construct the
Bayesian networks: (1) correctly identify the goals of
modeling, (2) identify many possible observations that may
be relevant to the problem, (3) determine what subset of
these observations are worthwhile to model, and (4)
organize the observations into variables that can be assigned
to distinct and exhaustive states. For tutorial articles on
Bayesian network please refer to [8][9].
3. Constructing Bayesian Networks with
temporal-state transitions from workflow
logs
In this section, we propose the solution of constructing
Bayesian networks with temporal-state transitions from
workflow logs. In the following subsections, firstly, we
define terms used for representing temporal-state transitions
and Baysian networks. Secondly, we design the process of
constructing Baysian networks. Thirdly, we describe how
we inference and propagate Baysian networks.
3.1 Some definitions for constructing Bayesian
networks
The following definitions will help us to understand
the detailed process of constructing Bayesian networks
described in the next subsection.
~2~
0-7695-1435-9/02 $17.00 (c) 2002 IEEE
Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS-35’02)
0-7695-1435-9/02 $17.00 © 2002 IEEE
2
Proceedings of the 35th Hawaii International Conference on System Sciences - 2002
five steps: data discretization, data merging, defining events
and actions, defining states, and constructing Bayesian
networks with temporal-state transitions.
Step 1: Discretizing continuous variables
The variables of conditions in the workflow processes
may be continuous, which creates the possibility of an
infinite number of states. Therefore it is necessary to
discretize the continuous variables to nominal variables
according to certain splitting criteria to enumerate important
states. Experts’ domain knowledge or clustering methods
can be used to discretize continuous variables. For example,
values for body temperature, low, fair, and high are ≦35℃,
35℃~36.5℃, and >36.5℃.
Step 2: Consolidating contiguous similar states
If the values of contiguous states are in the same range
(except the time variable), it means that these two states
have no significant differences. In this case, we consolidate
these two states to represent the states of a contiguous period
of time. By concatenating contiguous similar states, we can
obtain more concise state space in describing a workflow.
Step 3: Defining events and actions
When there are some differences between two
contiguous states, an event can be specified between the two
states. An action is performed after certain event occurs, or
or a state transition has occurred naturally, that is, without
intervening actions.
Step 4: Specifying distinct states
After executing Step 2, any two contiguous states are
different. We can use the name-mapping method to assign
state identification Sid, where id denotes the unique number
in state table of the database. When a new record is inserted
into the database, an attempt is made to match its state to an
existing state id. If there is a match, we assign the matched
identifier to the inserted state. Otherwise, we issue a distinct
state identifier to the new record. By doing this, each state is
distinct.
Step 5: Constructing Bayesian networks with temporal-state
transitions
A Bayesian network is a graphical representation of
conditional probability distributions for a set of variables,
and represents the causal relationship between variables.
Because workflow logs are sequence records, the workflow
logs themselves imply the causal relationship. Since time is
irreversible, the Bayesian networks can be constructed
according to the time line directly.
In a Bayesian network, a state is represented by a set
of variables (except time, event, and action), which is
different from traditional Bayesian network construction. A
pair of an event and an action is defined as the state value.
The occurrence of state variables is encoded with the
conditional probabilities. Therefore, we can construct a
Bayesian network on a coordinate which the y-axis is the
state identification and the x-axis is the time line shown in
Figure 1. Note that, in the state of T5S4, the asterisk (*) in the
P(E7A1|E1A2,*) represents that E1A2 and E8A9 are mutual
exclusive, and will not occur at the same time. By tracing a
Definition 1: Log database (L)
A log database of a workflow consists of records,
denoted as L= {Pi , Tij, Sij}, where Pi is the process i, Tij is the
recorded start time of this log, and Sij is the state of the
process i at time Tij.
Definition 2: State (S)
A state represents a certain condition of the process in
a time duration. A state is defined by all variables excluding
time, event and action. It can be represented as: Sj=(ci1, ci2,
cj3, …, ciN), where cjk is the value of the kth variable in Si, and
Si is unique in the state set.
Definition 3: Event (E)
An event occurs and makes contiguous Si and Sj
different. That is, Ek: SiÆSj. Note that Ek is composed by a
subset of state variables.
Definition 4: Action (A)
An action is performed in reaction to a given event. If
the state transition from Si to Sj naturally, the action is
represented by the empty set {φ}. The action with the
occurrence of event can be represented formally as EkAh:
Si ÆSj.
Definition 5: Path graph (Gp)
A process record contains the process name, recorded
time, state variables, events, and actions. That is, a process
record can be represented by the format of {Pi, T, S, (Event
∩Action)}. Once we have collected process records, we use
them to create a process path with the temporal-state
transition. A path graph is denoted as Gp=(V, E,μp), where
V is a finite nonempty set of vertices, E⊂ V × V is a set of
directed edges, and μ p(V)=〈T, S, (Event ∩ Action)〉 is a
function assigning attributes and marking labels to each
vertex. In 〈T, S, (Event∩Action)〉, T is the start time of the
vertex, S is the name of the vertex, and (Event∩Action) is
the value of this vertex.
Definition 6: Bayesian network (B)
A Bayesian network (B) consists of many path graphs.
Therefore, B has the same components as a path graph.
When one path graph puts into the B, each vertex of B will
calculate its conditional probability. When all path graphs
put into the B, the total probabilities of vertices would be
encoded in the B. A Bayesian network can be represented as
∪Gpi =(V, E,μB), where V is a finite nonempty set of nodes,
E⊂ V × V is a set of directed edges, andμB(V)= 〈T, S, ∪
(Event∩Action), p〉 is a function assigning attributions and
marking labels to each vertex. T is the start time of the
vertex, S is the name of the vertex, (Event∩Action) is the
value of this vertex, and p is the conditional probability of
each value.
3.2 The process of constructing Bayesian networks
for representing temporal-state transitions
We propose a process of constructing Bayesian
networks with temporal-state transitions which consists of
~3~
0-7695-1435-9/02 $17.00 (c) 2002 IEEE
Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS-35’02)
0-7695-1435-9/02 $17.00 © 2002 IEEE
3
Proceedings of the 35th Hawaii International Conference on System Sciences - 2002
path, only one of the states, T3S2 or T1S3, leading to state T5S4
at the same time. That is, only one prior condition exists
denoted as P(E7A1|E1A2,*) or P(E7A1|*,E8A9).
clinical pathway of Hemodialysis.
Finally, various
scenarios are used for illustrating the benefits from this
research results.
4.1 Representing clinical pathways of Hemodialysis
S10
S9
The efforts of using Bayesian networks to representing
the clinical pathway of Hemodialysis are spent on data
collection, data cleaning, Bayesian networks construction,
clinical path prediction, and Bayesian network propagation.
Figure 2 illustrates the model of applying Bayesian networks
to generalize clinical pathways of Hemodialysis. The
involved tasks are elaborated as follows.
S8
S7
S6
States
P(E7A1|E1A2,*)=0.3
P(E4A3|E1A2,*)=0.7
P(E7A1|*,E8A9)=0.04
P(E4A3|*,E8A9)=0.16
S5
S4
E7A1
E4A3
P(E1A2|EsAs)=0.7
S3
P(E2A8|E7A1,*)=0.3
P(E7A2|E7A1,*)=0.7
P(E2A8|E4A3,*)=0.5
P(E7A2|E4A3,*)=0.5
E2A8
E7A2
E1A2
S2
Hemodialysis
log data
E8A9
E2A5
P(E8A9|E2A1)=1
S1
E2A1
SSE EsAs
T0
E8A6
P(E2A1|EsAs)=0.3
T1
T2
P(E2A5|E8A6)=1.0
E4A5
E1A2
P(E8A6|E8A9)=0.8
T3
T4
T5
T6
T7
T8
T9
P(E4A5|E2A8,*)=0.2
P(E1A2|E2A8,*)=0.8
P(E4A5|E7A2,*)=0.7
P(E1A2|E7A2,*)=0.3
P(E4A5|*,E2A5)=0.7
P(E1A2|*,E2A5)=0.3
1. Data collection
Patient
data
2. Preprocessing
T10
Time
Figure 1. An example of Bayesian networks
3. Constructing Hemodialysis
clinical pathways
using Bayesian network
After we construct the Bayesian network, we can
calculate the probability distributions in the network. For
example, in Figure 1, a process reaches the vertex at T3S2
with value E8A9. There are two paths including ten
One path is
conditions starting from T3S2.
and
the
other
is
{T3S2ÆT5S4ÆT9S3ÆT11SSE},
There are ten different
{T3S2ÆT7S1ÆT9S2ÆT11SSE}.
combinations of events and actions in these two paths.
A Bayesian network can be updated after inserting new
process data. Each new process is specified by discretizing
continuous variables, consolidating contiguous similar
states, and defining events and actions as described in Step 1
to 3. By keeping the count of each vertex, we can easily
obtain the new probabilistic distribution of each vertex after
inserting a new data.
4. Evaluating Hemodialysis
clinical pathways
by professionals
Hemodialysis clinical
pathway patterns in
Bayesian networks
6. The propogation of
Bayesian networks
5. Hemodialysis
clinical pathways
predication and suggestion
Figure 2. The Model of applying Bayesian networks to
generalize Hemodialysis clinical pathways
1. Data collection
The data used for this research are collected from the
Division of Nephrology at Kaohsiung Municipal Women
and Children General Hospital during August 1999 and
September 1999. The data consists of three categories: (1)
patient profile including patient identification, gander,
birthday, age, (2) pre-dialysis data such as start time, systolic
and diastolic pressure, weight, weight difference between
dialysis, pulse, dialysis machine model, dialyzer, and
dialysis fluid, and (3) Hemodialysis log data composed of
log time, systolic and diastolic pressures, pulse, blood
velocity, dialysis fluid velocity, excess weight filtered,
physiological saline, patient’s chronics, and treatment.
2. Pre-processing
In order to generate accurate clinical patterns, the data
accuracy is very important. We consulted doctors and
nurses to determine important variables, and their domains.
We clean the new data by the following steps: (1) filling
missing values from the contiguous data over the extended
periods, (2) deleting the records with too many missing
values, and (3) censoring the data that exceed the domain.
3. Constructing the Hemodialysis clinical pathway
The construction of Hemodialysis clinical pathway
4. The application of Bayesian networks to
discovering the clinical pathway patterns of
Hemodialysis
Bayesian methods are not new to medical cares. In fact,
Bayes’ theorem has been used successfully in medical
expert systems for about thirty years [10]. However, this
research is an initial effort on using Baysian networks for
presenting temporal state transitions in Hemodialysis and
demonstrating the reciprocal knowledge management for
better quality of dialysis services.
In this section, we
demonstrate how to apply the Bayesian network to
representing temporal-state transitions in Hemodialysis.
Firstly, we briefly show how to apply a Bayesian network
framework to represent the clinical pathway of
Hemodialysis. Secondly, we specify stages of constructing
the temporal-state transitions in Bayesian networks as the
~4~
0-7695-1435-9/02 $17.00 (c) 2002 IEEE
Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS-35’02)
0-7695-1435-9/02 $17.00 © 2002 IEEE
4
Proceedings of the 35th Hawaii International Conference on System Sciences - 2002
First, we obtain an initial state Ss, which it is a common start
vertex. Second, we record state occurrence in each vertex.
Therefore, we can calculate the probabilities in each vertex
and easily propagate the Bayesian network. Finally, we
construct the Bayesian network.
patterns follows such steps as discretizing continuous
variables, consolidating contiguous similar records, defining
events and actions, specifying distinct states, and then
constructing Bayesian networks with temporal-state
transitions.
4. Evaluating the pattern of clinical pathways by
professionals
To ensure the resulting patterns are meaningful for
professionals in Hemodialysis, a team of dialysis
professionals including doctors and nurses should evaluate
these patterns to verify their quality.
5. Clinical pathway prediction and suggestion
The clinical path patterns can be used to predict possible
outcomes of a new patient admitted to the Hemodialysis. By
estimating the possible result, a clinical professional can
take actions in reaction to certain events during the dialysis
to induce desirable outcomes.
6. The propagation of Bayesian networks
The clinical path patterns can be maintained by
propagating the existing patterns with the new patients’ data
as described in Section 3.
4.2 Constructing
pathways
the
Hemodialysis
5. The Empirical Results and Discussion
The dialysis data collected from the Division of
Nephrology at Kaohsiung Municipal Women and Children
General Hospital during August and September 1999
include 140 patients and 2213 records in total. We use Java
language to develop the system to visualize networks and
calculate probabilities. Users can operate this system
through Web browsers. Applying Bayesian networks to
represent clinical pathways of Hemodialysis proposed in
Section 4, we obtain the results from generalizing the
temporal-state transitions in Hemodialysis to predict the
clinical pathways for individual and groups respectively.
The dialysis process represented by Bayesian networks can
be used for on-site suggestion and clinical knowledge
management as elaborated in Subsection 5.3.
clinical
5.1 The generalization of temporal-state transition
patterns in Hemodialysis
From clinical data, we build Bayesian networks to
represent the clinical pathway of Hemodialysis. The
construction of the Hemodialysis clinical pathways proceeds
with the following steps.
Step1: Discretizing continuous variables
Most of the dialysis data variables are continuous
variables. The discretization of these continuous values is
necessary to obtain distinguishable states. The splitting
points and results are shown in Table 5 and 6 respectively.
Step2: Consolidating contiguous similar states
To discretizing continuous variables, contiguous
similar data are assigned to the same discretized value, and
then consolidated. By consolidating contiguous similar
records, we obtain concise records to describe the
Hemodialysis clinical pathway.
Step3: Defining events and actions
By consolidating contiguous similar records, we
ensure that every two new contiguous records are different.
That is, there exists an event and an action between any two
contiguous records.
Step4: Specifying distinct states
We specify distinct states, and then transform the
dialysis data with these distinct states.
Step5: Constructing Bayesian networks with temporal-state
transitions
When we transform dialysis logs into the four-tuple {T,
S, (E∩A)}, we can start to construct Bayesian networks.
After
constructing
Bayesian
networks for
temporal-state transitions, we can display the clinical
pathway patterns for individuals or a group of patients
across an extended period of time. Figure 3 and 4 are two
example Bayesian networks showing individual and group
clinical pathways respectively. In these figures, the y-axis
denotes the systolic pressure, and the x-axis denotes the time
sequence and each scale represents five minutes. Nodes in
the figure represent states occurring in the dialysis. Links
from one state to the other state represent an action triggered
by an event in the source state. The probability can be added
onto links to show the percentage of cases from the
occurrence of events at the source state to the destination
state. The right side bar represents the scale of quality for
the dialysis process. The lighter the bar the better dialysis
quality while the darker the worse quality. Table 1 specifies
the attributes for specifying the dialysis quality. The
situations attributing to the worst quality are those life
crucial exceptions.
The resulting clinical pathways for individuals can be
used for tracing the individual dialysis history to achieve
better individual care. The group clinical pathways release
the common sequence of individuals within the group, and
specify variations for medical professionals to analyze their
causes and possible treatments in the future.
~5~
0-7695-1435-9/02 $17.00 (c) 2002 IEEE
Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS-35’02)
0-7695-1435-9/02 $17.00 © 2002 IEEE
5
Proceedings of the 35th Hawaii International Conference on System Sciences - 2002
predicting the clinical pathway for the coming patients, we
perform the following experiments to evaluate coverage of
the generalized patterns. First, we evaluate coverage at the
individual level by using an individual dialysis pattern
across these two months to predict her/his latest dialysis
pathway. The result shows that the dialysis parameters are
tuned according to the situation when the patient arrives;
thus, an individual is treated with large variation each time.
Therefore, coverage is very low.
Second, we evaluate coverage at the group level by
using group dialysis patterns to predict the individual
clinical pathways. We clustered the clinical pathways into
three groups, Group A, B, and C, according to the
demographical clustering function in Intelligent MinerTM
based on the attributes of weight before dialysis, age,
systolic pressure, weight difference after dialysis, diastolic
pressure, and pulse. The resulting three groups are shown in
Figure 5, where clinical pathways assigned to Group A, B,
and C are 38%, 32%, and 25% respectively. We select 10%
of clinical pathways from each group as the testing set to
evaluate coverage for each group patterns. Coverage for a
pathway is defined as the percentage of the number of states
in the pathway sharing the same links with pathways in a
group. Coverage of a testing set is the sum of the
percentages obtained from each pathway in the testing set.
Formally,
we
define
coverage
as
Figure 3. A patient’s Bayesian Network
 h
n
Coverage= 
1
+
1
h
n
2
+ ... +
2
h
n
N
N



, where N is the total
N
number of testing pathways, ni is the number of links of the
ith pathway, and hi is the number of links matching the ith
pathway.
Figure 4. A group patients’ Bayesian Network
Table 1. The variables for dialysis quality
Quality variables
Quality
indicator
Coma, sleeping, convulsions, palpitation, Life crucial
dyspnea, unconscious
Systolic and diastolic pressures increasing Medium
or decreasing 20%, suffocation.
exceptions
Headache, vomiting, stomach ache,
Minor
abdominal_dist, diarrhea, yawning,
uncomfortable
spasm, chillness, tumid, fever, blood_nf,
tub_hemorrhage, tub_inflammation,
tub_oppilation, vein_loss, vein_clot,
skin_itch, tinnithus, cold_sweat, Malaise,
groan, restless, fatigue.
Figure 5. The clustering results using Intelligent MinerTM
The results are shown in Figure 6, where x-axis
denotes the time window size, and the y-axis denotes
coverage. The time window is the allowance of time
forward or backward in matching states. One time window
size is set to 5 minutes in the experiment. The results show
that the trend of increasing coverage as the time window size
5.2 The prediction of temporal-state transitions in
Hemodialysis
In order to further analyze the generalized patterns for
~6~
0-7695-1435-9/02 $17.00 (c) 2002 IEEE
Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS-35’02)
0-7695-1435-9/02 $17.00 © 2002 IEEE
6
Proceedings of the 35th Hawaii International Conference on System Sciences - 2002
increases in these three groups. This implies that the
occurrence of states, events, and actions cannot be
accurately specified within a narrow time frame in the
Hemodialysis process. The relaxation of time frame
improves coverage, although coverage is not as high as we
expected. In future research, there is a great improvement
can be done in various directions, such as data quality,
pattern generalization, etc.
professionals can improve their clinical decisions, which in
terms enhance the quality of knowledge base. This can be
viewed as an organizational knowledge management effort
to acquire, store, evaluate knowledge of dialysis, and in turn
improve the organizational competitiveness. In the next two
subsections, we will further illustrate these extensions.
5.3.1 On-site suggestions. We illustrate the on-site
suggestion function with the following scenario. As a
patient is admitted into the Hemodialysis process, he or she
will be assigned to a dialysis machine. Nurses assign initial
settings, such as the blood velocity, dialysis fluid velocity,
and saline flow, according to the clinical pathway patterns
exhibited by the Bayesian network system. By recording
events occurring and actions taken, the Bayesian network
traces the progress of the dialysis. If an event occurs and
nurses have no previous experience or lack of confidence on
how to handle this event, the Bayesian network system can
calculate the percentage of taking certain actions, and
predict the resulting quality after taking the action. This
on-site tracing and suggestion capability enables nurses to
incorporate experiences from clinical pathways contributed
by other colleagues, which enlarges his or her decision
alternatives, and in turn may enhance his or her learning
performance. Of course, it ultimately benefits the patients.
0.35
0.3
Coverage
0.25
0.2
G ROU P A
G ROU P B
G ROU P C
0.15
0.1
0.05
0
0
1
2
3
4
5
6
7
8
9
10
11
12
Tim e W indow size
Figure 6. Coverage with different time window size
5.3 Discussion
In Subsection 5.1, we demonstrate the usage of
Bayesian networks for generalizing the temporal state
transitions in Hemodialysis. Through the visualization
interface, the medical professionals can obtain the structure
of clinical pathways of individuals and groups. This
provides them a basis for them to evaluate the quality of
treatment, and facilitate variation analysis. In Subsection
5.2, we test the predictability of the clinical pathway patterns
of Hemodialysis in individual and group levels. The results
show that the clinical pathway patterns have better coverage
for predicting the clinical pathway of an individual patient in
his or her similar group as the window size increases.
However, coverage is not as high as we anticipate. The low
coverage may result from data quantity and quality. We only
collected data from two months, which may not cover most
of the cases for generalizing the pathway patterns, and the
data cleaning efforts cannot remove all of noisy data.
Therefore, in the future, we can further verify the result that
the group patterns can be used for predicting the possible
pathways of individuals embedded with the group
characteristics by improving the data quality and increasing
data quantity. In summary, we have initiated the efforts on
formulating the Hemodialysis as the temporal-state
transitions using Bayesian networks, and obtain better
coverage in the group level. We can extend the analysis to
perform on-site suggestion along the dialysis process to give
nurses real-time suggestions to take actions to respond to
certain events. The clinical pathway pattern base can serve
as the knowledge base, and through the cycle of variation
analysis, execution, and evaluation, the medical
Evaluation
Pattern
generation
Pattern
base
On-site
suggestion
Data base
Data log
Execution
Figure 7. The reciprocal learning cycle of the
Hemodialysis
5.3.2 Facilitating reciprocal learning cycle and
organizational knowledge management. The additional
efforts on developing Intranet infrastructure and database
applications can be justified by the benefits of the reciprocal
learning cycle and organizational knowledge management.
Figure 7 illustrates the reciprocal learning cycle in
improving the Hemodialysis quality within the Nephrology
department.
The clinical pathway patterns can be
generalized from the log data in daily dialysis operations.
Periodically, medical professionals evaluate dialysis
variation under various situations to obtain synthetic
perception of the performance executed by professionals in
the department. At this point, the pattern base acts as the
organizational memory, which can be used by professionals
~7~
0-7695-1435-9/02 $17.00 (c) 2002 IEEE
Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS-35’02)
0-7695-1435-9/02 $17.00 © 2002 IEEE
7
Proceedings of the 35th Hawaii International Conference on System Sciences - 2002
for on-site suggestion to new patients. The log data
generated during the dialysis process are added into the
database, which can be used for refining the pattern base. In
turn, the patterns suggested by the system are more effective
in handling the progress of dialysis. Therefore, we view the
reciprocal learning cycle as the core of the organizational
knowledge management.
research in order to obtain high quality raw data in low costs.
Second, the clinical professionals proactively evaluate the
clinical pathway patterns and analyze variations among
different settings and patients. Therefore, the data collected
from the revised medical treatments can refine the pattern
base. Third, the reciprocal learning cycle can be better
measured to bring insights into organizational knowledge
management.
6. Conclusions and Future Research
References
In this research, we have spent efforts on the following
endeavors: (1) representing temporal-state transitions using
Bayesian networks, (2) viewing the Hemodialysis process as
the temporal-state transitions, and using Bayesian networks
for generalizing the clinical pathways of the dialysis, and (3)
evaluating coverage of the clinical pathway patterns using
the real world data collected from a hospital. The major
achievements of these endeavors are summarized as
follows:
(1) Developing systematic methods to convert data into
temporal-state transitions. The dialysis parameters after
discretized, such as body temperature, pulse, systolic
pressure, and diastolic pressure, are treated as attributes
to present states. The change of machine settings is
treated as actions taken by nurses as certain events
occur.
(2) Displaying Bayesian networks using visualization
technology to facilitate on-site suggestion by tracing the
progress of current events.
(3) One unique contribution of using Bayesian network for
temporal-state transitions is that we suggest a possible
set of states for Bayesian networks from the data. It is
different from the traditional approach that users
specify states.
(4) Exhibiting coverage of the generalized patterns in the
group level by testing the patterns with unseen
pathways. The results encourage dialysis professionals
to select the clinical pathways according to the
pathways of the patients with the similar characteristics.
In future research, the following directions can be
followed. First, a field data collection system can be
enhanced based on our data entry batch system during this
[1] W. Buntine, “A guide to the literature on learning probabilistic
networks from data,” IEEE Transactions on Knowledge and
Data Engineering, Vol. 8, No. 2, pp.195-210, 1996.
[2] D. Heckerman, and R. Shachter, “A definition and graphical
representation for causality,” Proceedings of Eleventh
Conference on Uncertainty in Artificial Intelligence, Montreal,
1995.
[3] F.-r. Lin, S.-c. Chou, S.-m. Pan, Y.-m. Chen, “Mining time
dependency patterns in clinical pathways,” Proceedings of the
33rd Hawaii International Conference on System Sciences,
2000.
[4] National Kidney Foundation, NKF-DOQI, “Clinical Practical
Guidelines for Hemodialysis Adequacy,” American Journal of
Kidney Diseases, Vol. 30, No 3, Suppl 2, pp.17-66, September
1997.
[5] J.R. Quinlan, “Induction of decision trees,” Machine Learning,
Vol.1, No. 1, pp.81-106, 1986.
[6] M.J.A. Berry, and Gordon Linoff, “Artificial neural networks,”
in Data Mining Techniques: for Marketing, Sales, and
Customer Support (Chapter 13), John Wiley & Sons, Inc.
1997.
[7] Heckerman, D., “Bayesian networks for knowledge
representation and learning,” Advances in Knowledge
Discovery and Data Mining, MIT Press, 1995.
[8] D. Heckerman, “Bayesian networks for data mining,” Data
Mining and Knowledge Discovery, Vol. 18, No. 6, pp.79-119,
1997.
[9] Charniak, “Bayesian networks without tears,” AI Magazine,
12(4):50-63, 1991.
[10] F.T. De Dombal, et al. “Computer-aided diagnosis of acute
abdominal pain,” British Medical Journal, Vol, ii, pp.9-13,
1972.
~8~
0-7695-1435-9/02 $17.00 (c) 2002 IEEE
Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS-35’02)
0-7695-1435-9/02 $17.00 © 2002 IEEE
8