Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Proceedings of the 35th Hawaii International Conference on System Sciences - 2002 Using Bayesian Networks for Discovering Temporal-State Transition patterns in Hemodialysis Fu-ren Lin Chih-hung Chiu San-chiang Wu Department of Information Management Kaohsiung Municipal Women and Children National Sun Yat-sen University General Hospital Kaohsiung, Taiwan 804, ROC Kaohsiung, Taiwan 804, ROC [email protected] The constructed probabilistic network keeps up to date by propagating new instances to the existing networks. The probabilistic network can be used for proposing alternatives for decision making, and facilitating the communications and training between professionals. Due to the prevalence of information technology in medical care, data collected from clinical processes can be used for discovering useful patterns. These patterns can be analyzed automatically or by medical professionals in order to develop better strategies to improve the quality of medical treatment. For example, numerous physiological states of patients, medical activities, and therapeutic interventions performed by the physicians, nurses, and other staffs are recorded for future reference and legal documents. These records, as medical histories, are valuable sources for discovering clinical rules and enhancing clinical knowledge facilitated by information technology (e.g. [3]). This paper reports our research in adopting Bayesian networks for discovering temporal-state transition patterns in the Hemodialysis process. The discovered Hemodialysis clinical pathway patterns can be used for predicting possible paths for an admitted patient, and for helping medical professionals to react to exceptions during the Hemodialysis process. We also a suggest knowledge management systems of the Hemodialysis process can be created using the results of this study. In Section 2, we introduce the theory of Bayesian networks, and explain why Bayesian networks are adopted to represent probabilistic networks. Bayesian networks are constructed to codify knowledge of temporal-state transitions from workflow logs in Section 3. In Section 4, we apply the Bayesian networks to represent causal relationships between transitions of patient’s physiological states and medical treatments in the Hemodialysis process. Section 5 illustrates the usage of Bayesian networks for generalizing clinical paths of Hemodialysis, and supporting medical decision-making. Conclusions and future research are described in Section 6. Abstract In this paper, we adopt Bayesian networks for discovering temporal-state transition patterns in the Hemodialysis process. Bayesian network is a graphical model that encodes probabilistic relationships among variables, and easily incorporatesnew instances to maintain rules up to date. We demonstrate the proposed method in representing the causal relationships between medical treatments and transitions of patient’s physiological states in the Hemodialysis process. The discovery of Hemodialysis clinical pathway patterns can be used for predicting possible paths for an admitted patient, and to help medical professionals to react to exceptions during the Hemodialysis process. The discovery of clinical pathway patterns enables reciprocal learning cycle for medical organizational knowledge management. 1. Introduction The importance of extracting professional knowledge from domain experts and representing it in an explicit form has been widely recognized. Taking the medical industry as an example, medical professionals make decisions throughout the clinical paths. When is the right time to perform the next therapeutic interventions? To what extent should a patient with medical history of high blood pressure or diabetes remain comfortable during hemodialysis? These questions indicate the need to document the professional knowledge explicitly embedded with medical professionals. The explicit knowledge representation facilitates the communication between medical professionals, accelerates practical training, and supports professional judgment. The probabilistic network is usually used for representing knowledge of temporal-state transitions from workflow logs. A probabilistic network consists of states, pathways, and causal probabilistic relationships [1]. It is an explicit representation of inter-dependencies between variables that ignores the specific numeric or functional details. Depending on interpretation, they can also represent causality [2]. Probabilistic networks are a good model for representing frequent state transitions in medical processes. 2. Literature Review 2.1 The Hemodialysis Normally, the human body has two kidneys positioned ~1~ 0-7695-1435-9/02 $17.00 (c) 2002 IEEE Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS-35’02) 0-7695-1435-9/02 $17.00 © 2002 IEEE 1 Proceedings of the 35th Hawaii International Conference on System Sciences - 2002 in the back of the body behind the lower ribs. A kidney is a filtering system. As blood passes through kidneys, the containing wastes and excess fluid are filtered and flow through the urethra to the bladder. They are passed out of the body as urine. Healthy kidneys daily produce a total of about 1.5 to 2.5 quarts of urine. Kidneys also balance chemicals (sodium, potassium, calcium, phosphorous and others) in blood, and produce hormones, which help to regulate blood pressure, stimulate red blood cell production, and promote strong bones. When both kidneys fail, the body holds fluid and the blood pressure rises. There is an accumulation of certain toxic substances (urea and creatinine) and an increase in the volume of body water. This excess water results in swelling of tissues and high blood pressure and prevents normal activities of other organs, such as the heart and lungs. For example, if the body cannot remove its own excess phosphate, calcium levels drop resulting in bone diseases, harmful wastes build up in the body, and the body will not produce enough red blood cells. If total kidney function drops below ten percent of normal capacity and the impairment is irreversible, the condition is known as end-stage renal disease (ESRD). When this situation occurs, artificial dialysis is needed to replace the work of the failed kidneys. There are two different kinds of dialysis: one is called Hemodialysis, which cleans blood outside the body via a machine, and the other is called peritoneal dialysis, which cleans blood inside the body. Both peritoneal dialysis and Hemodialysis are based on the same principles. The first principle is osmosis, which denotes that water moves from a low concentration of particulates to a higher concentration, and the second one is diffusion, which denotes that particles spread out evenly in a solution [4]. Hemodialysis (also called kidney dialysis) is the most popular treatment for ESRD patients. It does not only replace the functions of kidneys in cleaning and filtering blood to removes harmful wastes and extra salt and fluids, but also control blood pressure and help the body to keep the proper balance of chemicals such as potassium, sodium, and chloride. In this procedure, two needles are inserted into a blood vessel (usually in the patient's arm). Each needle is attached to a thin length of tube. One tube carries blood to a machine containing a dialyzer, which is a unit comprised of many very fine hollow fibers. These fibers are made of a semi-permeable membrane. As blood flows through the fibers, dialysate flows around them, removing impurities and excess water and adjusting the chemical balance of the blood. After being cleansed and adjusted, the blood returns to the patient's body through the second tube. Less than five percent of a patient's blood is outside the body during the dialysis process. The treatment lasts 3 to 6 hours and is usually performed three times a week. Nurses record patients’ status, such as blood pressure, pulse, and dialysis machine operation state, such as fluid and dialysis blood velocity. 2.2 Why Bayesian network is chosen There are several data mining techniques available for extracting and representing knowledge from data. To represent the complex causal relationships and probabilistic semantics between numerous variables, three techniques including decision trees [5], artificial neural networks [6]), and probabilistic networks are usually used. Buntine (1996) points out that probabilistic networks have the distinguishing characteristics from decision trees and artificial neural networks. Probabilistic networks have a clear semantics that allow them to be processed in order to do diagnosis, learning, explanation, and many other inference tasks necessary for intelligent systems. Artificial neural network usually has higher prediction accuracy, but fails to explicitly explain the causal relationship between input conditions and output outcomes. The decision tree technique is not qualified to represent inter-dependencies or independencies between variables and may not prevent a minority of special cases from being blindfolded or ignored in our task. The Bayesian network is a popular representation of probabilistic network. Justifying the fitness of these potential technologies, we assert that the Bayesian network is more suitable for our task than the other two techniques. The Bayesian network is a graphical model that encodes probabilistic relationships among variables of interest. Heckerman [7] states four initial steps to construct the Bayesian networks: (1) correctly identify the goals of modeling, (2) identify many possible observations that may be relevant to the problem, (3) determine what subset of these observations are worthwhile to model, and (4) organize the observations into variables that can be assigned to distinct and exhaustive states. For tutorial articles on Bayesian network please refer to [8][9]. 3. Constructing Bayesian Networks with temporal-state transitions from workflow logs In this section, we propose the solution of constructing Bayesian networks with temporal-state transitions from workflow logs. In the following subsections, firstly, we define terms used for representing temporal-state transitions and Baysian networks. Secondly, we design the process of constructing Baysian networks. Thirdly, we describe how we inference and propagate Baysian networks. 3.1 Some definitions for constructing Bayesian networks The following definitions will help us to understand the detailed process of constructing Bayesian networks described in the next subsection. ~2~ 0-7695-1435-9/02 $17.00 (c) 2002 IEEE Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS-35’02) 0-7695-1435-9/02 $17.00 © 2002 IEEE 2 Proceedings of the 35th Hawaii International Conference on System Sciences - 2002 five steps: data discretization, data merging, defining events and actions, defining states, and constructing Bayesian networks with temporal-state transitions. Step 1: Discretizing continuous variables The variables of conditions in the workflow processes may be continuous, which creates the possibility of an infinite number of states. Therefore it is necessary to discretize the continuous variables to nominal variables according to certain splitting criteria to enumerate important states. Experts’ domain knowledge or clustering methods can be used to discretize continuous variables. For example, values for body temperature, low, fair, and high are ≦35℃, 35℃~36.5℃, and >36.5℃. Step 2: Consolidating contiguous similar states If the values of contiguous states are in the same range (except the time variable), it means that these two states have no significant differences. In this case, we consolidate these two states to represent the states of a contiguous period of time. By concatenating contiguous similar states, we can obtain more concise state space in describing a workflow. Step 3: Defining events and actions When there are some differences between two contiguous states, an event can be specified between the two states. An action is performed after certain event occurs, or or a state transition has occurred naturally, that is, without intervening actions. Step 4: Specifying distinct states After executing Step 2, any two contiguous states are different. We can use the name-mapping method to assign state identification Sid, where id denotes the unique number in state table of the database. When a new record is inserted into the database, an attempt is made to match its state to an existing state id. If there is a match, we assign the matched identifier to the inserted state. Otherwise, we issue a distinct state identifier to the new record. By doing this, each state is distinct. Step 5: Constructing Bayesian networks with temporal-state transitions A Bayesian network is a graphical representation of conditional probability distributions for a set of variables, and represents the causal relationship between variables. Because workflow logs are sequence records, the workflow logs themselves imply the causal relationship. Since time is irreversible, the Bayesian networks can be constructed according to the time line directly. In a Bayesian network, a state is represented by a set of variables (except time, event, and action), which is different from traditional Bayesian network construction. A pair of an event and an action is defined as the state value. The occurrence of state variables is encoded with the conditional probabilities. Therefore, we can construct a Bayesian network on a coordinate which the y-axis is the state identification and the x-axis is the time line shown in Figure 1. Note that, in the state of T5S4, the asterisk (*) in the P(E7A1|E1A2,*) represents that E1A2 and E8A9 are mutual exclusive, and will not occur at the same time. By tracing a Definition 1: Log database (L) A log database of a workflow consists of records, denoted as L= {Pi , Tij, Sij}, where Pi is the process i, Tij is the recorded start time of this log, and Sij is the state of the process i at time Tij. Definition 2: State (S) A state represents a certain condition of the process in a time duration. A state is defined by all variables excluding time, event and action. It can be represented as: Sj=(ci1, ci2, cj3, …, ciN), where cjk is the value of the kth variable in Si, and Si is unique in the state set. Definition 3: Event (E) An event occurs and makes contiguous Si and Sj different. That is, Ek: SiÆSj. Note that Ek is composed by a subset of state variables. Definition 4: Action (A) An action is performed in reaction to a given event. If the state transition from Si to Sj naturally, the action is represented by the empty set {φ}. The action with the occurrence of event can be represented formally as EkAh: Si ÆSj. Definition 5: Path graph (Gp) A process record contains the process name, recorded time, state variables, events, and actions. That is, a process record can be represented by the format of {Pi, T, S, (Event ∩Action)}. Once we have collected process records, we use them to create a process path with the temporal-state transition. A path graph is denoted as Gp=(V, E,μp), where V is a finite nonempty set of vertices, E⊂ V × V is a set of directed edges, and μ p(V)=〈T, S, (Event ∩ Action)〉 is a function assigning attributes and marking labels to each vertex. In 〈T, S, (Event∩Action)〉, T is the start time of the vertex, S is the name of the vertex, and (Event∩Action) is the value of this vertex. Definition 6: Bayesian network (B) A Bayesian network (B) consists of many path graphs. Therefore, B has the same components as a path graph. When one path graph puts into the B, each vertex of B will calculate its conditional probability. When all path graphs put into the B, the total probabilities of vertices would be encoded in the B. A Bayesian network can be represented as ∪Gpi =(V, E,μB), where V is a finite nonempty set of nodes, E⊂ V × V is a set of directed edges, andμB(V)= 〈T, S, ∪ (Event∩Action), p〉 is a function assigning attributions and marking labels to each vertex. T is the start time of the vertex, S is the name of the vertex, (Event∩Action) is the value of this vertex, and p is the conditional probability of each value. 3.2 The process of constructing Bayesian networks for representing temporal-state transitions We propose a process of constructing Bayesian networks with temporal-state transitions which consists of ~3~ 0-7695-1435-9/02 $17.00 (c) 2002 IEEE Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS-35’02) 0-7695-1435-9/02 $17.00 © 2002 IEEE 3 Proceedings of the 35th Hawaii International Conference on System Sciences - 2002 path, only one of the states, T3S2 or T1S3, leading to state T5S4 at the same time. That is, only one prior condition exists denoted as P(E7A1|E1A2,*) or P(E7A1|*,E8A9). clinical pathway of Hemodialysis. Finally, various scenarios are used for illustrating the benefits from this research results. 4.1 Representing clinical pathways of Hemodialysis S10 S9 The efforts of using Bayesian networks to representing the clinical pathway of Hemodialysis are spent on data collection, data cleaning, Bayesian networks construction, clinical path prediction, and Bayesian network propagation. Figure 2 illustrates the model of applying Bayesian networks to generalize clinical pathways of Hemodialysis. The involved tasks are elaborated as follows. S8 S7 S6 States P(E7A1|E1A2,*)=0.3 P(E4A3|E1A2,*)=0.7 P(E7A1|*,E8A9)=0.04 P(E4A3|*,E8A9)=0.16 S5 S4 E7A1 E4A3 P(E1A2|EsAs)=0.7 S3 P(E2A8|E7A1,*)=0.3 P(E7A2|E7A1,*)=0.7 P(E2A8|E4A3,*)=0.5 P(E7A2|E4A3,*)=0.5 E2A8 E7A2 E1A2 S2 Hemodialysis log data E8A9 E2A5 P(E8A9|E2A1)=1 S1 E2A1 SSE EsAs T0 E8A6 P(E2A1|EsAs)=0.3 T1 T2 P(E2A5|E8A6)=1.0 E4A5 E1A2 P(E8A6|E8A9)=0.8 T3 T4 T5 T6 T7 T8 T9 P(E4A5|E2A8,*)=0.2 P(E1A2|E2A8,*)=0.8 P(E4A5|E7A2,*)=0.7 P(E1A2|E7A2,*)=0.3 P(E4A5|*,E2A5)=0.7 P(E1A2|*,E2A5)=0.3 1. Data collection Patient data 2. Preprocessing T10 Time Figure 1. An example of Bayesian networks 3. Constructing Hemodialysis clinical pathways using Bayesian network After we construct the Bayesian network, we can calculate the probability distributions in the network. For example, in Figure 1, a process reaches the vertex at T3S2 with value E8A9. There are two paths including ten One path is conditions starting from T3S2. and the other is {T3S2ÆT5S4ÆT9S3ÆT11SSE}, There are ten different {T3S2ÆT7S1ÆT9S2ÆT11SSE}. combinations of events and actions in these two paths. A Bayesian network can be updated after inserting new process data. Each new process is specified by discretizing continuous variables, consolidating contiguous similar states, and defining events and actions as described in Step 1 to 3. By keeping the count of each vertex, we can easily obtain the new probabilistic distribution of each vertex after inserting a new data. 4. Evaluating Hemodialysis clinical pathways by professionals Hemodialysis clinical pathway patterns in Bayesian networks 6. The propogation of Bayesian networks 5. Hemodialysis clinical pathways predication and suggestion Figure 2. The Model of applying Bayesian networks to generalize Hemodialysis clinical pathways 1. Data collection The data used for this research are collected from the Division of Nephrology at Kaohsiung Municipal Women and Children General Hospital during August 1999 and September 1999. The data consists of three categories: (1) patient profile including patient identification, gander, birthday, age, (2) pre-dialysis data such as start time, systolic and diastolic pressure, weight, weight difference between dialysis, pulse, dialysis machine model, dialyzer, and dialysis fluid, and (3) Hemodialysis log data composed of log time, systolic and diastolic pressures, pulse, blood velocity, dialysis fluid velocity, excess weight filtered, physiological saline, patient’s chronics, and treatment. 2. Pre-processing In order to generate accurate clinical patterns, the data accuracy is very important. We consulted doctors and nurses to determine important variables, and their domains. We clean the new data by the following steps: (1) filling missing values from the contiguous data over the extended periods, (2) deleting the records with too many missing values, and (3) censoring the data that exceed the domain. 3. Constructing the Hemodialysis clinical pathway The construction of Hemodialysis clinical pathway 4. The application of Bayesian networks to discovering the clinical pathway patterns of Hemodialysis Bayesian methods are not new to medical cares. In fact, Bayes’ theorem has been used successfully in medical expert systems for about thirty years [10]. However, this research is an initial effort on using Baysian networks for presenting temporal state transitions in Hemodialysis and demonstrating the reciprocal knowledge management for better quality of dialysis services. In this section, we demonstrate how to apply the Bayesian network to representing temporal-state transitions in Hemodialysis. Firstly, we briefly show how to apply a Bayesian network framework to represent the clinical pathway of Hemodialysis. Secondly, we specify stages of constructing the temporal-state transitions in Bayesian networks as the ~4~ 0-7695-1435-9/02 $17.00 (c) 2002 IEEE Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS-35’02) 0-7695-1435-9/02 $17.00 © 2002 IEEE 4 Proceedings of the 35th Hawaii International Conference on System Sciences - 2002 First, we obtain an initial state Ss, which it is a common start vertex. Second, we record state occurrence in each vertex. Therefore, we can calculate the probabilities in each vertex and easily propagate the Bayesian network. Finally, we construct the Bayesian network. patterns follows such steps as discretizing continuous variables, consolidating contiguous similar records, defining events and actions, specifying distinct states, and then constructing Bayesian networks with temporal-state transitions. 4. Evaluating the pattern of clinical pathways by professionals To ensure the resulting patterns are meaningful for professionals in Hemodialysis, a team of dialysis professionals including doctors and nurses should evaluate these patterns to verify their quality. 5. Clinical pathway prediction and suggestion The clinical path patterns can be used to predict possible outcomes of a new patient admitted to the Hemodialysis. By estimating the possible result, a clinical professional can take actions in reaction to certain events during the dialysis to induce desirable outcomes. 6. The propagation of Bayesian networks The clinical path patterns can be maintained by propagating the existing patterns with the new patients’ data as described in Section 3. 4.2 Constructing pathways the Hemodialysis 5. The Empirical Results and Discussion The dialysis data collected from the Division of Nephrology at Kaohsiung Municipal Women and Children General Hospital during August and September 1999 include 140 patients and 2213 records in total. We use Java language to develop the system to visualize networks and calculate probabilities. Users can operate this system through Web browsers. Applying Bayesian networks to represent clinical pathways of Hemodialysis proposed in Section 4, we obtain the results from generalizing the temporal-state transitions in Hemodialysis to predict the clinical pathways for individual and groups respectively. The dialysis process represented by Bayesian networks can be used for on-site suggestion and clinical knowledge management as elaborated in Subsection 5.3. clinical 5.1 The generalization of temporal-state transition patterns in Hemodialysis From clinical data, we build Bayesian networks to represent the clinical pathway of Hemodialysis. The construction of the Hemodialysis clinical pathways proceeds with the following steps. Step1: Discretizing continuous variables Most of the dialysis data variables are continuous variables. The discretization of these continuous values is necessary to obtain distinguishable states. The splitting points and results are shown in Table 5 and 6 respectively. Step2: Consolidating contiguous similar states To discretizing continuous variables, contiguous similar data are assigned to the same discretized value, and then consolidated. By consolidating contiguous similar records, we obtain concise records to describe the Hemodialysis clinical pathway. Step3: Defining events and actions By consolidating contiguous similar records, we ensure that every two new contiguous records are different. That is, there exists an event and an action between any two contiguous records. Step4: Specifying distinct states We specify distinct states, and then transform the dialysis data with these distinct states. Step5: Constructing Bayesian networks with temporal-state transitions When we transform dialysis logs into the four-tuple {T, S, (E∩A)}, we can start to construct Bayesian networks. After constructing Bayesian networks for temporal-state transitions, we can display the clinical pathway patterns for individuals or a group of patients across an extended period of time. Figure 3 and 4 are two example Bayesian networks showing individual and group clinical pathways respectively. In these figures, the y-axis denotes the systolic pressure, and the x-axis denotes the time sequence and each scale represents five minutes. Nodes in the figure represent states occurring in the dialysis. Links from one state to the other state represent an action triggered by an event in the source state. The probability can be added onto links to show the percentage of cases from the occurrence of events at the source state to the destination state. The right side bar represents the scale of quality for the dialysis process. The lighter the bar the better dialysis quality while the darker the worse quality. Table 1 specifies the attributes for specifying the dialysis quality. The situations attributing to the worst quality are those life crucial exceptions. The resulting clinical pathways for individuals can be used for tracing the individual dialysis history to achieve better individual care. The group clinical pathways release the common sequence of individuals within the group, and specify variations for medical professionals to analyze their causes and possible treatments in the future. ~5~ 0-7695-1435-9/02 $17.00 (c) 2002 IEEE Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS-35’02) 0-7695-1435-9/02 $17.00 © 2002 IEEE 5 Proceedings of the 35th Hawaii International Conference on System Sciences - 2002 predicting the clinical pathway for the coming patients, we perform the following experiments to evaluate coverage of the generalized patterns. First, we evaluate coverage at the individual level by using an individual dialysis pattern across these two months to predict her/his latest dialysis pathway. The result shows that the dialysis parameters are tuned according to the situation when the patient arrives; thus, an individual is treated with large variation each time. Therefore, coverage is very low. Second, we evaluate coverage at the group level by using group dialysis patterns to predict the individual clinical pathways. We clustered the clinical pathways into three groups, Group A, B, and C, according to the demographical clustering function in Intelligent MinerTM based on the attributes of weight before dialysis, age, systolic pressure, weight difference after dialysis, diastolic pressure, and pulse. The resulting three groups are shown in Figure 5, where clinical pathways assigned to Group A, B, and C are 38%, 32%, and 25% respectively. We select 10% of clinical pathways from each group as the testing set to evaluate coverage for each group patterns. Coverage for a pathway is defined as the percentage of the number of states in the pathway sharing the same links with pathways in a group. Coverage of a testing set is the sum of the percentages obtained from each pathway in the testing set. Formally, we define coverage as Figure 3. A patient’s Bayesian Network h n Coverage= 1 + 1 h n 2 + ... + 2 h n N N , where N is the total N number of testing pathways, ni is the number of links of the ith pathway, and hi is the number of links matching the ith pathway. Figure 4. A group patients’ Bayesian Network Table 1. The variables for dialysis quality Quality variables Quality indicator Coma, sleeping, convulsions, palpitation, Life crucial dyspnea, unconscious Systolic and diastolic pressures increasing Medium or decreasing 20%, suffocation. exceptions Headache, vomiting, stomach ache, Minor abdominal_dist, diarrhea, yawning, uncomfortable spasm, chillness, tumid, fever, blood_nf, tub_hemorrhage, tub_inflammation, tub_oppilation, vein_loss, vein_clot, skin_itch, tinnithus, cold_sweat, Malaise, groan, restless, fatigue. Figure 5. The clustering results using Intelligent MinerTM The results are shown in Figure 6, where x-axis denotes the time window size, and the y-axis denotes coverage. The time window is the allowance of time forward or backward in matching states. One time window size is set to 5 minutes in the experiment. The results show that the trend of increasing coverage as the time window size 5.2 The prediction of temporal-state transitions in Hemodialysis In order to further analyze the generalized patterns for ~6~ 0-7695-1435-9/02 $17.00 (c) 2002 IEEE Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS-35’02) 0-7695-1435-9/02 $17.00 © 2002 IEEE 6 Proceedings of the 35th Hawaii International Conference on System Sciences - 2002 increases in these three groups. This implies that the occurrence of states, events, and actions cannot be accurately specified within a narrow time frame in the Hemodialysis process. The relaxation of time frame improves coverage, although coverage is not as high as we expected. In future research, there is a great improvement can be done in various directions, such as data quality, pattern generalization, etc. professionals can improve their clinical decisions, which in terms enhance the quality of knowledge base. This can be viewed as an organizational knowledge management effort to acquire, store, evaluate knowledge of dialysis, and in turn improve the organizational competitiveness. In the next two subsections, we will further illustrate these extensions. 5.3.1 On-site suggestions. We illustrate the on-site suggestion function with the following scenario. As a patient is admitted into the Hemodialysis process, he or she will be assigned to a dialysis machine. Nurses assign initial settings, such as the blood velocity, dialysis fluid velocity, and saline flow, according to the clinical pathway patterns exhibited by the Bayesian network system. By recording events occurring and actions taken, the Bayesian network traces the progress of the dialysis. If an event occurs and nurses have no previous experience or lack of confidence on how to handle this event, the Bayesian network system can calculate the percentage of taking certain actions, and predict the resulting quality after taking the action. This on-site tracing and suggestion capability enables nurses to incorporate experiences from clinical pathways contributed by other colleagues, which enlarges his or her decision alternatives, and in turn may enhance his or her learning performance. Of course, it ultimately benefits the patients. 0.35 0.3 Coverage 0.25 0.2 G ROU P A G ROU P B G ROU P C 0.15 0.1 0.05 0 0 1 2 3 4 5 6 7 8 9 10 11 12 Tim e W indow size Figure 6. Coverage with different time window size 5.3 Discussion In Subsection 5.1, we demonstrate the usage of Bayesian networks for generalizing the temporal state transitions in Hemodialysis. Through the visualization interface, the medical professionals can obtain the structure of clinical pathways of individuals and groups. This provides them a basis for them to evaluate the quality of treatment, and facilitate variation analysis. In Subsection 5.2, we test the predictability of the clinical pathway patterns of Hemodialysis in individual and group levels. The results show that the clinical pathway patterns have better coverage for predicting the clinical pathway of an individual patient in his or her similar group as the window size increases. However, coverage is not as high as we anticipate. The low coverage may result from data quantity and quality. We only collected data from two months, which may not cover most of the cases for generalizing the pathway patterns, and the data cleaning efforts cannot remove all of noisy data. Therefore, in the future, we can further verify the result that the group patterns can be used for predicting the possible pathways of individuals embedded with the group characteristics by improving the data quality and increasing data quantity. In summary, we have initiated the efforts on formulating the Hemodialysis as the temporal-state transitions using Bayesian networks, and obtain better coverage in the group level. We can extend the analysis to perform on-site suggestion along the dialysis process to give nurses real-time suggestions to take actions to respond to certain events. The clinical pathway pattern base can serve as the knowledge base, and through the cycle of variation analysis, execution, and evaluation, the medical Evaluation Pattern generation Pattern base On-site suggestion Data base Data log Execution Figure 7. The reciprocal learning cycle of the Hemodialysis 5.3.2 Facilitating reciprocal learning cycle and organizational knowledge management. The additional efforts on developing Intranet infrastructure and database applications can be justified by the benefits of the reciprocal learning cycle and organizational knowledge management. Figure 7 illustrates the reciprocal learning cycle in improving the Hemodialysis quality within the Nephrology department. The clinical pathway patterns can be generalized from the log data in daily dialysis operations. Periodically, medical professionals evaluate dialysis variation under various situations to obtain synthetic perception of the performance executed by professionals in the department. At this point, the pattern base acts as the organizational memory, which can be used by professionals ~7~ 0-7695-1435-9/02 $17.00 (c) 2002 IEEE Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS-35’02) 0-7695-1435-9/02 $17.00 © 2002 IEEE 7 Proceedings of the 35th Hawaii International Conference on System Sciences - 2002 for on-site suggestion to new patients. The log data generated during the dialysis process are added into the database, which can be used for refining the pattern base. In turn, the patterns suggested by the system are more effective in handling the progress of dialysis. Therefore, we view the reciprocal learning cycle as the core of the organizational knowledge management. research in order to obtain high quality raw data in low costs. Second, the clinical professionals proactively evaluate the clinical pathway patterns and analyze variations among different settings and patients. Therefore, the data collected from the revised medical treatments can refine the pattern base. Third, the reciprocal learning cycle can be better measured to bring insights into organizational knowledge management. 6. Conclusions and Future Research References In this research, we have spent efforts on the following endeavors: (1) representing temporal-state transitions using Bayesian networks, (2) viewing the Hemodialysis process as the temporal-state transitions, and using Bayesian networks for generalizing the clinical pathways of the dialysis, and (3) evaluating coverage of the clinical pathway patterns using the real world data collected from a hospital. The major achievements of these endeavors are summarized as follows: (1) Developing systematic methods to convert data into temporal-state transitions. The dialysis parameters after discretized, such as body temperature, pulse, systolic pressure, and diastolic pressure, are treated as attributes to present states. The change of machine settings is treated as actions taken by nurses as certain events occur. (2) Displaying Bayesian networks using visualization technology to facilitate on-site suggestion by tracing the progress of current events. (3) One unique contribution of using Bayesian network for temporal-state transitions is that we suggest a possible set of states for Bayesian networks from the data. It is different from the traditional approach that users specify states. (4) Exhibiting coverage of the generalized patterns in the group level by testing the patterns with unseen pathways. The results encourage dialysis professionals to select the clinical pathways according to the pathways of the patients with the similar characteristics. In future research, the following directions can be followed. First, a field data collection system can be enhanced based on our data entry batch system during this [1] W. Buntine, “A guide to the literature on learning probabilistic networks from data,” IEEE Transactions on Knowledge and Data Engineering, Vol. 8, No. 2, pp.195-210, 1996. [2] D. Heckerman, and R. Shachter, “A definition and graphical representation for causality,” Proceedings of Eleventh Conference on Uncertainty in Artificial Intelligence, Montreal, 1995. [3] F.-r. Lin, S.-c. Chou, S.-m. Pan, Y.-m. Chen, “Mining time dependency patterns in clinical pathways,” Proceedings of the 33rd Hawaii International Conference on System Sciences, 2000. [4] National Kidney Foundation, NKF-DOQI, “Clinical Practical Guidelines for Hemodialysis Adequacy,” American Journal of Kidney Diseases, Vol. 30, No 3, Suppl 2, pp.17-66, September 1997. [5] J.R. Quinlan, “Induction of decision trees,” Machine Learning, Vol.1, No. 1, pp.81-106, 1986. [6] M.J.A. Berry, and Gordon Linoff, “Artificial neural networks,” in Data Mining Techniques: for Marketing, Sales, and Customer Support (Chapter 13), John Wiley & Sons, Inc. 1997. [7] Heckerman, D., “Bayesian networks for knowledge representation and learning,” Advances in Knowledge Discovery and Data Mining, MIT Press, 1995. [8] D. Heckerman, “Bayesian networks for data mining,” Data Mining and Knowledge Discovery, Vol. 18, No. 6, pp.79-119, 1997. [9] Charniak, “Bayesian networks without tears,” AI Magazine, 12(4):50-63, 1991. [10] F.T. De Dombal, et al. “Computer-aided diagnosis of acute abdominal pain,” British Medical Journal, Vol, ii, pp.9-13, 1972. ~8~ 0-7695-1435-9/02 $17.00 (c) 2002 IEEE Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS-35’02) 0-7695-1435-9/02 $17.00 © 2002 IEEE 8