Download Pre-Triage Decision Support Improvement in Maternity Care by

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Pre-Triage Decision Support
Improvement in Maternity Care by
means of Data Mining
Eliana Pereiraa, Andreia Brandãoa, Maria Salazarc, Carlos Filipe Portelab, Manuel Filipe
Santosb, José Machadoa , António Abelhaa , Jorge Bragad
a
Computer
b
Science and Technology Center (CCTC). University of Minho. Braga. Portugal;
Algoritmi Research Center. University of Minho. Guimarães. Portugal;
c
Serviços de Sistemas de Informação, Centro Hospitalar do Porto, Porto, Portugal
d
Centro Materno Infantil do Norte, Centro Hospitalar do Porto, Porto, Portugal
[email protected]; [email protected]; [email protected];
[email protected]; [email protected]; [email protected]; [email protected];
[email protected]
ABSTRACT
The conventional triage systems like Manchester Triage System (MTS) are not suitable for maternity
care, a decision model for pre-triaging patients in emergency (URG) and consultation (ARGO) classes
was built and incorporated into a Decision Support System (DSS) implemented in Centro Materno
Infantil do Norte (CMIN). Complementarily, DSS produces several indicators to support clinical and
management decisions. A recent data analysis revealed a bias in the classification of URG cases.
Frequently, cases classified as URG correspond to ARGO. This misclassification has been studied by
means of Data Mining (DM) techniques in order to improve the pre-triage model and to discover
knowledge for developing a new triage system based on waiting times and on a 5-scale of classes. This
chapter presents a kind of sensitivity analysis combining input variables in six scenarios and considering
four different DM techniques. CRISP-DM methodology was used to conduct the project.
INTRODUCTION
Nowadays, we live in an age where information, knowledge and globalization are important issues.
Organizations should be able to respond to new challenges and new demands in an environment of
constant change. In the healthcare sector, information technologies , providing complete and reliable
information for healthcare professionals support their clinical and administrative decisions
(Khodambashi, 2013). One example is the system of triage in the hospital emergency unit.
In a hospital setting, various types of triage systems are used. The most commonly used consider five
levels of severity, the Emergency Severity Index (ESI), the Manchester Triage System (MTS) and the
Canadian Triage and Acuity Scale (CTAS). The main limitation of this type of scales is the lack of
flexibility, since they usually are used only in emergency units in general and not specific for emergency
units (Portela et al., 2013).
2
In Centro Hospitalar do Porto (CHP), in particular the women emergency care of Gynaecology and/or
Obstetrics (GO), it was found that the MTS system, implemented in the urgency service is not the most
accurate for specific cases of triage such as obstetrics and gynaecology. This happens due to most of the
questions used for triage determined that urgent cases were identified when in fact they were not (false
positives). Such approach increases the number of patients in emergency and, consequently increasing
waiting times of patients who actually need priority attention.
Due to the limitations identified, a new system has been developed in order to reduce the false
positive rate. An Intelligent Decision Support System (IDSS) has been implemented for pre-triaging
patients into two different classes: Urgent (URG) when the patient should be treated at the
emergency service in CMIN; and Less Urgent (ARGO), when patient is oriented for a consultation
in CMIN. The IDSS will be able to be executed in real time and will include business intelligence
components (eg. indicators of Voluntary interruption of pregnancy, triage indicators) and Data Mining.
This system is implemented since 2010 and along four years of existence, the number of GO patients in
the urgency of the CHP decreased significantly. However, this only solves part of the problems inherent
to an emergency department because it only makes an efficient analysis of the patient and it is not
performing a priority triage according to patient symptoms. Further work is needed in order to understand
and improve pre-triage rules.
Data Mining (DM) techniques have been used to determine the level of accuracy of the implemented pretriage system, identifying opportunities to improve the quality of patient care. To this end classification
models were induced to predict whether the assignment of URG and ARGO occur according to the
questionnaire for evaluating clinical characteristics of the patient.
The best result obtained was in the study without ARGO and without URG (accuracy close to 100%).
Results demonstrate the reliability of the system for pre-triage. However, other scenarios (accuracy of
around 80% in the worst case) demonstrate the need for transformation of the pre-triage system on 5-level
priority system to allow better categorization of patients.
Beyond this introductory chapter, the document includes more five sections. The first is related to the
background and related work which describes the context in which the problem occurs and describes the
process of Knowledge Discovery from Databases and the method CRISP-DM. The second section
presents a DM case study following the CRISP-DM methodology. The results are discussed in the third
section. In the fourth section are presented some conclusions and, finally, in section five some future
directions are pointed.
BACKGROUND AND RELATED WORK
Context
Centro Materno Infantil do Norte (CMIN) is a unit of the Centro Hospitalar do Porto (ex Maternidade
Júlio Dinis). Women needing for gynaecology or obstetric urgent health care are submitted to a pre-triage
system developed in CHP. This system aims to prevent a possible routing of the patient taking into
account the clinical characteristics and therefore help selecting the best decision for each situation. To
develop this system were used techniques of knowledge discovery and data mining. This Intelligent
Decision Support System (IDSS) uses different data available, collected through specific triage
3
questionnaire. For this approach, the knowledge was obtained directly from the empirical and scientific
expertise of health professionals to make the first version of decision models. Later, were made adjusts to
optimize the process.
This system is running since January 2010. During the last four years, were admitted to CMIN 66,730
patients: 18,773 in 2010, 18,348 in 2011, 12,445 patients in 2012 and 17,164 in 2013. As mentioned
before, the routing system implemented in CMIN allows distinguish the urgent cases (URG) from the less
urgent cases (ARGO), however, healthcare professionals are in charge for the final decision. Healthcare
professionals can force URG or ARGO classification, if they do not agree with the result returned by the
pre-triage system. Only a pre-triage is performed, no priorities are associated.
Triage systems and pre-triage system in CMIN
Most emergency services in North America and Europe use triage tools to ensure that patients who need
intensive care receive priority treatment and to determine which patients require minors care and can wait,
giving priority to the patient in worst clinical conditions (Murray, Bullard, & Grafstein, 2004; Smithson et
al., 2013). In a hospital environment various types of triage systems are used. The most commonly used
are those with five levels of severity, the Emergency Severity Index (ESI), the Manchester Triage System
(MTS) and the Canadian Triage Acuity Scale (CTAS). The main limitation of this type of scales is the
lack of flexibility, since, typically, these systems are used only in emergency units (Murray, Bullard, &
Grafstein, 2004) because is not prepared to the specificities of other units as is Maternity Care Patients
which go to the specialties of obstetrics and gynaecology are a particular case, since they are
characterized by very specific symptoms, typical characteristic of obstetrics and gynaecology fields.
An example of a priority triage system specific for gynaecology and obstetrics is Obstetric Triage Acuity
Scale (OTAS). OTAS was developed based on the Canadian triage and Acuity Scale (CTAS), a tool that
was introduced in 1999 and was reviewed in 2006 and 2008 (Murray et al., 2004). This system is very
limited because only covers pregnant women and CMIN serves patients who are not pregnant too
(Smithson et al., 2013). So in CMIN a pre-triage system was built by using conventional knowledge
acquisition and representation techniques in order to characterize those patients at two levels of
importance: URG in the case of be urgent and ARGO if it is a less urgent case. This pre-triage system is
constituted by a set of six flowcharts supported by a detailed questionnaire that attempts to cover the
entire class of patients admitted into CMIN: Pregnant women, post-partum woman, non-post-partum
woman, maybe pregnant woman, patients to VIP, and patients to CTG.
Agency for Integration, Archive and Diffusion
Interoperability among information systems in CHP is guaranteed by the Agency for Integration, Archive
and Diffusion of Medical Information (AIDA), which is based on the use of intelligent agents to enable
communication among different systems. This multi-agent system allows for the standardization of
clinical systems and overcomes medical and administrative complexity inherent to different sources of
information. All medical information systems are supported by AIDA, including Electronic Health
Record (EHR) and triage routing system implemented in CMIN (Peixoto, Santos, Abelha, & Machado,
2012; Abelha, Analide, Machado, Neves, & Novais, 2007).
SWOT analysis
The SWOT analysis encompasses an approach to Strengths, Weaknesses, Opportunities and Threats. The
SWOT analysis is an important tool to support decision making and is usually used to systematically
analyze strategic situations and identify the level of organizations from their external and internal
environments. With this tool, strategies can be developed, that can be incorporated on their strengths,
4
eliminating the weaknesses, taking advantage of the opportunities and threats facing. The strengths and
weaknesses are identified by an internal review of the organization, while opportunities and threats are the
result of an external review. This analysis allows for helping organizations, projects or even individuals
about the systematic thinking and comprehensive diagnostic factors. Thus, the positive and negative
factors can be identified and subsequently a strategy can be developed resulting in a good fit of these
factors. The strengths and weaknesses are identified by an internal review of the organization, while
opportunities and threats are the result of an external review. (Salar & Salar, 2014; Shariatmadari,
Sarfaraz, Hedayat, & Vadoudi, 2013; Pereira, Salazar, Abelha, & Machado, 2013).
Knowledge Discovery and Data Mining
The Knowledge Discovery from Databases (KDD) process encompasses a set of ongoing activities that
share the knowledge discovered from databases. It consists of five stages (Fayyad, Piatetsky-shapiro, &
Smyth, 1996) (Krzysztof, Witold, Roman, & Lukasz, 2007):

Selection – At this stage is performed the selection of the data set that will be needed to achieve
the DM;

Pre-processing – This stage comprises cleaning and processing of data in order to turn them
consistent;

Transformation – This phase consists of working out the data according to the variable target.

Data Mining - At this step are defined the objectives and the type of result that is wanted to
achieve. According to the type of desired result, the type of task to be executed was defined (e.g.
classification, segmentation, summarization, dependency modelling) and the technique to be used
(e.g. decision trees, association rules, linear regression, artificial neuronal networks) was
identified. Subsequently, the selected data mining techniques were applied to the data set to
obtain patterns.

Interpretation/Evaluation – involves the interpretation and evaluation of the patterns obtained.
The results obtained are validated by applying the patterns found at new datasets (Azevedo,
2011).
Until 1995, many authors considered the KDD and DM equivalent terms. Nowadays, DM is a phase of
the Knowledge Discovery in Databases process (KDD) that consists in finding patterns or relationships
that may exist in the data stored in data repositories, while the KDD process refers to the whole process of
discovering useful knowledge.
Cross Industry Standard Process for Data Mining (CRISP-DM)
Cross Industry Standard Process for Data Mining (CRISP-DM) was the DM methodology addressed due
to its characteristics. CRISP-DM has a close relationship with the phases of the KDD process, described
above. The CRISP-DM divides the process of data mining into six major phases (Chapman, 2000;
Krzysztof et al., 2007; Machado, Abelha, Rua, & Centre, 2013).These steps are:

Business Understanding: this phase focuses on understanding the goal of the project from a
business perspective, defining a first plan to achieve the purposes;

Data Understanding: involves data collection and start-up activities for familiarization with the
5
data, identifying problems or interesting sets;

Data Preparation: at this stage are included all the tasks responsible for creating cases that will
be used to build the model table. Data preparation tasks are expected to be executed multiple
times. These tasks comprise building of the table of cases, selection of attributes, data cleaning
and transformation. Besides, new attributes can be added, obtained based on existing ones. The
data preparation phase can significantly improve the information that can be discovered through
DM.

Modelling: at this stage are applied modelling techniques (e.g. Decision trees, association rules,
linear regression, artificial neural networks) and their calibrated parameters for optimization.
During this phase it is common return to the data preparation stage;

Evaluation: to built a model that seems to have great quality from a perspective of data analysis.
However, it is necessary to check whether the model meets the business objectives;

Deployment: the knowledge obtained by the model is presented in a way that the customer can
use.
Model Evaluation
DM models built for predicting a particular process, analyze the relationship between the variables used
and what contribution they have to perform to the target. In the case of two-class target, a confusion
matrix M can be used. Cell M(1,1) stands for the number of True Positive results (TP), where the value
obtained corresponds to the expected value. Cell M(2,1) stands for False Positive (FP) results, in which
the resulting value incorrectly identifies the occurrence of the procedure (error type I). True Negative
(TN) results, cell M(2,2) is also a possible situation, in what the model correctly predicts the nonoccurrence of the procedure and, finally, False Negative (FN) results, in which the non-occurrence of the
procedure is not identified correctly (cell M(1,2) – error type II) (Beguería, 2006; Ripley, 2002).
From these values statistical metrics for assessing data quality can be deducted, in particular:

Sensitivity: to correctly detect the occurrence of the process. It is the resulting ratio with a correct
positive (TP), on all the values corresponding to positive (TP + FN);

Specificity: is the ability to correctly identify in a model, the non-occurrence of a procedure. It is
measured by the ratio of correctly identified values as negative (TN) by all the negative values
(TN + FP);

Accuracy: is the total percentage of ratio between the values detected correctly and the actual
values. It is measured by the proportion of all the results obtained correctly (TP + TN) from the
models of all cases liable to be obtained (TP + TN + FP + FN) (João, 2007);
IMPROVEMENT OF PRE-TRIAGE DECISION USING DATA MINING
In order to validate the pre-triage system of CMIN, Data Mining techniques were used, in particular
classification techniques, to verify if the decision model for URG and ARGO is well calibrated in
accordance to the surveys that are conducted among patients. In other words, this set of experiments aims
6
to predict whether the definition of URG and ARGO can be standardized according to the specific
questionnaire for the evaluation of clinical characteristics of the patient. This procedure has been carried
out for all flowcharts featuring the 6 types of patients in CMIN (Pregnant, postpartum, not pregnant,
patients who may be pregnant, VIP patients and patients to CTG).
Business Understanding
As was mentioned before the result of the pre-triage system implemented in CMIN can be URG (routing
patients to the urgency of CMIN) and can be ARGO (patients are routed for urgent consultations). As has
also been mentioned in the CMIN emergencies are attended 6 types of patients. Each one of these classes
of patients is associated to a set of specific questions that determine the state of Urgency (URG) or not
Urgency (ARGO). In this context, the problem can be formulated as "How likely is the answer URG or
ARGO taking into account the clinical characteristics of patients". This can be translated into a problem
of Data Mining as "How accurately a patient is distinguished as URG taking into account a set of specific
clinical features?"
Data Understanding
The data was extracted from AIDA and were analyzed in terms of the quality of the variables to be used
in the process. The sample covers the period between 06.01.2010 and 08.04.2014. 78984 cases were
analysed, being divided by:
 35238 cases of pregnant women;
 4050 cases of postpartum woman;
 24547 cases of non-postpartum and pregnant woman;
 4754 patients who may be pregnant;
 2843 patients who use the CMIN to make the process of Voluntary Interruption Pregnancy (VIP);
 2511 patients to Cardiotocography (CTG).
Attributes were extracted from the different forms used in the pre-triage system, as explained bellow:
Patient “Pregnant" consists of the following variables:

Results of the triage (RoT) - This is the Target variable and dictates the outcome of the triage
process.
Possible results: {URG, ARGO};

Symptoms - Represent some specific symptoms that can occur in pregnant and be related with the
well-being of the fetus or the pregnant woman.
Possible results: { Headache (Hd), Visual Changes (VC), Tension Increase of reference (TIR),
epigastric pain/right hypochondrium (EP\RH), nausea/vomiting (N\V), changes in skin/mucous
color (CS\MC), breakthrough bleeding (BB), decreased fetal movement (DFM), loss of amniotic
fluid (LAF), Trauma in pregnancy (TP) };

Another pathological reason (APR) - If any of the symptoms mentioned in the previous point is
not found, the pathological reason should be pointed out in this topic;

General state (GS) - In this parameter, the nurses assess the general condition of the patient. This
parameter is defined by a range of three possible outcomes.
7
Possible results: {good, bad, reasonable};

Pain Scale (PS) – It’s a scale between 1 and 10 that represent the pain scale, where 1 represents
the total absence of pain and 10 representing the pain as much as possible;

Symptoms1 - These variables represent symptoms of a more general nature.
Possible results: {Fever (Fv), Urinary Symptoms (US), Hemorrhage (Hm), Convulsions (Cv),
Syncope (Sc)}.
Patient "Postpartum" consists of the following variables:

Symptoms2 - Represent some specific symptoms that can occur in postpartum women, and are
related to the well-being of women.
Possible results: {Breast swelling (BS), Foul lochia (FL), Remove Suture (RS), Fluid (blood or
other) passes through the dressing (FPTD)};

The others variables used are already described in the Type of Patient "Pregnant": Results of the
triage (RoT), Pain Scale (PS), Symptoms 1)
Patient "Not Postpartum" and "Maybe pregnant" consists of the following variables:

The variables used were the same already described in the Type of Patient "Pregnant ": Results
of the triage (RoT), Pain Scale (PS) and Symptoms 1.
Patient “For VIP” and “For CTG” consists of the following variables:

The variables used were described in the Type of Patient "Pregnant" : Results of the triage (RoT)
and Pain Scale (PS)}
Data Preparation
At this stage, some studies were performed in order to construct scenarios for achieving the desired
models. Four possible scenarios were considered:

All data - all data present in the data repository were used for the realization of the models, for
each type of Patient;

Without ARGO – It was used all the data, except those in which the target variable was not filled
with ARGO according to what would be expected, in each type of patient;

Without URG – It was used all the data, except those in which the target variable was not filled
with URG according to what would be expected in each type of patient.

Without URG and ARGO – it was used all data, except those where the target variables ARGO
and URG did not meet according to what may be expected in each type of patient.
8
After a preliminary analysis of the data, it was found that they exhibited adequate quality. Thus, a
statistical analysis of the data was performed, being represented in the figure 1 the number of occurrences
of the target variable for each scenario and for each flowchart.
Figure 1- Distribution of values of the target result of triage (ARGO or URG) in different models to the
case of pregnant, postpartum, non-postpartum, patients who may be pregnant, VIP patients and patients
to CTG.
Modelling
To induce classification models, four DM classification techniques were used (Fayyad et al., 1996),
(Rojão, 2011):
 Decision Tree (DT): generates automatically rules that are conditional statements that show the
logic used to build the tree;

Naive Bayes (NB): uses Bayes' Theorem that consists of a formula that calculates a probability by
counting the frequency of values and combinations of values in the old data;

Generalized Linear Models (GLM): is a popular statistical technique for linear modelling of
binary classification.

Support Vector Machine (SVM): is a powerful DM technique based on linear and nonlinear
regression for binary and multiclass classification.
In order to use the GLM model to perform a binary classification, it was necessary to go back to the
previous step in order to transform the variable Result of the Triage in binary, since this variable contains
the following values:


URG: If patients are routed to the urgency of CMIN;
ARGO: If patients are routed for urgent consultations.
Because URG and ARGO are mutually exclusive classes, “1” was assigned to cases URG and “0” was
assigned to cases ARGO. The developed models for each flowchart can be represented by:
Mn ≡<Af, Vi, TDMy>
Model Mn belongs to the approach (A) and is composed by variables (V) and a DM technique (TDM),
where:
Af ∈ {Classification}
TDMy∈ {SVM, NB, GLM, DT}
For each one of the flowcharts different variables (V) than can be used. For the flowchart “Pregnant”:
Vi ∈ {RoT, Hd, VC, TIR, EP|RH, N\V, CV\MC, BB, DFM, LAF, TP, APR, GS, PS, Fv, US, Hm,
Cv, Sc}
For the flowchart “Postpartum”:
9
Vi ∈ {BS, FL, RS, FPTD, RoT, PS, Fv, US, Hm, Cv, Sc}
For the flowcharts “Not Postpartum” and “Maybe Pregnant”:
Vi ∈ {RoT, PS, Fv, US, Hm, Cv, Sc}
Finally, for the flowcharts “For VIP” and “For CTG”:
Vi ∈ {RoT, PS}
Globally, 96 models were induced (4 Scenarios * 4 techniques * 6 flowcharts / type of patients * 1 target).
Evaluation
To evaluate the results achieved by the DM models, the evaluation metrics described before were used.
The models used 60% of the data for training and 40% of the data for testing (holdout). For each model
and by type of patient were calculated the values of sensitivity, specificity and acuity, represented in the
tables 1, 2, 3, 4, 5 and 6.
Table 1- Models evaluation for pregnant patients.
Pregnant
Support Vector Machine
Sensitivity
All Data
Naïve Bayes
Specificity
Acuity
0.953
0.660
0.800
Without URG
0.957
0.647
0.789
Without ARGO
1.000
0.702
0.850
1.000
1.000
1.000
Without URG and ARGO
Sensitivity
Specificity
Acuity
0.951
0.685
0.818
Without URG
0.949
0.693
0.822
Without ARGO
1.000
0.701
0.849
1.000
1.000
1.000
All Data
Without URG and ARGO
Generalized Linear Model
Decision Tree
Sensitivity
Specificity
Acuity
All Data
0.951
0.685
0.818
Without URG
0.949
0.693
0.822
Without ARGO
1.000
0.702
0.850
Without URG and ARGO
1.000
1.000
1.000
Sensitivity
Specificity
Acuity
All Data
0.952
0.603
0.751
Without URG
0.957
0.605
0.753
Without ARGO
1.000
0.614
0.778
Without URG and ARGO
1.000
0.838
0.918
Specificity
Acuity
Table 2- Models evaluation for postpartum patients.
Postpartum
Support Vector Machine
Sensitivity
All Data
Naïve Bayes
Specificity
Acuity
0.925
0.716
0.823
Without URG
0.911
0.998
0.947
Without ARGO
1.000
0.743
0.870
1.000
1.000
1.000
Sensitivity
Specificity
Acuity
0.925
0.716
0.823
Without URG and ARGO
Sensitivity
0.925
0.715
0.822
Without URG
0.911
0.993
0.945
Without ARGO
1.000
0.739
0.868
1.000
0.993
0.997
All Data
Without URG and ARGO
Generalized Linear Model
All Data
Decision Tree
All Data
Sensitivity
Specificity
Acuity
0.931
0.692
0.809
10
Without URG
0.911
0.998
0.947
Without URG
0.916
0.942
0.927
Without ARGO
1.000
0.742
0.870
Without ARGO
1.000
0.718
0.853
Without URG and ARGO
1.000
0.998
0.999
Without URG and ARGO
1.000
0.945
0.976
Table 3- Models evaluation for not postpartum patients.
Non Postpartum
Support Vector Machine
Sensitivity
All Data
Naïve Bayes
Specificity
Acuity
0.823
0.888
0.869
Without URG
0.821
1.000
0.942
Without ARGO
1.000
0.888
0.917
1.000
1.000
1.000
Without URG and ARGO
Sensitivity
Specificity
Acuity
0.823
0.888
0.869
Without URG
0.821
1.000
0.942
Without ARGO
1.000
0.888
0.917
1.000
1.000
1.000
All Data
Without URG and ARGO
Generalized Linear Model
Decision Tree
Sensitivity
Specificity
Acuity
Sensitivity
Specificity
Acuity
All Data
0.823
0.888
0.869
All Data
0.822
0.887
0.868
Without URG
0.821
1.000
0.942
Without URG
0.821
0.999
0.942
Without ARGO
1.000
0.888
0.917
Without ARGO
1.000
0.887
0.916
Without URG and ARGO
1.000
1.000
1.000
Without URG and ARGO
1.000
0.999
0.999
Table 4- Models evaluation, for maybe pregnant patients.
Maybe Pregnant
Support Vector Machine
Sensitivity
Naïve Bayes
Specificity
Acuity
Sensitivity
Specificity
Acuity
All Data
0.247
0.000
0.247
All Data
0.795
0.861
0.850
Without URG
0.789
1.000
0.961
Without URG
0.789
0.997
0.959
Without ARGO
0.246
0.000
0.246
Without ARGO
1.000
0.861
0.879
Without URG and ARGO
1.000
1.000
1.000
Without URG and ARGO
1.000
0.998
0.998
Sensitivity
Specificity
Acuity
0.795
0.863
0.851
Without URG
0.789
1.000
0.961
Without ARGO
1.000
0.862
0.879
1.000
1.000
1.000
Generalized Linear Model
All Data
Without URG and ARGO
Decision Tree
Sensitivity
Specificity
Acuity
0.795
0.861
0.850
Without URG
0.789
0.997
0.959
Without ARGO
1.000
0.861
0.879
1.000
0.998
0.998
All Data
Without URG and ARGO
Table 5- Models evaluation for VIP patients.
VIP
Support Vector Machine
Sensitivity
All Data
Naïve Bayes
Specificity
Acuity
0.026
0.000
0.026
Without URG
0.667
1.000
0.997
Without ARGO
0.027
0.000
0.027
Sensitivity
Specificity
Acuity
0.667
0.980
0.977
Without URG
0.667
1.000
0.997
Without ARGO
1.000
0.977
0.977
All Data
11
Without URG and ARGO
1.000
1.000
1.000
Without URG and ARGO
Generalized Linear Model
1.000
1.000
1.000
Decision Tree
Sensitivity
Specificity
Acuity
All Data
0.667
0.980
0.977
Without URG
0.667
1.000
0.997
Without ARGO
1.000
0.977
0.977
Without URG and ARGO
1.000
1.000
1.000
Sensitivity
Specificity
Acuity
All Data
0.667
0.980
0.977
Without URG
0.667
1.000
Without ARGO
1.000
0.977
Without URG and ARGO
0.000
0.995
0.997
0.977
0.995
Table 6- Models evaluation for CTG patients.
CTG
Support Vector Machine
Sensitivity
Naïve Bayes
Specificity
Acuity
Sensitivity
Specificity
Acuity
All Data
0.199
0.000
0.199
All Data
1.000
0.823
Without URG
0.941
1.000
0.998
Without URG
0.941
1.000
0.998
0.200
0.000
0.200
Without ARGO
1.000
0.815
0.818
1.000
1.000
1.000
Without URG and ARGO
1.000
1.000
1.000
Without ARGO
Without URG and ARGO
Generalized Linear Model
Sensitivity
0.827
Decision Tree
Specificity
Acuity
0.827
Sensitivity
Specificity
Acuity
All Data
1.000
0.823
0.827
Without URG
0.941
1.000
0.998
0.818
1.000
All Data
1.000
0.823
Without URG
0.941
1.000
0.998
Without ARGO
1.000
0.815
0.818
Without ARGO
1.000
0.815
Without URG and ARGO
1.000
1.000
1.000
Without URG and ARGO
1.000
1.000
Deployment
The models obtained will be used to improve pre-triage system implemented in CMIN. DM models will
be integrated in the DSS implemented in CMIN. An increment of the quality of patient care and service is
expected by optimizing the resources allocation and by reducing the waiting time. In order to assess the
DSS and the pre-triage system and to define a strategy, a SWOT analysis has been carried out:
Strengths:







System calibrated for discriminating between URG and ARGO;
Specific system for gynaecology and obstetrics;
Usability;
Interoperability;
High availability;
Health professionals are interested in the benefits of the IDSS system implemented;
High collaboration between clinical (nurses and physicians) and information systems staff.
Weaknesses:


Limited and reduced range;
Possibility of error in referring patients;
12

The system is only a routing model able to distinguish patients into two levels (URG, ARGO).
Opportunities:




System with a big possibility of growing;
Evolution to a specific priority system similar to MTS and OTAS;
Introduce / improve real-time and online learning components;
Use Data Mining to improve the DSS.
Threats to Pre-triage system:




Wrongs diagnosis;
System failures;
Competition from other similar systems;
Security of the system.
DISCUSSION
In the data preparation phase it was found that a minority of cases (10 %) classified as URG have not
enough information to justify such classification. On the other hand, some cases labelled as ARGO have
associated parameters for URG classification. These situations occur only when the patients have one or
two parameters associated to URG classification. This is because healthcare professionals, responsible for
triage, have the final decision and can force a different outcome (URG, ARGO and EMERG) if they do
not agree with the result of the pre-triage system. In 12% of the cases healthcare professionals force a
different result than it is obtained by sorting the pre-triage system of CMIN.
The Graphs of figure 1 presented in Data Preparation subsection showed the data distribution for each
one of the flowcharts / type of patients considered in this study. Taking into account the induced models
for each one of the four proposed scenarios (all data, without URG, and without ARG, without ARGO and
no URG), best results obtained for each one of the flowcharts / type of patients are presented in the table
7. The number of correct or incorrect predictions was calculated using 40% (testing data) of the total data
for each target / set of data.
Table 7- Number of cases that the best Data Mining technique applied hit and missed for each of the
scenarios defined, in pregnant, postpartum, non-postpartum, maybe pregnant, for VIP and for CTG
flowcharts/type of patients.
Pregnant
Data Mining
technique(s)
Post-Partum
Correct
% of
Correct
Incorrect
All Data
NB
11527
2568
81.78 All Data
Without U
RG
NB
14544
3148
82.21 Without U
RG
GLM and
14658
SVM
GLM and
11502
SVM
Non Postpartum
2596
84.95 Without ARGO
Without ARGO
Without URG and
ARGO
Data Mining
Correct
0
100.00 Without URG
and ARGO
Data Mining
technique(s)
Correct
% of
Correct
Incorrect
GLM and
SVM
GLM and
SVM
1351
291
82.28
1331
75
94.67
SVM
1398
208
87.05
SVM
1311
0
100.00
Maybe Pregnant
Incorrect
% of
Data Mining
Correct
Incorrect
% of
13
technique(s)
All Data
Without U
RG
Without ARGO
Without URG and
ARGO
Data Mining
technique(s)
All Data
Without U
RG
Without ARGO
Without URG and
ARGO
Correct
SVM. NB and
GLM
GLM. SVM
and NB
GLM. SVM
and NB
GLM. SVM
and NB
For VIP
GLM. NB and
DT
GLM. SVM.
DT and NB
GLM. DT and
NB
GLM and
SVM
8597
1295
8566
technique(s)
Correct
86.91 All Data
GLM
1592
278
85.13
527
94.21 Without U
RG
GLM
1589
64
96.13
8519
775
91.66 Without ARGO
GLM
1650
227
87.91
8488
0
GLM and
1519
SVM
For CTG
0
100.00
Correct
100.00 Without URG
and ARGO
% of
Correct
Incorrect
1105
26
1091
3
1066
25
1102
0
97.70
Data Mining
technique(s)
All Data
99.73 Without U
RG
97.71
Without ARGO
100.00 Without URG
and ARGO
Correct
Incorrect
% of
Correct
GLM. NB and
DT
GLM. SVM.
DT and NB
GLM. DT and
NB
801
167
82.74
805
2
99.75
788
175
81.83
GLM. SVM.
DT and NB
802
0
100.00
In general for all flowcharts the model all data obtained the worst results. This is explained by the
existence of records that are classified as URG which do not have any parameters associated to URG and
records where the result was ARGO and the result should be URG (because they have parameters
associated to URG). For this scenario the worst performance is related to pregnant flowchart with 81.78
% of correctness and the best is related to patients to VIP with a precision equal to 97.70 %. VIP model
can be considered an acceptable value for a decision support system, because for all the cases the
percentage of correct predictions were upper than 97.5%.
Models without URG and without ARGO were performed in order to validate the pre-triage and to find
possible improvement points. For the case without URG the worst result is related to pregnant patients
with 82.21 % of correct classifications and the best result is related to CTG with 99.75% of correct
classifications. Model without ARGO presented the worst result for the class of patients CTG with 81.83%
of correct classifications and the best case is related to VIP with about 97.71% of correct classifications.
By removing the cases ARGO and URG that were not expectable, accuracy increases. This means that
when the output is not forced, the flowcharts (without URG and without ARGO) are adequate i.e., only in
these cases the pre-triage system works without failures.
Pre-triage system can be improved, as witnessed by the studies all data, without ARGO and without URG.
The studies without URG and without ARGO showed that the pre-triage system is calibrated (DM
accuracy of 100%) and consequently, the flowcharts are adequate. However, in the other cases (all data,
without URG and without ARGO) the attained results (accuracy lower than 89.71%) showed that
sometimes it is necessary to force the pre-triage result, in order to better characterize the patient condition.
After an analysis of patient records from January 2010 to April 2014, some weaknesses were found. Some
errors may occur in the categorization of patient outcome and the system only distinguishes patient in two
levels of priorities (URG or Argo). For example, the order of patient care for the class URG is the order of
arrival, which is not always consistent with the level of severity. And lastly, there are still healthcare
professionals that sometimes need to force a different output than would be expected by the system (URG
or ARGO). The idea of transforming the pre-triage system in a specific priority triage system to
gynecology and obstetrics similar to the Manchester Triage System is gaining momentum. This
amendment would allow patients who are currently triaged as URG, were distinguished among various
14
levels of priorities. These changes undoubtedly would bring great gains in healthcare, since patients with
greater severity would be served first, and only after those in need of minor clinical care.
The SWOT analysis showed some threats. The biggest threat is the possibility of wrong diagnosis and
existence of system failures. The security is an issue that administrators should always take into account
in the development process of any system. In this sense, it is very important to ensure the security and
confidentiality of the information. It is also very important to ensure system availability. This means that
contingency plans should be followed in the case of disaster situations or if the system fails (CMIN
activity cannot be stopped). DSS system will be supported by the interoperability platform AIDA which
assures a high level of security.
CONCLUSIONS
In the health sector is very important to make quick and assertive decisions because they are often related
to human life. This paper presented data mining models for assessing a triage DSS implemented in the
CMIN for routing patients between URG and ARGO. This system classifies patients into two levels and
was built to reduce the number of cases classified as URG where in fact they were not. With this system
non URG cases are routed for consultation. Making use of DM techniques a work was carried out in order
to verify whether or not the system is calibrated according to what it is expect by using flowcharts, i.e., if
the pre-triage result matches the patient condition.
Six distinct scenarios were explored. The scenarios without ARGO and without URG showed the need of
improvements in the pre-triage system. This system should evolve to a priority based system similar to
MST and OTAS, but prepared to attend all type of maternity care patients. This change can increase the
patient care quality and satisfaction by reducing the number of misclassified cases (cases where the nurses
responsible for triage do not agreed with the pre-triage system result) and decreasing the hospital care
waiting time. The new triage system will be more sensitive to the patient condition and will better
characterize the patient care needs, prioritizing the treatment according their clinical condition.
A SWOT analysis was demonstrated the pertinence in adapting the current pre-triage system into a
priorities specific system for Gynaecology and Obstetrics units. This new system will be similar to MTS,
enabling the triage of patients at 5 levels of acuity, and similar to the OTAS, in the case of pregnant
flowchart, being a system specific to Obstetrics and Gynaecology specialties.
FUTURE RESEARCH DIRECTIONS
Work will be done joining the methodologies used by MTS and the OTAS, the variables used in pretriage system implemented in CMIN, along with new variables that are shown to be relevant in order to
map the existing system to a priorities specific system in Gynaecology and Obstetrics with 5 levels of
acuity. Like the pre-triage system, this new system of priorities will not be limited to pregnant patients (as
with the OTAS system), but will consider the remaining types of patients who are treated at CMIN
(postpartum, non-postpartum, maybe pregnant, to IGO and to CTG patients).
Currently studies concerning to the adaptation of this new triage system are in development together with
clinical specialists in the field of Maternity Care.
Complementary work will be carried out in order make an intelligent DSS by exploring adaptive
capacities of the triage system. Via ensemble DM models, the data recorded can be used to adapt, in realtime, the triage model.
15
ACKNOWLEDGEMENTS
This work is funded by National Funds through the FCT - Fundação para a Ciência e a Tecnologia
(Portuguese Foundation for Science and Technology) within projects PEst-OE/EEI/UI0752/2014 and
PEst-OE/EEI/UI0319/2014. The work of Filipe Portela was supported by a postdoctoral grant associated
to FCT project INTCare II - PTDC/EEI-SII/1302/2012.
REFERENCES
Abelha, A., Analide, C., Machado, J., Neves, J., & Novais, P. (2007). Ambient Intelligence And
Simulation In Health Care Virtual Scenarios, IFIP — The International Federation for
Information Processing, 243, 461–468.
Beguería, S. (2006). Validation and evaluation of predictive models in hazard assessment and
risk management. Natural Hazards, 37(3), 315–329.
Chapman, P. (2000). The CRISP-DM User Guide. NCR Systems Engineering Compenhagen.
Fayyad, U., Piatetsky-shapiro, G., & Smyth, P. (1996). From Data Mining to Knowledge
Discovery in, 37–54.
João, A. (2007). Avaliação de Artigos Científicos. Bases da Epidemiologia Clínica.
Khodambashi, S. (2013). Business Process Re-engineering Application in Healthcare in a
Relation to Health Information Systems. Procedia Technology, 9(2212), 949–957.
doi:10.1016/j.protcy.2013.12.106
Krzysztof, C., Witold, P., Roman, S., & Lukasz, K. (2007). Data Mining. A knowledge
Discovery Approach. Springer.
Machado, J., Abelha, A., Rua, F., & Centre, A. (2013). Real-time Predictive Analytics for Sepsis
Level and Therapeutic Plans in Intensive Care Medicine. International Information
Institute.
Murray, M., Bullard, M., & Grafstein, E. (2004). Revisions to the Canadian Emergency
Department Triage and Acuity Scale implemenation guidelines. Cjem, 6(6), 421–7.
Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/17378961
Peixoto, H., Santos, M., Abelha, A., & Machado, J. (2012). Intelligence in Interoperability with
AIDA. In L. Chen, A. Felfernig, J. Liu, & Z. Raś (Eds.), Foundations of Intelligent Systems
SE - 31 (Vol. 7661, pp. 264–273). Springer Berlin Heidelberg. doi:10.1007/978-3-64234624-8_31
16
Pereira, R., Salazar, M., Abelha, A., & Machado, J. (2013). SWOT Analysis of a Portuguese
Electronic Health Record. In C. Douligeris, N. Polemi, A. Karantjias, & W. Lamersdorf
(Eds.), I3E (Vol. 399, pp. 169–177). Springer.
Portela, F., Cabral, A., Abelha, A., Salazar, M., Quintas, C., Machado, J., … Santos, M. F.
(2013). Knowledge Acquisition Process for Intelligent Decision Support in Critical Health
Care. In & J. V. R. Martinho, R. Rijo, M. Cruz-Cunha (Ed.), R. Martinho, R. Rijo, M. CruzCunha, & J. Varajão (Vol. Informatio). Hershey, PA: Medical Information Science
Reference.
Ripley, B. D. (2002). Statistical Data Mining, (May).
Rojão, A. I. R. . (2011). Data mining languages for business intelligence. University of Minho.
Retrieved from http://repositorium.sdum.uminho.pt/handle/1822/22892
Salar, M., & Salar, O. (2014). Determining Pros and Cons of Franchising by Using Swot
Analysis. Procedia - Social and Behavioral Sciences, 122, 515–519.
doi:10.1016/j.sbspro.2014.01.1385
Shariatmadari, M., Sarfaraz, A. H., Hedayat, P., & Vadoudi, K. (2013). Using SWOT Analysis
and Sem to Prioritize Strategies in Foreign Exchange Market in Iran. Procedia - Social and
Behavioral Sciences, 99, 886–892. doi:10.1016/j.sbspro.2013.10.561
Smithson, D. S., Twohey, R., Rice, T., Watts, N., Fernandes, C. M., & Gratton, R. J. (2013).
Implementing an obstetric triage acuity scale: interrater reliability and patient flow analysis.
American Journal of Obstetrics and Gynecology, 209(4), 287–93.
doi:10.1016/j.ajog.2013.03.031
ADDITIONAL READING SECTION
17
Angelini, D. J., Zannieri, C. L., Silva, V. B., Fein, E., & Ward, P. J. (1990). Toward a concept of
triage for labor and delivery: staff perceptions and role utilization. The Journal of Perinatal
& Neonatal Nursing, 4(3), 1–11.
Bellazzi, R., & Zupan, B. (2008). Predictive data mining in clinical medicine: current issues and
guidelines. International journal of medical informatics, 77(2), 81-97.
Cabral, A., Pina, C., Machado, H., Abelha, A., Salazar, M., Quintas, C., … Santos, M. F. (2011).
Data acquisition process for an intelligent decision support in gynecology and obstetrics
emergency triage. ENTERprise Information Systems Communications in Computer and
Information Science in Maria Manuela Cruz-Cunha, João Varajão, Philip Powell. Ricardo
Martinho. Volume 221, 2011, pp 223-232. Springer.
Cronin, J. G. (2003). The introduction of the Manchester triage scale to an emergency
department in the Republic of Ireland. Accident and Emergency Nursing.
Kantardzic, M. (2011). Data mining: concepts, models, methods, and algorithms: Wiley-IEEE
Press.
Koh, H. C., & Tan, G. (2011). Data mining applications in healthcare. Journal of Healthcare
Information Management—Vol, 19(2), 65.
Maconochie, I., & Dawood, M. (2008). Manchester triage system in paediatric emergency care.
BMJ
(Clinical
research
ed.)
(Vol.
337,
p.
a1507).
Retrieved
from
http://www.ncbi.nlm.nih.gov/pubmed/22334644
Martins, H. M. G., Cuña, L. M. D. C. D., & Freitas, P. (2009). Is Manchester (MTS) more than a
triage system? A study of its association with mortality and admission to a large Portuguese
hospital. Emergency Medicine Journal, 26(3), 183–186.
Peng, Y., Kou, G., Shi, Y., & Chen, Z. (2008). A descriptive framework for the field of data
mining and knowledge discovery. International Journal of Information Technology &
Decision Making, 7(04), 639–682.
Portela, F., Santos, M. F., Silva, Á., Machado, J., Abelha, A., & Rua, F. (2013). Data mining for
real-time intelligent decision support system in intensive care medicine.
S. Khodambashi, “Business Process Re-engineering Application in Healthcare in a Relation to
Health Information Systems,” Procedia Technology in Maria Manuela Cruz-Cunha, João
18
Varajão, Helmut Krcmar and Ricardo Martinho (Eds), vol. 9, no. 2212, pp. 949–957, Jan.
2013.
Santos, M. F., & Azevedo, C. S. (2005). Data mining: descoberta de conhecimento em bases de
dados. FCA editores.
Saúde, M. da. (2006). Serviço de Urgência - Recomendações para a organização dos cuidados
urgentes e emergentes. Ministério Da Saúde - Hospitais SA.
Services, U. S. D. o. H. H. (2003). Emergency Severity Index - Five-level Triage Systems. .
Ministério da Saúde (2013). Triagem Obstétrica- modelo de Triagem. Lisboa: Direção Geral de
Saúde.
Turban E., A. J. E. & L. T.-P. (2005). Decision Support Systems and Intelligent Systems .
W. Bonney, “Applicability of Business Intelligence in Electronic Health Record,” Procedia Soc. Behav. Sci., vol. 73, pp. 257–262, Feb. 2013.
KEY TERMS & DEFINITIONS
AIDA - Platform developed to ensure interoperability among healthcare information systems.
Data Mining - Process of exploring large amounts of data in search of consistent patterns.
Decision Support System- A computerized information system used to support decision-making process
in an organization or business.
Interoperability - Autonomous ability to interact and communicate.
Knowledge Discovery from Databases – Process that encompasses a set of ongoing activities that share
the knowledge discovered from databases. It consists of five stages, namely, selection, pre-processing,
transformation, data mining and interpretation/evaluation.
Manchester Triage System - The MTS is a scale used in the triage process of patients when they are
admitted in the Emergency Department.
Maternity Care - Health institution where patients of gynecologists and obstetrics specialties are admitted.
Obstetric Triage Acuity Scale – A 5-category scale used in the triage process of patients when they are
admitted in the Emergency Department of an Obstetric unit.
SWOT analysis - Picking and discussion of strengths, weaknesses, opportunities and threats with the
purpose of know better and improve a system
19
Triage System - A triage system has as main aim to improve the quality of care in that it provides a
service based on clinical characteristics and the target time.