Download Intelligent Modeling To Predict Ejection Fraction From

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cardiac contractility modulation wikipedia , lookup

Management of acute coronary syndrome wikipedia , lookup

Cardiac surgery wikipedia , lookup

Quantium Medical Cardiac Output wikipedia , lookup

Transcript
Intelligent Modeling To Predict Ejection Fraction
From Echocardiographic Reports
José Pedro Gonçalves Fernandes da Mota
Email: [email protected]
Abstract
Left ventricular ejection fraction is one of the most common physiological parameters used by
clinicians to monitor cardiac function. Patients with preserved ejection fraction are expected to have
lower risk of developing undergoing cardiac events. However, recent studies have shown that the
incidence in mortality and cardiac events in these patients still constitutes a major concern. The aim of
this thesis is to develop a model to predict heart failure in patients in the intensive care unit.
Two sets of data were extracted: (i) data referring to physiological parameters measured before an
echocardiogram, and (ii) data recorded during the stay of the patient in the ICU. Fuzzy modeling was
applied to the datasets to develop the inference systems.
The model created using the first dataset revealed a low capability of the parameters acquired at
the time of the echocardiographic exam to predict heart failure.
The second dataset delivered significantly better performance, which was further improved using a
range of sequential feature selection methods, namely: sequential forward selection (SFS), sequential
backward selection (SBS), modified sequential forward selection (MSFS) and modified sequential
backward selection (MSBS). These methods improve the performance measures when compared with
the model, which did not use any feature selection. The two feature selection methods that provided
the best results (AUC higher than 0.77 and accuracy up to 0.73) were the MSFS and the MSBS, using
only 5 and 6 of the 14 available features, respectively, available in the second dataset. The features
common to all sets resulting from feature selection were the left ventricular ejection fraction (LVEF)
value, heart rate and age.
Results were compared with those of a model intended to mimic the real clinical decision. This
model obtained an accuracy of 59.5%, and 60.6% and 58.6% for sensitivity and specificity,
respectively.
The fuzzy modeling approach suggested in this thesis allows non-linear analysis of several
variables, which could improve critical clinical decision. To the best knowledge of the author this is the
first approach to predicting heart failure using non-linear multivariate analysis and the results obtained
suggest that the clinical decision process can be improved by using such methods.
Keywords:
Heart Failure, Fuzzy Modeling, Feature Selection, Sequential Forward Selection, Sequential Backward
Selection, modified Sequential Forward Selection, modified Sequential Backward Selection
I. Introduction
In the past, healthcare practitioners believed
that patient outcomes were dependent almost
exclusively on: training, capabilities and skills of
individual physicians, patient characteristics and
specifics of the illness or procedures being
performed. Nowadays, healthcare decisionmakers have begun to look toward engineering
systems concepts and approaches for solutions
to challenging problems or a way to improve
quality and reduce costs.
Healthcare delivery systems are one of the
most technologically intense and data rich
industries [1].
Having access to large medical databases
enables researchers to build models with the
objective to predict illnesses. In this work, the
objective was to create a model that could predict
Heart Failure of patients in intensive care units
(ICU).
Heart failure
Within all illnesses, Heart Failure (HF) is a
common, costly, disabling, and potentially deadly
condition. In developed countries, around 2% of
adults suffer from heart failure, but in those over
the age of 65, this increases to 6-10% [2].
Heart failure, often called congestive heart
failure (CHF), occurs when the heart is unable to
provide sufficient pump action to maintain blood
flow to meet the needs of the body. The condition
is diagnosed by patient physical examination and
confirmed with echocardiographic exam.
Echocardiogram has become routinely used
in the diagnosis, management, and follow-up of
patients with any suspected or known heart
diseases. It is one of the most widely used
diagnostic tests in cardiology. It can provide
helpful information, including the size and shape
of
the
heart
(internal
chamber
size
quantification), pumping capacity, and the
updated successively through recent years, it is
still highly based on imaging [8].
LVEF value VS Patients Heart Failure occurrence
110
100
90
Occurrence of Heart Failure (%)
location and extent of any tissue damage. An
Echocardiogram can also give physicians other
estimates of heart function such as a calculation
of the cardiac output, ejection fraction (EF), and
diastolic function (how well the heart relaxes).
The measurement of the volumetric fraction of
blood pumped out of the ventricle is called
ejection fraction. A value of 60 percent means
that 60 percent of the total amount of blood in the
ventricle is pushed out with each heartbeat.
Heart have four valves that makes sure blood
flows in only one direction through your heart.
From the left ventricle, the blood is pumped into a
network of arteries that carry the blood
throughout the body.
Measurement of left ventricular ejection
fraction (LVEF) is a well-established clinical
parameter that has essential diagnostic,
therapeutic
and
prognostic
implications,
particularly in the settings of coronary artery
disease (CAD) and HF [2], [3], [4].
Healthy individuals typically have EF between
50% and 65% [5]. However, normal values
depend upon the modality being used to
calculate the EF, and some sources consider an
EF of 55-75% to be normal [6]. Damage to the
heart muscle (myocardium), such as that
sustained during myocardial infarction or in atrial
fibrillation or a plurality of etiologies of
cardiomyopathy, compromises the heart's ability
to perform as an efficient pump and therefore
reduces ejection fraction.
Healthy older adults favorably adapt as the
ventricles become less compliant and are
routinely echocardiographically proven to have
an EF from 55-85% with the help of good
genetics and a healthy lifestyle.
The EF is one of the most important
predictors of prognosis; those with significantly
reduced ejection fractions typically have poorer
prognoses. However, recent studies have
indicated that a preserved ejection fraction does
not mean freedom from risk.
Among HF patients in sinus rhythm, higher
LVEF were associated with a linear decrease in
mortality up to a LVEF of 45%. However,
increases above 45% were not associated with
further reductions in mortality [7] [8]. This entity
had been termed “heart failure with preserved
ejection fraction” (HFPEF).
As seen in Figure 1 there is a decrease in
HF occurrence up to 45% of LVEF, and suddenly
there is an increase of HF occurrences in 55% of
LVEF. This figure represents HF occurrence in
patients from a database used in this thesis,
which clearly have patients with HFPEF.
HFPEF is also referred to as diastolic heart
failure (DHF) and currently accounts for more
than 50% of all heart failure patients. Though the
guidelines for diagnosing DHF have been
80
70
60
50
40
30
20
10
20
30
40
50
LVEF value
60
70
80
Figure 1 - Percentage of patients with HF along the
LVEF values.
Data sets used to predict Heart Failure
The aim of this work is to unravel the sets of
variables that achieve the best performance
predicting DHF and to do that two different data
sets were tested:
1. The physiologic data acquired at the time of
the echocardiogram exam.
This data set was obtained when the patient
from ICU was about to perform an
echocardiogram, and it consists in textual
medical reports with information of all the
features tested before the exam. These textual
medical reports are contained in the MIMIC II
database. The set of eight features obtained from
these reports are:
• LVEF value
• Indication (clinical state of patient)
• Height
• Weight
• Body surface area (BSA)
• Systolic Non-invasive Blood Pressure (sNBP)
• Diastolic Non-invasive Blood Pressure (dNBP)
• Heart rate (HR)
And the outcome of the patients is if the
patients had Diastolic Heart Failure or not.
2. The most frequently acquired physiologic
variables in the intensive care unit (ICU).
This data set is a set of features recorded
along the patient stay in the ICU. We extracted a
total of 15 numerical physiological variables from
the MIMIC II database, using a standard
variables selection process that seeks to
maximize the amount of data within any given
variable versus maintaining robust numbers and
statistical power.
The extracted features are presented in
Table 1, and the outcome of the patients is if
the patients had Diastolic Heart Failure or not.
Table 1 - List of features from the second dataset
and corresponding number
#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Feature
Heart Rate
Systolic Non-invasive Blood Pressure
Mean Non-invasive Blood Pressure
Respiratory Rate
Oxygen Saturation in the blood
Temperature (ºC)
Glucose
Hematocrit
Platelets
Potassium
Diastolic Non-invasive Blood Pressure
Age
Weight on admission
LVEF value
The differences between these two data sets
consists in the number of available features, and
how these features were obtained. In the first
one, features were recorded immediately before
the exam, whereas the second one are mean
values from time series data.
Both data sets were extracted from MIMIC II
database.
value; the ones that have no ICU stay
identification, since we want to predict HF to
patients that are in the ICU, and the ones that
have missing data in the physiologic values.
After these exclusions, it was necessary to
search for outliers to ensure that those patients
had no influence in model development, using
the IQR method. Usually in this method a factor
of 1.5 times the IQR is used, but we used a
"greedy" approach and used a factor of 3,
excluding only the "extreme outliers". Since we
take some conservative measures along this
work, we preferred to only remove these outliers,
instead of applying the factor of 1.5, which would
exclude much more patients that can actually
influence in our models.
To these resulting data sets we applied
normalization, using the min-max procedure,
where the minimum value from a feature takes
the value 0, and the maximum value takes the
value 1.
For the two obtained datasets, the flowchart
from Figure 2 and Figure 3 resume the preprocessing methods applied and the resulting
amount of available data, respectively.
II. Methods
This work made use of the MIMIC II (Multiparameter Intelligent Monitoring in Intensive
Care) database. It encompasses a diverse and
very large population of ICU patients from the
Beth Israel Deaconess Medical Center, in
Boston, dating from 2001.
Data extraction
The methodology used to extract the
previously referred features from this data set
consists in applying Text Mining to the textual
reports from physicians.
We only focus in physiological parameters
because the textual reports have other features
that depend on medical decisions/assessments,
which might bias the results. In textual reports
the reference to LVEF can be either textual (e.g.
“normal LVEF”) or numeric but only numeric
values were used. When a numeric interval was
reported for LVEF, which is common, the mean
value was used.
Figure 2 - Patients selection flowchart for the first
dataset
Pre-processing
Having a new data set with the numeric
values, Data Mining was applied discarding the
patients that have no information about LVEF
Figure 3 - Patients selection flowchart for the
second dataset
Modeling
Feature Selection
Usually, non-linear modeling techniques are
designated as "black-box" models, since they do
not need prior knowledge of the system [9].
Given that the majority real-world systems are
complex and only partly understood, this is
extremely useful.
Within medical-related classification problems,
several fuzzy-based models have shown
comparable performances to other non-linear
modeling techniques [10,11]. Fuzzy modeling is
mostly appealing because it not only provides a
transparent, non-crisp model, but also a linguistic
interpretation in the form of rules and logical
connectives. More than "black-box" models, they
are "grey-models" [12], since their rule-based
nature allows for a linguistic description of
knowledge.
In this work we used first order TakagiSugeno (TS) fuzzy models to perform
classification, as they are usually more suitable
for the accurate modeling and identification of
nonlinear systems. These models are described
by rules of the type:
Most databases contain redundant and
irrelevant features that will only mislead the
modeling process. A feature is said to be relevant
if the model performance increases when it is
added to the feature vector. Further, it is
considered to be redundant if it is highly
correlated with other features. Once feature
selection is applied, the majority of these features
will be discarded, which should lead to an
increase of the database quality.
As said before, in real problems the available
data have high dimensionality, so with feature
selection (FS) techniques, it is possible to reduce
the available data to a manageable size. This
reduction improves training and utilization speed
in cases where computational complexity of
learning algorithms is unbearable. Nevertheless,
in this work the main objective with the use of FS
algorithms is to improve the predictor
performance.
The class of methods used in this work was
the wrapper method. It was the chosen one
because it involves the process of training a
predictor as part of the selection procedure, and
the performance of the predictor with the
selected subset is correlated with the objective
function obtained during search.
The subset evaluation is made at two different
phases, during and after FS. The search process
is guided by the value of an objective function
that measures the quality of the solutions. The
performance of the final subset is evaluated with
a validation set by using cross-validation.
Therefore a very important part of the wrapper
approach is the objective function, which should
incorporate the goals that are meant to be
achieved with FS (e.g. model accuracy, number
of features).
The performance of the models was
evaluated in terms of area under the receiver
operating characteristic curve (AUC), accuracy
(correct classification rate), sensitivity (true
positive classification rate) and specificity (true
negative classification rate).
The main goal of the features selection is to
find an optimal subset from a full set of features,
where this subset of features provide relevant
information, in order to match or improve the
accuracy of the classifiers. In this thesis four
different methods were used: Sequential Forward
Selection (SFS), Sequential Backward Selection,
a modified algorithm based on the SFS called
Modified Sequential Forward Selection (MSFS)
and a modified algorithm based on the SBS
called Modified Sequential Backward Selection
(MSBS).
The SFS method sequentially adds features
to the best set previously evaluated until a
stopping criterion is achieved (e.g. no
𝑅! : If 𝑥! is 𝐴!" 𝑎𝑛𝑑 … 𝑎𝑛𝑑 𝑥! 𝑖𝑠 𝐴!" then y! = (𝒂! )! 𝒙 + 𝑏 !
where Ri denotes the i-th rule, x is the vector of
antecedent variables; Ai and yk are the
(multidimensional) antecedent fuzzy sets and the
one-dimensional consequent variable of the i-th
i
rule, respectively. a is a vector of parameters
i
and b is a scalar offset that relates the
antecedent fuzzy sets with the consequents.
The number of classes of similar objects
(a.k.a. clusters) that exist within the dataset
determines the number of these if-then rules.
Each cluster can be treated collectively as one
group and so may be explained by a common
rule. Several clustering techniques have been
presented in literature, and the one used in this
work was fuzzy c-means clustering.
Each cluster creates one rule. Upon
evaluation of the model all rules created are
activated, each according to the membership
degree of the sample to the cluster. A continuous
real output is returned based on the sum of the
outputs of each rule.
The membership functions (fitted Gaussian
functions) generated by fuzzy modeling allow
interpretation of the rules through when coupled
with the information from the inference built from
the data, contained in the partition matrix.
Though this interpretation is very simplistic, it
allows unraveling the most striking rules in order
to seek expert understanding and validation.
improvement in performance). It considers the
best set obtained in the previous step to advance
to the next step.
Analogously, SBS method removes features
from the total set of features, until the stopping
criterion is achieved.
The MSFS and MSBS are a modification of
the previous methods, where two criteria were
added, based on the performance of the best
sets evaluated at step k, through which is
allowed to several sets to be selected for
evaluation at the next step, and restrict the
advance of the model based on the improvement
in performance [13].
The model was assessed in terms of AUC,
accuracy, sensitivity and specificity.
Study design
Create model with no FS - Since we want to
create the model, test it, and finally validate it, we
divided data into two balanced sets of 50%
where the first one is used to build and test the
model.
Given the number of clusters, we built the TS
inference models, based on regular FCM
algorithm, with the first 50% of data set. Since we
end up with just 97 patients with HF in the each
data set, we build the model based on 2-fold
cross validation, which guarantee that all the
available data is used to train and test the model.
The chosen method to obtain the best
threshold value was the same as in the first
database model.
Instead using the second half of the database,
the entire database (936 patients) was used in
10-fold cross-validation, in order to validate the
model, as used in [16]. If we had used the
second half of database to assess the model,
when applying k-fold cross-validation we would
have few patients with heart failure in each fold.
In other words, all the available data was divided
in 10 different folds, and the performance of the
model was tested using iteratively one different
fold as test data set. This process was repeated
ten times, and the final values of the model
performance were obtained by a mean value of
the ten iterations.
For the first dataset:
After data-extraction from the MIMIC II
database, and pre-processing the obtained
dataset, we were able to identify the best number
of clusters to divide data in groups, so we used
the fcm function, from the Mathworks Fuzzy
Toolbox. Using this function we obtained the
membership function of each data sample. Since
we want to find the optimum number of clusters
we needed cluster validation indices to calculate
the best cluster number. The used indices were
the Xie-Beni index [14], and Partition Coefficient
index [15]. In the first one, we search for the
number of clusters that give the minimum index
value, whereas for the coefficient partition index,
we search for the number of clusters that
originates the maximum value.
After obtaining the optimum number of
clusters, we needed to divide data to create our
model. Initially this data set was divided into
balanced train/test sets by 70%/30%. It means
that we assure that 70% of the patients with HF
(1016 patients) and 70% of patients that didn't
have HF (2087 patients) are included into the
training set. Analogously 30% of the patients with
HF (435 patients) and 30% of patients that didn't
have HF (894 patients) are included into the test
set.
With the optimum number of clusters, and
data divided in train/test sets, we were able to
build a TS Fuzzy Inference System (FIS) using
the fuzzy c-means algorithm. In order to do that,
we used the genfis3 function, also from the
Mathworks Fuzzy Toolbox.
In order to obtain the model best performance
it was needed to find the threshold optimum
value. To calculate this value, an iterative
process, where models are created using
thresholds from 0.2 to 0.8 with a 0.01 "step" is
used, and the best result came from the
minimum difference between sensitivity and
specificity. The best value of threshold was the
one used in the model assessment.
For the second dataset:
As in the first dataset, we started with dataextraction, followed by pre-processing ending
with a usable dataset. Finding the optimum
number of clusters for this dataset, using the fcm
function, was the next step taken. We used the
same indexes as for the first dataset.
In order to have a value to compare with, we
started with a model with no FS, and then we
applied the different FS methods described
previously.
Models created with feature selection - The
previous model was created using all the
available features, and there was a performance
associated to that set of features. With FS, we
intend to obtain an even better performance
reducing the quantity of features used to build a
model.
For SFS, in the first step, models were created
with only one different feature. In order to obtain
the "best feature", all models were assessed and
the one that had the best performance, the AUC
value, indicates which is the feature to start with.
Fixing this first feature, and adding a new feature
from the remaining ones, we created models with
the different combinations of features. The
combination of two features that originates the
best performance, indicates which is the feature
to add to the previous one, already selected. A
new feature is only added to the "best features
set" if the performance value is higher than the
previously obtained. This process stops when
there is no other feature that, if added to the set
of selected features, originates a model with
better performance.
As in the model creation with no feature
selection, 10-fold cross-validation method was
applied, with all the available data, in the model
assessment. Models for assessing the relation of
the set of features with the outcome were
iteratively built and evaluated from 10 random
configurations of the dataset. Once again, in
each fold, patients were randomly divided in two
datasets maintaining proportion between classes.
We finally obtain the mean and standard
deviation values, from the model performance of
the ten iterations, as the resulting performance of
the model.
With SBS, analogously to the SFS, features
are sequentially removed from a full candidate
set that is used to create models. The
combination that provides a higher performance
(higher AUC value) indicates which is the feature
that was left out of the set to create the model,
and is the one to be excluded from the final set of
features.
This process, of sequentially remove features
from the candidate set, is repeated until there is
no improvement in the model performance, and
so it means that there is no other features that
should be excluded from the final set. In this final
step, we end up with a list of fewer features than
the full candidate set, which provided the higher
AUC value possible for this method.
The model assessment made in this method
is the same as in the SFS and the final
performance values is once again obtained by
mean value from the ten iterations of the model
assessment.
For the modified SFS and modified SBS, the
main difference is that, when the modified
methods finalizes, before the model assessment,
we end up with different sets of features that
result in different performance values. This is
because in each step, new search branches are
created, and this way it creates the possibility to
expand the search of field, and in each new
branch, new ones could be created.
The set with better AUC value is the one that
is used in model assessment. Also ten iterations
of the 10-fold cross-validation are made in order
to obtain a mean value of the assessment values
of the chosen set of features.
III. Results
For the first dataset:
Since the Xie-Beni index and Partition
Coefficient index indicates different numbers of
clusters, we decided to test model for both
results. The obtained results for model created
with 2 and 3 clusters are presented in Table 2
Table 2 - Performance of the created models
Performance
Measure
AUC
Accuracy (%)
Sensitivity (%)
Specificity (%)
2 Clusters
3 Clusters
0.57±0.04
55.61±2.89
53.95±2.72
56.04±3.12
0.58±0.05
56.01±4.10
54.28±3.24
56.87±4.13
Since the results were too disappointing (low
capability to predict heart failure of patients), the
only chance to improve the results was to add
new features to the database. The contained
features in this database were the only ones
measured before the echocardiogram, so we
changed to another dataset from MIMIC II, of
patients that are in ICU, and have information
about having or not heart failure.
For the second dataset:
For this dataset the Xie-Beni index and
Partition Coefficient index also indicates different
numbers of clusters, so we decided to test model
for both results. The obtained results for model
created with 2 and 3 clusters are presented in
Table 3.
Table 3 - Performance of the created models with
no FS
Performance
Measure
AUC
Accuracy (%)
Sensitivity (%)
Specificity (%)
2 Clusters
3 Clusters
0.75±0.05
72.11±4.36
72.09±4.29
72.13±4.56
0.75±0.05
72.21±5.10
72.38±5.24
72.07±5.13
From the Table 3 it's possible to see that both
models have similar results for performance. The
model with 2 clusters have lower results on
performance measures, though the difference is
minimal to the model with 3 clusters, but it have
also lower values for standard deviation, what
makes this model more reliable. Because of this,
the following models were all created with 2
clusters.
Results for SFS method:
In this method the necessary improvement to
a feature to be added to the final set of features
have
no
restrictions,
so
the
slightest
improvement is sufficient to a feature to be
added.
Table 4 - Performance of the created model using
SFS as FS method
Performance Measure
AUC
Accuracy (%)
Sensitivity (%)
Specificity (%)
Selected Features
SFS method
0.76±0.05
72.01±4.89
71.95±5.12
72.04±4.82
1,2,5,6,8,10,12,14
In the Table 4 it possible to verify a slightly
improvement in terms of AUC, that was our
principal performance measure, compared to the
model with no feature selection. Just as expected
the number of features is less than the model
with no FS, which is also our objective.
Results for SBS method:
In this method features are removed from the
total set of features, in order to obtain the final
set of features. As the previous method, a
slightest improvement in AUC makes a feature to
be excluded from the final set. In this case, that is
a good thing because less the features, less the
necessary information to predict if a patient have
or not HF. So in this case, there is no logic in
imposing a minimum improvement.
Table 5 - Performance of the created model using
SBS as FS method
Performance Measure
AUC
Accuracy (%)
Sensitivity (%)
Specificity (%)
Selected Features
SBS method
0.76±0.05
71.23±4.81
71.08±4.87
71.18±4.92
1,10,12,13,14
Comparing the results between the SFS and
SBS (Table 5), the models have similar results
for all the performance measures, but the big
difference between them were the necessary
number of features to obtain these values. In
SBS method, fewer features were used to obtain
nearly the same results.
Table 6 - Performance of the created model using
MSFS as FS method
Performance Measure
AUC
Accuracy (%)
Sensitivity (%)
Specificity (%)
Selected Features
MSFS method
0.77±0.01
73.79±2.06
74.46±3.84
73.25±0.62
1,4,11,12,14
The result of best ramification is presented in
Table 6. It is possible to realize that this
modification on the SFS algorithm leads to
improvement in the results. Instead of having
more features like in SFS, we obtained a model
created with the same number of features that in
SBS method, but with a substantial improvement
in performance measure results.
Results for MSBS method:
As in MSFS, different branches are tested,
with different combinations of features, and the
one that provided the best result is shown in
Table 7.
Table 7 - Performance of the created model using
MSBS as FS method
Performance Measure
AUC
Accuracy (%)
Sensitivity (%)
Specificity (%)
Selected Features
MSBS method
0.77±0.04
73.84±1.69
73.80±1.76
73.87±1.64
1,6,10,11,12,14
Comparing the results from this last method with
the previous SF methods, it is possible to realize
that this modifications on the SBS algorithm
leads to improvements in the results. Although it
has more features then the SBS method, there is
an improvement in the final result. But when
compared with the MSFS method, it uses one
more feature, and the final values are very
similar.
0.82
No FS
SFS
SBS
MSFS
MSBS
0.8
0.78
Obtained value
0.76
Results for MSFS method:
Since in each step more than one
combination of features could lead to a better
performance, in this method "branches" are
created with different combinations of features
that are tested, which expand the search of field.
0.74
0.72
0.7
0.68
0.66
AUC
Accuracy
Sensitivity
Specificity
Figure 4 - Graphical view of performance measure
of each model
In Figure 4 the obtained results for the different
models, with and without FS, are presented in
order to easily compare them.
Compare the obtained results with a clinical
decision method
It is known that physicians evaluate the heart
failure tendency of a patient based in the LVEF
value and heart rate. In order to compare our
results with some results from a clinical decision
method, we created a model that replicate
physician's decisions.
Since they use both LVEF and HR, the
created model assume that a patient have heart
failure if one of these two values is outside the
default values. For LVEF, we assumed that a
patient with a value over 45% ([7,8]) is safe of
HF, and for heart rate, a stable patient have
between 60 and 100 cardiac pulsations.
With this model we calculated the accuracy
with our dataset, the sensitivity and specificity.
The confusion matrix of this model is presented
in Table 8.
Table 8 - Resulting confusion matrix for the
medical model obtained, based on LVEF and heart
rate
Predicted
Class
Positive
Negative
Actual Class
Positive
Negative
255
213
166
302
In Table 9 it is possible to compare the
obtained values for accuracy, sensitivity and
specificity between our different models, and the
one based on clinical decision methods. Based in
these values, MSBS is the more reliable model,
obtaining a higher value of accuracy, with a lower
standard deviation, besides having similar results
with MSFS.
Table 9 - Performance of all models compared with
the Medical model based on LVEF and Heart Rate
Model
Accuracy
(%)
Sensitivity
(%)
Specificity
(%)
No FS
72.11±4.36
72.09±4.29
72.13±4.56
SFS
72.01±4.89
71.95±5.12
72.04±4.82
SBS
71.23±4.81
71.08±4.87
71.18±4.92
MSFS
73.79±2.06
74.46±3.84
73.25±0.62
MSBS
73.84±1.69
73.80±1.76
73.87±1.64
Clinical
decision
59.51
60.57
58.64
Selected
Features
All
Features
1,2,5,6,8,
10,12,14
1,10,12,
13,14
1,4,11,
12,14
1,6,10,
11,12,14
1,14
IV. Conclusions
In the present thesis, fuzzy systems were
applied to predict heart failure among patients in
ICU. Since there is no work done in this theme,
there are also no results, which we can compare
to. We combined fuzzy modeling with feature
selection with different methods, which lead us to
different results.
Our main goal is to predict the occurrence of
heart failure in patients from the ICU, better we
also had in mind that less features is better in
terms of necessary machines, associated costs,
and less features also means that patients would
be less bothered.
Comparing both databases, the first one had
much more patients which could lead to a better
performance, since we had more data to train our
models, but the problem was the number of
available features. In the second one, we have
more available information about each patient,
but after pre-processing the data we end up with
less patients that in the first one.
With this thesis, we end up with a list of
features that provide some accurate prediction.
Comparing the obtained results along the
different methods of FS, we conclude that, with
the Modified Sequential Forward Selection, we
were able to find the features that provide the
best performance results, in terms of AUC.
These features are:
• Heart Rate (1)
• Respiratory Rate (4)
• Diastolic Non-invasive Blood Pressure (11)
• Age (12)
• LVEF value (14)
It's important to notice that in all the different
FS methods, there are three features that are
always chosen, which are: LVEF value, Heart
Rate and Age.
These three features are actually related
when we think in heart failure. If a person have a
low value of LVEF, it means that the heart isn't
pumping enough blood to the body, so there is
an increase of heart rate, in order to supply the
necessary oxygen to cells. If to this heart rate
increase, we consider the age of the patient, the
probability of heart failure occurrence increases
with patient aging due to the fact that heart
doesn't easily adapt to blood perfusion
modifications.
With the obtained values, it is difficult to say
that with this tool we could suppress the
physician knowledge, what actually isn't our goal,
but help physicians, in order to prevent the
occurrence of heart failure. As referred
previously, heart failure with preserved ejection
fraction have a significant number of
occurrences, as can be seen in Figure 1, so
this model a helpful tool to physicians, because
LVEF value by itself isn't sufficient to predict
heart failure.
With the developed model based in the
clinical decision method, we were able to
compare our results with a model that replicates
the physician’s decision. Making this comparison,
we conclude that we have obtained a model that
would help the physicians, relating more than two
features, thereby improving their decisions.
With the developed work, and since so far
there are no results of existing models to predict
Heart Failure of patients in ICU, we are able to
say that we obtained satisfactory results.
[8] RS Bhatia, JV Tu, DS Lee, PC Austin, J Fang,
A Haouzi, Y Gong, and PP Liu. Outcome of heart
failure with preserved ejection fraction in a
population-based study. New England Journal of
Medicine, pages 355(3): 260–9, Jully 20 - 2006.
[9] João M. C. Sousa and U. Kaymak. Fuzzy
Decision Making in Modeling and Control. World
Scientific Publishing Company, 2002.
[10] S. N. Ghazavi and T. W. Liao. Medical data
mining by fuzzy modeling with selected features.
Artif Intell Med, 43(3): 195–206, 2008.
V. References
[1] Rainu Kaushal, David Blumenthal, Eric G.
Poon, Ashish K. Jha, Calvin Franz, Blackford
Middleton, John Glaser, Gilad Kuperman,
Melissa Christino, Rushika Fernandopulle,
Joseph P. Newhouse, and David W. Bates. The
costs of a national health information network.
Annals of Internal Medicine, 143(3): 165–173,
2005.
[2] LJ Shaw, ED Peterson, K Kesler, V
Hasselblad, and RM Califf. A meta-analysis of
pre-discharge risk stratification after acute
myocardial
infarction
with
stress
electrocardiographic, myocardial perfusion, and
ventricular function imaging. American Journal of
Cardiology, pages 78:1327–37, 1996.
[3] GJ Taylor, JO Humphries, ED Mellits, and et
al. Predictors of clinical course, coronary
anatomy and left ventricular function after
recovery from acute myocardial infarction.
Circulation, pages 62–70, 1980.
[4] HD White, RM Norris, MA Brown, and et al.
Left ventricular end-systolic volume as the major
determinant of survival after recovery from
myocardial infarction. Circulation, pages 76:44–
51, 1987.
[5] Vinay Kumar, Abul K Abbas, and Jon Aster.
Robbins and Cotran pathologic basis of disease.
Philadelphia, PA: Saunders/Elsevier, 2009.
[6] Simon O’Connor. Examination Medicine (The
Examination). Edinburgh: Churchill Livingstone,
2009.
[7] JP Curtis, SI Sokol, YWang, SS Rathore, DT
Ko, F Jadbabaie, EL Portnay, SJ Marshalko, MJ
Radford, and HM Krumholz. The association of
left ventricular ejection fraction, mortality, and
cause of death in stable outpatients with heart
failure. Journal of the American College of
Cardiology, pages 42(4): 736–42., 2003.
[11] A. S. Fialho, F. Cismondi, S. M. Vieira, J. M.
C. Sousa, S. R. Reti, M. D. Howell, and S. N.
Finkelstein. Predicting outcomes of septic shock
patients using feature selection based on soft
computing techniques. In Proc. of the IPMU 13th
International Conference, pages 65–74, 2010.
[12] Vojislav Kecman. Learning and Soft
Computing Support Vector Machines, Neural
Networks, Fuzzy Logic Systems. MIT Press,
2001.
[13] Rúben D. Pereira, João M.C. Sousa, Susana
M. Vieira, Shane Reti, and Stan Finkelstein.
Modified sequential forward selection applied to
predicting septic shock outcome in the intensive
care unit. Advances in Intelligent Systems and
Computing, 190: 469–477, 2013.
[14] Yuangang Tang, Fuchun Sun, and Zengqi
Sun. Improved validation index for fuzzy
clustering. Proceedings of the American Control
Conference, page 1120, 2005.
[15] J.C. Bezdek. Cluster validity with fuzzy sets.
J. Cybernet, 3:58 – 73, 1973.
[16] Susana M. Vieira, Luís F. Mendonça,
Gonçalo J. Farinha, and João M.C. Sousa.
Modified binary pso for feature selection using
svm applied to mortality prediction of septic
patients. Applied Soft Computing, 13(8): 3494–
3504, 2013.