Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Intelligent Modeling To Predict Ejection Fraction From Echocardiographic Reports José Pedro Gonçalves Fernandes da Mota Email: [email protected] Abstract Left ventricular ejection fraction is one of the most common physiological parameters used by clinicians to monitor cardiac function. Patients with preserved ejection fraction are expected to have lower risk of developing undergoing cardiac events. However, recent studies have shown that the incidence in mortality and cardiac events in these patients still constitutes a major concern. The aim of this thesis is to develop a model to predict heart failure in patients in the intensive care unit. Two sets of data were extracted: (i) data referring to physiological parameters measured before an echocardiogram, and (ii) data recorded during the stay of the patient in the ICU. Fuzzy modeling was applied to the datasets to develop the inference systems. The model created using the first dataset revealed a low capability of the parameters acquired at the time of the echocardiographic exam to predict heart failure. The second dataset delivered significantly better performance, which was further improved using a range of sequential feature selection methods, namely: sequential forward selection (SFS), sequential backward selection (SBS), modified sequential forward selection (MSFS) and modified sequential backward selection (MSBS). These methods improve the performance measures when compared with the model, which did not use any feature selection. The two feature selection methods that provided the best results (AUC higher than 0.77 and accuracy up to 0.73) were the MSFS and the MSBS, using only 5 and 6 of the 14 available features, respectively, available in the second dataset. The features common to all sets resulting from feature selection were the left ventricular ejection fraction (LVEF) value, heart rate and age. Results were compared with those of a model intended to mimic the real clinical decision. This model obtained an accuracy of 59.5%, and 60.6% and 58.6% for sensitivity and specificity, respectively. The fuzzy modeling approach suggested in this thesis allows non-linear analysis of several variables, which could improve critical clinical decision. To the best knowledge of the author this is the first approach to predicting heart failure using non-linear multivariate analysis and the results obtained suggest that the clinical decision process can be improved by using such methods. Keywords: Heart Failure, Fuzzy Modeling, Feature Selection, Sequential Forward Selection, Sequential Backward Selection, modified Sequential Forward Selection, modified Sequential Backward Selection I. Introduction In the past, healthcare practitioners believed that patient outcomes were dependent almost exclusively on: training, capabilities and skills of individual physicians, patient characteristics and specifics of the illness or procedures being performed. Nowadays, healthcare decisionmakers have begun to look toward engineering systems concepts and approaches for solutions to challenging problems or a way to improve quality and reduce costs. Healthcare delivery systems are one of the most technologically intense and data rich industries [1]. Having access to large medical databases enables researchers to build models with the objective to predict illnesses. In this work, the objective was to create a model that could predict Heart Failure of patients in intensive care units (ICU). Heart failure Within all illnesses, Heart Failure (HF) is a common, costly, disabling, and potentially deadly condition. In developed countries, around 2% of adults suffer from heart failure, but in those over the age of 65, this increases to 6-10% [2]. Heart failure, often called congestive heart failure (CHF), occurs when the heart is unable to provide sufficient pump action to maintain blood flow to meet the needs of the body. The condition is diagnosed by patient physical examination and confirmed with echocardiographic exam. Echocardiogram has become routinely used in the diagnosis, management, and follow-up of patients with any suspected or known heart diseases. It is one of the most widely used diagnostic tests in cardiology. It can provide helpful information, including the size and shape of the heart (internal chamber size quantification), pumping capacity, and the updated successively through recent years, it is still highly based on imaging [8]. LVEF value VS Patients Heart Failure occurrence 110 100 90 Occurrence of Heart Failure (%) location and extent of any tissue damage. An Echocardiogram can also give physicians other estimates of heart function such as a calculation of the cardiac output, ejection fraction (EF), and diastolic function (how well the heart relaxes). The measurement of the volumetric fraction of blood pumped out of the ventricle is called ejection fraction. A value of 60 percent means that 60 percent of the total amount of blood in the ventricle is pushed out with each heartbeat. Heart have four valves that makes sure blood flows in only one direction through your heart. From the left ventricle, the blood is pumped into a network of arteries that carry the blood throughout the body. Measurement of left ventricular ejection fraction (LVEF) is a well-established clinical parameter that has essential diagnostic, therapeutic and prognostic implications, particularly in the settings of coronary artery disease (CAD) and HF [2], [3], [4]. Healthy individuals typically have EF between 50% and 65% [5]. However, normal values depend upon the modality being used to calculate the EF, and some sources consider an EF of 55-75% to be normal [6]. Damage to the heart muscle (myocardium), such as that sustained during myocardial infarction or in atrial fibrillation or a plurality of etiologies of cardiomyopathy, compromises the heart's ability to perform as an efficient pump and therefore reduces ejection fraction. Healthy older adults favorably adapt as the ventricles become less compliant and are routinely echocardiographically proven to have an EF from 55-85% with the help of good genetics and a healthy lifestyle. The EF is one of the most important predictors of prognosis; those with significantly reduced ejection fractions typically have poorer prognoses. However, recent studies have indicated that a preserved ejection fraction does not mean freedom from risk. Among HF patients in sinus rhythm, higher LVEF were associated with a linear decrease in mortality up to a LVEF of 45%. However, increases above 45% were not associated with further reductions in mortality [7] [8]. This entity had been termed “heart failure with preserved ejection fraction” (HFPEF). As seen in Figure 1 there is a decrease in HF occurrence up to 45% of LVEF, and suddenly there is an increase of HF occurrences in 55% of LVEF. This figure represents HF occurrence in patients from a database used in this thesis, which clearly have patients with HFPEF. HFPEF is also referred to as diastolic heart failure (DHF) and currently accounts for more than 50% of all heart failure patients. Though the guidelines for diagnosing DHF have been 80 70 60 50 40 30 20 10 20 30 40 50 LVEF value 60 70 80 Figure 1 - Percentage of patients with HF along the LVEF values. Data sets used to predict Heart Failure The aim of this work is to unravel the sets of variables that achieve the best performance predicting DHF and to do that two different data sets were tested: 1. The physiologic data acquired at the time of the echocardiogram exam. This data set was obtained when the patient from ICU was about to perform an echocardiogram, and it consists in textual medical reports with information of all the features tested before the exam. These textual medical reports are contained in the MIMIC II database. The set of eight features obtained from these reports are: • LVEF value • Indication (clinical state of patient) • Height • Weight • Body surface area (BSA) • Systolic Non-invasive Blood Pressure (sNBP) • Diastolic Non-invasive Blood Pressure (dNBP) • Heart rate (HR) And the outcome of the patients is if the patients had Diastolic Heart Failure or not. 2. The most frequently acquired physiologic variables in the intensive care unit (ICU). This data set is a set of features recorded along the patient stay in the ICU. We extracted a total of 15 numerical physiological variables from the MIMIC II database, using a standard variables selection process that seeks to maximize the amount of data within any given variable versus maintaining robust numbers and statistical power. The extracted features are presented in Table 1, and the outcome of the patients is if the patients had Diastolic Heart Failure or not. Table 1 - List of features from the second dataset and corresponding number # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Feature Heart Rate Systolic Non-invasive Blood Pressure Mean Non-invasive Blood Pressure Respiratory Rate Oxygen Saturation in the blood Temperature (ºC) Glucose Hematocrit Platelets Potassium Diastolic Non-invasive Blood Pressure Age Weight on admission LVEF value The differences between these two data sets consists in the number of available features, and how these features were obtained. In the first one, features were recorded immediately before the exam, whereas the second one are mean values from time series data. Both data sets were extracted from MIMIC II database. value; the ones that have no ICU stay identification, since we want to predict HF to patients that are in the ICU, and the ones that have missing data in the physiologic values. After these exclusions, it was necessary to search for outliers to ensure that those patients had no influence in model development, using the IQR method. Usually in this method a factor of 1.5 times the IQR is used, but we used a "greedy" approach and used a factor of 3, excluding only the "extreme outliers". Since we take some conservative measures along this work, we preferred to only remove these outliers, instead of applying the factor of 1.5, which would exclude much more patients that can actually influence in our models. To these resulting data sets we applied normalization, using the min-max procedure, where the minimum value from a feature takes the value 0, and the maximum value takes the value 1. For the two obtained datasets, the flowchart from Figure 2 and Figure 3 resume the preprocessing methods applied and the resulting amount of available data, respectively. II. Methods This work made use of the MIMIC II (Multiparameter Intelligent Monitoring in Intensive Care) database. It encompasses a diverse and very large population of ICU patients from the Beth Israel Deaconess Medical Center, in Boston, dating from 2001. Data extraction The methodology used to extract the previously referred features from this data set consists in applying Text Mining to the textual reports from physicians. We only focus in physiological parameters because the textual reports have other features that depend on medical decisions/assessments, which might bias the results. In textual reports the reference to LVEF can be either textual (e.g. “normal LVEF”) or numeric but only numeric values were used. When a numeric interval was reported for LVEF, which is common, the mean value was used. Figure 2 - Patients selection flowchart for the first dataset Pre-processing Having a new data set with the numeric values, Data Mining was applied discarding the patients that have no information about LVEF Figure 3 - Patients selection flowchart for the second dataset Modeling Feature Selection Usually, non-linear modeling techniques are designated as "black-box" models, since they do not need prior knowledge of the system [9]. Given that the majority real-world systems are complex and only partly understood, this is extremely useful. Within medical-related classification problems, several fuzzy-based models have shown comparable performances to other non-linear modeling techniques [10,11]. Fuzzy modeling is mostly appealing because it not only provides a transparent, non-crisp model, but also a linguistic interpretation in the form of rules and logical connectives. More than "black-box" models, they are "grey-models" [12], since their rule-based nature allows for a linguistic description of knowledge. In this work we used first order TakagiSugeno (TS) fuzzy models to perform classification, as they are usually more suitable for the accurate modeling and identification of nonlinear systems. These models are described by rules of the type: Most databases contain redundant and irrelevant features that will only mislead the modeling process. A feature is said to be relevant if the model performance increases when it is added to the feature vector. Further, it is considered to be redundant if it is highly correlated with other features. Once feature selection is applied, the majority of these features will be discarded, which should lead to an increase of the database quality. As said before, in real problems the available data have high dimensionality, so with feature selection (FS) techniques, it is possible to reduce the available data to a manageable size. This reduction improves training and utilization speed in cases where computational complexity of learning algorithms is unbearable. Nevertheless, in this work the main objective with the use of FS algorithms is to improve the predictor performance. The class of methods used in this work was the wrapper method. It was the chosen one because it involves the process of training a predictor as part of the selection procedure, and the performance of the predictor with the selected subset is correlated with the objective function obtained during search. The subset evaluation is made at two different phases, during and after FS. The search process is guided by the value of an objective function that measures the quality of the solutions. The performance of the final subset is evaluated with a validation set by using cross-validation. Therefore a very important part of the wrapper approach is the objective function, which should incorporate the goals that are meant to be achieved with FS (e.g. model accuracy, number of features). The performance of the models was evaluated in terms of area under the receiver operating characteristic curve (AUC), accuracy (correct classification rate), sensitivity (true positive classification rate) and specificity (true negative classification rate). The main goal of the features selection is to find an optimal subset from a full set of features, where this subset of features provide relevant information, in order to match or improve the accuracy of the classifiers. In this thesis four different methods were used: Sequential Forward Selection (SFS), Sequential Backward Selection, a modified algorithm based on the SFS called Modified Sequential Forward Selection (MSFS) and a modified algorithm based on the SBS called Modified Sequential Backward Selection (MSBS). The SFS method sequentially adds features to the best set previously evaluated until a stopping criterion is achieved (e.g. no 𝑅! : If 𝑥! is 𝐴!" 𝑎𝑛𝑑 … 𝑎𝑛𝑑 𝑥! 𝑖𝑠 𝐴!" then y! = (𝒂! )! 𝒙 + 𝑏 ! where Ri denotes the i-th rule, x is the vector of antecedent variables; Ai and yk are the (multidimensional) antecedent fuzzy sets and the one-dimensional consequent variable of the i-th i rule, respectively. a is a vector of parameters i and b is a scalar offset that relates the antecedent fuzzy sets with the consequents. The number of classes of similar objects (a.k.a. clusters) that exist within the dataset determines the number of these if-then rules. Each cluster can be treated collectively as one group and so may be explained by a common rule. Several clustering techniques have been presented in literature, and the one used in this work was fuzzy c-means clustering. Each cluster creates one rule. Upon evaluation of the model all rules created are activated, each according to the membership degree of the sample to the cluster. A continuous real output is returned based on the sum of the outputs of each rule. The membership functions (fitted Gaussian functions) generated by fuzzy modeling allow interpretation of the rules through when coupled with the information from the inference built from the data, contained in the partition matrix. Though this interpretation is very simplistic, it allows unraveling the most striking rules in order to seek expert understanding and validation. improvement in performance). It considers the best set obtained in the previous step to advance to the next step. Analogously, SBS method removes features from the total set of features, until the stopping criterion is achieved. The MSFS and MSBS are a modification of the previous methods, where two criteria were added, based on the performance of the best sets evaluated at step k, through which is allowed to several sets to be selected for evaluation at the next step, and restrict the advance of the model based on the improvement in performance [13]. The model was assessed in terms of AUC, accuracy, sensitivity and specificity. Study design Create model with no FS - Since we want to create the model, test it, and finally validate it, we divided data into two balanced sets of 50% where the first one is used to build and test the model. Given the number of clusters, we built the TS inference models, based on regular FCM algorithm, with the first 50% of data set. Since we end up with just 97 patients with HF in the each data set, we build the model based on 2-fold cross validation, which guarantee that all the available data is used to train and test the model. The chosen method to obtain the best threshold value was the same as in the first database model. Instead using the second half of the database, the entire database (936 patients) was used in 10-fold cross-validation, in order to validate the model, as used in [16]. If we had used the second half of database to assess the model, when applying k-fold cross-validation we would have few patients with heart failure in each fold. In other words, all the available data was divided in 10 different folds, and the performance of the model was tested using iteratively one different fold as test data set. This process was repeated ten times, and the final values of the model performance were obtained by a mean value of the ten iterations. For the first dataset: After data-extraction from the MIMIC II database, and pre-processing the obtained dataset, we were able to identify the best number of clusters to divide data in groups, so we used the fcm function, from the Mathworks Fuzzy Toolbox. Using this function we obtained the membership function of each data sample. Since we want to find the optimum number of clusters we needed cluster validation indices to calculate the best cluster number. The used indices were the Xie-Beni index [14], and Partition Coefficient index [15]. In the first one, we search for the number of clusters that give the minimum index value, whereas for the coefficient partition index, we search for the number of clusters that originates the maximum value. After obtaining the optimum number of clusters, we needed to divide data to create our model. Initially this data set was divided into balanced train/test sets by 70%/30%. It means that we assure that 70% of the patients with HF (1016 patients) and 70% of patients that didn't have HF (2087 patients) are included into the training set. Analogously 30% of the patients with HF (435 patients) and 30% of patients that didn't have HF (894 patients) are included into the test set. With the optimum number of clusters, and data divided in train/test sets, we were able to build a TS Fuzzy Inference System (FIS) using the fuzzy c-means algorithm. In order to do that, we used the genfis3 function, also from the Mathworks Fuzzy Toolbox. In order to obtain the model best performance it was needed to find the threshold optimum value. To calculate this value, an iterative process, where models are created using thresholds from 0.2 to 0.8 with a 0.01 "step" is used, and the best result came from the minimum difference between sensitivity and specificity. The best value of threshold was the one used in the model assessment. For the second dataset: As in the first dataset, we started with dataextraction, followed by pre-processing ending with a usable dataset. Finding the optimum number of clusters for this dataset, using the fcm function, was the next step taken. We used the same indexes as for the first dataset. In order to have a value to compare with, we started with a model with no FS, and then we applied the different FS methods described previously. Models created with feature selection - The previous model was created using all the available features, and there was a performance associated to that set of features. With FS, we intend to obtain an even better performance reducing the quantity of features used to build a model. For SFS, in the first step, models were created with only one different feature. In order to obtain the "best feature", all models were assessed and the one that had the best performance, the AUC value, indicates which is the feature to start with. Fixing this first feature, and adding a new feature from the remaining ones, we created models with the different combinations of features. The combination of two features that originates the best performance, indicates which is the feature to add to the previous one, already selected. A new feature is only added to the "best features set" if the performance value is higher than the previously obtained. This process stops when there is no other feature that, if added to the set of selected features, originates a model with better performance. As in the model creation with no feature selection, 10-fold cross-validation method was applied, with all the available data, in the model assessment. Models for assessing the relation of the set of features with the outcome were iteratively built and evaluated from 10 random configurations of the dataset. Once again, in each fold, patients were randomly divided in two datasets maintaining proportion between classes. We finally obtain the mean and standard deviation values, from the model performance of the ten iterations, as the resulting performance of the model. With SBS, analogously to the SFS, features are sequentially removed from a full candidate set that is used to create models. The combination that provides a higher performance (higher AUC value) indicates which is the feature that was left out of the set to create the model, and is the one to be excluded from the final set of features. This process, of sequentially remove features from the candidate set, is repeated until there is no improvement in the model performance, and so it means that there is no other features that should be excluded from the final set. In this final step, we end up with a list of fewer features than the full candidate set, which provided the higher AUC value possible for this method. The model assessment made in this method is the same as in the SFS and the final performance values is once again obtained by mean value from the ten iterations of the model assessment. For the modified SFS and modified SBS, the main difference is that, when the modified methods finalizes, before the model assessment, we end up with different sets of features that result in different performance values. This is because in each step, new search branches are created, and this way it creates the possibility to expand the search of field, and in each new branch, new ones could be created. The set with better AUC value is the one that is used in model assessment. Also ten iterations of the 10-fold cross-validation are made in order to obtain a mean value of the assessment values of the chosen set of features. III. Results For the first dataset: Since the Xie-Beni index and Partition Coefficient index indicates different numbers of clusters, we decided to test model for both results. The obtained results for model created with 2 and 3 clusters are presented in Table 2 Table 2 - Performance of the created models Performance Measure AUC Accuracy (%) Sensitivity (%) Specificity (%) 2 Clusters 3 Clusters 0.57±0.04 55.61±2.89 53.95±2.72 56.04±3.12 0.58±0.05 56.01±4.10 54.28±3.24 56.87±4.13 Since the results were too disappointing (low capability to predict heart failure of patients), the only chance to improve the results was to add new features to the database. The contained features in this database were the only ones measured before the echocardiogram, so we changed to another dataset from MIMIC II, of patients that are in ICU, and have information about having or not heart failure. For the second dataset: For this dataset the Xie-Beni index and Partition Coefficient index also indicates different numbers of clusters, so we decided to test model for both results. The obtained results for model created with 2 and 3 clusters are presented in Table 3. Table 3 - Performance of the created models with no FS Performance Measure AUC Accuracy (%) Sensitivity (%) Specificity (%) 2 Clusters 3 Clusters 0.75±0.05 72.11±4.36 72.09±4.29 72.13±4.56 0.75±0.05 72.21±5.10 72.38±5.24 72.07±5.13 From the Table 3 it's possible to see that both models have similar results for performance. The model with 2 clusters have lower results on performance measures, though the difference is minimal to the model with 3 clusters, but it have also lower values for standard deviation, what makes this model more reliable. Because of this, the following models were all created with 2 clusters. Results for SFS method: In this method the necessary improvement to a feature to be added to the final set of features have no restrictions, so the slightest improvement is sufficient to a feature to be added. Table 4 - Performance of the created model using SFS as FS method Performance Measure AUC Accuracy (%) Sensitivity (%) Specificity (%) Selected Features SFS method 0.76±0.05 72.01±4.89 71.95±5.12 72.04±4.82 1,2,5,6,8,10,12,14 In the Table 4 it possible to verify a slightly improvement in terms of AUC, that was our principal performance measure, compared to the model with no feature selection. Just as expected the number of features is less than the model with no FS, which is also our objective. Results for SBS method: In this method features are removed from the total set of features, in order to obtain the final set of features. As the previous method, a slightest improvement in AUC makes a feature to be excluded from the final set. In this case, that is a good thing because less the features, less the necessary information to predict if a patient have or not HF. So in this case, there is no logic in imposing a minimum improvement. Table 5 - Performance of the created model using SBS as FS method Performance Measure AUC Accuracy (%) Sensitivity (%) Specificity (%) Selected Features SBS method 0.76±0.05 71.23±4.81 71.08±4.87 71.18±4.92 1,10,12,13,14 Comparing the results between the SFS and SBS (Table 5), the models have similar results for all the performance measures, but the big difference between them were the necessary number of features to obtain these values. In SBS method, fewer features were used to obtain nearly the same results. Table 6 - Performance of the created model using MSFS as FS method Performance Measure AUC Accuracy (%) Sensitivity (%) Specificity (%) Selected Features MSFS method 0.77±0.01 73.79±2.06 74.46±3.84 73.25±0.62 1,4,11,12,14 The result of best ramification is presented in Table 6. It is possible to realize that this modification on the SFS algorithm leads to improvement in the results. Instead of having more features like in SFS, we obtained a model created with the same number of features that in SBS method, but with a substantial improvement in performance measure results. Results for MSBS method: As in MSFS, different branches are tested, with different combinations of features, and the one that provided the best result is shown in Table 7. Table 7 - Performance of the created model using MSBS as FS method Performance Measure AUC Accuracy (%) Sensitivity (%) Specificity (%) Selected Features MSBS method 0.77±0.04 73.84±1.69 73.80±1.76 73.87±1.64 1,6,10,11,12,14 Comparing the results from this last method with the previous SF methods, it is possible to realize that this modifications on the SBS algorithm leads to improvements in the results. Although it has more features then the SBS method, there is an improvement in the final result. But when compared with the MSFS method, it uses one more feature, and the final values are very similar. 0.82 No FS SFS SBS MSFS MSBS 0.8 0.78 Obtained value 0.76 Results for MSFS method: Since in each step more than one combination of features could lead to a better performance, in this method "branches" are created with different combinations of features that are tested, which expand the search of field. 0.74 0.72 0.7 0.68 0.66 AUC Accuracy Sensitivity Specificity Figure 4 - Graphical view of performance measure of each model In Figure 4 the obtained results for the different models, with and without FS, are presented in order to easily compare them. Compare the obtained results with a clinical decision method It is known that physicians evaluate the heart failure tendency of a patient based in the LVEF value and heart rate. In order to compare our results with some results from a clinical decision method, we created a model that replicate physician's decisions. Since they use both LVEF and HR, the created model assume that a patient have heart failure if one of these two values is outside the default values. For LVEF, we assumed that a patient with a value over 45% ([7,8]) is safe of HF, and for heart rate, a stable patient have between 60 and 100 cardiac pulsations. With this model we calculated the accuracy with our dataset, the sensitivity and specificity. The confusion matrix of this model is presented in Table 8. Table 8 - Resulting confusion matrix for the medical model obtained, based on LVEF and heart rate Predicted Class Positive Negative Actual Class Positive Negative 255 213 166 302 In Table 9 it is possible to compare the obtained values for accuracy, sensitivity and specificity between our different models, and the one based on clinical decision methods. Based in these values, MSBS is the more reliable model, obtaining a higher value of accuracy, with a lower standard deviation, besides having similar results with MSFS. Table 9 - Performance of all models compared with the Medical model based on LVEF and Heart Rate Model Accuracy (%) Sensitivity (%) Specificity (%) No FS 72.11±4.36 72.09±4.29 72.13±4.56 SFS 72.01±4.89 71.95±5.12 72.04±4.82 SBS 71.23±4.81 71.08±4.87 71.18±4.92 MSFS 73.79±2.06 74.46±3.84 73.25±0.62 MSBS 73.84±1.69 73.80±1.76 73.87±1.64 Clinical decision 59.51 60.57 58.64 Selected Features All Features 1,2,5,6,8, 10,12,14 1,10,12, 13,14 1,4,11, 12,14 1,6,10, 11,12,14 1,14 IV. Conclusions In the present thesis, fuzzy systems were applied to predict heart failure among patients in ICU. Since there is no work done in this theme, there are also no results, which we can compare to. We combined fuzzy modeling with feature selection with different methods, which lead us to different results. Our main goal is to predict the occurrence of heart failure in patients from the ICU, better we also had in mind that less features is better in terms of necessary machines, associated costs, and less features also means that patients would be less bothered. Comparing both databases, the first one had much more patients which could lead to a better performance, since we had more data to train our models, but the problem was the number of available features. In the second one, we have more available information about each patient, but after pre-processing the data we end up with less patients that in the first one. With this thesis, we end up with a list of features that provide some accurate prediction. Comparing the obtained results along the different methods of FS, we conclude that, with the Modified Sequential Forward Selection, we were able to find the features that provide the best performance results, in terms of AUC. These features are: • Heart Rate (1) • Respiratory Rate (4) • Diastolic Non-invasive Blood Pressure (11) • Age (12) • LVEF value (14) It's important to notice that in all the different FS methods, there are three features that are always chosen, which are: LVEF value, Heart Rate and Age. These three features are actually related when we think in heart failure. If a person have a low value of LVEF, it means that the heart isn't pumping enough blood to the body, so there is an increase of heart rate, in order to supply the necessary oxygen to cells. If to this heart rate increase, we consider the age of the patient, the probability of heart failure occurrence increases with patient aging due to the fact that heart doesn't easily adapt to blood perfusion modifications. With the obtained values, it is difficult to say that with this tool we could suppress the physician knowledge, what actually isn't our goal, but help physicians, in order to prevent the occurrence of heart failure. As referred previously, heart failure with preserved ejection fraction have a significant number of occurrences, as can be seen in Figure 1, so this model a helpful tool to physicians, because LVEF value by itself isn't sufficient to predict heart failure. With the developed model based in the clinical decision method, we were able to compare our results with a model that replicates the physician’s decision. Making this comparison, we conclude that we have obtained a model that would help the physicians, relating more than two features, thereby improving their decisions. With the developed work, and since so far there are no results of existing models to predict Heart Failure of patients in ICU, we are able to say that we obtained satisfactory results. [8] RS Bhatia, JV Tu, DS Lee, PC Austin, J Fang, A Haouzi, Y Gong, and PP Liu. Outcome of heart failure with preserved ejection fraction in a population-based study. New England Journal of Medicine, pages 355(3): 260–9, Jully 20 - 2006. [9] João M. C. Sousa and U. Kaymak. Fuzzy Decision Making in Modeling and Control. World Scientific Publishing Company, 2002. [10] S. N. Ghazavi and T. W. Liao. Medical data mining by fuzzy modeling with selected features. Artif Intell Med, 43(3): 195–206, 2008. V. References [1] Rainu Kaushal, David Blumenthal, Eric G. Poon, Ashish K. Jha, Calvin Franz, Blackford Middleton, John Glaser, Gilad Kuperman, Melissa Christino, Rushika Fernandopulle, Joseph P. Newhouse, and David W. Bates. The costs of a national health information network. Annals of Internal Medicine, 143(3): 165–173, 2005. [2] LJ Shaw, ED Peterson, K Kesler, V Hasselblad, and RM Califf. A meta-analysis of pre-discharge risk stratification after acute myocardial infarction with stress electrocardiographic, myocardial perfusion, and ventricular function imaging. American Journal of Cardiology, pages 78:1327–37, 1996. [3] GJ Taylor, JO Humphries, ED Mellits, and et al. Predictors of clinical course, coronary anatomy and left ventricular function after recovery from acute myocardial infarction. Circulation, pages 62–70, 1980. [4] HD White, RM Norris, MA Brown, and et al. Left ventricular end-systolic volume as the major determinant of survival after recovery from myocardial infarction. Circulation, pages 76:44– 51, 1987. [5] Vinay Kumar, Abul K Abbas, and Jon Aster. Robbins and Cotran pathologic basis of disease. Philadelphia, PA: Saunders/Elsevier, 2009. [6] Simon O’Connor. Examination Medicine (The Examination). Edinburgh: Churchill Livingstone, 2009. [7] JP Curtis, SI Sokol, YWang, SS Rathore, DT Ko, F Jadbabaie, EL Portnay, SJ Marshalko, MJ Radford, and HM Krumholz. The association of left ventricular ejection fraction, mortality, and cause of death in stable outpatients with heart failure. Journal of the American College of Cardiology, pages 42(4): 736–42., 2003. [11] A. S. Fialho, F. Cismondi, S. M. Vieira, J. M. C. Sousa, S. R. Reti, M. D. Howell, and S. N. Finkelstein. Predicting outcomes of septic shock patients using feature selection based on soft computing techniques. In Proc. of the IPMU 13th International Conference, pages 65–74, 2010. [12] Vojislav Kecman. Learning and Soft Computing Support Vector Machines, Neural Networks, Fuzzy Logic Systems. MIT Press, 2001. [13] Rúben D. Pereira, João M.C. Sousa, Susana M. Vieira, Shane Reti, and Stan Finkelstein. Modified sequential forward selection applied to predicting septic shock outcome in the intensive care unit. Advances in Intelligent Systems and Computing, 190: 469–477, 2013. [14] Yuangang Tang, Fuchun Sun, and Zengqi Sun. Improved validation index for fuzzy clustering. Proceedings of the American Control Conference, page 1120, 2005. [15] J.C. Bezdek. Cluster validity with fuzzy sets. J. Cybernet, 3:58 – 73, 1973. [16] Susana M. Vieira, Luís F. Mendonça, Gonçalo J. Farinha, and João M.C. Sousa. Modified binary pso for feature selection using svm applied to mortality prediction of septic patients. Applied Soft Computing, 13(8): 3494– 3504, 2013.