Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Real-Time Clinical Warning for Hospitalized Patients via Data Mining (数据挖掘实现的住院病人的实时预警) Department of Computer Science and Engineering Yixin Chen (陈一昕), Yi Mao, Minmin Chen, Rahav Dor, Greg Hackermann, Zhicheng Yang, Chengyang Lu School of Medicine Kelly Faulkner, Kevin Heard, Marin Kollef, Thomas Bailey Background • The ICU direct costs per day for survivors is between six and seven times those for non-ICU care. • Unlike patients at ICUs, general hospital wards (GHW) patients are not under extensive electronic monitoring and nurse care. • Clinical study has found that 4–17% of patients will undergo cardiopulmonary or respiratory arrest while in the GHW of hospital. Project mission • Sudden deteriorations (e.g. septic shock, cardiopulmonary or respiratory arrest) of GHW patients can often be severe and life threatening. • Goal: Provide early detection and intervention based on data mining – to prevent these serious, often lifethreatening events. – Using both clinical data and wireless body sensor data • A NIH-ICTS funded project: currently under clinical trials at Barnes-Jewish Hospital, St. Louis, MO What exactly do we predict Is he going to die? What exactly do we predict Is he going to ICU? System Architecture •Tier 1: EWS (early warning system) • Clinical data, lab tests, manually collected, low frequency •Tier 2: RDS (real-time data sensing) • Body sensor data, automatically collected, wirelessly transmitted, high frequency Agenda 1 Background and overview 2 3 5 Early warning system (EWS) Real-time data sensing (RDS) Future work Medical Record (34 vital signs: pulse, temperature, oxygen saturation, shock index, respirations, age, blood pressure …) Time/second Time/second Related Work Medical data mining machine learning methods medical knowledge SCAP and PSI Acute Physiology Score, Chronic Health Score , and APACHE score are used to predict renal failures Modified Early Warning Score (MEWS) decision trees neural networks SVM Main problems : Most previous general work uses a snapshot method that takes all the features at a given time as input to a model, discarding the temporal evolving of data Overview of EWS Goal: Design an data mining algorithm that can automatically identify patients at risk of clinical deterioration based on their existing electronic medical records time-series. Challenges: • Classification of highdimensional time series data • Irregular data gaps • measurement errors • class imbalance 30000 25000 20000 15000 10000 5000 0 Non-ICU ICU Key Techniques in the EWS Algorithm • • • • • • Temporal bucketing Discriminative classification Bootstrap aggregating (bagging) Exploratory under-sampling Exponential moving average smoothing Kernel-density estimation Workflow of the System Data set D,T Data Preprocessing Real-time data stream Generate a 24-hour window Bucketing Data Preprocessing Exploratory Undersampling Bucket bagging Logistic Regression No Converge? Bucketing Final Model EMA Smoothing Yes Predict Model > threshold? No > iteration count? Yes Yes Alarm Warning Final Model (A) Model building phase (B) Deployment phase No Data Preprocessing Outlier removal Normalization Temporal Bucketing Bucket 1 Bucket 2 Bucket 3 Bucket 4 Bucket 5 Bucket 6 We retain data in a sliding window of the last 24 hours and divided it evenly into 6 buckets In order to capture temporal variations, we compute several feature values for each bucket, including the minimum, maximum, and average Discriminative Classification • Logistic regression (LR) Clinical data • Support vector machine (SVM) Data preprocessing • Use max, min, and avg of each bucket and each vital sign as the input features. (~ 400 features in total) Temporal Bucketing Classification Algo. • Use the training data to learn the model parameters. Output Model, Threshold Aggregated Bootstrapping (bagging) Advantages: 1. Handles outliers 2. Avoid over-fitting …………. 3. Better model quality …………. Final Model Biased Bucket Bagging Bucketing …………. …………. Final Model Exploratory Undersampling Predict model Class balance Remove the right record from the majority class according to the predicted value Exponential Moving Average (EMA) Evaluation Criteria AUC (Area Under receives operating characteristic (ROC) Curve) represents the probability that a randomly chosen positive example is correctly rated with greater suspicion than a randomly chosen negative example. Results on Historical Database Method AUC SENS PPV NPV ACCU 1 0.86809 0.44753 0.29562 0.97345 0.92747 2 0.8907 0.5135 0.3386 0.9751 0.9293 3 0.91995 0.58558 0.36864 0.97871 0.93269 4 0.92108 0.60087 0.37466 0.97948 0.93342 5 0.9221 0.60961 0.37805 0.97992 0.93384 At specificity=0.95 1: bucketing + logistic regression 2: bucketing + logistic regression + bagging 3: bucketing + logistic regression + bucket bagging 4: bucketing + logistic regression + biased bucket bagging 5: bucketing + logistic regression + biased bucket bagging + exploratory undersampling Comparison of various models Method AUC SPEC SENS PPV NPV ACCU RPART 0.6703 0.93 0.55 0.287 0.977 0.912 SVM (Linear kernel 0.6879 0.9762 0.3997 0.4405 0.9719 0.95033 SVM (Quadratic kernel 0.6851 0.9675 0.4028 0.3676 0.9718 0.94216 SVM (Cubic kernel) 0.6792 0.9681 0.3904 0.3646 0.9713 0.94216 SVM(RBF kernel 0.6968 0.9615 0.4321 0.3448 0.9730 0.93774 Our method 5 0.9221 0.94996 0.60961 0.37805 0.97992 0.93384 Clinical Trial at Barnes-Jewish Hospital Dates Start Date Last Date 277 days 1/24/2011 11/1/2011 ICU Transfers total with alert w/o alert ICU transfer 510 243 267 Total 11286 1430 9856 Ratio 4.5% 17.0 % 2.7 % Deaths total with alert w/o alert Deaths 239 138 102 Total 11286 1430 9856 Ratio 2.12% 9.65 % 1.02 % Alerts already triggered early prevention that may prevented deaths Agenda 1 Background & Related work 2 Early warning system (EWS) 3 Real-time data sensing (RDS) 5 Future work Overview of RDS A challenging problem • Classification based on multiple high-frequency real-time timeseries (heart rate, pulse, oxygen sat., CO2, temperature, etc.) Wireless Sensor Network at BJH Overview of Learning Algorithm Key techniques: Feature extraction from multiple time series Feature selection Classification algorithms Exploratory undersampling A Large Pool of Features Features: • Detrended fluctuation analysis (DFA) features • Approximate entropy (ApEn) • Spectral features • First-order features • Second-order features • Cross-sign features Detrended Fluctuation Analysis (DFA) DFA is a method for quantifying the statistical self-affinity of a time-series signal. (See: e.g., Peng et al. 1994) Applicable to both pulse rate and SpO2 Spectral Analysis (FFT) Used component values of VLF (<0.04Hz), LF (0.04-0,15HZ), HF (0.15-0.4HZ), and the ratio LF/HF for each signal. Other Features • Approximate Entropy (ApEn): It quantifies the unpredictability of fluctuations in a time series. – A low value deterministic – A high value unpredictable • First Order Features: – Mean, standard deviation – skewness (symmetry of distribution), Kurtosis (peakness of distribution) • Second Order Features: related to co-occurrence of patterns – First quantify a time series into Q discrete bins, then construct a pattern matrix – energy (E), entropy (S), correlation (COR), inertia (F), local homogeneity (LH), • Cross-sign features: link multiple vital signs together – Correlation: the degree of departure of two signals from independence – Coherence: amplitude and phase about the frequencies held in common between two signals Forward Feature Selection Empty Feature Set Current Feature Set Pick one feature to add into the set Evaluate each of the remaining features (if no improvement) Final feature set Experimental Setup Dataset: MIMIC-II (Multiparameter Intelligent Monitoring in Intensive Care II): A public-access ICU database The data model can be used for both GHW patients with sensors and ICU patients Our data: between 2001 and 2008 from a variety of ICUs (medical, surgical, coronary care, and neonatal) Prediction goal: death or survival Real-time vital signs: heart rate and oxygen saturation rate Class imbalance: most patients survived Evaluation: Based on a 10-fold cross validation Result – Linear and Nonlinear Classification Method Feature AUC Specificity Sensitivity PPV NPV LSVM 1 0.5759 0.9497 0.0755 0.2550 0.7781 LR 1 0.4742 0.9483 0.0729 0.3181 0.7555 KSVM 1 0.5897 0.9497 0.1265 0.3643 0.7879 LSVM 2 0.4473 0.9497 0.0346 0.1300 0.7705 LR 2 0.4902 0.9483 0.0313 0.1667 0.7473 KSVM 2 0.5016 0.9497 0.0676 0.2450 0.7768 LSVM 1&2 0.5757 0.9497 0.1416 0.3917 0.7694 LR 1&2 0.5370 0.9483 0.0521 0.2500 0.7513 KSVM 1&2 0.6332 0.9497 0.1428 0.4146 0.7911 LSVM: Linear SVM LR: Logistic Regression KSVM: RBF Kernel SVM 1: DFA of Heart Rate 2: DFA of Oxygen Saturation Result – Feature Combinations Algorithm Features AUC KSVM DFA 0.6332 DFA + Cross-sign features 0.6565 DFA + Cross-sign features + ApEn 0.6753 All features 0.7079 DFA 0.5370 DFA + Cross-sign features 0.5731 DFA + Cross-sign features + ApEn 0.5974 All features 0.7402 Logistic Regression Result – Feature Selection Method #Selected Features AUC KSVM 5 0.7752 0.9654 LR 23 0.7844 0.9483 PPV NPV 0.4852 0.8041 0.8651 0.5208 0.7692 0.8567 Specificity Sensitivity LR is our first choice: better AUC, interpretability, efficiency First 12 Selected Features (in logistic regression) standard deviation of heart rate ApEn of heart rate Energy of oxygen saturation LF of oxygen saturation LF of heart rate DFA of oxygen saturation Mean of heart rate HF of heart rate Inertia of heart rate Homogeneity of heart rate Energy of heart rate linear correlation of heart rate of oxygen saturation Result – Our Final Model Method AUC Specificity Sensitivity PPV NPV 1 0.7402 0.9500 0.3646 0.7000 0.8185 2 0.7767 0.9500 0.4615 0.9000 0.6440 3 0.8082 0.9500 0.4865 0.9000 0.6546 Method 1: Logistic Regression + all features Method 2: Logistic Regression + all features + exploratory undersampling Method 3: Logistic Regression + feature selection + exploratory undersampling Current Work: Density-based LR • Standard logistic regression φk(x) = xk: – P(y=1|x) = 1/(1 + exp( - ∑ wk xk)) – Probability of an event (e.g., ICU, death) grows or decreases monotonically with each feature – Not true in many case: e.g., ICU transfer rate vs. age • Ideas: transform each feature xk Current Work: Density-based LR • Use a kernel-density estimator to estimate p(xk, y=1) and p(xk, y=0) for each feature xk • Resulting in a nonlinear separation plane that conforms to the true distribution of data • Advantages over KLR, SVM – Efficiency, interpretability Example of Density-based LR Test Data: Original LR Density-based LR Future Work • Distance-based classification algorithms for multidimensional time-series – Dynamic time warping, information distance • Combination of feature-base and distance-based classification algorithms – Include distance information in the objective function • Combining Tier-1 and Tier-2 data – Multi-kernel methods • Interpretation of alerts – Based on the magnitude and sign of model coefficients Real-Time Simulation on Historical Data Method AUC SENS PPV NPV ACCU 1 0.6834 0.30159 0.2345 0.9634 0.9128 1 + EMA 0.78203 0.36508 0.27059 0.96664 0.9128 2 0.74359 0.30159 0.23457 0.96342 0.9293 2 + EMA 0.777737 0.38095 0.27907 0.96342 0.92134 4 0.77689 0.38905 0.27907 0.96745 0.9336 4 + EMA 0.81411 0.39683 0.28736 0.96825 0.92212 5 0.79902 0.4127 0.29545 0.96096 0.9229 5 + EMA 0.79902 0.4127 0.29545 0.96096 0.9229 @ Specificity=0.95 (Assuming feature Independence) Feature Coefficient local homogeneity of heart rate -14.50 standard deviation of oxygen saturation 10.20 entropy of oxygen saturation 10.17 LF of heart rate 8.62 local homogeneity of oxygen saturation 7.77 LF/HF of oxygen saturation 4.53 inertia of heart rate 3.86 entropy of heart rate 2.97 low frequency of oxygen saturation -2.89 mean of oxygen saturation -2.86 Why Bagging Works? Let each ( Di , yi ),1 i m be the bucket sample that is independently drawn from ( D, y ) . ( Di , yi ) is the predictor. The aggregated predictor is: A ( D, y) E( ( Di , yi )) The average prediction error in is: e' E ( yi ( Di , yi )) 2 The error in the aggregated predictor is: e E ( y A ( D, y )) 2 Using the inequality ( EZ )2 EZ 2 gives us e e' . Algorithm details – Biased Bucket bagging (BBB) 180000 160000 140000 120000 100000 80000 60000 40000 20000 0 2 3 4 5 buckets buckets buckets buckets Standard deviation A critical factor deciding how much bagging will improve accuracy is the variance of these bootstrap models. We see that BBB with 4 buckets has the 2 largest difference between E ( Di , yi ) and E 2 ( Di , yi ) . Besides this, BBB with 4 buckets also has the highest standard deviations in predict results. So we choose BBB with 4 buckets as the final method. Algorithm Details –Bucket Bagging Bucketing …………. …………. Final Model Result on Real-Time System We can see that all cases attain best performance when is around 0.06, showing that the choice of is robust. This small optimal value shows that historical records plays an important role for prediction. Cross validation for the EMA parameter