Download Research Mission

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Immunity-aware programming wikipedia , lookup

UniPro protocol stack wikipedia , lookup

Transcript
Real-Time Clinical Warning for Hospitalized Patients via
Data Mining (数据挖掘实现的住院病人的实时预警)
Department of Computer Science and Engineering
Yixin Chen (陈一昕), Yi Mao, Minmin Chen, Rahav Dor, Greg
Hackermann, Zhicheng Yang, Chengyang Lu
School of Medicine
Kelly Faulkner, Kevin Heard, Marin Kollef, Thomas Bailey
Background
• The ICU direct costs per day for
survivors is between six and seven
times those for non-ICU care.
• Unlike patients at ICUs, general
hospital wards (GHW) patients are not
under extensive electronic monitoring
and nurse care.
• Clinical study has found that 4–17%
of patients will undergo
cardiopulmonary or respiratory arrest
while in the GHW of hospital.
Project mission
• Sudden deteriorations (e.g. septic
shock, cardiopulmonary or respiratory
arrest) of GHW patients can often be
severe and life threatening.
• Goal: Provide early detection and
intervention based on data mining
– to prevent these serious, often lifethreatening events.
– Using both clinical data and wireless
body sensor data
• A NIH-ICTS funded project: currently
under clinical trials at Barnes-Jewish
Hospital, St. Louis, MO
What exactly do we predict
Is he going
to die?
What exactly do we predict
Is he going
to ICU?
System Architecture
•Tier 1: EWS (early warning system)
• Clinical data, lab tests, manually collected, low frequency
•Tier 2: RDS (real-time data sensing)
• Body sensor data, automatically collected, wirelessly transmitted, high frequency
Agenda
1
Background and overview
2
3
5
Early warning system (EWS)
Real-time data sensing (RDS)
Future work
Medical Record (34 vital signs: pulse, temperature, oxygen
saturation, shock index, respirations, age, blood pressure …)
Time/second
Time/second
Related Work
Medical
data
mining
machine
learning
methods
medical
knowledge
SCAP and PSI
Acute Physiology
Score, Chronic
Health Score , and
APACHE score are
used to predict
renal failures
Modified Early
Warning
Score (MEWS)
decision
trees
neural
networks
SVM
Main problems : Most previous general work uses a snapshot method
that takes all the features at a given time as input to a model, discarding
the temporal evolving of data
Overview of EWS
Goal: Design an data mining algorithm that can automatically
identify patients at risk of clinical deterioration based on their
existing electronic medical records time-series.
Challenges:
• Classification of highdimensional time
series data
• Irregular data gaps
• measurement errors
• class imbalance
30000
25000
20000
15000
10000
5000
0
Non-ICU
ICU
Key Techniques in the EWS Algorithm
•
•
•
•
•
•
Temporal bucketing
Discriminative classification
Bootstrap aggregating (bagging)
Exploratory under-sampling
Exponential moving average smoothing
Kernel-density estimation
Workflow of the System
Data set D,T
Data Preprocessing
Real-time
data stream
Generate a 24-hour
window
Bucketing
Data Preprocessing
Exploratory Undersampling
Bucket bagging
Logistic Regression
No
Converge?
Bucketing
Final Model
EMA Smoothing
Yes
Predict Model
> threshold?
No
> iteration count?
Yes
Yes
Alarm Warning
Final Model
(A) Model building phase
(B) Deployment phase
No
Data Preprocessing
Outlier removal
Normalization
Temporal Bucketing
Bucket 1
Bucket 2
Bucket 3
Bucket 4
Bucket 5
Bucket 6
We retain data in a sliding window of the last 24 hours and
divided it evenly into 6 buckets
In order to capture temporal variations, we compute several
feature values for each bucket, including the minimum,
maximum, and average
Discriminative Classification
• Logistic regression (LR)
Clinical data
• Support vector machine (SVM)
Data preprocessing
• Use max, min, and avg of each
bucket and each vital sign as the
input features. (~ 400 features in
total)
Temporal Bucketing
Classification Algo.
• Use the training data to learn the
model parameters.
Output Model, Threshold
Aggregated Bootstrapping (bagging)
Advantages:
1. Handles outliers
2. Avoid over-fitting
………….
3. Better model quality
………….

Final Model
Biased Bucket Bagging
Bucketing
………….
………….

Final Model
Exploratory Undersampling
Predict model
Class balance
Remove the right record from the majority class
according to the predicted value
Exponential Moving Average (EMA)
Evaluation Criteria
AUC (Area Under receives operating characteristic (ROC) Curve)
represents the probability that a randomly chosen positive example is
correctly rated with greater suspicion than a randomly chosen negative
example.
Results on Historical Database
Method
AUC
SENS
PPV
NPV
ACCU
1
0.86809
0.44753
0.29562
0.97345
0.92747
2
0.8907
0.5135
0.3386
0.9751
0.9293
3
0.91995
0.58558
0.36864
0.97871
0.93269
4
0.92108
0.60087
0.37466
0.97948
0.93342
5
0.9221
0.60961
0.37805
0.97992
0.93384
At specificity=0.95
1: bucketing + logistic regression
2: bucketing + logistic regression + bagging
3: bucketing + logistic regression + bucket bagging
4: bucketing + logistic regression + biased bucket bagging
5: bucketing + logistic regression + biased bucket bagging + exploratory undersampling
Comparison of various models
Method
AUC
SPEC
SENS
PPV
NPV
ACCU
RPART
0.6703
0.93
0.55
0.287
0.977
0.912
SVM (Linear
kernel
0.6879
0.9762
0.3997
0.4405
0.9719
0.95033
SVM
(Quadratic
kernel
0.6851
0.9675
0.4028
0.3676
0.9718
0.94216
SVM (Cubic
kernel)
0.6792
0.9681
0.3904
0.3646
0.9713
0.94216
SVM(RBF
kernel
0.6968
0.9615
0.4321
0.3448
0.9730
0.93774
Our method 5
0.9221
0.94996 0.60961 0.37805
0.97992
0.93384
Clinical Trial at Barnes-Jewish Hospital
Dates
Start Date
Last Date
277 days
1/24/2011
11/1/2011
ICU
Transfers
total
with alert
w/o alert
ICU transfer
510
243
267
Total
11286
1430
9856
Ratio
4.5%
17.0 %
2.7 %
Deaths
total
with alert
w/o alert
Deaths
239
138
102
Total
11286
1430
9856
Ratio
2.12%
9.65 %
1.02 %
Alerts already triggered early prevention that may prevented deaths
Agenda
1
Background & Related work
2
Early warning system (EWS)
3
Real-time data sensing (RDS)
5
Future work
Overview of RDS
A challenging problem
• Classification based on multiple high-frequency real-time timeseries (heart rate, pulse, oxygen sat., CO2, temperature, etc.)
Wireless Sensor Network at BJH
Overview of Learning Algorithm
Key techniques:
Feature extraction from multiple time series
Feature selection
Classification algorithms
Exploratory undersampling
A Large Pool of Features
Features:
• Detrended fluctuation
analysis (DFA) features
• Approximate entropy
(ApEn)
• Spectral features
• First-order features
• Second-order features
• Cross-sign features
Detrended Fluctuation Analysis (DFA)
DFA is a method for quantifying the statistical self-affinity of a time-series
signal. (See: e.g., Peng et al. 1994)
Applicable to both pulse rate and SpO2
Spectral Analysis (FFT)
Used component values of VLF (<0.04Hz), LF (0.04-0,15HZ),
HF (0.15-0.4HZ), and the ratio LF/HF for each signal.
Other Features
• Approximate Entropy (ApEn): It quantifies the unpredictability of
fluctuations in a time series.
– A low value  deterministic
– A high value  unpredictable
• First Order Features:
– Mean, standard deviation
– skewness (symmetry of distribution), Kurtosis (peakness of distribution)
• Second Order Features: related to co-occurrence of patterns
– First quantify a time series into Q discrete bins, then construct a pattern matrix
– energy (E), entropy (S), correlation (COR), inertia (F), local homogeneity (LH),
• Cross-sign features: link multiple vital signs together
– Correlation: the degree of departure of two signals from independence
– Coherence: amplitude and phase about the frequencies held in common
between two signals
Forward Feature Selection
Empty Feature Set
Current Feature Set
Pick one feature
to add into the set
Evaluate each of the
remaining features
(if no improvement)
Final feature set
Experimental Setup
Dataset: MIMIC-II (Multiparameter Intelligent Monitoring in
Intensive Care II): A public-access ICU database
The data model can be used for both GHW patients
with sensors and ICU patients
Our data: between 2001 and 2008 from a variety of ICUs
(medical, surgical, coronary care, and neonatal)
Prediction goal: death or survival
Real-time vital signs: heart rate and oxygen saturation rate
Class imbalance: most patients survived
Evaluation: Based on a 10-fold cross validation
Result – Linear and Nonlinear Classification
Method
Feature
AUC
Specificity
Sensitivity
PPV
NPV
LSVM
1
0.5759
0.9497
0.0755
0.2550
0.7781
LR
1
0.4742
0.9483
0.0729
0.3181
0.7555
KSVM
1
0.5897
0.9497
0.1265
0.3643
0.7879
LSVM
2
0.4473
0.9497
0.0346
0.1300
0.7705
LR
2
0.4902
0.9483
0.0313
0.1667
0.7473
KSVM
2
0.5016
0.9497
0.0676
0.2450
0.7768
LSVM
1&2
0.5757
0.9497
0.1416
0.3917
0.7694
LR
1&2
0.5370
0.9483
0.0521
0.2500
0.7513
KSVM
1&2
0.6332
0.9497
0.1428
0.4146
0.7911
LSVM: Linear SVM
LR: Logistic Regression
KSVM: RBF Kernel SVM
1: DFA of Heart Rate
2: DFA of Oxygen Saturation
Result – Feature Combinations
Algorithm
Features
AUC
KSVM
DFA
0.6332
DFA + Cross-sign features
0.6565
DFA + Cross-sign features + ApEn
0.6753
All features
0.7079
DFA
0.5370
DFA + Cross-sign features
0.5731
DFA + Cross-sign features + ApEn
0.5974
All features
0.7402
Logistic Regression
Result – Feature Selection
Method
#Selected
Features
AUC
KSVM
5
0.7752
0.9654
LR
23
0.7844
0.9483
PPV
NPV
0.4852
0.8041
0.8651
0.5208
0.7692
0.8567
Specificity Sensitivity
LR is our first choice: better AUC, interpretability, efficiency
First 12 Selected Features (in logistic regression)
standard deviation of heart rate
ApEn of heart rate
Energy of oxygen saturation
LF of oxygen saturation
LF of heart rate
DFA of oxygen saturation
Mean of heart rate
HF of heart rate
Inertia of heart rate
Homogeneity of heart rate
Energy of heart rate
linear correlation of heart rate of oxygen saturation
Result – Our Final Model
Method
AUC
Specificity Sensitivity
PPV
NPV
1
0.7402
0.9500
0.3646
0.7000
0.8185
2
0.7767
0.9500
0.4615
0.9000
0.6440
3
0.8082
0.9500
0.4865
0.9000
0.6546
Method 1: Logistic Regression + all features
Method 2: Logistic Regression + all features + exploratory undersampling
Method 3: Logistic Regression + feature selection + exploratory undersampling
Current Work: Density-based LR
• Standard logistic regression φk(x) = xk:
– P(y=1|x) = 1/(1 + exp( - ∑ wk xk))
– Probability of an event (e.g., ICU, death) grows or decreases
monotonically with each feature
– Not true in many case: e.g., ICU transfer rate vs. age
• Ideas: transform each feature xk
Current Work: Density-based LR
• Use a kernel-density estimator to estimate p(xk, y=1)
and p(xk, y=0) for each feature xk
• Resulting in a nonlinear separation plane that
conforms to the true distribution of data
• Advantages over KLR, SVM
– Efficiency, interpretability
Example of Density-based LR
Test
Data:
Original LR
Density-based LR
Future Work
• Distance-based classification algorithms for multidimensional time-series
– Dynamic time warping, information distance
• Combination of feature-base and distance-based
classification algorithms
– Include distance information in the objective function
• Combining Tier-1 and Tier-2 data
– Multi-kernel methods
• Interpretation of alerts
– Based on the magnitude and sign of model coefficients
Real-Time Simulation on Historical Data
Method
AUC
SENS
PPV
NPV
ACCU
1
0.6834
0.30159
0.2345
0.9634
0.9128
1 + EMA
0.78203
0.36508
0.27059
0.96664
0.9128
2
0.74359
0.30159
0.23457
0.96342
0.9293
2 + EMA
0.777737
0.38095
0.27907
0.96342
0.92134
4
0.77689
0.38905
0.27907
0.96745
0.9336
4 + EMA
0.81411
0.39683
0.28736
0.96825
0.92212
5
0.79902
0.4127
0.29545
0.96096
0.9229
5 + EMA
0.79902
0.4127
0.29545
0.96096
0.9229
@ Specificity=0.95
(Assuming feature
Independence)
Feature
Coefficient
local homogeneity of heart rate
-14.50
standard deviation of oxygen
saturation
10.20
entropy of oxygen saturation
10.17
LF of heart rate
8.62
local homogeneity of oxygen
saturation
7.77
LF/HF of oxygen saturation
4.53
inertia of heart rate
3.86
entropy of heart rate
2.97
low frequency of oxygen
saturation
-2.89
mean of oxygen saturation
-2.86
Why Bagging Works?
Let each ( Di , yi ),1  i  m be the bucket sample that is independently
drawn from ( D, y ) .  ( Di , yi ) is the predictor.
The aggregated predictor is:
 A ( D, y)  E( ( Di , yi ))
The average prediction error in is:
e'  E ( yi   ( Di , yi )) 2 
The error in the aggregated predictor is:
e   E ( y   A ( D, y )) 
2
Using the inequality ( EZ )2  EZ 2 gives us e  e'
.
Algorithm details – Biased Bucket bagging (BBB)
180000
160000
140000
120000
100000
80000
60000
40000
20000
0
2
3
4
5
buckets buckets buckets buckets
Standard deviation
A critical factor deciding how much bagging will improve accuracy is the
variance of these bootstrap models. We see that BBB with 4 buckets has the
2
largest difference between E  ( Di , yi ) and E  2 ( Di , yi ) . Besides this, BBB
with 4 buckets also has the highest standard deviations in predict results. So
we choose BBB with 4 buckets as the final method.
Algorithm Details –Bucket Bagging
Bucketing
………….
………….

Final Model
Result on Real-Time System
We can see that all cases
attain best performance
when is around 0.06,
showing that the choice of is
robust. This small optimal
value shows that historical
records plays an important
role for prediction.
Cross validation for the EMA parameter