Download Presentation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
An approach to predict the survival time of
childhood Acute Lymphoblastic Leukemia
(ALL) patients
Name : Rangana Jayashanka
Supervisor:
Mrs.Rupika Wijesinghe
Co-Supervisor: Dr. A.R. Weerasinghe
Overview
Introduction
Objectives
Related Work
Research Design
Results and Discussion
Childhood Acute Lymphoblastic
Leukemia (ALL)
•Leukemia is a cancer of blood cells.
•Acute lymphoblastic leukemia is a cancer of the
lymphoblast.
•This is the commonest cancer among children.
•This is the most dangerous cancer among
children.
Survival Prediction
•Survival prediction is the task of predicting the
length of time that a patient will survive.
•Doctors can’t accurately predict the survival
time(prognosis) since the relationship between
health status and survivability is still unknown.
•This is a difficult task due to the complex
relationship between biological, genetic and
environmental factors in the human body.
Importance of the Predicting
Survival Time
•An accurate survival model can help in the
treatment and care of patients.
•correct prognosis can guide doctors on
planning future and selecting treatments.
•Can show which features are more dominant in
predicting survival time.
Main Objectives
1.Identify the factors affect the survival of
patients diagnosed as ALL.
2.Design a model to predict the survival time of
newly diagnosed ALL patients.
TNM Staging System
Burke and Henson, 1993
•Introduced in the 1960.
•Attributes to measure cancer outcome:
1.Tumor size
2.Number of lymph nodes with
metastatic disease
3.Distance metastases
TNM Staging System
Burke and Henson, 1993
•New prognosis factors have been identified
after introducing this system.
•Several problems of the model:
1.Accuracy ( breast cancer - 44%)
2. predictive variables can’t be added
to the model
3.cannot apply to all cancers
Research Design
Overview of the proposed
methodology
Data Collection
Data Pre-processing
Feature Selection
Statistical
Models
Build Prediction Model
Evaluation
Machine
Learning
Models
Machine Learning Model
Evaluation
•Use 5-fold cross-validation to obtain the best
estimate of the predictor
Primary Criteria
-Relative Absolute Error (RAE)
-Concordance Index (CI)
Secondary Criteria
-Average L1 Error
Evaluation
•Measure the performances of classifiers using
–Accuracy
–Sensitivity
–Specificity
•Visualization
–Plot the actual survival time and predicted survival
time on the same graph
Results and Discussion
Machine Learning models
Feature selection
•Exhaustive sequential backward method
used
•Features selected;
 Age group (age > 10, age < 10)
 Cranial RT
 First Lumber Puncture test results
 Prednisolone Response
 Regimen
 WBC count
 ALL subtype
 Relapse
Baseline
Take average or median value of the entire
population as the predicted survival time for
each individual patient
Regression
Method
RAE
CI
LI<12
RAE>12
Mean Value
0.457
0.500
undefined
0.457
Average Value
0.531
0.500
undefined
0.531
Baseline
Median
Average
Machine Learning algorithms for
whole data set
•We applied 2 main machine learning
approaches
1. Multiple Linear Regression (MLR)
2. Support Vector Regression (SVR)
Regression Method
RAE
CI
LI < 12
RAE > 12
MLR
0.298
0.537
0.832
0.380
SVR
0.280
0.543
undefined 0.280
Machine Learning algorithms for
whole data set
MLR
SVR
Grouping
•We used two different grouping approaches in
this work
1.Group according to the Risk categorization
2.Classification and Regression Trees (CART)
Group according to the Risk
categorization
Medical doctors categorize patients into 3 main
risk groups
1. Standard Risk Group (SRG)
2. High Risk Group (HRG)
3. Very High Risk Group (VHRG)
Multiple Linear Regression
SRG
HRG
VHRG
Support Vector Regression
SRG
HRG
VHRG
Classification and Regression
Trees (CART)
Classification and Regression
Trees (CART)
Place a patient into a sub group and then predict
that the patient will live the average or median
of that sub group
Regression
Method
RAE
CI
L1 < 12
RAE > 12
Average
0.276
0.535
undefined
0.276
Median
0.274
0.525
undefined
0.274
Combination of CART and
Regressions
•Segregate the patients using the classification
and regression tree.
•Applied learning algorithms at each node of the
resulting tree.
Regression RAE
Method
CI
L1 < 12
RAE > 12
MLR
SVR
0.654
0.661
4.407
4.234
0.248
0.212
0.253
0.223
Outlier Detection
•Outliers are the patients who are extremely
different from the rest of the population.
•Eliminate 5% instances that have relatively
higher Mahalanobis distance.
Multiple Linear Regression after
removing outliers
HR
SR
VHR
Support Vector Regression after
removing outliers
HR
SR
VHR
Classification
Classify patients into “long survivor” versus
“short survivor”, where the classification
boundary is the average survival time of the
entire population.
Classification
Regression Model
Group
Accuracy
Sensitivity Specificity
Base Line - Average
Whole Data Set 0.5234
0.0000
1.0000
Base Line - Median
Whole Data Set 0.5723
0.1267
0.8565
SRG
0.7189
0.8246
0.6245
HRG
0.7267
0.7637
0.7253
VHRG
0.7392
0.7862
0.7154
SRG
0.7115
0.8084
HRG
0.7337
0.7634
0.7097
VHRG
0.6955
0.8084
0.6025
Combined Model
Whole Data Set 0.7512
0.7612
Multiple Linear Regression
0.6025
Support Vector Regression
0.7174
Classification
After handing censored data, eliminating
outliers and groping most predictors were able
to achieve at least 70% accuracy.
Conclusion
•Multiple Linear Regression and Support Vector
Regression are effective
•Performances of survival prediction models
increase after segregating the patients into
groups
•For classification task, most of the classifiers
were able achieve at least 70% accuracy
Conclusion
•We can effectively predict childhood acute
lymphoblastic leukemia patients’ survival times
by taking the following steps:
1. Processing Data
2. Segregating Patients
3. Handling Censoring
4. Eliminating Outliers
5. Learning Predictors
Future Works
•Explore better techniques to handle censored
data
•Collect more available data for analysis
•Kernel functions like,
Gaussian Radial Basis
Polynomial of degree 2
for the support vector regression need to be
studied
References
[1]Alfred Reiter, Martin Schrappe, Wolfgang Hiddemann, and Gunter Henze, "Chemotherapy in
998 Unselected Childhood Acute Lymphoblastic Leukemia Patients. Results and Conclusions of
the Multicenter Trial ALL-BFM 86," Blood, vol. 84, pp. 3122-3133, July 1994.
[2]Martin Schrappe, Alfred Reiter, Martin Zimmermann, and Wolfgang Hiddemann, "Improved
outcome in childhood acute lymphoblastic leukemia despite reduced use of anthracyclines and
cranial radiotherapy: results of trialALL-BFM 90," BLOOD, vol. 95, pp. 3310-3322, JUNE 2000.
[3]Heather S Jim, Michael A Andryskowski, N Pamela Munster, and B Baul Jacohsen, "Physical
symptoms/side effects during breast cancer treatment predict posttreatment distress," Annals
of Behavioral Medicine, vol. 34, no. 2, pp. 200-208, June 2007.
[4]Susan Halabi, Eric J Small, Philip W Kantoff, and Michael W Kattan, "Prognostic Model for
Predicting Survival in Men With Hormone-Refractory Metastatic Prostate Cancer," Journal of
Clinical Oncology, vol. 21, no. 7, pp. 1232-1237, April 2003.
[5]Tien Hoang, Ronghui Xu, Joan H Schiller, Philip Bonomi, and David H Jahnson, "Clinical Model
to Predict Survival in Chemonaive Patients With Advanced Non–Small-Cell Lung Cancer Treated
With Third-Generation Chemotherapy Regimens Based on Eastern Cooperative Oncology Group
Data," Journal of Clinical Oncology, vol. 23, pp. 175-183, January 2005.
References
[6] R. R.Bittern, A.Cushieri. "Artificial neural networks in cancer
management." 2002
[7] M. Lundin and J. Lundin. “Artificial neural networks applied to survival
prediction in breast cancer”. Oncology, (57):281-286, 1999
[8] H.-C. Lin and S. Jabbari. Predicting survival times for cancer patients.
287(68), 1994.
[9] H.-C. Lin. Learning accurate repressors for predicting survival times of
individual cancer patients. 2011.
[10] E. L. Kaplan and P. Meier. Nonparametric estimation from incomplete
observations. Journal of the American Statistical Association, 53, 1958.