Download Machine Learning in Computational Biology CSC 2431

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Machine Learning in
Computational Biology
CSC 2431
Lecture 11: Cancer
Instructor: Anna Goldenberg
Luminaries in the cancer field
Burt Vogelstein
Robert Weinberg
Cancer – definition
Neoplasm (from Ancient Greek νεοneo- "new" and πλάσμα plasma
"formation, creation"), is an abnormal
growth of tissue, and when also forming a
mass is commonly referred to as a tumor or
tumour. This abnormal growth (neoplasia)
usually but not always forms a mass.
Malignant neoplasms are called cancer
Wikipedia =)
Clonality of cancer
—  Only
one cell needs to become malignant
—  Form clonal populations
—  All genetically derived from a single clone
population
Weinberg, 2014
Functions important for cancer growth
Uncontrolled cell proliferation
Cancer cells require relatively few growth factors
to proliferate.
Why: Cancer cells themselves release growth
factors into the medium and have a receptor for
them as well, creating forward feedback loop
(autocrine signaling or autocrine stimulation)
— 
◦  Glioblastoma - PDGF
◦  Sarcoma – TGF-alpha, EGFR
— 
— 
Overexpressed growth receptors
30-40% of epithileal growth receptors (EGFR) are
overexpressed, causing random interactions and
creating spontaneous signal (not needing the
growth factor itself)
Loss of Apoptosis
Inhibitor of Apoptosis Protein (IAP)
Bcl-2 family
Unbalancing these factors prevents apoptosis
Tissue invasion and metastasis
1. Paving the path
Extracellular space
MMP
Tissue invasion and metastasis
2. Enter blood stream
3. Enter target tissue: a) Weak adhesion; b)
Roll; c) Stronger Attachment; d) Enter
Angiogenesis
—  Additional
vasculature
—  Excrete proteins
—  Stimulate blood vessel growth
MMP Degrades the extracellular matrix
MMP and other factors make the vessels more permuable
Extracellular space
Angiogenesis
1.  VEGF – pro-angiogenic factor
2.  Stimulates angiogenesis in the vessel
3.  Activates proteins necessary for new blood vessels to form
Angiogenesis result
Allow more blood/nourishment into the tumor
Invasiveness
Invasiveness
Hallmarks of cancer
(Hanahan and Weinberg, 2000)
Hallmarks of cancer
(Hanahan and Weinberg, 2000)
1. 
2. 
3. 
4. 
5. 
6. 
Unlimited proliferation
Lack of response to inhibitory signals
Resistance to programmed cell death
Counting mechanism, embedded in the telomere.
Stem cells and developing cells avoid that. Cancer
cells figure out how to preserve telomere length
as well
Tumors in organs – tumors turn on angiogenesis
Invasiveness, metastasis, immortalized
proliferation. Co-opting processes and subverting
standard functions to their own processes
Updated hallmarks
(Hanahan and Weinberg, 2011)
—  No
general counter-arguments that the
original hallmarks were not correct
—  Emerging hallmarks:
◦  Reprogramming of the cellular energetics of
metabolism (e.g. in some cancers there are
mutations that don’t effect cell division but do
effect metabolism)
◦  Immune system may act as a barrier to
cancer development in some cancers:
–  Cytotoxic t-cells in cancers are good for prognosis
Hallmarks
Invasion
Computational questions that are
asked in the cancer field
— 
— 
Cancer classification
Subtyping of cancer
◦  Intra-tumor heterogeneity
◦  Inter-tumor heterogeneity
— 
Cancer biomarkers
◦  Associative
◦  Causal
◦  Single cell
— 
— 
Evolution of cancer clonality
Drug response
◦  Chemotherapy
◦  Targeted treatment
◦  Immunotherapies
— 
— 
Metastasis, reoccurrence
Prevention
Cancer Classification
— 
Golub et al, Science, 1999
AML vs ALL
Single gene rules
— 
Hoadley et al, Cell, 2014
12 cancers. 3000+ samples.
Cluster of clusters
Cancer Subtyping
Perou et al, Nature 2000
Hierarchical Clustering
Breast cancer (BC)
Cancer Subtyping
Perou et al, Nature 2000
Brunet et al, PNAS
2004
Hierarchical Clustering
Breast cancer (BC)
Non-negative matrix
Factorization, Leukemia
Cancer Subtyping
Perou et al, Nature 2000
Brunet et al, PNAS
2004
Paquet, JNCI, 2015
Hierarchical Clustering
Breast cancer (BC)
Non-negative matrix
Factorization, Leukemia
Absolute Intrinsic Molecular
Subtyping (AIMS), 4294 BC
Biomarkers
Biomarkers
Ewing, 1921
Diffuse endothelioma of bone (1921) ->
Ewing’s Sarcoma
“A fourteen-year-old girl had been treated
by an outside physician in 1918 for nasal
discharge and occasional bleeding. In
November, 1918, while pulling on a rope, a
spontaneous fracture of the ulna occurred,
followed by swelling which gradually
subsided....”
Radiograph and microscopic detected
structures were nearly identical in 7 cases
published in the study.
“The main point of the present
communication lies in the demonstration
that there is a rather common tumor
occurring in young subjects, commonly
identified with osteogenic sarcoma, and
usually called round cell sarcoma, which is
really of endothelial origin, and which is
marked by such peculiar gross anatomical,
clinical, and therapeutic features as to
constitute a specific neoplastic disease of
bone.”
Biomarkers
Ewing, 1921
Diffuse endothelioma of bone (1921) ->
Ewing’s Sarcoma
“A fourteen-year-old girl had been treated
by an outside physician in 1918 for nasal
discharge and occasional bleeding. In
November, 1918, while pulling on a rope, a
spontaneous fracture of the ulna occurred,
followed by swelling which gradually
subsided....”
Radiograph and microscopic detected
structures were nearly identical in 7 cases
published in the study.
“The main point of the present
communication lies in the demonstration
that there is a rather common tumor
occurring in young subjects, commonly
identified with osteogenic sarcoma, and
usually called round cell sarcoma, which is
really of endothelial origin, and which is
marked by such peculiar gross anatomical,
clinical, and therapeutic features as to
constitute a specific neoplastic disease of
bone.”
2006: EWS gene on
chromosome 22 and
ETS-type Fli1 gene on
chr 11 are implicated in
more than 95% of
Ewing's sarcomas.
Fusions are used as
biomarkers
Chemicals can also cause cancer
Weinberg, 2014
Carcinogenes induce cancer
through mutations
Weinberg, 2014
Weinberg, 2014
Biomarkers
Chung et al, MSB, 2007
Hart et al, Nature Methods,
2015
Finds differentially expressed
subnetworks, using Mutual
Information based score
Pareto analysis determines low
dimensional polytope embedding to
define ‘tasks’
Drivers vs Passengers
Vogelstein, breakthroughs in cancer lecture
Evolution of cancer clonality
Clonal Theory of Cancer (Nowell 1976)
Initial oncogenic driver mutation
(but normal cell already has at
least 50 passenger mutations)
Slide credit: A Deshwar
Brosnan & lacobuzioDonahue, 2012
Clonal Theory of Cancer (Nowell 1976)
Initial oncogenic driver mutation
(but normal cell already has at
least 50 passenger mutations)
1st subclone. New mutation
provides a selection advantage
over ancestral subpopulation
Subpopulation
composition of the
tumour
Slide credit: A Deshwar
Brosnan & lacobuzioDonahue, 2012
Clonal Theory of Cancer (Nowell 1976)
1st subclone. New mutation
provides a selection advantage
over ancestral subpopulation
Subpopulation
compositions of
the tumour over
time
Slide credit: A Deshwar
Initial oncogenic driver mutation
(but normal cell already has at
least 50 passenger mutations)
Subsequent subclones
gain further selection
advantages and can
arise in parallel
Brosnan & lacobuzioDonahue, 2012
Tumour genome sequencing produces VAF
clusters for simple somatic mutations (SSMs)
# of SSMs"
Mutation sets
0.1
0.2
0.3
0.4
0.5
Variant Allele Frequency "
0.6"
Evolution of cancer clonality
Roth et al, PyClone,
Nature Methods, 2014
LDA-type model
Evolution of cancer clonality
Roth et al, PyClone,
Nature Methods, 2014
LDA-type model
Deshwar et al, PhyloWGS,
arXiv, 2015
Phylogenetic tree, incorporates
CNVs and mutations
Evolution of cancer clonality
The lifetime course of cancer
Other computational tasks
important in helping to fight cancer
—  Diagnostic
image analysis
—  Differential networks
—  Somatic variant detection
◦  Current accuracy is really low – 50% false
positive rates (see DREAM challenges)
◦  Tools are also very computationally inefficient,
taking weeks to run
◦  Structural aberrations – numerous and complex.
Methods to find them - relatively simplistic
—  Models
of drug-target interactions
The
Thelifetime
lifetimecourse
courseofofcancer
cancer
Weinberg, 2014
Takes 30 years for the cancer to
“start”. Best weapon is prevention!
Burt Vogelstein
Next class
—  Discussion
and conclusions
—  Start preparing the final project reports