Download Slide 1

Document related concepts
no text concepts found
Transcript
Genome-Wide Mutational Analyses of
Human Cancers:
Lessons Learned From Sequencing Cancer
Genomes
Will Parsons, M.D., Ph.D.
Ludwig Center for Cancer Genetics and Therapeutics
The Sidney Kimmel Cancer Center
Johns Hopkins University
Sept 5, 2008
Overview
I.
Background and overview of
cancer genome studies
II.
Lessons from prior analyses
of cancer genomes
III.
Results and implications of
the current brain cancer study
Overview
I.
Background and overview of
cancer genome studies
II.
Lessons from prior analyses
of cancer genomes
III.
Results and implications of
the current brain cancer study
Cancer is a genetic disease
APC/b-catenin
Normal
Epithelium
Dysplastic
ACF
K-RAS
Early
Adenoma
Intermediate
Adenoma
p53
18q
Late
Adenoma
30 to 40 years
Other
Changes?
Carcinoma
Metastasis
Cancer genotype directed
therapies

Gleevec (imatinib)
– CML (BCR-ABL)
– Gastrointestinal Stromal Tumors (c-KIT)

Herceptin (trastuzumab)
– Breast Cancer (HER-2)

Iressa (gefitinib) and Tarceva (erlotinib)
– NSCLC (EGFR)
What we know about cancer genetics
High throughput sequencing
(>10 million bp per day)
+
+ $$ =
Methods to identify mutations
Pre-genome
Candidate approach
Post-genome
High throughput
Mutational analysis of signaling
pathways in colorectal cancer





138 protein tyrosine kinases
16 phosphatidylinositol 3-kinases
87 protein tyrosine phosphatases
200 chromosomal instability genes
350 serine / threonine kinases
Bardelli et al., Science
300:949 (2003)
Samuels et al., Science
304, 554 (2004)
Wang et al., Science
304 (5674):1164 (2004).
Wang et al., Cancer Res
64(9):2998 (2004)
Parsons et al., Nature
436(7052):792 (2005)
Analyzed in a collection of colorectal and other human
tumors
High frequency of mutations of the
PI3-kinase PIK3CA in human cancer
Colorectal cancer
74/234
Tumor
Breast
cancerFraction mutated
13/53
Hepatocellular
cancer
26/73
Colon
74/234 (32%)
Brain
cancer 4/15 (27%)
4/15
Brain
Gastric
cancer 3/12 (25%)
3/12
Gastric
Lung
cancer
Breast
1/12 (8%)1/24
Lung
32%
27%
35%
27%
25%
4%
1/24 (4%)
C2
8%
47%
Samuels et al., Science 304, 554 (2004),
Bachman et al., CBT 3 e49 (2004), Broderick et al., Can Res 64, 5048
(2004), Lee et al., Oncogene 24, 1477 (2005)
33%
Mutations of PI3K pathway genes
in colorectal cancer
Parsons et al. Nature 436: 792 (2005)
Goals for “Cancer Genomics”

To develop a strategy for unbiased genome-wide
analyses of cancer genes in human tumors

To determine the spectrum and extent of somatic
mutations in human tumors of similar and different
histologic types

To identify new cancer genes for basic research and
improvements in diagnosis, prevention, and therapy
Genome-wide mutational analyses
Discovery Screen
Select gene set and tumors
A
Design primers
PCR amplify coding exons
from samples of tumor DNA
t
Dye terminator sequencing
n
Validation Screen
Find tumor-specific mutations
Validate mutated genes in larger
panel of additional tumors
Compare gene mutation
frequency to expected
background
Candidate cancer genes
Genes with passenger mutations
B
Driver vs. Passenger
mutations
Driver mutations – provide a net
growth advantage and are positively
selected for during tumorigenesis
 Passenger mutations – neutral
mutations that provide no advantage
to the tumor

Mutation Prioritization
1. Frequency
2. Type
3. Predicted effects
4. Structural models
5. Analogous mutations
6. Functional studies
Evaluating Genes based on
Mutation Frequency

CaMP Score
– Metric used to rank genes based on their mutation frequency
and type
– Takes account of number of mutations, length and nucleotide
content of gene, context of mutations

Can use statistical methods to determine the likelihood
that genes with CaMP scores over a threshold are
mutated at a frequency higher than background
Overview
I.
Background and overview of
cancer genome studies
II.
Lessons from prior analyses
of cancer genomes
III.
Results and implications of
the current brain cancer study
What tumors?
Breast and Colon cancers
2004 Estimated US Cancer Cases*
Men
699,560
Women
668,470
32%
Breast
12%
Lung & bronchus
11%
11%
Colon & rectum
Urinary bladder
6%
6%
Uterine corpus
Melanoma of skin
4%
4%
Ovary
4%
Non-Hodgkin
lymphoma
4%
Melanoma
of skin
3%
Thyroid
2%
Pancreas
2%
Urinary bladder
20%
All Other Sites
Prostate
33%
Lung & bronchus
13%
Colon & rectum
Non-Hodgkin
lymphoma
4%
Kidney
3%
Oral Cavity
3%
Leukemia
3%
Pancreas
2%
All Other Sites
18%
*Excludes basal and squamous cell skin cancers and in situ carcinomas except urinary bladder.
Source: American Cancer Society, 2004.
What genes?
Protein-coding genes in CCDS and RefSeq
Identical in RefSeq and Ensembl
Canonical start / stop codons
Consensus
Coding
Sequences
(CCDS)
Cross-species conservation
Consensus splice sites
Translatable from reference
genome without fs or stop
~13,000 genes
RefSeq
Ensembl
~18,500 genes
~21,500 genes
Lessons learned - 1
Mutations and candidate cancer genes

Many genes are mutated in these solid tumors
Mutations per tumor
120
100
Non-silent mutations
Total mutations
80
60
40
CAN-gene mutations
20
0
1
2
3
4
5
6
Tumor #
7
8
9
10
11
Lessons learned – 1
Mutations and candidate cancer genes


Many genes are mutated in these solid tumors
Vast majority of previously known breast and
colon cancer genes were identified
Genes known to be mutated in breast
and colorectal cancers are CAN-genes
Mutation frequency
Breast cancers
Colon cancers
>10%
TP53, PIK3CA
TP53, APC, KRAS, PIK3CA,
SMAD4, FBXW7 (CDC4)
<10%
MRE11, BRCA1
EPHA3, NF1, SMAD2,
SMAD3, TCF7L2 (TCF4),
TGFBRII
Lessons learned – 1
Mutations and candidate cancer genes




Many genes are mutated in these solid tumors
Vast majority of previously known breast and
colon cancer genes were identified
Many new breast and colon CAN-genes were
discovered
New CAN-genes are likely to exist in other
tumor types
The majority of CAN-genes had not
previously been implicated in cancer
Breast cancers
(n=122 genes)
Colon cancers
(n=69 genes)
3%
8%
20%
3%
1%
3%
1%
3%
18%
20%
Mutation
Translocation
Amplification
67%
Deletion
Methylation
3%
Expression
Not known
61%
12%
Lessons learned – 2
Genomic landscape of cancers

More genes involved in cancer than previously
anticipated – few “mountains”, many “hills”
Top colon CAN-genes
Gene
Name
APC
adenomatosis polyposis coli
>10
KRAS
v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog
>10
TP53
tumor protein p53
>10
PIK3CA
phosphoinositide-3-kinase, catalytic, alpha polypeptide
>10
FBXW7
F-box and WD-40 domain protein 7
9.6
neuron navigator 3
8.0
EPH receptor A3
7.1
MAP2K7
neuron navigator 3
7.0
SMAD4
SMAD, mothers against DPP homolog 4
6.0
ADAMTS-like 3
5.9
guanylate cyclase 1, soluble, alpha 2
5.8
OR51E1
olfactory receptor, family 51, subfamily E, member 1
5.6
TCF7L2
transcription factor 7-like 2 (TCF4)
5.2
ADAM metallopeptidase with thrombospondin type 1 motif, 18
5.0
exocyst complex component 4
4.7
ret proto-oncogene
4.6
PTEN
phosphatase and tensin homolog
4.5
MMP2
matrix metallopeptidase 2
4.3
GNAS
GNAS complex locus
4.3
TGM3
transglutaminase 3
4.0
NAV3
EPHA3
Mutated
in <1-5%
of
cancers
CaMP
score
ADAMTSL3
GUCY1A2
ADAMTS18
SEC8L1
RET
Landscape of colon cancers
Landscape of colon cancers
FBXW7
TP53
PIK3CA
KRAS
APC
Landscape of colon cancers
FBXW7
TP53
PIK3CA
KRAS
APC
Lessons learned – 2
Genomic landscape of cancers


More genes involved in cancer than previously
anticipated – few “mountains”, many “hills”
There is significant heterogeneity between
individual tumors (even of the same type)
Landscape of a single colon cancer
FBXW7
TP53
PIK3CA
KRAS
APC
Landscape of a single colon cancer
FBXW7
TP53
PIK3CA
KRAS
APC
Lessons learned – 2
Genomic landscape of cancers



More genes involved in cancer than previously
anticipated – few “mountains”, many “hills”
There is significant heterogeneity between
individual tumors (even of the same type)
Simpler gene groups and pathways emerge
when mutation data are considered as a whole
PI3K/AKT pathway is mutated in both breast and colorectal
cancers, but the specific mutated genes are different.
Overview
I.
Background and overview of
cancer genome studies
II.
Lessons from prior analyses
of cancer genomes
III.
Results and implications of
the current brain cancer study
Glioblastoma multiforme
(GBM)



Most common and lethal primary brain
tumor
Occurs in both adults and children
Categorized into two groups
– Primary (>90%)
– Secondary (<10%): have evidence of preexisting lower-grade lesion
What genes?
All available protein-coding genes
Identical in RefSeq and Ensembl
Canonical start / stop codons
Consensus
Coding
Sequences
(CCDS)
Cross-species conservation
Consensus splice sites
Translatable from reference
genome without fs or stop
~13,000 genes
RefSeq
Ensembl
~18,500 genes
~21,500 genes
MUTATION ANALYSIS
Human Genome Reference and Ensembl Sequences
23,219 transcripts from 20,661genes
Design primers for PCR-based amplification and
sequencing of coding exons
208,311 passing primer pairs
31.8 Mb coding sequence
Amplify and sequence DNA from 22 GBM samples
689 Mb total tumor sequence
COPY NUMBER ANALYSIS
Assemble sequence data and filter
putative somatic mutations
EXPRESSION ANALYSIS
Hybridisation to high density
oligo arrays
1.06 million genomic loci
Resequence tumor and normal DNA to confirm
mutations and exclude germline variants
Serial analysis of gene expression
using next generation sequencing
2 million tags / sample
134 homozygous deletions and
147 amplifications
2325 somatic mutations in 2043 genes
Differential expression of
genetically altered genes
Integrated bioinformatic analyses of altered genes
Identification of CAN-genes
Identification of mutated pathways
Integration of expression analyses



Identification of potential target genes in
previously-uncharacterized deletions and
amplifications
Identification of differentially-expressed
genes in GBMs relative to normal brain
Analysis of expression changes in pathways
implicated by genetic alterations
Table 1. Summary of genomic analyses
Sequencing analysis
Number of genes successfully analyzed
Number of transcripts successfully analyzed
Number of exons successfully analyzed
Primer pairs designed for amplification
Fraction of passing amplicons*
Total number of nucleotides successfully sequenced
Fraction of passing amplicon sequences successfully analyzed†
Fraction of targeted bases successfully analyzed†
Number of somatic mutations identified (n=22 samples)
Number of somatic mutations (excluding Br27P)
Missense
Nonsense
Insertion
Deletion
Duplication
Splice site or UTR
Synonymous
Average number of sequence alterations per sample
20,661
23,219
175,471
219,229
95.0%
689,071,123
98.3%
93.0%
2,325
993
622
43
3
46
7
27
245
47.3
Copy number analysis
Total number of SNP loci assessed for copy number changes
1,069,688
Number of copy number alterations identified (n=22 samples)
281
Amplifications
147
Homozygous deletions
134
Average number of amplifications per sample
6.7
Average number of homozygous deletions per sample
6.1
*Passing amplicons were defined as having PHRED20 scores or better over 90% of the
target sequence in 75% of samples analyzed.
†
Fraction of nucleotides having PHRED20 scores or better (see Supporting Online
Materials for additional information).
Altered genes in GBM
Table 2. Most frequently altered GBM CAN- genes
Point mutations^
Gene
CDKN2A
TP53
EGFR
PTEN
NF1
CDK4
RB1
IDH1
PIK3CA
PIK3R1
Amplifications&
Homozygous deletions&
Number of
tumors
Fraction of
tumors
Number of
tumors
Fraction of
tumors
Number of
tumors
Fraction of
tumors
Fraction of
tumors with
any alteration
0/22
37/105
15/105
27/105
16/105
0/22
8/105
12/105
10/105
8/105
0%
35%
14%
26%
15%
0%
8%
11%
10%
8%
0/22
0/22
5/22
0/22
0/22
3/22
0/22
0/22
0/22
0/22
0%
0%
23%
0%
0%
14%
0%
0%
0%
0%
11/22
1/22
0/22
1/22
0/22
0/22
1/22
0/22
0/22
0/22
50%
5%
0%
5%
0%
0%
5%
0%
0%
0%
50%
40%
37%
30%
15%
14%
12%
11%
10%
8%
Passenger
Probability*
<0.01
<0.01
<0.01
<0.01
0.04
<0.01
0.02
<0.01
0.10
0.10
The most frequently-altered CAN- genes are listed; all CAN- genes are listed in Table S7. ^Fraction of tumors with point mutations indicates the fraction of mutated GBMs out
of the 105 samples in the Discovery and Prevalence Screens. CDKN2A and CDK4 were not analyzed for point mutations in the Prevalence Screen because no sequence
alterations were detected in these genes in the Discovery Screen. &Fraction of tumors with amplifications and deletions indicates the number of tumors with these types of
alterations in the 22 Discovery Screen samples. *Passenger probability indicates the Passenger probability - Mid (12 ).
Core genetic pathways in GBMs
Table 3. Mutations of the TP53, PI3K, and RB1 pathways in GBM samples
TP53 pathway
Tumor sample
TP53
Br02X
Br03X
Br04X
Br05X
Br06X
Br07X
Br08X
Br09P
Br10P
Br11P
Br12P
Br13X
Br14X
Br15X
Br16X
Br17X
Br20P
Br23X
Br25X
Br26X
Br27P
Br29P
Fraction of tumors with
altered gene/pathway#
Del
Mut
Mut
MDM2
MDM4
Amp
PI3K Pathway
All
genes
Alt
Alt
Alt
Alt
Mut
Alt
Mut
Mut
Mut
Mut
Mut
Alt
Alt
Alt
Alt
Alt
PTEN PIK3CA PIK3R1
RB1 pathway
IRS1
Mut
Mut
Mut
Mut
Mut
All
genes
Alt
Alt
Alt
Alt
Alt
RB1
CDK4
CDKN2A
Del
Mut
Del
Del
Del
Del
Amp
Mut
Alt
Mut
Alt
Del
Del
Del
Mut
Amp
Alt
Mut
Alt
Amp
0.55
0.05
0.64
#
Alt
Alt
Alt
Alt
Alt
Alt
Del
Del
Mut
Alt
Alt
Alt
Del
Del
Alt
Alt
Alt
0.45
0.68
Alt
Alt
0.05
Alt
Alt
Alt
Alt
Alt
Alt
Mut
Mut
Mut
Mut
All
genes
Alt
Amp
0.27
0.09
0.09
0.05
0.50
* Mut, mutated; Amp, amplified; Del, deleted; Alt, altered Fraction of affected tumors in 22 Discovery Screen samples
0.14
0.14
IDH1 mutations
Normal
C394A (R132S)
Br104X
G395A (R132H)
Br122X
Isocitrate dehydrogenases (IDHs)
Catalyze the oxidative carboxylation of
isocitrate to a-ketoglutarate
Isocitrate + NAD(P)+ ----------> a-ketoglutarate + CO2 + NAD(P)H
Isocitrate binding site residues:
One subunit: Thr77, Ser94, Arg100, Arg109,
Arg132, Tyr139, Asp275
Other subunit: Lys212, Thr214, Asp252
Five isocitrate dehydrogenase (IDH) genes reported
(e- acceptor)
-Form heterotetramer a2bg
-Catalyze rate-limiting
step of TCA cycle
IDH3A
CCDS10297.1
Chr 15
NAD(+)
-Form homodimer
-Regeneration of NADPH
for biosynthetic processes
-Defense against oxidative
damage?
IDH3G
CCDS14730.1
Chr X
IDH3B
CCDS13031.1
CCDS13032.1
Chr 20
Mitochondria
NADP(+)
IDH2
CCDS10359.1
Chr 15
IDH1
CCDS2381.1
Chr 2
Cytoplasm/peroxisomes
Isocitrate dehydrogenases (IDHs)
Catalyze the oxidative carboxylation of
isocitrate to a-ketoglutarate
Isocitrate + NAD(P)+ ----------> a-ketoglutarate + CO2 + NAD(P)H
Isocitrate binding site residues:
One subunit: Thr77, Ser94, Arg100, Arg109,
Arg132, Tyr139, Asp275
Other subunit: Lys212, Thr214, Asp252
Fig. 1. Structure of the active site of IDH1. The crystal structure of the human cytosolic
NADP(+) -dependent IDH is shown in ribbon format (PDBID: 1T0L) (44). The active cleft of IDH1
consists of a NADP-binding site and the isocitrate-metal ion-binding site. The alpha-carboxylate
oxygen and the hydroxyl group of isocitrate chelate the Ca2+ ion. NADP is colored in orange,
isocitrate in purple and Ca2+ in blue. The Arg132 residue, displayed in yellow, forms hydrophilic
interactions, shown in red, with the alpha-carboxylate of isocitrate. Displayed image was
created with UCSF Chimera software version 1.2422
Characteristics of IDH1-mutated GBMs
Table 4. Characteristics of GBM patients with IDH1 mutations
Patient ID
Patient age
(years)*
Sex
Recurrent Secondary Overall survival
GBM#
GBM^
(years)&
IDH1 Mutation
Nucleotide
Amino acid
Mutation
of TP53
Mutation of PTEN,
RB1, EGFR, or NF1
Br10P
30
F
No
No
2.2
G395A
R132H
Yes
No
Br11P
32
M
No
No
4.1
G395A
R132H
Yes
No
Br12P
31
M
No
No
1.6
G395A
R132H
Yes
No
Br104X
29
F
No
No
4.0
C394A
R132S
Yes
No
Br106X
36
M
No
No
3.8
G395A
R132H
Yes
No
Br122X
53
M
No
No
7.8
G395A
R132H
No
No
Br123X
34
M
No
Yes
4.9
G395A
R132H
Yes
No
Br237T
26
M
No
Yes
2.6
G395A
R132H
Yes
No
Br211T
28
F
No
Yes
0.3
G395A
R132H
Yes
No
Br27P
32
M
Yes
Yes
1.2
G395A
R132H
Yes
No
Br129X
25
M
Yes
Yes
3.2
C394A
R132S
No
No
Br29P
42
F
Yes
Unknown
Unknown
G395A
R132H
Yes
No
IDH1 mutant
patients (n=12)
33.2
67% M
25%
42%
3.8
100%
100%
83%
0%
IDH1 wildtype
patients (n=93)
53.3
65% M
16%
1%
1.1
0%
0%
27%
60%
*
Patient age refers to age at which patient GBM sample was obtained. #Recurrent GBM designates a GBM which was resected >3 months after a prior diagnosis of GBM. ^Secondary GBM
designates a GBM which was resected > 1 year after a prior diagnosis of a lower grade glioma (WHO I-III). &Overall survival was calculated using date of GBM diagnosis and date of death or last
patient contact: patients Br10P and Br11P were alive at last contact. Median survival for IDH1 mutant patients and IDH1 wildtype patients was calculated using logrank test. Previous pathologic
diagnoses in secondary GBM patients were oligodendroglioma (WHO grade II) in Br123X, low grade glioma (WHO grade I-II) in Br237T and Br211T, anaplastic astrocytoma (WHO grade III) in
Br27P, and anaplastic oligodendroglioma (WHO grade III) in Br129X. Abbreviations: GBM (glioblastoma multiforme, WHO grade IV), WHO (World Health Organization), M (male), F (female),
mut (mutant). Mean age and median survival are listed for the groups of IDH1-mutated and IDH1-wildtype patients.
IDH1 mutation and patient age
80
70
60
50
40
30
20
10
0
Patients with mutated IDH1
Patients with wildtype IDH1
IDH1 mutation, age and tumor type
Total
Age (years)
IDH1 mutated
<20
20-29
30-39
40-49
50-59
>59
0/12
6/10
8/16
2/25
2/36
0/50
0%
60%
50%
8%
6%
0%
All
18/149
12%
Young adult patients
All patients
18/149
12%
Patients < 35 years
13/32
41%
Patients 35+ years
5/117
4%
Secondary GBMs
8/10
80%
Secondary GBMs
IDH1 mutation and patient survival
Overall Survival (%)
100
80
IDH1 M utated
(n=11)
60
IDH1
Wildtype
(n=79)
40
20
p<0.001
0
0
2
4
6
Years
8
10
Conclusions – 1
Pathway analyses

Core set of pathways identified in GBMs using
integrated genomic data, including processes
specific to the nervous system
Conclusions – 1
Pathway analyses


Core set of pathways identified in GBMs using
integrated genomic data, including processes
specific to the nervous system
Necessity for pathway or process-specific view
to guide further analyses and therapeutic
design
Conclusions – 2
Identification of IDH1

IDH1 was identified as a commonly mutated
GBM gene, particularly in specific subsets of
patients
Conclusions – 2
Identification of IDH1


IDH1 was identified as a commonly mutated
GBM gene, particularly in specific subsets of
patients
IDH1-mutated GBMs have characteristic
clinical and genetic findings
Conclusions – 2
Identification of IDH1



IDH1 was identified as a commonly mutated
GBM gene, particularly in specific subsets of
patients
IDH1-mutated GBMs have characteristic
clinical and genetic findings
Identifies IDH1 as a potentially-useful target for
diagnostics and therapeutics
Conclusions – 2
Identification of IDH1




IDH1 was identified as a commonly mutated
GBM gene, particularly in specific subsets of
patients
IDH1-mutated GBMs have characteristic
clinical and genetic findings
Identifies IDH1 as a potentially-useful target for
diagnostics and therapeutics
Further functional studies required
Acknowledgements
JHU participants in prior genome studies
Tobias Sjoblom
Laura Wood
Yardena Samuels
Steve Szabo
Ben Ho Park
Kurtis E. Bachman
Additional JHU participants in current study
Janine Ptak
Natalie Silliman
Lisa Dobbyn
Melissa Whalen
GBM study participants (JHU)
Sian Jones
Xiaosong Zhang
Jimmy Lin
Rebecca Leary
Philipp Angenendt
Parminder Mankoo
Hannah Carter
I-Mei Sui
Gary Gallia
Allesandro Olivi
Luis Diaz, Jr.
Gregory Riggins
Rachel Karchin
Nick Papadopoulos
Giovanni Parmigiani
Bert Vogelstein
Victor Velculescu
Ken Kinzler
GBM study participants (Duke)
Hai Yan
Roger McLendon
B. Ahmed Rasheed
Stephen Keir
Darell Bigner
GBM study (other collaborators)
Tatiana Nikolskaya
Yuri Nikolsky
Dana Bsam
Hanna Tekleab
James Hartigan
Doug Smith
Robert Strausberg
Sely Kazue Nagahashi Marie
Sueli Mieko Oba Shinjo
Related documents