Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Genome-Wide Mutational Analyses of Human Cancers: Lessons Learned From Sequencing Cancer Genomes Will Parsons, M.D., Ph.D. Ludwig Center for Cancer Genetics and Therapeutics The Sidney Kimmel Cancer Center Johns Hopkins University Sept 5, 2008 Overview I. Background and overview of cancer genome studies II. Lessons from prior analyses of cancer genomes III. Results and implications of the current brain cancer study Overview I. Background and overview of cancer genome studies II. Lessons from prior analyses of cancer genomes III. Results and implications of the current brain cancer study Cancer is a genetic disease APC/b-catenin Normal Epithelium Dysplastic ACF K-RAS Early Adenoma Intermediate Adenoma p53 18q Late Adenoma 30 to 40 years Other Changes? Carcinoma Metastasis Cancer genotype directed therapies Gleevec (imatinib) – CML (BCR-ABL) – Gastrointestinal Stromal Tumors (c-KIT) Herceptin (trastuzumab) – Breast Cancer (HER-2) Iressa (gefitinib) and Tarceva (erlotinib) – NSCLC (EGFR) What we know about cancer genetics High throughput sequencing (>10 million bp per day) + + $$ = Methods to identify mutations Pre-genome Candidate approach Post-genome High throughput Mutational analysis of signaling pathways in colorectal cancer 138 protein tyrosine kinases 16 phosphatidylinositol 3-kinases 87 protein tyrosine phosphatases 200 chromosomal instability genes 350 serine / threonine kinases Bardelli et al., Science 300:949 (2003) Samuels et al., Science 304, 554 (2004) Wang et al., Science 304 (5674):1164 (2004). Wang et al., Cancer Res 64(9):2998 (2004) Parsons et al., Nature 436(7052):792 (2005) Analyzed in a collection of colorectal and other human tumors High frequency of mutations of the PI3-kinase PIK3CA in human cancer Colorectal cancer 74/234 Tumor Breast cancerFraction mutated 13/53 Hepatocellular cancer 26/73 Colon 74/234 (32%) Brain cancer 4/15 (27%) 4/15 Brain Gastric cancer 3/12 (25%) 3/12 Gastric Lung cancer Breast 1/12 (8%)1/24 Lung 32% 27% 35% 27% 25% 4% 1/24 (4%) C2 8% 47% Samuels et al., Science 304, 554 (2004), Bachman et al., CBT 3 e49 (2004), Broderick et al., Can Res 64, 5048 (2004), Lee et al., Oncogene 24, 1477 (2005) 33% Mutations of PI3K pathway genes in colorectal cancer Parsons et al. Nature 436: 792 (2005) Goals for “Cancer Genomics” To develop a strategy for unbiased genome-wide analyses of cancer genes in human tumors To determine the spectrum and extent of somatic mutations in human tumors of similar and different histologic types To identify new cancer genes for basic research and improvements in diagnosis, prevention, and therapy Genome-wide mutational analyses Discovery Screen Select gene set and tumors A Design primers PCR amplify coding exons from samples of tumor DNA t Dye terminator sequencing n Validation Screen Find tumor-specific mutations Validate mutated genes in larger panel of additional tumors Compare gene mutation frequency to expected background Candidate cancer genes Genes with passenger mutations B Driver vs. Passenger mutations Driver mutations – provide a net growth advantage and are positively selected for during tumorigenesis Passenger mutations – neutral mutations that provide no advantage to the tumor Mutation Prioritization 1. Frequency 2. Type 3. Predicted effects 4. Structural models 5. Analogous mutations 6. Functional studies Evaluating Genes based on Mutation Frequency CaMP Score – Metric used to rank genes based on their mutation frequency and type – Takes account of number of mutations, length and nucleotide content of gene, context of mutations Can use statistical methods to determine the likelihood that genes with CaMP scores over a threshold are mutated at a frequency higher than background Overview I. Background and overview of cancer genome studies II. Lessons from prior analyses of cancer genomes III. Results and implications of the current brain cancer study What tumors? Breast and Colon cancers 2004 Estimated US Cancer Cases* Men 699,560 Women 668,470 32% Breast 12% Lung & bronchus 11% 11% Colon & rectum Urinary bladder 6% 6% Uterine corpus Melanoma of skin 4% 4% Ovary 4% Non-Hodgkin lymphoma 4% Melanoma of skin 3% Thyroid 2% Pancreas 2% Urinary bladder 20% All Other Sites Prostate 33% Lung & bronchus 13% Colon & rectum Non-Hodgkin lymphoma 4% Kidney 3% Oral Cavity 3% Leukemia 3% Pancreas 2% All Other Sites 18% *Excludes basal and squamous cell skin cancers and in situ carcinomas except urinary bladder. Source: American Cancer Society, 2004. What genes? Protein-coding genes in CCDS and RefSeq Identical in RefSeq and Ensembl Canonical start / stop codons Consensus Coding Sequences (CCDS) Cross-species conservation Consensus splice sites Translatable from reference genome without fs or stop ~13,000 genes RefSeq Ensembl ~18,500 genes ~21,500 genes Lessons learned - 1 Mutations and candidate cancer genes Many genes are mutated in these solid tumors Mutations per tumor 120 100 Non-silent mutations Total mutations 80 60 40 CAN-gene mutations 20 0 1 2 3 4 5 6 Tumor # 7 8 9 10 11 Lessons learned – 1 Mutations and candidate cancer genes Many genes are mutated in these solid tumors Vast majority of previously known breast and colon cancer genes were identified Genes known to be mutated in breast and colorectal cancers are CAN-genes Mutation frequency Breast cancers Colon cancers >10% TP53, PIK3CA TP53, APC, KRAS, PIK3CA, SMAD4, FBXW7 (CDC4) <10% MRE11, BRCA1 EPHA3, NF1, SMAD2, SMAD3, TCF7L2 (TCF4), TGFBRII Lessons learned – 1 Mutations and candidate cancer genes Many genes are mutated in these solid tumors Vast majority of previously known breast and colon cancer genes were identified Many new breast and colon CAN-genes were discovered New CAN-genes are likely to exist in other tumor types The majority of CAN-genes had not previously been implicated in cancer Breast cancers (n=122 genes) Colon cancers (n=69 genes) 3% 8% 20% 3% 1% 3% 1% 3% 18% 20% Mutation Translocation Amplification 67% Deletion Methylation 3% Expression Not known 61% 12% Lessons learned – 2 Genomic landscape of cancers More genes involved in cancer than previously anticipated – few “mountains”, many “hills” Top colon CAN-genes Gene Name APC adenomatosis polyposis coli >10 KRAS v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog >10 TP53 tumor protein p53 >10 PIK3CA phosphoinositide-3-kinase, catalytic, alpha polypeptide >10 FBXW7 F-box and WD-40 domain protein 7 9.6 neuron navigator 3 8.0 EPH receptor A3 7.1 MAP2K7 neuron navigator 3 7.0 SMAD4 SMAD, mothers against DPP homolog 4 6.0 ADAMTS-like 3 5.9 guanylate cyclase 1, soluble, alpha 2 5.8 OR51E1 olfactory receptor, family 51, subfamily E, member 1 5.6 TCF7L2 transcription factor 7-like 2 (TCF4) 5.2 ADAM metallopeptidase with thrombospondin type 1 motif, 18 5.0 exocyst complex component 4 4.7 ret proto-oncogene 4.6 PTEN phosphatase and tensin homolog 4.5 MMP2 matrix metallopeptidase 2 4.3 GNAS GNAS complex locus 4.3 TGM3 transglutaminase 3 4.0 NAV3 EPHA3 Mutated in <1-5% of cancers CaMP score ADAMTSL3 GUCY1A2 ADAMTS18 SEC8L1 RET Landscape of colon cancers Landscape of colon cancers FBXW7 TP53 PIK3CA KRAS APC Landscape of colon cancers FBXW7 TP53 PIK3CA KRAS APC Lessons learned – 2 Genomic landscape of cancers More genes involved in cancer than previously anticipated – few “mountains”, many “hills” There is significant heterogeneity between individual tumors (even of the same type) Landscape of a single colon cancer FBXW7 TP53 PIK3CA KRAS APC Landscape of a single colon cancer FBXW7 TP53 PIK3CA KRAS APC Lessons learned – 2 Genomic landscape of cancers More genes involved in cancer than previously anticipated – few “mountains”, many “hills” There is significant heterogeneity between individual tumors (even of the same type) Simpler gene groups and pathways emerge when mutation data are considered as a whole PI3K/AKT pathway is mutated in both breast and colorectal cancers, but the specific mutated genes are different. Overview I. Background and overview of cancer genome studies II. Lessons from prior analyses of cancer genomes III. Results and implications of the current brain cancer study Glioblastoma multiforme (GBM) Most common and lethal primary brain tumor Occurs in both adults and children Categorized into two groups – Primary (>90%) – Secondary (<10%): have evidence of preexisting lower-grade lesion What genes? All available protein-coding genes Identical in RefSeq and Ensembl Canonical start / stop codons Consensus Coding Sequences (CCDS) Cross-species conservation Consensus splice sites Translatable from reference genome without fs or stop ~13,000 genes RefSeq Ensembl ~18,500 genes ~21,500 genes MUTATION ANALYSIS Human Genome Reference and Ensembl Sequences 23,219 transcripts from 20,661genes Design primers for PCR-based amplification and sequencing of coding exons 208,311 passing primer pairs 31.8 Mb coding sequence Amplify and sequence DNA from 22 GBM samples 689 Mb total tumor sequence COPY NUMBER ANALYSIS Assemble sequence data and filter putative somatic mutations EXPRESSION ANALYSIS Hybridisation to high density oligo arrays 1.06 million genomic loci Resequence tumor and normal DNA to confirm mutations and exclude germline variants Serial analysis of gene expression using next generation sequencing 2 million tags / sample 134 homozygous deletions and 147 amplifications 2325 somatic mutations in 2043 genes Differential expression of genetically altered genes Integrated bioinformatic analyses of altered genes Identification of CAN-genes Identification of mutated pathways Integration of expression analyses Identification of potential target genes in previously-uncharacterized deletions and amplifications Identification of differentially-expressed genes in GBMs relative to normal brain Analysis of expression changes in pathways implicated by genetic alterations Table 1. Summary of genomic analyses Sequencing analysis Number of genes successfully analyzed Number of transcripts successfully analyzed Number of exons successfully analyzed Primer pairs designed for amplification Fraction of passing amplicons* Total number of nucleotides successfully sequenced Fraction of passing amplicon sequences successfully analyzed† Fraction of targeted bases successfully analyzed† Number of somatic mutations identified (n=22 samples) Number of somatic mutations (excluding Br27P) Missense Nonsense Insertion Deletion Duplication Splice site or UTR Synonymous Average number of sequence alterations per sample 20,661 23,219 175,471 219,229 95.0% 689,071,123 98.3% 93.0% 2,325 993 622 43 3 46 7 27 245 47.3 Copy number analysis Total number of SNP loci assessed for copy number changes 1,069,688 Number of copy number alterations identified (n=22 samples) 281 Amplifications 147 Homozygous deletions 134 Average number of amplifications per sample 6.7 Average number of homozygous deletions per sample 6.1 *Passing amplicons were defined as having PHRED20 scores or better over 90% of the target sequence in 75% of samples analyzed. † Fraction of nucleotides having PHRED20 scores or better (see Supporting Online Materials for additional information). Altered genes in GBM Table 2. Most frequently altered GBM CAN- genes Point mutations^ Gene CDKN2A TP53 EGFR PTEN NF1 CDK4 RB1 IDH1 PIK3CA PIK3R1 Amplifications& Homozygous deletions& Number of tumors Fraction of tumors Number of tumors Fraction of tumors Number of tumors Fraction of tumors Fraction of tumors with any alteration 0/22 37/105 15/105 27/105 16/105 0/22 8/105 12/105 10/105 8/105 0% 35% 14% 26% 15% 0% 8% 11% 10% 8% 0/22 0/22 5/22 0/22 0/22 3/22 0/22 0/22 0/22 0/22 0% 0% 23% 0% 0% 14% 0% 0% 0% 0% 11/22 1/22 0/22 1/22 0/22 0/22 1/22 0/22 0/22 0/22 50% 5% 0% 5% 0% 0% 5% 0% 0% 0% 50% 40% 37% 30% 15% 14% 12% 11% 10% 8% Passenger Probability* <0.01 <0.01 <0.01 <0.01 0.04 <0.01 0.02 <0.01 0.10 0.10 The most frequently-altered CAN- genes are listed; all CAN- genes are listed in Table S7. ^Fraction of tumors with point mutations indicates the fraction of mutated GBMs out of the 105 samples in the Discovery and Prevalence Screens. CDKN2A and CDK4 were not analyzed for point mutations in the Prevalence Screen because no sequence alterations were detected in these genes in the Discovery Screen. &Fraction of tumors with amplifications and deletions indicates the number of tumors with these types of alterations in the 22 Discovery Screen samples. *Passenger probability indicates the Passenger probability - Mid (12 ). Core genetic pathways in GBMs Table 3. Mutations of the TP53, PI3K, and RB1 pathways in GBM samples TP53 pathway Tumor sample TP53 Br02X Br03X Br04X Br05X Br06X Br07X Br08X Br09P Br10P Br11P Br12P Br13X Br14X Br15X Br16X Br17X Br20P Br23X Br25X Br26X Br27P Br29P Fraction of tumors with altered gene/pathway# Del Mut Mut MDM2 MDM4 Amp PI3K Pathway All genes Alt Alt Alt Alt Mut Alt Mut Mut Mut Mut Mut Alt Alt Alt Alt Alt PTEN PIK3CA PIK3R1 RB1 pathway IRS1 Mut Mut Mut Mut Mut All genes Alt Alt Alt Alt Alt RB1 CDK4 CDKN2A Del Mut Del Del Del Del Amp Mut Alt Mut Alt Del Del Del Mut Amp Alt Mut Alt Amp 0.55 0.05 0.64 # Alt Alt Alt Alt Alt Alt Del Del Mut Alt Alt Alt Del Del Alt Alt Alt 0.45 0.68 Alt Alt 0.05 Alt Alt Alt Alt Alt Alt Mut Mut Mut Mut All genes Alt Amp 0.27 0.09 0.09 0.05 0.50 * Mut, mutated; Amp, amplified; Del, deleted; Alt, altered Fraction of affected tumors in 22 Discovery Screen samples 0.14 0.14 IDH1 mutations Normal C394A (R132S) Br104X G395A (R132H) Br122X Isocitrate dehydrogenases (IDHs) Catalyze the oxidative carboxylation of isocitrate to a-ketoglutarate Isocitrate + NAD(P)+ ----------> a-ketoglutarate + CO2 + NAD(P)H Isocitrate binding site residues: One subunit: Thr77, Ser94, Arg100, Arg109, Arg132, Tyr139, Asp275 Other subunit: Lys212, Thr214, Asp252 Five isocitrate dehydrogenase (IDH) genes reported (e- acceptor) -Form heterotetramer a2bg -Catalyze rate-limiting step of TCA cycle IDH3A CCDS10297.1 Chr 15 NAD(+) -Form homodimer -Regeneration of NADPH for biosynthetic processes -Defense against oxidative damage? IDH3G CCDS14730.1 Chr X IDH3B CCDS13031.1 CCDS13032.1 Chr 20 Mitochondria NADP(+) IDH2 CCDS10359.1 Chr 15 IDH1 CCDS2381.1 Chr 2 Cytoplasm/peroxisomes Isocitrate dehydrogenases (IDHs) Catalyze the oxidative carboxylation of isocitrate to a-ketoglutarate Isocitrate + NAD(P)+ ----------> a-ketoglutarate + CO2 + NAD(P)H Isocitrate binding site residues: One subunit: Thr77, Ser94, Arg100, Arg109, Arg132, Tyr139, Asp275 Other subunit: Lys212, Thr214, Asp252 Fig. 1. Structure of the active site of IDH1. The crystal structure of the human cytosolic NADP(+) -dependent IDH is shown in ribbon format (PDBID: 1T0L) (44). The active cleft of IDH1 consists of a NADP-binding site and the isocitrate-metal ion-binding site. The alpha-carboxylate oxygen and the hydroxyl group of isocitrate chelate the Ca2+ ion. NADP is colored in orange, isocitrate in purple and Ca2+ in blue. The Arg132 residue, displayed in yellow, forms hydrophilic interactions, shown in red, with the alpha-carboxylate of isocitrate. Displayed image was created with UCSF Chimera software version 1.2422 Characteristics of IDH1-mutated GBMs Table 4. Characteristics of GBM patients with IDH1 mutations Patient ID Patient age (years)* Sex Recurrent Secondary Overall survival GBM# GBM^ (years)& IDH1 Mutation Nucleotide Amino acid Mutation of TP53 Mutation of PTEN, RB1, EGFR, or NF1 Br10P 30 F No No 2.2 G395A R132H Yes No Br11P 32 M No No 4.1 G395A R132H Yes No Br12P 31 M No No 1.6 G395A R132H Yes No Br104X 29 F No No 4.0 C394A R132S Yes No Br106X 36 M No No 3.8 G395A R132H Yes No Br122X 53 M No No 7.8 G395A R132H No No Br123X 34 M No Yes 4.9 G395A R132H Yes No Br237T 26 M No Yes 2.6 G395A R132H Yes No Br211T 28 F No Yes 0.3 G395A R132H Yes No Br27P 32 M Yes Yes 1.2 G395A R132H Yes No Br129X 25 M Yes Yes 3.2 C394A R132S No No Br29P 42 F Yes Unknown Unknown G395A R132H Yes No IDH1 mutant patients (n=12) 33.2 67% M 25% 42% 3.8 100% 100% 83% 0% IDH1 wildtype patients (n=93) 53.3 65% M 16% 1% 1.1 0% 0% 27% 60% * Patient age refers to age at which patient GBM sample was obtained. #Recurrent GBM designates a GBM which was resected >3 months after a prior diagnosis of GBM. ^Secondary GBM designates a GBM which was resected > 1 year after a prior diagnosis of a lower grade glioma (WHO I-III). &Overall survival was calculated using date of GBM diagnosis and date of death or last patient contact: patients Br10P and Br11P were alive at last contact. Median survival for IDH1 mutant patients and IDH1 wildtype patients was calculated using logrank test. Previous pathologic diagnoses in secondary GBM patients were oligodendroglioma (WHO grade II) in Br123X, low grade glioma (WHO grade I-II) in Br237T and Br211T, anaplastic astrocytoma (WHO grade III) in Br27P, and anaplastic oligodendroglioma (WHO grade III) in Br129X. Abbreviations: GBM (glioblastoma multiforme, WHO grade IV), WHO (World Health Organization), M (male), F (female), mut (mutant). Mean age and median survival are listed for the groups of IDH1-mutated and IDH1-wildtype patients. IDH1 mutation and patient age 80 70 60 50 40 30 20 10 0 Patients with mutated IDH1 Patients with wildtype IDH1 IDH1 mutation, age and tumor type Total Age (years) IDH1 mutated <20 20-29 30-39 40-49 50-59 >59 0/12 6/10 8/16 2/25 2/36 0/50 0% 60% 50% 8% 6% 0% All 18/149 12% Young adult patients All patients 18/149 12% Patients < 35 years 13/32 41% Patients 35+ years 5/117 4% Secondary GBMs 8/10 80% Secondary GBMs IDH1 mutation and patient survival Overall Survival (%) 100 80 IDH1 M utated (n=11) 60 IDH1 Wildtype (n=79) 40 20 p<0.001 0 0 2 4 6 Years 8 10 Conclusions – 1 Pathway analyses Core set of pathways identified in GBMs using integrated genomic data, including processes specific to the nervous system Conclusions – 1 Pathway analyses Core set of pathways identified in GBMs using integrated genomic data, including processes specific to the nervous system Necessity for pathway or process-specific view to guide further analyses and therapeutic design Conclusions – 2 Identification of IDH1 IDH1 was identified as a commonly mutated GBM gene, particularly in specific subsets of patients Conclusions – 2 Identification of IDH1 IDH1 was identified as a commonly mutated GBM gene, particularly in specific subsets of patients IDH1-mutated GBMs have characteristic clinical and genetic findings Conclusions – 2 Identification of IDH1 IDH1 was identified as a commonly mutated GBM gene, particularly in specific subsets of patients IDH1-mutated GBMs have characteristic clinical and genetic findings Identifies IDH1 as a potentially-useful target for diagnostics and therapeutics Conclusions – 2 Identification of IDH1 IDH1 was identified as a commonly mutated GBM gene, particularly in specific subsets of patients IDH1-mutated GBMs have characteristic clinical and genetic findings Identifies IDH1 as a potentially-useful target for diagnostics and therapeutics Further functional studies required Acknowledgements JHU participants in prior genome studies Tobias Sjoblom Laura Wood Yardena Samuels Steve Szabo Ben Ho Park Kurtis E. Bachman Additional JHU participants in current study Janine Ptak Natalie Silliman Lisa Dobbyn Melissa Whalen GBM study participants (JHU) Sian Jones Xiaosong Zhang Jimmy Lin Rebecca Leary Philipp Angenendt Parminder Mankoo Hannah Carter I-Mei Sui Gary Gallia Allesandro Olivi Luis Diaz, Jr. Gregory Riggins Rachel Karchin Nick Papadopoulos Giovanni Parmigiani Bert Vogelstein Victor Velculescu Ken Kinzler GBM study participants (Duke) Hai Yan Roger McLendon B. Ahmed Rasheed Stephen Keir Darell Bigner GBM study (other collaborators) Tatiana Nikolskaya Yuri Nikolsky Dana Bsam Hanna Tekleab James Hartigan Doug Smith Robert Strausberg Sely Kazue Nagahashi Marie Sueli Mieko Oba Shinjo