* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download CancerBrowser_COAT2012
Pharmacogenomics wikipedia , lookup
Copy-number variation wikipedia , lookup
Designer baby wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Genome evolution wikipedia , lookup
Genomic library wikipedia , lookup
Public health genomics wikipedia , lookup
Metagenomics wikipedia , lookup
Pathogenomics wikipedia , lookup
The Cancer Genome Browser Sofie Salama COAT-PhD Summer School 2012 1 The Cancer Genome Browser • OUTLINE – Slide show to introduce the Cancer Genomics Browser • What’s there? • How to visualize the data? • Tools – Live Demo • Basic setup • Breast cancer data – Using signatures – Microarray vs RNA-Seq – Comparing across datasets • GBM data – Genesets – What genes correlate with phenotypes? – Playtime! 2 UCSC Genome Browser • Base level to full genome display capability • ENCODE • Human sequence variation • Whole genome association studies • Human genetic and disease related genome annotation https://genome.ucsc.edu 3 Large-scale Medical Genomics Datasets New issues arise to visualize high-throughput cancer genomics data: data security and access control, sample cohort, multi-analytes, and clinical and phenotypic information. https://genome-cancer.ucsc.edu 4 UCSC Cancer Genomics Browser • Simultaneously display patient genomic and clinical data from a cohort of samples • Base level to full genome display capability • Multiple studies • Growing list of published studies, including public-tier TCGA data • Integrated with popular UCSC Genome Browser and its vast store of genomic information Zhu J et. al Nature5Methods. 2009 Sanborn JZ et.al. Nucleic Acids Res. 2010 New UCSC Cancer Browser Portal genome-cancer.ucsc.edu User Interface: A portal to display high throughput data sets genome-cancer.ucsc.edu Teresa Swatloski, Brian Craft, Mary Goldman User Interface Features help menu resize panels select dataset to view link to tumor image browser link to human genome browser view in chromosome mode view in gene mode user sign in toggle on/off RefSeq genes position or gene search bar configure genesets configure genomic signatures genome-cancer.ucsc.edu Teresa Swatloski, Brian Craft, Mary Goldman Dataset selection showing TCGA breast cancer data TCGA breast cancer datasets • Gene expression, copy number, DNA Methylation, RPPA, Paradigmlite • TCGA clinical data Teresa Swatloski Genomic and phenotypic data heatmaps TCGA glioblastoma multiforme (GBM) copy number Gistic2 estimate • N=538 Heatmap Box Plot Proportions Adjust Copy Number (Gistic2) Features Genomic data genome-cancer.ucsc.edu Clinical data Individual dataset layout TCGA glioblastoma multiforme (GBM) copy number Gistic2 estimate • N=538 Heatmap Box Plot Proportions Adjust Copy Number (Gistic2) Features Samples Genomic locations / Genes Genomic data genome-cancer.ucsc.edu Clinical data Genomics Heatmap TCGA glioblastoma multiforme (GBM) copy number Gistic2 estimate • N=538 Heatmap Box Plot Proportions Adjust Featu Samples amplification deletion Clinical Heatmap Primary solid tumor • Multiple clinical features • Clinical data encoded in color Samples Solid tissue normal Metastatic sample_type days_to_last_followup Sample sorting determined by clinical data TCGA glioblastoma multiforme (GBM) copy number Gistic2 estimate • N=538 Heatmap Box Plot Proportions Adjust Copy Number (Gistic2) Features Samples • Sample (i.e. vertical) order is determined by the clinical data on the right • The samples is always sorted by clinical features • Tie break using subsequent clinical features genome-cancer.ucsc.edu Zoom in to See Individual Sample drag zoom genome-cancer.ucsc.edu slider Individual Dataset Control click to show dataset detail heatmap view box plot summar y view proportions summary view remove dataset adjust display coloring genomic heatmap configuration window for clinical variables, sample subgrouping and statistics clinical heatmap Teresa Swatloski Summary Views Heatmap View - Amplified / Deleted Regions Box Plot Summary View Proportions Summary View DNA Copy Number Profile Summary View glioblastoma multiforme breast carcinoma lung squamous cell TCGA CNV DNA Copy Number Profile Summary View glioblastoma multiforme EGFR CDKN2A, CDKN2B breast carcinoma lung squamous cell TCGA CNV Genes View Mode genome-cancer.ucsc.edu “Genes” Configuration Currently displayed gene list 1 Three ways to add a gene list 2 3 Type or copy and paste user defined genes genome-cancer.ucsc.edu 20 Teresa Swatloski Genes view to see the PAM50 intrinsic gene expression subtypes in TCGA Breast data Basal LuA LuB Her2-like Normal-like PAM50: Parker et al., Journal of Clinical Oncology (2009) Same thing with RNA-Seq Data Her2 Basal Tumor LumA LumB Solid normal Online statistical tests compare two subgroups TCGA glioblastoma multiforme (GBM) copy number Gistic2 estimate • N=538 Heatmap Box Plot Proportions Adjust Copy Number (Gistic2) Features Samples Subgroup samples genome-cancer.ucsc.edu Online statistical tests compare two subgroups TCGA glioblastoma multiforme (GBM) copy number Gistic2 estimate • N=538 Heatmap Box Plot Proportions Adjust Copy Number (Gistic2) Features Samples p values Subgroup samples genome-cancer.ucsc.edu Sample subgroup configuration click to view detail and use the variable to subgroup samples variables used in defining subgroups “Active Feature List” area subgroup 1 subgroup 2 perform statistical tests to compare subgroup1 and subgroup2 Compare subgroups using the summary view EGFR amplification in GBM is largely in the non CpG island DNA methylator samples (non G-CIMP) methylator samples in GBM is largely proneural by gene expression, also from younger patients, with better survival Evaluate Genomic Signature on the Browser B. Computed signatures online -> approximate prediction A. Enter signature as an algebraic expression Evaluate Genomic Signature on the Browser • 21 gene signature predicts rate of recurrence at 10 yr in ER+ patients treated with TAM (Paik 2004) • Genomic signature online approximation: higher score -> higher likelihood of recurrence; low score -> lower likelihood of recurrence Evaluate Genomic Signature on the Browser • Browser view of ER+ patients in a preoperative chemotherapy study dataset • Signature score correlates with pathCR: the paradox that ER+ patient who is more likely to have recurrent disease in 10 years treated with TAM is also more likely to respond to chemotherapy Genomic Signature Configuration Current signatures 1 Three ways to add a genomic signature 2 3 Enter signature as an algebraic expression Such as: + TP53 – 0.25* ERBB2 Teresa Swatloski User Support [email protected] 31 Mary Goldman, Teresa Swatloski Web API Create a url to specify a view to the cancer browser •base: https://genome-cancer.ucsc.edu/hgHeatmap/#? •data track(s): comma separated gene names •display mode •gene list: coma separated gene names •chromosomal position •genomic signature: e.g. +TP53-0.25*ERBB2 Examples •dataset=vijver2002&pos=chr2:123767566-chr2:187943340 •dataset=ucsfNeveCGH&displayas=geneset&gene_list=TP53,ERBB2 Documentation https://genome-cancer.soe.ucsc.edu/proj/site/help Brian Craft, Mary Goldman User Account and Security genome-cancer.ucsc.edu Brian Craft cgData: Cancer Genomic data specification • • Gene expression, copy number, RPPA, DNA methlylation, siRNA viability, phenotypes, clinical data Support large-scale genomic data repository - Currently supports Cancer Browser • • • - Plan to support automated data analysis pipeline “Solve” (address) common data linking problem Meta data tracking Once data in this specification, automated data ingestion to UCSC Cancer Browser Kyle Ellrott Cancer Browser Updates • • • • Current improved version launched January, 2012 Monthly data freeze Latest freeze data viewable on the Cancer Browser within a few days July, 2012 – Added ability to download processed datasets and improved user interface for clinical features, subgrouping and statistics Data freeze 2012-02-28 summary (sample number) Summary • Simultaneously display patient genomic and clinical data from a cohort of samples • Multiple studies data visualization • Base level to full genome, and genesets display capability • cgData data repository driven • Monthly data freeze and version control • User account • Project-specific access-control • Single signon portal • Provide web API for linking [email protected] 37 DCC, Firehose UCSC Cancer Genomics Browser converter cBio UCSC cgData Repository PARADIGM pathway analysis UCSC Next-gen Sequencing Data Analysis Bam files •DNA-seq (bambam, bridget) •mutation, allelic-specific copy number, structural rearrangement •Combined RNA/DNA analysis •RNA editing Clinical Predictors (TopModel) Mutation call comparison Acknowledgment UCSC Cancer Genomics Group Collaborators The Cancer Genome Atlas Brian Craft Stand Up To Cancer Teresa Swatloski Intl. Cancer Genomics Consortium Mary Goldman Kyle Ellrott Erich Weiler Chris Wilks Singer Ma Christopher Szeto Sofie Salama Mia Grifford Sam Ng Ted Goldstein Dan Carlin Daniel Zerbino Melissa Cline Mark Diekhans Josh Stuart David Haussler ISPY consortium MSKCC LINCS consortium Christopher Benz, Buck Institute Laura Esserman, UCSF Joe Gray, OHSU Eric Collisson, UCSF Gordon Mills, MDACC Rachel Schiff, BCM Funding Agencies NCI/NIH, NHGRI American Association for Cancer Research 39 The Cancer Genome Browser • OUTLINE – Slide show to introduce the Cancer Genomics Browser • What’s there? • How to visualize the data? • Tools – Live Demo • Basic setup • Breast cancer data – Using signatures – Microarray vs RNA-Seq – Comparing across datasets • GBM data – Genesets – What genes correlate with phenotypes? – Playtime! 40 cgData Packages clinical data1 (FFPE, timepoint) meta-data genomic data A (CNV) meta-data clinical data 2 (patient, age,..) meta-data genomic data B (RPPA) meta-data Most likely your data files Need to add meta data file cgData Packages clinical data1 (FFPE, timepoint) TCGA-01-ABCD-01A clinical data 2 (patient, age,..) TCGA-01-ABCD idMap (TCGA BRCA) genomic data A (CNV) patient sample sample genomic data B (RPPA) aliquot aliquot TCGA-01-ABCD-01A-EG TCGA-01-ABCD-01A-JH cgData Packages clinical data1 (FFPE, timepoint) clinical data 2 (patient, age,..) Most likely your data files Need to add meta data file idMap (TCGA BRCA) genomic data A (CNV) assembly (hg18) Identifiers used in data files genomic data B (RPPA) probeMap B (antibody) probeMap B parent-child relationships Mostly likely already in UCSC cgData library