Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Mining TCGA Gene Expression Data Kimberly J. Bussey, Ph.D. Assistant Professor Integrated Cancer Genomics Division Translational Genomics Research Institute Outline • • • • Biology Types of gene expression measurements Gene expression data in TCGA Tools for working with the data What do we mean by gene expression? DNA RNA Protein GENE EXPRESSION Gene Expression Platforms RNA • Array-based – Affymetrix, Illumina, Agilent, etc. – Can be total mRNA, focused on exons or splice variants, or miRNA – Probes designed to specific sequences • RNA-seq – Sequenced based – Either mRNA or miRNA RNA workflow • RNA Extraction – Different protocols depending on what the target pool of RNA is: total RNA, mRNA, or miRNA • Create cDNA or cRNA library • For array-based method, hybridization and image analysis RNA-Seq Illumina Sequencing Technology Overview 3’ 5’ DNA (0.1-5.0 µg) A G C T G C T A C G A T A C C C G A T C G A T A T C G A T G C T Library preparation Single molecule Cluster growtharray 5’ Sequencing 1 2 3 4 5 6 7 8 9 TGCTACGAT… Image acquisition Base calling Gene Expression Platforms Protein • Reverse Phase Protein Arrays – Serial dilution of protein lysate spotted array – Probed with antibody Protein workflow TCGA data sets • https://wiki.nci.nih.gov/display/TCGA/ TCGA+Data+Primer • Remember: what is public depends on the risk of being able to identify the subject – No BAM files, no FASTq without controlled access approval More Cell types" Protein" RNA" DNA" Integration is the goal, but… • You need to understand what the technology measured • You need to know how that measurement was annotated • Remember that not all identifiers are stable over time • Excel does bad things to gene symbols and clone ids About Copy Number…. Normal Cancer Tools and Questions • IGV: Visualize the output of FireHose • UCSC Cancer Genome Viewer : Visualize and do subset analysis • Regulome Explorer: Integrative analysis driven by statistical associations • GenePattern: Suite of tools for many different types of data sets • IPA, GeneGO, etc: Pathway enrichment analysis Biosig Biosig cBio Signatures/Gene Sets - MSigdb MSigdb UCSC Cancer Genomics Browser Cancer Genome Workbench IGV IGV Take Home Points • Gene expression data can be RNA or protein-base measurements • Different tools are good for showing you different relationships • Interpretation of the results requires an understanding of what was actually measured QUESTIONS?