* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download RNA-seq Analysis in Galaxy
Microevolution wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Primary transcript wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Designer baby wikipedia , lookup
Gene expression programming wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Gene expression profiling wikipedia , lookup
RNA-seq Analysis in Galaxy Pawel Michalak ([email protected]) Two applications of RNA-Seq Discovery • find new transcripts • find transcript boundaries • find splice junctions Comparison • Given samples from different experimental conditions, find effects of the treatment on gene expression strengths • Isoform abundance ratios, splice patterns, transcript boundaries Specific Objectives By the end of this module, you should 1) Be more familiar with the DE user interface 2) Understand the starting data for RNA-seq analysis 3) Be able to align short sequence reads with a reference genome in the DE 4) Be able to analyze differential gene expression in the DE 5) Be able to use DE text manipulation tools to explore the gene expression data Conceptual Overview Key Definitions Key Definitions Key Definitions Key Definitions RNA-seq file formats File formats – FASTQ File formats – SAM/BAM File formats – GTF Experimental Design Steps in RNA-seq Analysis http://galaxyproject.org/ http://galaxyproject.org/ Galaxy workflow Galaxy workflow Galaxy workflow QC and Data Prepping in Galaxy Data Quality Assessment: FastQC Data Quality Assessment: FastQC Data Quality Assessment: FastQC Data Quality Assessment: FastQC Data Quality Assessment: FastQC Read Mapping Why TopHat? TopHat2 in Galaxy CuffLinks and CuffDiff • CuffLinks is a program that assembles aligned RNA-Seq reads into transcripts, estimates their abundances, and tests for differential expression and regulation transcriptome-wide. • CuffDiff is a program within CuffLinks that compares transcript abundance between samples Cuffcompare and Cuffmerge CuffDiff results example RNA-seq results normalization Differential Expression (DE) requires comparison of 2 or more RNA-seq samples. Number of reads (coverage) will not be exactly the same for each sample Problem: Need to scale RNA counts per gene to total sample coverage Solution – divide counts per million reads Problem: Longer genes have more reads, gives better chance to detect DE Solution – divide counts by gene length Result = RPKM (Reads Per KB per Million) RPKM normalization RNA-seq hands-on Go to http://galaxyproject.org/ and then type in the URL address field https://usegalaxy.org/u/jeremy/d/257ca40a619a8591 (GM12878 cell line) Click the green + near the top right corner to add the dataset to your history then click on start using the dataset to return to your history, and then repeat with https://usegalaxy.org/u/jeremy/d/7f717288ba4277c6 (h1-hESC cell line) RNA-seq hands-on http://staff.vbi.vt.edu/pawel/RNASeq.pdf