Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1/17/2013 1 2 Outline • • • • • • RNA-seq Analysis in Galaxy Local copy of Galaxy: https://galaxy.wi.mit.edu/ Main site: http://main.g2.bx.psu.edu/ January, 2013 Hot Topics: RNA-seq Analysis in Galaxy. What is the Galaxy Interface? What is RNA-seq? Data upload in Galaxy Data preprocessing in Galaxy Tuxedo tools for RNA-seq analysis RNA-seq analysis workflows: Hands-on Hot Topics: RNA-seq Analysis in Galaxy. 3 The Galaxy Interface A web based platform for analysis of large genomic datasets 4 Galaxy Interface: Analyze Data Data analysis Type “https://galaxy.wi.mit.edu/” in your browser address. You will be prompted for your name and password (these are the same that you use for your email) Processed data Green: job is finished Yellow: job is running Gray: job is in queue LOCAL COPY need of programming experience. No Integrates many bioinformatics tools within one interface. Keeps track of all the steps performed in an analysis. Even if you delete the datasets, the history keeps the tools used. Faster Customizable 250Gb of storage Data is private Jobs are sent to the cluster Red: there is a problem Tools window History window: All analysis steps are saved. Data is not overwritten. History window: Data display and tool’s dialog Can create workflow to repeat an window analysis. Hot Topics: RNA-seq Analysis in Galaxy. 1 1/17/2013 5 6 RNA-seq Experiment How to find your previous histories History menu Wang, Z. et al. RNA-Seq: a revolutionary tool for transcriptomics Nature Reviews Genetics (2009) Hot Topics: RNA-seq Analysis in Galaxy. Hot Topics: RNA-seq Analysis in Galaxy. 7 RNA-seq Applications • Annotation Getting Data: Uploading Large Files 8 Step 1: copy your file to /nfs/galaxy/uploads/[email protected] using a sftp client Identify novel genes, transcripts, exons, splicing events, ncRNAs. 22 • Detecting RNA editing and SNPs. • Measurements: RNA quantification and differential gene expression CyberDuck Abundance of transcripts between different conditions /nfs/galaxy/uploads/[email protected] Hot Topics: RNA-seq Analysis in Galaxy. Hot Topics: RNA-seq Analysis in Galaxy. 2 1/17/2013 Getting Data: Uploading Large Files 9 Preprocessing NGS Quality Control: FastQC Step 2: Select and upload the file within galaxy 10 Upload Fie Execute Genome Assembly Hot Topics: RNA-seq Analysis in Galaxy. Hot Topics: RNA-seq Analysis in Galaxy. 11 Preprocessing: Remove bad quality reads FASTX-TOOLKIT ->Filter by quality 12 RNA-seq Analysis: Tuxedo Tools Tool Cufflinks package Hot Topics: RNA-seq Analysis in Galaxy. Tool description Bowtie Ultrafast short read aligner Tophat Aligns RNA-seq reads to the genome using Bowtie. Discovers splice sites Cufflinks Assembles transcripts Cuffcompare Compares transcript assemblies to annotation Cuffmerge Merges two or more transcript assemblies Cuffdiff Finds differentially expressed genes and transcripts. Detects differential splicing and promoter use Hot Topics: RNA-seq Analysis in Galaxy. 3 1/17/2013 13 14 Examples of analysis workflows Hands-on 1 Differential expression analysis Hands-on 2 Transcript assembly and differential expression analysis Hands-on 3 Transcript assembly and transcript comparison Tutorials and References • Galaxy tutorials http://galaxy.psu.edu/screencasts.html • Previous Hot Topics http://jura.wi.mit.edu/bio/education/hot_topics • SOPs (Standard operating procedures) https://gir.wi.mit.edu/trac/wiki/barc/SOPs • References Taylor et al. (2007) Using Galaxy to perform large-scale interactive data analyses. Current Protocols in Bioinformatics Chapter 10, unit 10. Blankenberg et al. (2010) Manipulation of FASTQ data with Galaxy. Bioinformatics 26(14):1783-5 Trapnell et al. (2012) Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols 7, 562–578 Tophat Manual: http://tophat.cbcb.umd.edu/manual.html Cufflinks Manual: http://cufflinks.cbcb.umd.edu/manual.html Hot Topics: RNA-seq Analysis in Galaxy. 4