* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download RNA-Seq - iPlant Pods
DNA vaccination wikipedia , lookup
Synthetic biology wikipedia , lookup
Genome (book) wikipedia , lookup
Point mutation wikipedia , lookup
Nucleic acid tertiary structure wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Molecular cloning wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Epigenomics wikipedia , lookup
Oncogenomics wikipedia , lookup
Public health genomics wikipedia , lookup
Transposable element wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Pathogenomics wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Genomic library wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
RNA silencing wikipedia , lookup
Non-coding RNA wikipedia , lookup
Human genome wikipedia , lookup
History of RNA biology wikipedia , lookup
Minimal genome wikipedia , lookup
Designer baby wikipedia , lookup
Gene expression profiling wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Human Genome Project wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
History of genetic engineering wikipedia , lookup
Non-coding DNA wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Microevolution wikipedia , lookup
Metagenomics wikipedia , lookup
Primary transcript wikipedia , lookup
Genome evolution wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Helitron (biology) wikipedia , lookup
Genome editing wikipedia , lookup
DNA Subway Green Line Onramp to HPC in Biology Education Dave Micklos and Uwe Hilgert iPlant Collaborative DNA Learning Center, Cold Spring Harbor Laboratory; Bio5 Institute, University of Arizona …ride an educational Discovery Environment Green Line: RNA Sequence (RNA-Seq) Analysis • First fully GUI interface for RNA-Seq analysis — no command line or data conversions • Accesses XSEDE system through the iPlant Agave API • Co-localizes up to 100 GB of data in iPlant Data Store • Look for differential gene expression in different tissues, life stages, or treatment • Generate lists of expressed genes and fold-changes • Annotate sequenced genomes; add results to Red Line projects RNA code represents “active” DNA in genome 150 feet Homo sapiens bitter taste receptor (TAS2R38) DNA code > RNA code CCTTTCTGCACTGGGTGGCAACCAGGTCTTTAGATTAGCCAACTAGAGAAGAGAAGTAGAATAGCC AATTAGAGAAGTGACATCATGTTGACTCTAACTCGCATCCGCACTGTGTCCTATGAAGTCAGGAGT ACATTTCTGTTCATTTCAGTCCTGGAGTTTGCAGTGGGGTTTCTGACCAATGCCTTCGTTTTCTTG GTGAATTTTTGGGATGTAGTGAAGAGGCAGGCACTGAGCAACAGTGATTGTGTGCTGCTGTGTCTC AGCATCAGCCGGCTTTTCCTGCATGGACTGCTGTTCCTGAGTGCTATCCAGCTTACCCACTTCCAG AAGTTGAGTGAACCACTGAACCACAGCTACCAAGCCATCATCATGCTATGGATGATTGCAAACCAA GCCAACCTCTGGCTTGCTGCCTGCCTCAGCCTGCTTTACTGCTCCAAGCTCATCCGTTTCTCTCAC ACCTTCCTGATCTGCTTGGCAAGCTGGGTCTCCAGGAAGATCTCCCAGATGCTCCTGGGTATTATT CTTTGCTCCTGCATCTGCACTGTCCTCTGTGTTTGGTGCTTTTTTAGCAGACCTCACTTCACAGTC ACAACTGTGCTATTCATGAATAACAATACAAGGCTCAACTGGCAGATTAAAGATCTCAATTTATTT TATTCCTTTCTCTTCTGCTATCTGTGGTCTGTGCCTCCTTTCCTATTGTTTCTGGTTTCTTCTGGG ATGCTGACTGTCTCCCTGGGAAGGCACATGAGGACAATGAAGGTCTATACCAGAAACTCTCGTGAC CCCAGCCTGGAGGCCCACATTAAAGCCCTCAAGTCTCTTGTCTCCTTTTTCTGCTTCTTTGTGATA TCATCCTGTGCTGCCTTCATCTCTGTGCCCCTACTGATTCTGTGGCGCGACAAAATAGGGGTGATG GTTTGTGTTGGGATAATGGCAGCTTGTCCCTCTGGGCATGCAGCCATCCTGATCTCAGGCAATGCC AAGTTGAGGAGAGCTGTGATGACCATTCTGCTCTGGGCTCAGAGCAGCCTGAAGGTAAGAGCCGAC CACAAGGCAGATTCCCGGACACTGTGCTGAGAATGGACATGAAATGAGCTCTTCATTAATACGCCT GTGAGTCTTCATAAATATGCC Differential Gene Expression RNA Sequence (RNA-Seq) gives “snapshot” of genes active in different cells at different times 6 Differential Gene Expression RNA Sequence (RNA-Seq) gives “snapshot” of genes active in different cells 7 RNA Sequence (RNA-Seq) Analysis Design RNA-Seq experiment, i.e., differential expression Isolate total RNA; convert to DNA library Sequence experiment and control libraries Analyze sequence data on DNA Subway Green Line Follow-up experimental validation Image source: http://www.bgisequence.com 1) Manage Data: Quality Assessment with FastQC; ~100 Million 75/150 nucleotide reads in < 1hr 2) FastX ToolKit: Quality Control with FastX Toolkit; ~100M 75/150 nucleotide reads in <1 hr (some took up to 19 hours…) 3) TopHat: Aligns ~100 Million 75/150 nucleotide (paired end) reads to a reference genome of 100M–5B in 6–19hr TopHat Alignment JBrowse TopHat Alignment JBrowse 4) CuffLinks: Assembles transcripts and calculates abundance on BAM files, 1–12GB in 6–19hr 5) CuffDiff: Merges assemblies from Cufflinks and performs differential expression analysis on 4–9 samples in 6–19 hr Green Line Queue time vs Run time Asking for a high run time, leads to longer queue times Asking for a short high time may lead to job being terminated Users don't like to wait too long Users want the results right away Finding the right balance is not easy Green Line Dealing w/ the unexpected Systems taken offline Maintenance Network outages, data transfer issues Science API gives glitches Authentication Green Line “Monitoring XSEDE” DNA Subway “Power Desktop” • Intuitive interface to support seamless genome “round trip” for eukaryote of choice • Access high performance computing to analyze whole genome data (RNA-seq, initially) • Scaffold data to sequenced genomes available in iPlant Data Store • Directly upload RNA-seq reads as biological evidence for genome annotation using Red Line NSF CCLI Project Retreat June 8–20, 2014, CSHL • 11 faculty from PUIs • Program included lectures/practical sessions Wet lab: RNA library prep Green Line analysis & bioinformatics Pedagogy/teaching resources Virtual training materials NSF CCLI Project Retreat Faculty Participants Agnes Ayme-Southgate College of Charleston, SC Judy Brusslan California State University, Long Beach, CA Raymond Enke James Madison University, VA Shaye Lewis Prairie View A&M University, TX Irina Makarevitch Hamline University, MN Judith Ogilvie Saint Louis University, MO Jeremy Seto New York City College of Technology, CUNY, NY Carrie Thurber Abraham Baldwin Agricultural College, IL George Ude Bowie State University, MD Deirdre Vaden Prairie View A&M University, TX Scott Woody University of Wisconsin, WI Flight muscle development during life-stage transitions in Apis melifera (honeybee) Leaf development and senescence in Arabidopsis thaliana Retina development in Gallus gallus Testes development from juvenile to puberty in caprine (goat) Response to cold stress in maize Retinal changes of mice with retinitis pigmentosa Differentiation of rat pheochromocytoma line cells (PC12) to a neuronal-like phenotype Seed abscission in Sorghum bicolor Floral inflorescence genes in banana/plantains Peripheral blood mononuclear cells from hypertensive rats treated with captopril Gibberellic acid exposure in Brassica rapa (Fast Plants) gibberellic acid (gad) mutants NSF CCLI Project Retreat Flight muscle development during life-stage transitions in Apis mellifera (honeybee) Agnes Ayme-Southgate, College of Charleston, SC All honeybees begin as worker bees, flying short distances. Some honeybees transition into foragers, flying long distances. This transition necessitates major changes in flight muscles. Goal is to identify the gene expression changes in flight muscles during this transition Courses • Biol 322: Developmental Biology, 30–38 students • Genetics, 100 students • Undergraduate research in lab, 2–3 students NSF CCLI Project Retreat Differential gene expression in Capra hircus (goat) testes during juvenile development Shaye Lewis, Prairie View A&M University, TX Fertility phenotypes show low heritability, and semen analysis parameters cannot determine fertility status. Molecular biomarkers can increase efficiency of artificial insemination and embryo transfer in goats. Goal is to identify genes important for normal testes development and function Courses •4533: Animal Breeding & Genetics, 20 students •Undergraduate research in lab, 4 students NSF CCLI Project Retreat Understanding transcriptional response to cold stress in maize Irina Makarevitch, Hamline University, MN Maize is grown worldwide and is astaple for >1 billion people. Maize is thermophilic and sensitive to low temperatures, and understanding how plants respond to cold can improve yields. Goal is to identify genes that are differentially expressed when maize is grown under cold stress Courses •Biol 201: Principles of Genetics, 80 students •Biol 301: Genomics & Bioinformatics, 20 students •Undergraduate research in lab, 4 students NSF CCLI Project Retreat RNA-Seq Datasets Generated and Analyzed Using the Green Line of DNA Subway • 8 eukaryotic organisms • 21 controls paired with 26 experimental conditions • 402 Gbases sequenced • 837 jobs submitted to TACC • 87% jobs completed • 695 hours total CPU time • 16 threads/processors running concurrently Intended Implementation 2014-15 100 level 200 level 300 level 400 level 500 level Intro Genetics, 270 Genetics, 220 Molecular & Cell Molecular Biology, Biology, 50 100 Molecular Applications in Crop Improvement 15 Biology Cell & Molecular Biology, 75 20 15 Genomics, 40 Genomics & Bioinformatics, 70 Animal Breeding & Genetics, 20 Developmental Biology, 35 Independent Research, 5 Undergrad Research Cell Structure & Function, 30 Synthetic Biology, 30 Anatomy/Physiology, 50 Advanced Genetic Techniques, 15 100s 320 550 140 DNA Subway is… Producers Uwe Hilgert David Micklos Jason Williams Designers Eun-Sook Jeong Susan Lauter Programmers Cornel Ghiban Mohammed Khalfan Sheldon McKay Contributors Matt Vaughn Rion Dooley Anthony Biondo Jim Burnette Scott Cain Ed Lee Zhenyuan Lu Advisors Matt Conte Carson Holt Bruce Nash Oscar Pineda-Catalan HPC in Undergraduate Biology Education Banbury Center, CSHL, September 3-5, 2014 Contact Dave Micklos ([email protected]) A Great Gatsby era estate on Long Island’s “Gold Coast” Funded by NSF and the Alfred P. Sloan Foundation