* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Red Line - iPlant Pods
DNA vaccination wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Ridge (biology) wikipedia , lookup
Molecular cloning wikipedia , lookup
Primary transcript wikipedia , lookup
Mitochondrial DNA wikipedia , lookup
Copy-number variation wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Gene expression programming wikipedia , lookup
Genomic imprinting wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Gene nomenclature wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Epigenomics wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Gene therapy wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Gene desert wikipedia , lookup
Oncogenomics wikipedia , lookup
Public health genomics wikipedia , lookup
Point mutation wikipedia , lookup
Genetic engineering wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Transposable element wikipedia , lookup
Metagenomics wikipedia , lookup
Gene expression profiling wikipedia , lookup
Pathogenomics wikipedia , lookup
Genome (book) wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Genomic library wikipedia , lookup
Human genome wikipedia , lookup
Minimal genome wikipedia , lookup
Human Genome Project wikipedia , lookup
Non-coding DNA wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
History of genetic engineering wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Microevolution wikipedia , lookup
Designer baby wikipedia , lookup
Genome editing wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
I. Introduction and Red Line Education for Data-unlimited Science Educational Challenge For the first time in the history of biology students can work with the same data at the same time and with the same tools as research scientists. Research Education Context of scientific discovery My own suspicion is that the universe is not only queerer than we suppose, but queerer than we can suppose. J.B.S. Haldane, Possible Worlds and Other Essays (1927) Plant Genomes Vary Widely in Size Glycine max (soy) Dicots 46 150-300 Monocots 25 50-70 13 14 28 9 Time (million years) 60 40 20 1,115 Mb Arabidopsis 145 Mb Oryza (rice) 430 Mb Avena (oats) Brachypodium Hordeum (barley) >20,000 Mb Triticum (wheat) 20,000 Mb 270 Mb 5,200 Mb Setaria (foxtail millet) ?? Mb Pennisetum (pearl millet) Sorghum ?? Mb Zea (maize) Present = Genome duplication event 750 Mb 2,500 Mb Genome Duplication/Factionation DNA Subway Concepts (Big Ideas) • • • • • • Genomes are complex and dynamic (queer). DNA sequence is information. DNA sequence is biological identity. Gene annotation adds meaning to DNA sequence. Concept of gene continues to evolve. A genome is more than genes. Insights from Genomics in Education Washington University, June 16-19, 2009 44 participants from three worlds and three kingdoms • Bioinformatics: Students have limited patience for pure computer work and want a wet bench hook. • Student-scientists partnerships: Someone has to care about the data generated by students. • Students as co-investigators: Projects should potentially lead to publication. • Scale: Need to move from individual experiments to course-based and distributed research projects. Walk or… Ride… DNA Subway an educational Discovery Environment • Simplified bioinformatics workflows • Developed with 25 collaborators at 11 institutions • Since March 2010 launch: 2,905 registered users 52,591 visits, 24,593 unique visits • • • • Red Line: predict and annotate genes in <150 kb Yellow Line: identify homologs in sequenced genomes Blue Line: analyze DNA barcodes and build gene trees Green Line: align and analyze RNA-seq data (coming) Red Line Learning Questions • What is a gene and how does it relate to DNA sequence? • What are the components of genes? • How does a gene relate to the central dogma of molecular biology: DNA <> RNA > Protein? • How does a gene encode a protein? • How is the mathematical evidence used to predict genes? • How does biological evidence (from RNA and proteins) confirm gene predictions? Genes as Beads on a String Morgan’s Beads on a String http://www.ncbi.nlm.nih.gov/genome/guide/human/ Human Globin Locus on Chromosome 11 Human Genome Insights (ENCODE) • • • • • • Majority of genome is transcribed ~50% transposons ~25% protein coding genes/1.3% exons ~23,700 protein coding genes ~160,000 transcripts Average Gene ~ 36,000 bp 7 exons @ ~ 300 bp 6 introns @ ~5,700 bp • 7 alternatively spliced products (95% of genes) Piano Keys? Keys dynamically placed by real data (features, coordinates) What is a gene is and how does it relate to DNA ? •This map can allow student to appreciate some of the complexity of the genome. •Clicking on links to sequence confirms a relationship between something called a gene and a DNA sequence. Gene Annotation Workflow Submit Sequence Identify & Mask Repeats Predict Genes (Optional) Load User Data Search Datasets Build Gene Models Prospect Genomes Predict Function Compare Annotations Brent Buckner, Ph.D. Truman State University “I have found that students are overwhelmed by their first introduction to genome sequences viewed on a genome browser. Students who used DNA Subway needed little or no guidance when they moved on to use MaizeGDB and had an easier time transitioning to genomes depicted in different genome browsers.” DNA Subway Case Study Brent Buckner, Ph.D., Truman State University • Sophomore genetics class, spring 2010 and 2011 – 70 students used Red Line to annotate 3.7 mbp of maize genome – 12 hours effort, each student annotated 100 kb – Follow-up research projects by 7 undergraduates: • Compared syntenic regions of maize Chr. 6 and sorghum • 65 hours effort, each student annotated 1 million bp • MaizeGDB, MaizeSequence.org, InterProScan, CoGE, PlexDB, Circos • Sophomore genetic class, spring 2012 – 19 students used Red Line to visualize next-gen RNA-Seq data to investigate presence/absence variation (PAV) in maize – 12 hours effort, each student group annotated 100 kb and then imported next-gen RNA-Seq data from 5 different tissues in 30 maize inbred lines for a gene that they had previously shown exhibits PAV Judy Brusslan, Ph.D. CSU, Long Beach “When I used the Red Line exercise in six lab sections of my General Genetics class this Fall, it went smoothly and best of all, there was a mass “Ah-ha” moment when the results of the gene prediction programs were displayed on the Genome Browser. The use of BLASTX and BLASTN within the Red Line allowed the students to visualize the different outputs and understand the value of sequenced cDNAs for gene prediction.”