Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
iPlant Genomics in Education Workshop Genome Exploration in Your Classroom iPlant Genomics in Education Workshop Major Workshop Concepts: •Biology is becoming a “Data Unlimited” science. •Genomes are dynamic. •Genomes are more than just protein coding genes. •DNA sequence is information. •Gene annotation adds “meaning” to DNA sequence. •Biological concepts like “genes” and “species” continually evolving. •DNA barcoding bridges molecular genetics, evolution, ecology. The Problem of Big Data in Biology The abundance of biological data generated by highthroughput sequencing creates challenges, as well as opportunities: •How do scientists share their data and make it publically available? •How do scientists extract maximum value from the datasets they generate? •How can students and educators (who will need to come to grips with data-intensive biology) be brought into the fold? The iPlant Collaborative The iPlant Collaborative 5-10 year project to develop a computer infrastructure to apply computational thinking to solve biological problems •High performance computing •Data and data analysis •Virtual organization •Learning and workforce The iPlant Collaborative Bringing Genomics into the Classroom Visualization of the Pectobacterium atrosepticum genome http://www.scri.ac.uk/research/pp/plantpathogengenomics/pathogenbioinformatics Bringing Genomics into the Classroom • • • • • • • • • 1866 – Mendel publishes work on inheritance 1869 – DNA discovered 1915 – Hunt Morgan describes linkage and recombination 1953 – Structure of DNA described 1956 – Human chromosome number determined 1968 – First gene mapped to autosome 1977 – Dideoxy sequencing 1983 – PCR 1986 – Human Genome Project proposed Bringing Genomics into the Classroom • • • • 1993 – First MicroRNAs described 2003 – First ‘Gold Standard’ human genome sequence 2005 – First draft of human haplotype map (HapMap) 2007 – ENCODE project Timeline: Welcome Trust (http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtx063807.pdf) Bringing Genomics into the Classroom “Essentially, all models are wrong, but some are useful” – George E.P. Box From This… Bringing Genomics into the Classroom To This… Bringing Genomics into the Classroom Majority of genome is transcribed ~50% transposons ~25% protein coding genes/1.3% exons ~23,700 protein coding genes ~160,000 transcripts Average Gene ~ 36,000 bp 7 exons @ ~ 300 bp 6 introns @ ~5,700 bp 7 alternatively spliced products (95% of genes) RefSeq: ~34,600 “reference sequence” genes (includes pseudogenes, known RNA genes) Using Plants to Explore Genomics Using Plants to Explore Genomics There are a large number of plant genomes available for analysis. Using Plants to Explore Genomics “Plant genomes range from simple to exceptionally complex” – Richard Chronn, USDA Forest Service It’s this diversity within plant genomes that provides a rich platform for examination of the genome as a phenomenon. Genlisea margaretae 63Mb Paris Japonica 150Gb Using Plants to Explore Genomics The “weirdness” of plant genomes on your dinner plate 1 Brachypodium Sorghum Oryza 1 3 2 10 6 3 1 3 9 5 7 7 8 2 4 4 2 5 5 6 8 10 11 12 9 4 3 5 Brachypodium 1 Triticum aestivum: allohexaploid 2 4 Using Plants to Explore Genomics Glycine max (soy) Dicots 46 150-300 Arabidopsis 145 Mb Oryza (rice) 430 Mb Avena (oats) 25 Brachypodium Monocots 50-70 13 14 28 Hordeum (barley) Triticum (wheat) Time (million years) - Genome duplication event 5,200 Mb 20,000 Mb Pennisetum (pearl millet) ?? Mb Zea (maize) 20 270 Mb ?? Mb 9 40 >20,000 Mb Setaria (foxtail millet) Sorghum 60 1,115 Mb Present 750 Mb 2,500 Mb Using Plants to Explore Genomics