Download Red Line - iPlant Pods

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

DNA vaccination wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Ridge (biology) wikipedia , lookup

NUMT wikipedia , lookup

Molecular cloning wikipedia , lookup

Primary transcript wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Copy-number variation wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Gene expression programming wikipedia , lookup

Genomic imprinting wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Gene nomenclature wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Epigenomics wikipedia , lookup

Cancer epigenetics wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Gene therapy wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Gene desert wikipedia , lookup

Oncogenomics wikipedia , lookup

Public health genomics wikipedia , lookup

Point mutation wikipedia , lookup

Genetic engineering wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Transposable element wikipedia , lookup

Metagenomics wikipedia , lookup

Gene expression profiling wikipedia , lookup

Pathogenomics wikipedia , lookup

Genome (book) wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Genomic library wikipedia , lookup

Human genome wikipedia , lookup

Minimal genome wikipedia , lookup

Gene wikipedia , lookup

Human Genome Project wikipedia , lookup

Non-coding DNA wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

History of genetic engineering wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Microevolution wikipedia , lookup

RNA-Seq wikipedia , lookup

Designer baby wikipedia , lookup

Genomics wikipedia , lookup

Genome editing wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Genome evolution wikipedia , lookup

Helitron (biology) wikipedia , lookup

Transcript
I. Introduction and Red Line
Education for Data-unlimited Science
Educational Challenge
For the first time in the history of biology students
can work with the same data at the same time and
with the same tools as research scientists.
Research
Education
Context of scientific discovery
My own suspicion is that the universe is not only queerer
than we suppose, but queerer than we can suppose.
J.B.S. Haldane, Possible Worlds and Other Essays (1927)
Plant Genomes Vary Widely in Size
Glycine max (soy)
Dicots
46
150-300
Monocots
25
50-70
13
14
28
9
Time (million years)
60
40
20
1,115 Mb
Arabidopsis
145 Mb
Oryza (rice)
430 Mb
Avena (oats)
Brachypodium
Hordeum (barley)
>20,000 Mb
Triticum (wheat)
20,000 Mb
270 Mb
5,200 Mb
Setaria (foxtail millet)
?? Mb
Pennisetum (pearl millet)
Sorghum
?? Mb
Zea (maize)
Present
= Genome duplication event
750 Mb
2,500 Mb
Genome Duplication/Factionation
DNA Subway Concepts (Big Ideas)
•
•
•
•
•
•
Genomes are complex and dynamic (queer).
DNA sequence is information.
DNA sequence is biological identity.
Gene annotation adds meaning to DNA sequence.
Concept of gene continues to evolve.
A genome is more than genes.
Insights from Genomics in Education
Washington University, June 16-19, 2009
44 participants from three worlds and three kingdoms
• Bioinformatics: Students have limited patience for
pure computer work and want a wet bench hook.
• Student-scientists partnerships: Someone has to care
about the data generated by students.
• Students as co-investigators: Projects should
potentially lead to publication.
• Scale: Need to move from individual experiments to
course-based and distributed research projects.
Walk or…
Ride…
DNA Subway
an educational Discovery Environment
• Simplified bioinformatics workflows
• Developed with 25 collaborators at 11 institutions
• Since March 2010 launch: 2,905 registered users
52,591 visits, 24,593 unique visits
•
•
•
•
Red Line: predict and annotate genes in <150 kb
Yellow Line: identify homologs in sequenced genomes
Blue Line: analyze DNA barcodes and build gene trees
Green Line: align and analyze RNA-seq data (coming)
Red Line Learning Questions
• What is a gene and how does it relate to DNA
sequence?
• What are the components of genes?
• How does a gene relate to the central dogma of
molecular biology: DNA <> RNA > Protein?
• How does a gene encode a protein?
• How is the mathematical evidence used to predict
genes?
• How does biological evidence (from RNA and
proteins) confirm gene predictions?
Genes as Beads on a String
Morgan’s Beads on a String
http://www.ncbi.nlm.nih.gov/genome/guide/human/
Human Globin Locus on Chromosome 11
Human Genome Insights (ENCODE)
•
•
•
•
•
•
Majority of genome is transcribed
~50% transposons
~25% protein coding genes/1.3% exons
~23,700 protein coding genes
~160,000 transcripts
Average Gene ~ 36,000 bp
7 exons @ ~ 300 bp
6 introns @ ~5,700 bp
• 7 alternatively spliced products (95% of genes)
Piano Keys?
Keys dynamically placed by real data (features, coordinates)
What is a gene is and how does it relate to DNA ?
•This map can allow student to appreciate some of the complexity of the genome.
•Clicking on links to sequence confirms a relationship between something called a gene
and a DNA sequence.
Gene Annotation
Workflow
Submit
Sequence
Identify & Mask
Repeats
Predict
Genes
(Optional) Load
User Data
Search
Datasets
Build
Gene Models
Prospect
Genomes
Predict
Function
Compare
Annotations
Brent Buckner, Ph.D.
Truman State University
“I have found that
students are
overwhelmed by their
first introduction to
genome sequences
viewed on a genome
browser. Students who
used DNA Subway
needed little or no
guidance when they
moved on to use
MaizeGDB and had an
easier time transitioning
to genomes depicted in
different genome
browsers.”
DNA Subway Case Study
Brent Buckner, Ph.D., Truman State University
• Sophomore genetics class, spring 2010 and 2011
– 70 students used Red Line to annotate 3.7 mbp of maize genome
– 12 hours effort, each student annotated 100 kb
– Follow-up research projects by 7 undergraduates:
• Compared syntenic regions of maize Chr. 6 and sorghum
• 65 hours effort, each student annotated 1 million bp
• MaizeGDB, MaizeSequence.org, InterProScan, CoGE, PlexDB, Circos
• Sophomore genetic class, spring 2012
– 19 students used Red Line to visualize next-gen RNA-Seq data to
investigate presence/absence variation (PAV) in maize
– 12 hours effort, each student group annotated 100 kb and then
imported next-gen RNA-Seq data from 5 different tissues in 30 maize
inbred lines for a gene that they had previously shown exhibits PAV
Judy Brusslan, Ph.D.
CSU, Long Beach
“When I used the Red Line
exercise in six lab sections
of my General Genetics
class this Fall, it went
smoothly and best of all,
there was a mass “Ah-ha”
moment when the results
of the gene prediction
programs were displayed
on the Genome
Browser. The use of
BLASTX and BLASTN within
the Red Line allowed the
students to visualize the
different outputs and
understand the value of
sequenced cDNAs for gene
prediction.”