* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download source file
Mitochondrial DNA wikipedia , lookup
Gene therapy wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genomic imprinting wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Gene expression programming wikipedia , lookup
Copy-number variation wikipedia , lookup
Oncogenomics wikipedia , lookup
Gene desert wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Metabolic network modelling wikipedia , lookup
Transposable element wikipedia , lookup
Genetic engineering wikipedia , lookup
Non-coding DNA wikipedia , lookup
Gene expression profiling wikipedia , lookup
Genome (book) wikipedia , lookup
Microevolution wikipedia , lookup
Public health genomics wikipedia , lookup
Designer baby wikipedia , lookup
Human genome wikipedia , lookup
History of genetic engineering wikipedia , lookup
Helitron (biology) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Metagenomics wikipedia , lookup
Pathogenomics wikipedia , lookup
Minimal genome wikipedia , lookup
Genome editing wikipedia , lookup
Human Genome Project wikipedia , lookup
Overview What is Annotation? Annotation is the process of determining the location and function of all identifiable genes in a genome. Annotation is an important part of bioinformatics • whole-genome shotgun sequencing provides the raw material • annotation provides an interpretation of the sequencing results Figure 1 from Stothard & Wishart (2006) Automated bacterial genome analysis and annotation. Current Opinion in Microbiology 9: 505-510. 1. Find start and stop codons – separated by 800-900 bp? 2. Find Shine-Dalgarno sequence (RBS) – upstream of start codon? 3. Find core promoter – consensus sequences for -10 & -35? 4. Find rho-independent terminator 5. Predict whether the gene could be organized into an operon – compare chromosomal neighborhood 1. Verify predicted function based on amino acid sequence homology 2 Predict protein structure and localization What will we be doing? Verifying ORF calls Verifying function based on sequence conservation Insert Figure 8-40 from Microbiology – An Evolving Science © 2009 W.W. Norton & Company, Inc. Verifying function based on localization data Verifying function based on structural conservation (insert image of E. coli lac permease) Why manually annotate? • Automated annotations tend to over-predict….produce many false-positives • Automated annotations also miss things…. • Accuracy of any annotation is only as good as the quality of annotated genes in reference databases • High sequencing error rates. . . A curated, finished genome has gene calls verified & proteins organized into pathways Possible solutions? Reference paper: Genome re-annotation: a wiki solution? by Steven Saltzberg Genome Biology (2007), 8:102 Undergraduates provide “human expertise” GOAL: Demonstrate that student annotations can be accurate, up-to-date, reliable, and useful to scientific community! What is imgACT? http://img-act.jgi-psf.org/user/login - Web portal to access genome database, img/edu - Contains wiki-based Lab Notebook & Report Page for organizing annotation data What is img/edu? http://imgweb.jgi-psf.org/cgi-bin/img_edu_v260/main.cgi - Simplified database for undergraduate genome annotation - Features and functions similar to that found in IMG - Directly linked to imgACT Click! IMG companion system What is IMG? http://img.jgi.doe.gov/cgi-bin/pub/main.cgi INTEGRATED MICROBIAL GENOMES (IMG) - Database managed by the U.S. Department of Energy (DOE) Joint Genome Institute (JGI) - JGI currently producing ~ 22% of the reported number of bacterial genome projects worldwide - Key mission of IMG is to provide a data management platform that supports comprehensive analysis and annotation of all publicly available genomes in a comparative genomics context What are we annotating? (insert information about organism including location/map of collection site, image and description of organism, etc.) Why annotate a GEBA organism? Phylogenetic tree of Bacteria showing established & candidate phyla Note that genome sequences from members of those phyla in yellow and orange are under-represented relative to those in red GEBA (Genomic Encyclopedia of Bacteria and Archaea) goal is to sequence genomes from underrepresented phyla Insert Figure 1 from Handelsman (2004) Microbiol. Mol. Biol. Rev. 68: 669-685. What is our goal? Annotate genes in pathways & complexes Insert Figure 2 from Scott KM et al. (2006) The Genome of Deep-Sea Vent Chemolithoautotroph Thiomicrospira crunogena XCL-2. PLoS Biology, 4: 2196 Student Goals: Conceptual • Apply basic concepts in biochemistry, microbial physiology & ecology, and evolutionary biology • Question basic assumptions about biochemistry, physiology and evolution • Understand the power and limitations of bioinformatics Student Goals: Technical • Proficiently use multiple database analysis software packages • Strengthen web-based library search skills (Pubmed) • Develop skills creating hypotheses and designing experiments to test them • Sharpen skills in analysis, synthesis and presentation of results and data interpretation • Experience the collaborative nature of science Annotation Project • Each team will annotate genes encoding enzymes in a metabolic pathway or components of a cellular complex in [insert organism name] • Your T.A. or instructor will tell you specific assignments • Consult KEGG map and use orthologous gene in other related organisms to query the genome of [insert organism name] in IMG/EDU database • For best “hit”, complete the corresponding modules of imgACT lab notebook and lab report for that gene • Complete the module(s) presented each week. The imgACT online notebook & report for Modules #1 – 8 must be finished for all genes assigned (3 per student). Annotation Project Assignments • Online notebook checks end of weeks: • Final Report due dates: How do we get started? http://img-act.jgi-psf.org/user/login Click “Create an account” Register for an img-act account Email address First Name Last Name xxxxxxxxxxxxxxxxxxxxxxxxxx No abbreviations or nicknames Pick something you can remember Specific for our class Click “Register” once information entered Once registration complete, log in to imgACT What you should see. . . Winter 2010 If you can’t get this far, tell your instructor immediately! Next, take pre-annotation survey Cookies must be enabled for survey to work properly. What next? Practice! Explore the imgACT web portal • All students will be assigned at least one gene, which should be used to navigate through the imgACT online lab notebook (Modules #1 – 8) and the lab report • Note that students are not responsible for annotating this gene. It may be used to help students get used to navigating the web portal. “Practice gene” click imgACT Lab Notebook click The first time you log in to Lab Notebook, you will also need to log in to the wiki. Use the same username & password as created for imgACT account. imgACT Lab Notebook Only responsible for Modules #1 – 8 in this class imgACT Lab Report To be completed at end of the quarter Correspond to modules in Lab Notebook