* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Databases_what_and_w..
Epigenomics wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
DNA sequencing wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Gene expression profiling wikipedia , lookup
Gene nomenclature wikipedia , lookup
History of genetic engineering wikipedia , lookup
Protein moonlighting wikipedia , lookup
Non-coding DNA wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Designer baby wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Human genome wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Genome evolution wikipedia , lookup
Genomic library wikipedia , lookup
Pathogenomics wikipedia , lookup
Human Genome Project wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Microevolution wikipedia , lookup
Point mutation wikipedia , lookup
Microsatellite wikipedia , lookup
Genome editing wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Helitron (biology) wikipedia , lookup
Sequence Databases What are they and why do we need them What is sequence data? DNA, RNA and Protein (Amino Acids) Why do I need it? • Evolution • Mutation • Natural Selection • Intra and Inter-species relationships • Niche exploitation • Ecosystems REALLY? Intra and Inter-species YES! relationships Niche exploitation Ecosystems Evolution Mutation Natural Selection Phenotypes Intra and Inter-species relationships • Phenotypes come from the proteins. Niche exploitation Ecosystems • Proteins come from the DNA via RNA. • Changes in DNA cause changes in proteins. • Changes in proteins cause changes in phenotypes. How do we find those changes? Sequencing Is the Sequence everything? The sequence itself is not informative; it must be analyzed by comparative methods against existing databases to develop hypothesis concerning relatives and function. What do Databases let you do? • Explore and investigate sequence data Classify organisms Assign a possible function to a gene Verify a sequences identity Annotate a genome Design primers for PCR and probe experiments What is a Database? QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Databases allow us to more easily find what we need What Databases are there? Ten Important Bioinformatics Databases Name Address Description GenBank/DDBJ/EMBL www.ncbi.nlm.nih.gov Nucleotide sequences Ensembl www.ensembl.org Human/Mouse genome PubMed www.ncbi.nlm.nih.gov Literature references NR www.ncbi.nlm.nih.gov Protein sequences SWISS-PROT www.expasy.ch Protein sequences InterPro www.ebi.ac.uk Protein domains OMIM www.ncbi.nlm.nih.gov Genetic diseases Enzymes www.chem.qmul.ac.uk Enzymes PDB www.rcsb.org/pdb/ Protein structures KEGG www.genome.ad.jp Metabolic pathways Many other specialized Databases are available. Bioinformatics for Dummies, 2003 What Database should I use? QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. A.K.A. GenBank How big is GenBank? QuickTime™ and a QuickTime™ and a TIFF (Uncompressed) decompressor TIFF (Uncompressed) decompressor are needed see this picture. are needed to tosee this picture. 1977 DNA Sequencing 1985 PCR 1987 Automated Sequencing 1997 Capillary Sequencing Who can put data into GenBank? Sequence data are submitted to GenBank from scientists from around the world. Warning: GenBank does not check the validity or accuracy of sequences submitted. This is left up to the scientific community to verify, like all published scientific data. How do I use GenBank? www.ncbi.nlm.nih.gov Problem 1. You are constructing a phylogeny of Euglenoids and you have determined from the literature that the Beta-tubulin gene is a good gene for this purpose. How do I start??? QuickTime™ and a MPEG-4 Video decompressor are needed to see this picture. How do I use GenBank? www.ncbi.nlm.nih.gov Euglenozoa AND tubulin NOT kinetoplastida AF182759 How do I use GenBank? Problem 2. You are studying domestication of Sorghum vulgare. From reading about sorghum you find out that it is closely related to Zea mays. You also find out that maize has a wild relative teosinte that forms multiple stocks. Domesticated maize forms a single stock. Domesticated sorghum has a single stock while wild sorghum (Johnsongrass) has multiple stocks. Broomcorn (Sorghum) Domesticated QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Johnsongrass Wild Sorghum vulgare QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Sorghum halepense How do I use GenBank? Problem 2. Continued Moreover, the paper states that this trait is controlled by a single gene teosinte branched 1 (tb1). You wonder “Does sorghum have this gene?”. The paper does provide a set (Forward and Reverse) PCR primers that where used to isolate and sequence the tb1 gene. Will they work for Sorghum? Sequencing Sorghum QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickT ime ™an d a TIFF ( Uncomp res sed) deco mpre ssor ar e need ed to see this pictur e. QuickTime™ and a GIF decompressor are needed to see this picture. QuickTime™ and a GIF decompressor are needed to see this picture. Sequencing Sorghum QuickTi me™ and a TIFF ( Uncompressed) decompressor are needed to see thi s pi ctur e. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickT ime ™an d a TIFF ( Uncomp res sed) deco mpre ssor ar e need ed to see this pictur e. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Sequencing Sorghum Does sorghum have the tb1 gene? >Sorghum_vulgare_sequence ATGGACTTACCGCTTTACCAACAACTGCAGCTCAGCCCGCCTTCCCCAAAGCCGGACCAATCAAGCAGCT TCTACTGCTGCTACCCATGCTCCCCTCCCTTCGCCGCCGCCGCCGCCGACGCCAGCTTTCACCTGAGCTA CCAGATCGGTAGTGCCGCCGCCGCCATCCCTCCACAAGCCGTGATCAACTCGCCGGAGGACCTGCCGGTG CAGCCGCTGATGGAGCAGGCGCCGGCGCCGCCTACAGAGCTTGTCGCCTGCGCCAGTGGTGGTGCACAAG GCGCCGGCGTCAGCGTCAGCCTGGACAGGGCGGCGGCCGCGGCCGCCGCGAGGAAAGACCGGCACAGCAA GATATGCACCGCCGGCGGGATGAGGGACCGCCGGATGCGGCTGTCCCTTGACGTCGCCCGCAAGTTCTTC GCGCTCCAGGACATGCTTGGCTTCGACAAGGCCAGCAAGACGGTACAATGGCTCCTCAACACGTCCAAGG CCGCCATCCAGGAGATCATGGCCGACGACGTCGACGCGTCGTCGGAGTGCGTGGAGGATGGCTCCAGCAG CCTCTCCGTCGACGGCAAGCACAACCCGGCGGAGCAGCTGGGAGATCAGAAGCCCAAGGGTAATGGCCGC AGCGAGGGGAAGAAGCCGGCCAAGTCAAGGAAGGCGGCGACCACCCCAAAGCCGCCAAGAAAATCGGGGA ATAATGCGCACCCGGTCCCCGACAAGGAGACGAGGGCGAAGGCGAGGGAGAGGGCGAGGGAGCGAACCAA GGAGAAGCACCGGATGCGTTGGGTAAAGCTTGCATCAGCAATTGACGTGGAGGCGGCGGCTGCCTCGGTG GCTAGCGACAGGCCGAGCTCGAACCATTTGAACCACCACCACCACTCATCGTCGTCCATGAACATGCCGC GTGCTGCGGAGGCTGAATTGGAGGAGAGGGAGAGGTGCTCATCAACTCTCAACAATAGAGGAAGGATGCA AGAAATCACAGGGGCGAGCGAGGTGGTCCTAGGCTTTGGCAACGGAGGAGGATACGGCGGCGGCAACTAC TACTGCCAAGAACAATGGGAACTCGGTGGAGTCGTCTTTCAGCAGAACTCACGCTTCTACTGA www.ncbi.nlm.nih.gov/BLAST/ Resources at NCBI GenBank – Molecular Databases Nucleotides, Proteins, Structures, Expression (ESTs) and Taxonomy. Literature Databases PubMed, Journals, OMIM, Book, and Citation Matcher. Genomes and Maps – Entrez Map Viewer, UniGene, COGs, Organism-specific, Organelle, Virus, and Plasmids. Tools – Software Engineering BLAST, Sequence Analysis, 3-D Structures, Gene Expression, Literature and Genome Analysis. Education Books, Courses, Public Information. Research Biology, Computers. Objectives 1. Explain what can you do with sequence data. 2. Explain what a database is. 3. Describe what kinds of data and resources are available. 4. Describe some of the uses of databases. Other Specialty Databases QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture. TIFF (Uncompressed) decompressor QuickTime™ and a QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture.