* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download IntroBio520 - Nematode bioinformatics. Analysis tools and data
Genome (book) wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Transposable element wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Protein moonlighting wikipedia , lookup
Point mutation wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Genomic library wikipedia , lookup
Microevolution wikipedia , lookup
Designer baby wikipedia , lookup
Minimal genome wikipedia , lookup
History of genetic engineering wikipedia , lookup
Metabolic network modelling wikipedia , lookup
Gene expression profiling wikipedia , lookup
Public health genomics wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Human genome wikipedia , lookup
Non-coding DNA wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Human Genome Project wikipedia , lookup
Pathogenomics wikipedia , lookup
Genome evolution wikipedia , lookup
Helitron (biology) wikipedia , lookup
Metagenomics wikipedia , lookup
Genome editing wikipedia , lookup
Bioinformatics BIO520/INF520 Jim Lund Assigned reading: Ch1 & 2 Bioinformatics Bioinformatics applies principles of information science (derived from applied math, computer science, and statistics) to make the vast, diverse, and complex life sciences data more understandable and useful. It automates simple but repetitive types of analysis. Computational biology uses mathematical and computational approaches to address theoretical and experimental questions in biology. BIO520 Topics • Navigating biological databases. • Sequence alignment. • Proteins - 3D structure visualization, prediction, motif analysis. • DNA sequence annotation. – Gene finding in prokaryotes and eukaryotes. • RNA structure. • Phylogenetic inference • Genome/transcriptome/proteome – Function & Analyses. Molecular information-DNA • Raw bacterial DNA sequence – Coding or not? – Parse into genes? – Find regulatory sequences? – PCR primers, vector engineering? – 4 bases: ACGT • 1kb for a gene • Mb for a genome 19 8 19 2 8 19 3 8 19 4 8 19 5 8 19 6 8 19 7 8 19 8 8 19 9 9 19 0 9 19 1 9 19 2 9 19 3 9 19 4 9 19 5 9 19 6 9 19 7 9 19 8 9 20 9 0 20 0 0 20 1 0 20 2 0 20 3 0 20 4 0 20 5 0 20 6 0 20 7 0 20 8 09 Sequences (millions) 110 100 90 70 60 50 100 80 80 x http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html 60 40 30 40 20 20 10 0 0 Base pairs (billions) Growth of Genbank (1982-2009) 120 Protein Structure Prediction Proteomics 1978-1998 MALDI-TOF? ESI-MS? Metabolic Networks KEGG, 1998 Regulatory Networks KEGG Bioinformatics-what is it? Acquisition, curation, and analysis of biological data Hypothesis Bioinformatic Data-1978 to 2008 • DNA sequence • Gene expression • Protein expression • Protein Structure • Genome mapping • Metabolic networks • Regulatory networks • Trait mapping • Gene function analysis • Scientific literature Goals of the HGP,1998-2003 • Reference Human Genome Sequence • • Improved Sequence Technology • • • • $0.25 per finished base Human Genome Sequence Variation Technology for Functional Genomics Comparative Genomics • • Draft 2001, Finished in 2003 Finish Mouse by 2005 (well ahead here) ELSI Genome sequences highlight the finiteness of the set of sequences! • What remains to be done? Comparative Genomics • Description of mRNAs, proteins (identity and structure) • Functional analysis • Detailed understanding of development, regulation, variation The Gene for… Other Reasons to Care Affymetrix Genentech Biologist User Training • Internet sites –Range from high quality to unreliable. • Unread documentation • Popular program sites with NO documentation –Perhaps one day I will get around to writing some documentation”–Help from a WWW service, hit several hundred times per day! Dramatic Changes in Information Science • Information Storage – Digital: text, numbers, images • Computerized Data Analysis • Automated Data Analysis • Information Distribution – Internet, cloud, etc. Moore’s Law Intel Corporation Computer Science and bioinformatics • Operating Systems • Programming • Algorithms – New problems keep turning up! • Data structure/databases • Interfaces • Search and visualization BIO520 Nuts and Bolts • Syllabus & Schedule • Labs on Fridays In Young B-35 • Textbook – Internet • Exams (2 + final) – Program • Grading: documentation – 12 labs: 10 pts – Exams: 50 pts – Final: 50 pts http://elegans.uky.edu/520 Textbooks Required textbook: • Understanding Bioinformatics by Marketa Zvelebil and Jeremy Baum Supplemental reading (don’t buy): • Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, 3rd Ed. – Baxevanis and Ouellette Biology background material: – Genes IX (Lewin) – Cell Biology (Watson et al, Darnell et al) – NCBI Bookshelf (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?d b=Books&itool=toolbar) Computer Resources • http://elegans.uky.edu/520 • Locally installed Programs: – Cn3D, Clustal, TreeView, Chime • Web based tools: – Databases – Software programs Biological Principles Evolution by natural selection DNA->RNA->Protein StructureFunction