* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download IntroToBioinformatics
Gene therapy wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Adeno-associated virus wikipedia , lookup
Gene desert wikipedia , lookup
DNA barcoding wikipedia , lookup
Gene expression profiling wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Metabolic network modelling wikipedia , lookup
Synthetic biology wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Mitochondrial DNA wikipedia , lookup
Segmental Duplication on the Human Y Chromosome wikipedia , lookup
Copy-number variation wikipedia , lookup
Oncogenomics wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Point mutation wikipedia , lookup
Human genetic variation wikipedia , lookup
Genetic engineering wikipedia , lookup
Microevolution wikipedia , lookup
Transposable element wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Public health genomics wikipedia , lookup
Genome (book) wikipedia , lookup
Designer baby wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
History of genetic engineering wikipedia , lookup
Non-coding DNA wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Minimal genome wikipedia , lookup
Metagenomics wikipedia , lookup
Pathogenomics wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Helitron (biology) wikipedia , lookup
Human genome wikipedia , lookup
Genomic library wikipedia , lookup
Genome editing wikipedia , lookup
Welcome to the Bioinformatics Workshop July 18, 2002 Introduction Workshop objectives Module 1: Retrieval of literature dealing with molecular life sciences. Module 2: Sequence databases and similarity searches. Module 3: Protein structure analysis Workshop logistics Course Website (http://www.calstatela.edu/faculty/jmomand /Bioinformaticscourse.html) Power point presentation In-class workshop Definition of Bioinformatics Many definitions at the moment: Use of computers to catalog and organize molecular life science information into meaningful entities. Subset of Computational Biology How can Bioinformatics help make scientific discoveries? Bioinformatics is not just the storage of data in a computer. Bioinformatics is the use of computers to test a biological hypothesis prior to performing the experiment in the laboratory. Bioinformatics is the design of software programs that analyze data. Basis of molecular biology Hierarchy of relationships (not exactly true): Genome Gene 1 Gene 2 Gene 3 Gene X Protein 1 Protein 2 Protein 3 Protein X Function 1 Function 2 Function 3 Function X Genome Sizes FERN: 160,000,000,000 LUNGFISH: 139,000,000,000 SALAMANDER: 81,300,000,000 NEWT: 20,600,000,000 ONION: 18,000,000,000 GORILLA: 3,523,200,000 MOUSE: 3,454,200,000 HUMAN: 3,400,000,000 Drosophila : 137,000,000 C. elegans 96,000,000 Yeast 12,000,000 E. Coli 5,000,000 Smallest Genome ?????? Genes 31,000 13,500 19,000 6,315 5,361 What is the approach used to sequence genomes? Divide and conquer Split the genome into fragments Clone into vectors that can accept large fragments: yeast artificial chromosomes (YAC Library) Landmarks within the genome can be obtained using a Sequence Tagged Site (STS) Sequences of YAC clones are matched with each other. Sequences that overlap form contigs. History of the Human Genome Project 1953 Watson, Crick DNA structure 1972 Berg, 1st recombinant DNA 1977 Maxam, Gilbert, Sanger sequence DNA 1980 1982 1984 1985 1986 Botstein, Sinsheimer DOE begins Wada MRC Davis, genome proposes to publishes hosts Skolnick build first large meeting to studies with White discuss HGP $5.3 million automated genome propose to sequencing Epstein-Barrat UCSanta map human robots virus (170 Cruz; genome with Kary Mullis kb) RFLPs develops PCR 1987 Gilbert announces plans to start company to sequence and copyright DNA; Burke, Olson, Carle develop YACs; DonisKeller publish first map (403 markers) History of the Human Genome Project (continued) 1987 (cont) 1988 1989 Hood produces first automated sequencer; Dupont devolops fluorescent dideoxynucleotides Proposal Venter Simon Hood, to sequence announces develops Olson, 20 Mb in strategy to BACs; US Botstein model sequence and French Cantor propose organism by ESTs. He teams 2005; plans to publish first using Lipman, patent physical STS’s to map Myers partial maps of the human chromosome genome publish the cDNAs; BLAST Uberbacher s; first algorithm develops genetic maps GRAIL, a of mouse and gene finding human program genome published NIH supports the HGP; Watson heads the project and allocates part of the budget to study social and ethical issues 1990 1991 1992 1993 Collins is named director of NCHGR; revise plan to complete seq of human genome by 2005 1995 Venter publishes first sequence of free-living organism: H. influenzae (1.8 Mb); Brown publishes on DNA arrays 1996 Yeast genome is sequenced (S. cerevisiae) History of the Human Genome Project (continued) 1997 Blattner, Plunket complete E. coli sequence; a capillary sequencing machine is introduced. 1998 SNP project is initiated; rice genome project is started; Venter creates new company called Celera and proposes to sequence HG within 3 years; C. elegans genome completed 1999 2000 NIH proposes to sequence mouse genome in 3 years; first sequence of chromosome 22 is announced Celera and others publish Drosphila sequence (180 Mb); human chromosome 21 is completely sequenced; proposal to sequence puffer fish; Arabadopsis sequence is completed 2001 Celera publishes human sequence in Science; the HGP consortium publishes the human sequence in Nature Public funding vs. Private funding Public-Taxpayers’ money, international effort. Private-Companies that invest money hope to provide access to their information on a fee basis. Celera also allows some free information to small research groups. Both groups published the sequence of the human genome in 2001. Bioinformatics is Multidisciplinary Genomics Drug Design Computer Science Molecular Biology Phylogenetics Structural Biology Math Statistics Bioinformatics at CSULA www.calstatela.edu/faculty/jmomand/Bioinformaticscourse.html Upper Div. Standing in Biology or Biochem One course in C/C++ programming (CIS 283) Upper Div. Standing in CS, IS, CE One course in Molec. Biology/Biochem or Chem/Biol 154L (W’03) Introduction to Bioinformatics (Chem/Biol 454L) (offered in Spring ‘03) How is Bioinformatics Used? Bioinformatics isn’t going to replace lab work anytime soon Experimental proof is still the “Gold Standard”. Bioinformatics is used to help “focus” the experiments of the benchtop scientist What’s Left To Do? Find out what the rest of the genome does. Unknown Function What is left to do? Sequence genomes of other organisms Analyze genes to predict function Analyze interactions of gene products- Create genetic networks Once this is finished, then what? Start making changes Modify gene expression patterns to make better crops or better medicines Increasing levels of complexity Metabalome (metabolic pathways) Proteome (proteins) Transcriptosome (RNA) Genome (DNA) Primary public domain bioinformatics servers Public Domain Bioinformatics Facilities National Center For Biotechnology Information (NCBI) United States Databases Analysis Tools European Bioinformatics Institute (EBI) United Kingdom Databases Analysis Tools Genome Net (KEGG & DDBJ) Japan Databases Analysis Tools Literature Databases and NCBI Learning objective- How does one retrieve information on a particular subject? National Center for Biotechnology Information (NCBI) Databases outside of NCBI Retrieval of information Literature Databases Medline (PubMed) OMIM CSULA Library Other biological databases BIOSIS Agriculture http://www.fao.org Melvyl (Books at UC Libraries) NCBI ENTREZ A search engine that provides access and links between various databases ENTREZ PubMed GenBank Protein Genomes databases PopSet Taxonomy OMIM On-line Mendelian Inheritance of Man (OMIM) A catalog of human genes linked to diseases Began by Victor A. McKusick at Johns Hopkins University A good place to start when you want to know about a certain disease. This database is linked to PubMed, the OMIM Morbid Map The OMIM Gene Map CSULA and other resources The best way to access articles at Cal State LA is to obtain the exact reference from PubMed. Then search to the CSULA library database for the article: http://www.calstatela.edu/library/mudir1.htm Publishers to search through at the CSULA Library Site: ACS Wiley InterScience IDEAL There is one Website that also offers free access to journals: PubMedCentral: http://www.pubmedcentral.nih.gov/ How to keep up to date on your favorite subject? Set up Cubby. An automatic retrieval system that searches PubMed and deposits the literature citations in your own account (there is no charge). Demonstration of how Cubby works. Requires a login. Workshop Exercise 1-Retrieve information on a topic from literature databases. Set up Cubby account for yourself.