* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Introduction to bioinformatics
Gene expression wikipedia , lookup
G protein–coupled receptor wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Magnesium transporter wikipedia , lookup
Protein moonlighting wikipedia , lookup
Genome evolution wikipedia , lookup
Protein (nutrient) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Protein folding wikipedia , lookup
Western blot wikipedia , lookup
Molecular evolution wikipedia , lookup
Proteolysis wikipedia , lookup
Homology modeling wikipedia , lookup
Biochemical cascade wikipedia , lookup
Protein adsorption wikipedia , lookup
Metabolic network modelling wikipedia , lookup
Protein structure prediction wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
Biological databases International genome sequencing and protein structure determination Protein Data Bank (PDB) Sequence data = strings of letters Nucleotides (bases) Adenine (A) Cytosine (C) triplet codons Guanine (G) genetic code Thymine (T) 20 amino acids (A, L, V, S etc.) Three-dimensional protein structure = atomic coordinates in 3D space Conversion into metric Protein folding Data types primary data sequence primary database DMPVERILEALAVE…DNA secondary data amino acid secondary “motifs”: regular expressions, blocks, profiles, fingerprints protein structure tertiary data tertiary protein structure atomic co-ordinates interaction data tertiary db e. g., alpha-helices, betastrands and pathways interaction db functional binary protein-protein networks interactions/ networks secondary db domains, folding units Primary biological databases • Nucleic acid EMBL GenBank DDBJ (DNA Data Bank of Japan) • Protein PIR MIPS SWISS-PROT TrEMBL NRL-3D International nucleotide data banks EMBL GenBank Europe EMBL EBI USA NLM NCBI International Advisory Meeting Collaborative Meeting TrEMBL DDBJ Japan NIG CIB NRDB GenBank file format GenBank file format Swiss-Prot SWISS-PROT file format SWISS-PROT file format SWISS-PROT file format SWISS-PROT file format Other primary protein databases • TrEMBL (translated EMBL) in SWISS-PROT format rapid access to sequence data from genome projects computer-annotated supplement to SWISS-PROT translations of all coding sequences (CDS) in EMBL • SP-TrEMBL Other primary protein databases The Protein Information Resource (PIR) • integrated system of protein sequence databases and derived related databases, e. g., alignment databases • rapid searching, comparison, and pattern matching of protein sequences • retrieval of descriptive, bibliographic, feature, and concurrent cross-reference information • aims to be comprehensive and consistently annotated PIR: related databases NRL-3D Sequence-Structure Database • produced by PIR from sequence and annotation information extracted from three-dimensional structures in the Protein Databank (PDB) • allows keyword and similarity searches Two other useful sites INFOBIOGEN-The Public Catalog of Databases http://www.infobiogen.fr/services/dbcat/ KEGG-Kyoto Encyclopedia of Genes and Genomes http://www.genome.ad.jp/kegg/ Kyoto Encyclopedia of Genes and Genomes (KEGG) is an effort to computerize current knowledge of molecular and cellular biology in terms of the information pathways that consist of interacting molecules or genes and to provide links from the gene catalogs produced by genome sequencing projects. Sequence Retrieval System (SRS) Database browser that allows users to •retrieve •link •access entries from all interconnected resources. Users can formulate queries across a range of different database types. Guide to Protein Databases: http://www.biochem.ucl.ac.uk/~robert/bioinf/lecture1/index.html http://www.biochem.ucl.ac.uk/~robert/bioinf/lecture2/index.html With thanks to Dr Roman Laskowski. Interaction databases Biomolecule-ligand interactions • SRS: Enzymes, reactions and metabolic pathway databases • Receptor-ligand database searches relibase.ebi.ac.uk/ Interaction databases Yeast model • YPD - http://www.incyte.com/sequence/proteome • proteome database of model organism • 6142 proteins : 3430 known, 804 similarity, 1908 unknown • data on protein interaction maps • derived from literature and experiment • Curagen - http://curatools.curagen.com • Curagen -Yeast two-hybrid screen data • 957 putative interactions of 1004 yeast proteins • Uetz et al., 2000 - Nature 403 p623-630 Protein-Protein Interaction Databases http://www.hgmp.mrc.ac.uk/GenomeWeb/protinteraction.html Protein-Protein Interactions DIP Biocarta KEGG KEGG http://www.genome.ad.jp/kegg/ •Search database for metabolic and regulatory pathways •Compute KEGG: Generate possible reaction pathways between two compounds http://www.genome.ad.jp/ Metabolic pathways Signal transduction pathways (species-specific, Homo sapiens shown) Biocarta pathway database http://www.biocarta.com