Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
List of types of proteins wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Protein domain wikipedia , lookup
Protein structure prediction wikipedia , lookup
Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup
De novo protein synthesis theory of memory formation wikipedia , lookup
PSI Structural Genomics Knowledgebase Helen M. Berman Bottlenecks Workshop April 14, 2008 Knowledgebase PSI SG Knowledgebase Knowledgebase Vision The PSI Structural Genomics Knowledgebase (PSI SG KB) will turn the products of the PSI effort into major advances in knowledge that can be used to understand living systems and human disease The PSI SG KB will be a key resource for the advancement of biology, biochemistry, functional genomics, pharmacology, bioinformatics, chemistry, education and clinical medicine PSI SG Knowledgebase Knowledgebase Goals To provide a “marketplace of ideas” that connects protein sequence information to 3D structures and homology models enhances functional annotations provides access to new experimental protocols and materials To kick start and enable advancements in structural genomics by communicating and providing visibility and accessibility of information and technology advances of the PSI through presentation and discussion of the most provocative challenges with the general community by fostering community collaborations PSI SG Knowledgebase Scope To capture, make accessible, and highlight elements of the high throughput pipelines for general use in the community and to leverage such information through the generation of hundreds of thousands of molecular models and functional annotation. Standard metrics will be used to measure progress. Experimental Tracking Target Selection Materials Genomic Data Based Target Isolation, Expression, Purification,Crystallization Collection Selection Structure Determination PDB Deposition & Release Models Annotations Publications Technology Metrics PSI SG Knowledgebase Knowledgebase Users Biologists Biochemists Functional Genomists Pharmacologists Bioinformatics Chemists Clinical Researchers and Physicians Teachers and Students KB Site Features Search by - Sequence - Keyword - PDB ID Featured Structure News and Events Technology Feature Molecules of Unknown Function Link to Functional Sleuth Gallery Link to Technology Module PSI SG Knowledgebase PSI SG KB Portal Collects sequences, common features, and common identifiers Maintains correspondences in local database Delivers aggregate reports, inventories, and epublications which contain links to PSI projects, modules and external resources Delivers featured articles describing: PSI news and events, featured molecules and technologies, molecules of unknown function Provides collaborative environments for discussion, annotation, and target suggestions PSI SG Knowledgebase PSI SG KB Portal Databases Keyword Database PSI Modules PSI Centers Models Portal Queries Portal Resource Database PSI Info Site PDB ID Related Biological Resources Archival Sequence Databases Domain Databases (Pfam) Sequence Keyword TargetD B PepcDB PDB Literature (PubMed) TargetDB Sequences PDB Sequences PSI SG Knowledgebase Modules Modules derived from PSI information and external resources Target Selection & Experimental Data Tracking Materials Repository Models Annotation Metrics Technology Outreach PSI SG Knowledgebase Target Selection & Experimental Data Tracking Target Selection – PSI-2 BIG4 Family definitions and target management TargetDB Search by sequence, Target ID, project site, status, update date, protein name, and source organism Links to other sequence databases, domain databases, other structural genomics centers, and PDB Download target data Target statistics summary PepcDB All the functionality of TargetDB plus – Experimental protocols – Detailed status history of experimental trials – Information on failed experiments PSI SG Knowledgebase Experimental Tracking PepcDB Search Form Protocol Keywords Search PSI SG Knowledgebase PSI SG Knowledgebase Experimental Tracking Module PSI SG Knowledgebase PSI SG Knowledgebase Materials Repository PSI SG Knowledgebase PSI Materials Repository Module PSI SG Knowledgebase PSI SG Knowledgebase Modeling Portal Current Phase 1 Model Portal contains Models from 4 PSI centers and 2 public model databases (SwissModel and ModBase) integrated on a common UniProt reference system. Current release consists of 5.8 million comparative protein models for 1.97 million distinct UniProt entries. PSI SG Knowledgebase Modeling Portal PSI SG Knowledgebase Metrics Module Provides objective measures of the progress and output of the PSI project Centered around “Goals and Milestones” document PSI SG Knowledgebase PSI-2 Summary Statistics Updated April 1, 2008 I.1.A Number of novel experimental PSI-2 structures 1031 I.1.B Number of distinct experimental PSI-2 structures nonredundant sequences 1428 I.1.D Total number of experimental PSI-2 structures 1628 I.1.E Numbers of experimentally determined distinct residues 319977 Numbers of experimentally determined novel residues 225518 I.2.J Number of experimental structures of human proteins 61 I.2.K Number of experimental structures of eukaryotic proteins 186 I.2.M Number of experimental structures of membrane proteins 1 I.2.N Number of experimental structures determined at the atomic level using x-ray crystallography 1484 Number of experimental structures determined at the atomic level using NMR methods 144 PSI SG Knowledgebase PSI-2 Summary Statistics for Domain and Modeling Leverage Updated January 15, 2008 I.1.C I.1.E Number and Size of BIG Domain Families for which PSI-2 provides the first Experimental Structure Representative 474 Number and Size of MEGA Domain Families for which PSI-2 provides the first Experimental Structure Representative 399 Numbers of Experimentally Determined Distinct BIG Family Residues 76579 Numbers of Experimentally Determined Distinct MEGA Family Residues 76121 Updated February 21, 2008 I.3.A Total Modeling Leverage 583735 I.3.B Novel Modeling Leverage 114407 PSI SG Knowledgebase Technology Module PSI Centers are actively developing technologies and methodologies for all aspects of the structure determination pipeline Genomic Based Target Selection Isolation, Expression, Purification,Crystallization Data Collection Structure Determination PDB Deposition & Release Publication Functional Annotation PSI SG Knowledgebase Technology Module Progress Phase 1 Technology Portal in place Summary Information from all PSI Centers Keyword search from KB portal PSI SG Knowledgebase PSI SG Knowledgebase PSI SG Knowledgebase PSI SG Knowledgebase PSI SG Knowledgebase PSI SG Knowledgebase PSI SG Knowledgebase Outreach Module Provides information to the public about the products and accomplishments of the PSI Media reports Publications Community activities Plans for a Nature Gateway PSI SG Knowledgebase PSI SG Knowledgebase Current Annotation Module Provides paths to unravel sequence, structure, function relationships 10 PSI Interactive Services for Sequence, Structure and Functional Annotations 11 PSI Galleries and Summaries of Sequence, Structure and Functional Annotations 35 other resources for annotation PSI SG Knowledgebase Annotation Module PSI SG Knowledgebase PSI SG Knowledgebase Biological Annotation of Novel Proteins March 7,8 2008 Calit2, UCSD Participants PSI groups Annotation system authors General biological community Outcome Recommendations for standard annotations Processes for community input PSI SG Knowledgebase Standard Annotations Genomic features: gene identifier, name and synonyms, operon/regulon mappings Protein sequence features: amino acid sequence, taxonomy & phylogeny, sequence database accession, isoform, SNPs, PTMs, sequence families, residue conservation. Structure features: oligomeric state, structure and functional domains, DNA binding motifs, nests & clefts, sites of interaction, residue regions of protein-protein, ligand-protein, catalytic sites, secondary structure, structural neighbors and comparison of groups of structures with common feature, properties/features mapped to 3D and their similarities (e.g. electrostatics, cavities, conserved residues, quality assessment ) Ligands: chemical structure, interactions, functional role. Functional classification: GO, FunCat, EC, epitope mapping, cellular location, organ location, substrate specificity, disease involvement Mapping to Biological Systems: mapping to networks and pathways (e.g. Reactome, Kegg, HPRD, BioCyc, Reactome, KEGG, HPRD, NetPath, MINT, MIPS, DIP, STRING, STITCH, PROLINKS) Literature: synonyms for protein names, links to PubMed by database identifier and related text and authors PSI SG Knowledgebase Future Improvements Experimental Data Tracking Standardization of the protocols in PepcDB PepcDB data deposition tool Integration with the Materials Repository Materials Repository Searchable database of clones Ordering system Integration with PepcDB and PSI SGKB Models Module Public web service interface Additional quality assessment Interactive homology modeling PSI SG Knowledgebase Future Improvements Technology Module Improved navigation over technology topic areas Keyword search option of descriptions and publications PSI SGKB Integration with Nature Gateway Simple presentation and search of standard annotations Incorporation of data about ligands and modified-residues Molecular visualization tool PSI SG Knowledgebase Access Information http://kb.psi-structuralgenomics.org/KB/ Acknowledgements KB Team Wendy Tao Raship Shah James Chun John Westbrook Modules Torsten Schwede (Models) Andrei Kouranov (Exp. Data Tracking) Paul Adams (Technology) Wladek Minor (Publications) Josh La Baer (Materials) Rajesh Nair (Metrics)