Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Son of Standardizing Drug Target Types A Pistoia Vocabulary Standard Initiative Lee Harland, Christopher Larminie And Phoebe Roberts With input from the Pistoia VSI group Was a PUBLIC DOCUMENT Drug Targets • A ‘simple’ monomer? – HTR1B (or 5ht-1b, 5ht1b, htr1b…etc) • A ‘stable’ (core) complex – 2(NR1:NR2A) • A ‘dynamic’ complex – NMDA-MASC What is the “target” of insulin? From: A. J. Pocklington, J. D. Armstrong and S. G. N. Grant (2006): Organization of brain complexity — synapse proteome form and function. Briefings in Functional Genomics and Prot 5(1), pp66-73 Example from a pipeline database • Research programme: Wnt signalling pathway inhibitors • Three proprietary drug targets which inhibit the Wnt pathway have been identified by the company. The company's most advanced candidates inhibit the interaction between Bcl9/hLgs and beta-catenin, the key regulatory protein in Wnt-signal transduction Problem Statement • • • • • • No agreed standard for representing a Molecular Drug Target within information systems. ad hoc solutions, often free text or simple list of gene identifiers. Especially for non-single protein entities (see right) Crucially, no public URI to link identical concepts across different sources. For instance “Protein Kinase C” Effect: No mechanism for association of data between crucial drug discovery entities More than a “vocabulary” issue – its how a target should be represented A molecular drug target standard could accomplish 3 things: – Represent: A common scheme to describe a drug target. • – – Including URIs for all known drug targets(*) Organise: Map all known drug targets to a pharmacologically relevant taxonomy Discover: Exploit the resource to identify assertions relating to drug target concepts http://dx.doi.org/10.1517/17460440903049290 High Level Summary • Representation of a molecular drug target in structured databases is ad-hoc – – • This project will focus on industry & suppliers to describe a specification for reporting drug targets within structured content – – – – – – • Single protein-targets are “OK” (being linked via Entrez gene, but this is not an agreed standard) Multi-protein targets, complexes, biologicals and many more are poorly described, often simply raw text Minimal cost, just FTE time required This could feed into the IMI Open Pharmacology (OPS) call as an industry-publisher requirement Output would be a specific set of “rules” regarding the representation of complex molecular targets Aim would not be to define a list of all known targets, this would be out of scope. As will any text-mining efforts. Recommendation to suppliers and industry to adopt specification along with industry-generated mappings for pre-existing targets Deliverable – specification & publication Could be a start to a future, wider phamacological data standard project – – All databases providing pharmacological activity content delivered in a standard way Could gain a quick-start building on MIABE standard Definitions • Target Form ontology: – A small list of TYPES/FORMS/ROLES that a drug target can be. E.g. single protein, protein complex, fuzzy group etc • Target Instance ontology: – An ontology representing the targets themselves, providing a URI for things such as Protein Kinase C, NMDA Receptor etc • Target Family ontology: – An ontology grouping together different targets under the same functional property, e.g. “Aminergic Receptors”, “Serine Proteases”, “Phosphodiesterases” This first phase of the Target Standard project will look to define the Target Form ontology ONLY. Other things we need to say about a drug and its target • What the drug is doing to the activity of the target (a.k.a. “mode of action”) – inhibiting, activating, or mimicking? – This is relatively well standardized, not an impediment to use and integration • What are the implications of the drug binding to the host protein? – – – – Efficacious target Secondary pharmacology Metabolism of drug Convert pro-drug to active form The Project • We are looking to Collaborate: 4+ Pharma, 3+ Vendors, 2+ Academic/Database providers • Define a small ontology of “types” of drug target • Create an implementation-independent standard to represent target-types in vendor systems • Analyse how vendor content would fit this standard • If possible, assist vendors in adopting this standard • Discuss standard with non-commercial providers/public domain ontologies • Est Cost: low $$, intellectual input required Pharma Could Aid Vendor Adoption • We have already mapped many non-single protein targets to types within our systems • We could contribute these back to vendors in the form of: – <TARGET> <TYPE> • Vendors could then load these back into their systems without any limitation • Vendors would then only need to add target typing going forward • Cross-pharma means good consensus view of target-2-form mappings Output • Would be a simple table of types, typeids and definitions • Vendors then able to add target –type mappings in their systems ID Name Def DTS1 Single Protein Target A target which is a single protein DTS2 Fuzzy protein target A known protein target where the list of proteins is ambiguous DTS3 Complex A protein complex drug target • Targets would also list public database protein identifiers. • The combination of the two helps consumers understand the nature and composition of non-single protein targets • Recommendations on other meta-data essentials (e.g. organism, mutation etc) would also be defined – PDE5 = DTS1 – Protein Kinase C = DTS2 – Gastric Pump = DTS3 Application of Target Form Ontology Fuzzy Fuzzy groups Fuzzy families Non-protein targets Target form ontology Granular Single protein Well-defined complex GO CC GO:0005694 Large Ribosomal Subunit has_target_form complex shared by grampositive bacteria GO MF GO:0004697 Protein Kinase C activity has_target_form fuzzy group with shared activity PRO complex PRO PRO:xxx integrin alpha4 beta1 has_target_form well-defined complex PRO:000000535 has_target_form single protein Vendor implementation • The types would be a purely abstract list, there would be no requirement for vendors to change their database infrastructure • Only change that would be required is the addition of the “type” identifier within their system (e.g. new db column, new field etc) • We will not dictate how the information should be presented, only that the field should be available within the data Benefits • • Questions this will address (Core) – What type of target is this? – What are the molecular components of this target? – How are the molecular components of this target related? – How does this target relate to other targets? – How can information in two separate sources be associated to the same target concept? Questions this will address (Fact Identification Module) – What are all the synonyms for this target? • • Industry: Deal with a specific, common problem – • • • • • • • • Namely the representation and integration of drug target associated data across public and commercial sources Develop a more complete picture of the existing drug target universe Explore business models that include fully public and licencing elements Contribute to the PRO ontology? Content Providers Reduction in the cost of development and the cost of sale of drug-target data products Customer ready solution Increased potential to find target specific data is increased, maximising use of content Address incomplete and inconsistent search results and customer satisfaction Moving Beyond • This is a small project, to address a specific issue. • However, this should also create a base upon which to build. The group may explore: – Further development of Target instance and family ontologies – Further standards around competitor/chemogenomic//pharmacological data provision – Collaborative opportunities with public domain resources – Technology opportunities for providers & consumers in the target space Discussions with PRO • PRO is the public domain, OBO-compliant protein ontology http://pir.georgetown.edu/pro/pro.shtml • We have opened a dialog to understand what aspects of target standards could be represented by PRO. Would start to address the “public URI” problem • There is an option of a cross-industry funded researcher placed in an academic group to implement standards in a resource such as PRO • Early days. Does not preclude the involvement of other groups or other mechanisms Acknowledgements • Pistoia Member Companies & Individuals • Pistoia Vocabulary Standards Group • OBO Co-ordinators Vocabulary Standards Initiative Pilot: Molecular Drug Target Reporting Standard Project Objectives Activities / Deliverables • To identify a methodology to represent drug targets within information systems which is simple and can be implemented at minimal cost to the provider • Specifically: • To create a rule-base for the description of complex molecular drug targets • To create a controlled vocabulary for representing drug target classes • To consider additional “add-on” components (e.g. synonyms & text-search capabilities) which would add considerable value to all participants • To deliver a specification document which can be used in a next phase of a Pistoia, IMI or other funded programme to deliver a suitable implementation • Development of a project team of interested parties • Agree area of focus for pilot • All parties agreed of major outcome is specification document • All parties agree components of the specification • All parties prepared to submit exemplar data for the specification • All parties to review existing standards and make recommendations on their use within this initiative • Provider and industry review value, cost, technical feasibility and minimum core services required to move to specification • OBO guidance on utility as an “ontology” and value within the public domain • Primary delivery of the pilot is a specification report, documenting requirements & recommendations for any subsequent implementation. Business Challenge • No universal standard for describing a molecular drug target within structured content: • For Consumers: • Inability to navigate, find and exploit information on a core pharmaceutical entity. Inability to use same standard to connect internal & external data • For providers: • Undermines efforts to make data more accessible. Missing results, reducevalue of product Background • For community: • No knowledge of industry/provider experience dealing with these concepts to increase definition and access for all. Background • Pistoia Alliance sponsored project. • Part of the Vocabulary Standards Initiative (VSI) within the KIS Domain. Success Criteria • 4+ pharma, 4+ supply chain company participants at least one academic group • Agreed specification document generated by end 2010. Including 2+ content provider assements of feasability of deployment. Agreed pharma commitment to implementation via IMI • Clear picture of feasibility from all stakeholders and >75% of partners interested in longer term service. Stakeholders & Resource Requirements • Pistoia Alliance • Chair: Pfizer, GSK • Relatively low level funding, time/input from members will be critical • Will open dialog with the PRO group (and other interested parties) as to support in the public domain Key Milestones Dec 09 Jan 10 VSI Kick-off M Pilot Planning Feb 10 Q2 10 Mar 10 Pilot Planning & Recruitment M Q3 10 Q4 10 Target Ontology Kick-off Resourcing & IP • Inability to separate requirements from implementation • Inability to identify common path to adoption • IMI Pharmacological space call, opportunity for funding but also complex logistics • “PRO” protein ontology, opportunity to enhance this resource in tandem Expected Benefits (Value) EBI OBO Meeting M Threats / Opportunities Execution Q1 11 • Reduce system development costs (supplier) • Reduced integration costs (consumer) • Improved usability of Drug target information • Improved scientific analysis and hypothesis generation capabilities for all • Increased visibility content. Draft. V1.2of19-01-2010