Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Ontology, RDF, SW for Chemical Structures T N Bhat & J. Barkley NIST [email protected] Query tool Use Case Publications Major Features, Goal – to Reduce User Frustration We have established a use case at the HCLS Website - Chemical taxonomies Combining of Rule-based terms with Vocabularybased terms to define elements of RDF Organization of the elements of RDF into predictable ontology using concepts from use cases Developing tools and techniques to present the information using familiar database environments – Allows easier portability and implementation of the information by the community Illustrating the concept using high profile data such as for AIDS inhibitors and Protein Data Bank contents Combining of Rule-based with Vocabularybased elements to define RDF Chemical structures are definable by atomic connectivity – thus structures are suitable for identification using graph theory – InChI – Suitable for machine reasoning Graphs are hard to digest for humans – therefore proposal is to combine InChI with familiar vocabularies such as Ala, Phenyl, Adenine – Also include synonyms in the vocabulary for greater coverage among diverse users – Vocabularies make it easier for humans to recognize the information InChI – a Scalable URI InChI is generated using a software that decodes the chemical connectivity information in certain layers such as chirality, ring structure, atom type and then recodes them to form a text string InChI is a naming standard for chemicals recommended by IUPAC InChI – a rule-based URI InChI – _1_2FC10H11NO2_2Fc1110_2812_2913-9-5-7-3-1-2-48_287_296-9_2Fh1-4_2C9H_2C56H2_2C_28H2_2C11_2C12_29 Vocabulary-based Definitions For decades scientists have been developing names to identify structures and their images – Simple names His Ala DNA ATP – Semi-rule-based IUPAC names 2-amino-3-methylpentanamide 4-amino-3-hydroxy-6-methylheptanoic_acid 1-[(Benzenesulfonyl-methyl-amino)-phenyl-butyl]-piperidin-4-yl}propyl-carbamic acid, naphthalen-1-ylmethyl ester Names facilitate text-based queries of desired components Names when used together with InChI provide a smoother integration of machine and human needs Use-Case for SW; Treatment for AIDS is a work in progress Treatments for AIDS are of two types – Prevention – the most effective – Containment Drugs to contain, and reduce the viral load – – – Majority of the drugs ( ~17) target either HIV protease or RT Complete suppression of either of these viral enzymes could cure AIDS But drug resistance leads only to partial suppression of the enzymes All the drug design efforts for AIDS are based on structures Data needed for drug-design is scattered over many Web resources and users often wean through the data manually Therefore AIDS drug design is an ideal target for Semantic Web and novel new database related technologies SW connection between NIST and NIAID AIDS database Choose the problem that matters Website Annotation Technique/Developing Structural Ontology Define compounds using chemical features of interest to use cases – Fragment, subgroup, class 000503 000505 1A8K 030798 Modeling with Protégé – Suitable for Text-based Ontology Web tools Structures are different from text based info – Structures are not amenable to text-based query/rendering techniques – Majority of the structural users never heard (nor want to hear!) about SPARQL – query language for RDF – Commonly preferred/expected way to query is by ‘click’ Semantic Web for Structures needs new Web tools that allow navigation by clicking on structural features Chem-BLAST for Structural Semantic Web http://bioinfo.nist.gov/SemanticWeb_pr3d/chemblast.do Prasanna et al. PROTEINS 60, 1-4 (2005). Prasanna et al. PROTEINS 63(4), 907-917(2006). Download publications Future Plans Extend the work to chemical structures from Protein Data Bank If interest exists hold a workshop at NIST Proposed dates last two weeks of March 2008 – Workshop will be in conjunction with the NIST wide Ontology week Possible collaboration with IUPAC (International Union of Pure and Applied Chemistry ) and ChEBI – Contact: Colin Batchelor [email protected] RSC Publishing, Royal Society of Chemistry Community participation is essential for further development Contact [email protected] 301 975 5448 (US)