Download Bio-Ontologies in the context of the BOOTStrep project

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Protein moonlighting wikipedia , lookup

Molecular evolution wikipedia , lookup

Histone acetylation and deacetylation wikipedia , lookup

Genome evolution wikipedia , lookup

Lac operon wikipedia , lookup

Community fingerprinting wikipedia , lookup

Promoter (genetics) wikipedia , lookup

Gene therapy wikipedia , lookup

Gene desert wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

RNA-Seq wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

List of types of proteins wikipedia , lookup

Gene nomenclature wikipedia , lookup

Gene expression profiling wikipedia , lookup

Gene expression wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Gene regulatory network wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Transcript
Bio-Ontologies in the Context of the BOOTStrep Project
Andrea Splendiani1, Elena Beisswanger2, Jung-Jae Kim3, Vivian Lee3, Olivier Dameron1
Udo Hahn2 and Dietrich Rebholz-Schuhmann3
1Laboratoire d'Informatique Médicale, Université de Rennes 1, Rennes, FR
2
Language and Information Engineering (JULIE) Lab, Friedrich-Schiller-Universität, Jena, DE
3
Rebholz Group, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, U.K.
Aims of the BOOTStrep Project
GRO (0.1) – Fragment of Continuants
• Construction of a wide-coverage integrated lexical, conceptual and factual
knowledge resource for the biology domain, consisting of:
- Bio-Lexicon
- Bio-Ontology
- Bio-Fact Store
• Development of a comprehensive natural language processing (NLP) pipeline
• Incremental enrichment of the BOOTStrep knowledge resources by information
automatically extracted from scientific literature
Bio-Ontology: Pivotal Knowledge Resource
•
•
•
OWL DL as the knowledge representation language for the Bio-Ontology
Bio-Ontology provides:
- Vocabulary for semantic annotation of a selected text corpus
(a subset of the ontology is used only)
- Semantics for accessing the Bio-Fact Store and for populating the
Bio-Lexicon
Bio-Ontology enables:
- Consistency checking of ontological terms and facts in the Bio-Fact
Store (using a terminological classifier)
- Logical inferencing that allows for advanced text analysis and
interpretation (disambiguation of terms, event identification, anaphora
resolution, etc.) and interpretation of facts in the Bio-Fact Store
From a Specific GRO to
an Integrated Architecture for the Bio-Ontology
•
The current version of GRO (V 0.1) covers fundamental types of the gene
regulation domain
Bottom-up construction process that was primarily driven by the task of
semantic annotation of gene regulation events in a corpus of scientific
abstracts
Top-down definition according to the upper level ontology BFO1 is under
way
(Semi-automatic) Enhancement towards a broader and more comprehensive
three-layer Bio-Ontology reusing existing alternative terminological and data
base resources
•
•
•
1
Ontology
http://www.ifomis.org/bfo/
Three-Layer Architecture for Bio-Ontology
Type-to-Linguistic-Variants-Linking
Semantic Annotation
Inference / Consistency Checking
Text
Bio-Lexicon
Bio-Fact Store
Ontology
Top-Level
Bio-Ontology (Three Layers)
Entity
Continuant
Characteristics of the Gene Regulation Ontology (GRO, V 0.1)
- 250 classes
- 150 property restrictions
- Taxonomic is-a relation plus five fundamental semantic relation types
(has-part, has-participant with the subrelations has-agent and haspatient, and species (plus inverses and reciprocals)
- Two major branches describing physical objects and processes in the
field of gene regulation, and their interrelations
- Integrates parts of existing bio-ontologies and terminological resources,
such as:
• Gene Ontology1
• INOH Molecular Role Ontology1
• INOH Event Ontology1
• Sequence Ontology1
• TransFac2
1 http://obofoundry.org
2 http://www.gene-regulation.com/
Bio-Ontologies
Function
Biological Function
Cell
Occurrent
Independent Cont.
Quality
Process
Boundary
Object
Organism
E.coli
Molecular Entity
Nucleotice
Sequence
Protein
Gene
Transcription
Factor
Biological
Process
Gene
Expression
Regulation of
Biological Process
has-agent
Regulation of
Gene Expression
DNA-Binding
Transcription
Repressor
species
Process
Regulation
Mouse
Human
Genes and Proteins
•
Domain chosen: gene regulation (in E. coli, though scaling to other model
organisms is anticipated in the design)
Database Entries:
•
GRO and Parts of Other
Dependent Cont.
Focus on Gene Regulation
Multiple regulation involved in the
expression of the uxuR regulatory
gene in Escherichia coli K-12
„ ... These results indicate that the
expression of the uxuR gene is repressed
by its own product but also by the exuR
repressor. ...”
has-patient
part-of
part-of
Population
and
Cleaning of
Ontology
Translation
Transcription
Bio-Lexicon
Escherichia coli Synonym: E.coli
exuR
uxuR
uxuR Synonym: ECK4315
Synonym: JW4287 Variant: …
exuR Synonym: ECK3085
Synonym: JW3065
Regulates (uxuR, uxuR expression, E.coli K-12)
Regulates (exuR, uxuR expression, E.coli K-12)
Bio-Fact Store
is-a
instance-of
Acknowledgements
BOOTStrep (FP6 - 028099) is a Specific Targeted Research Project
(STREP) funded by the European Union
http://www.bootstrep.eu/