Download - Tetherless World Constellation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data model wikipedia , lookup

Data vault modeling wikipedia , lookup

Semantic Web wikipedia , lookup

Database model wikipedia , lookup

Imagery analysis wikipedia , lookup

Business intelligence wikipedia , lookup

Information privacy law wikipedia , lookup

Transcript
IN51A-1685
Current System
If you know where to look, you can
click on the “Wiki” tab to get some
additional information. But context
is lost at that point.
Semantically Enabled Knowledge Representation
of Metamorphic Petrology Data
Patrick West1 ([email protected]), Peter Fox1([email protected]), Frank Spear2 ([email protected]),
Sibel Adali3([email protected]), Cam Le Nguyen1([email protected]),
Benjamin Hallett2([email protected]), L. K.S. Horkley2([email protected])
There’s not a lot of information here beyond the results of the
search. If one clicks on one of the result samples, you get the
screen below with some information, but nothing that links to
“knowledge information”. In other words, no links for what
minerals are, or what chemical analysis were used, or who the
owner is of the sample, linked data, project information, etc…
(1Tetherless World Constellation, Rensselaer Polytechnic Institute 110 8th St., Troy, NY, 12180 United States)
(2Earth and Environmental Science, Rensselaer Polytechnic Institute 110 8th St., Troy, NY, 12180 United States)
(3Department of Computer Science, Rensselaer Polytechnic Institute 110 8th St., Troy, NY, 12180 United States)
Abstract
More and more metamorphic petrology data is being collected around the world, and is now being organized together into
different virtual data portals by means of virtual organizations. For example, there is the virtual data portal Petrological
Database (PetDB, http://www.petdb.org) of the Ocean Floor that is organizing scientific information about geochemical data of
ocean floor igneous and metamorphic rocks; and also The Metamorphic Petrology Database (MetPetDB,
http://metpetdb.rpi.edu) that is being created by a global community of metamorphic petrologists in collaboration with software
engineers and data managers at Rensselaer Polytechnic Institute. The current focus is to provide the ability for scientists and
researchers to register their data and search the databases for information regarding sample collections.
What we present here is the next step in evolution of the MetPetDB portal, utilizing semantically enabled features such as
discovery, data casting, faceted search, knowledge representation, and linked data as well as organizing information about the
community and collaboration within the virtual community itself. We take the information that is currently represented in a
relational database and make it available through web services, SPARQL endpoints, semantic and triple-stores where
inferencing is enabled. We will be leveraging research that has taken place in virtual observatories, such as the Virtual Solar
Terrestrial Observatory (VSTO) and the Biological and Chemical Oceanography Data Management Office (BCO-DMO);
vocabulary work done in various communities such as Observations and Measurements (ISO 19156), FOAF (Friend of a
Friend), Bibo (Bibliography Ontology), and domain specific ontologies; enabling provenance traces of samples and subsamples
using the different provenance ontologies; and providing the much needed linking of data from the various research
organizations into a common, collaborative virtual observatory.
In addition to better representing and presenting the actual data, we also look to organize and represent the knowledge
information and expertise behind the data. Domain experts hold a lot of knowledge in their minds, in their presentations and
publications, and elsewhere. Not only is this a technical issue, this is also a social issue in that we need to be able to encourage
the domain experts to share their knowledge in a way that can be searched and queried over. With this additional focus in
MetPetDB the site can be used more efficiently by other domain experts, but can also be utilized by non-specialists as well in
order to educate people of the importance of the work being done as well as enable future domain experts.
Modern informatics enables a new scale-free framework approach
To accomplish the re-development of
the MetPetDB portal we have
decided to follow the Semantic Web
Methodology and Technology
Development Process, as developed
by Peter Fox at Tetherless World
Constellation. This methodology is
an iterative approach to developing
semantically-enabled knowledge
information systems.
 Start with face-to-face meetings
with current researchers and
developers,
 Take a look at the current portal
(upper right).
 Develop use cases from current
usage models as well as future
usage.
 Analyze the information models
 Develop more expressive
representations of the information
 Review the information
 Iterate and Evolve
 Relational database used to store data.
 Good Enough for portals with simple search capabilities
 But more requirements being placed on systems to not only provide data, but to provide
knowledge information and links to other systems.
 For this, we need a more expressive representation of the data, and the metadata, and
community information
Ontology Spectrum/Expressivity
Catalog/
ID
Thesauri
“narrower
term”
relation
Formal
is-a
Current System
Relational
Terms/
glossary
Informal
is-a
Selected
Logical
Constraints
Frames
(properties)
Formal
instance
(disjointness,
inverse, …)
Next Generation System
Value Restrs.
Originally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty;
– updated by McGuinness.
Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html
 Not only do we represent concepts, but relationships between concepts.
 Both concepts and relationships are first class citizens
 Common vocabulary, taxonomy, RDFs, owl ontology … more expressive
representation of knowledge information
 Share and tie together formal representation with other efforts
 Becomes a linked virtual observatory
 Easier implementation and evolution of concepts and relationships
 Implementation of faceted browsing capabilities
 Discovery mechanisms
 Tying together science domain, community domain, educational domain, provenance
domain.
Next Generation System
Knowledge Representation
SWEET ontology for elements, minerals, processes, matter
FOAF – Friend of a Friend for person, roles, documents, organizations, relationships between
O&M – ontology for sampling, measurements of components, observational information
bibTeX – ontology for bibliographic information
Embedded faceted search
interface into the MetPetDB
Community Portal.
Facets for concepts
People
Projects
Minerals
Elements
Chemical Analysis
Regions
Information easy to click to
Information pages for
each instance of a concept
Links to more information
Acknowledgments:
Stephan Zednik and Evan Patton – Tetherless World Constellation
Results and Next Steps
 Results
 Developed information model from knowledge extraction of relational schema
 Developed web services to provide common information extraction
 Creation of community portal using Drupal CMS for community participation
 Next Steps
 Determine useful existing vocabularies and ontologies, e.g. FOAF, SWEET, O&M,
VSTO, bibTeX
 Complete development of RDFs – OWL representation of knowledge information
 Expand web services, response representation, and availability to faceted search
 Development of faceted search/browse interface using S2S
 Implement Linked Data and link to other efforts (PetDB, EarthChem)
Sponsors:
Tetherless World Constellation, Rensselaer Polytechnic Institute
Glossary:
bibTeX – ontology for bibliographic data (http://zeitkunst.org/bibtex/0.1/)
CMS – Content Management System
FOAF - Friend of a Friend
MetPetDB – Metamorphic Petrology Data Base
O&M – Observations and Measurements (http://www.opengeospatial.org/standards/om)
OWL – Web Ontology Language
RDFs – Resource Description Framework Schema
RPI/TWC – Rensselaer Polytechnic Institute / Tetherless World Constellation
SWEET – Semantic Web for Earth and Environmental Terminology (http://sweet.jpl.nasa.gov)
XSL – Extensible Stylesheet Language
VSTO – Virtual Solar Terrestrial Observatory