Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Database model wikipedia , lookup

Transcript
RDF and
triplestores
CMSC 461
Michael Wilson
Reasoning
 Relational
databases allow us to reason
about data that is organized in a specific
way


Data that models specific relationships
Data that is very cleanly structured
 What
other reasoning methods are
available to us?
Metadata
 “Data

about data”
Data that describes other data
 Gives
 Example



context
metadata:
Image EXIT data (geolocation, rotation,
etc.)
User statistics
Last saved information in a file
What’s so important?
 The
context that we gather from
metadata often allows us to understand a
much greater picture




Can correlate and tie metadata together
Calculate statistics on metadata
Understand trends
Infinite possibilities
The depth of metadata
 Many
systems have their own way of
storing metadata

Database tables may be organized to
house specific metadata
 This
does not lend itself well to discovering
new types of metadata


Person may have age, DOB
Later want to add new types (friends,
Facebook ID, Twitter ID, etc.)
Metadata structures
 RDF

Resource Description Framework
 OWL

Web Ontology Language
 Ontology
– established vocabulary to
describe knowledge within a domain
 RDF
is more widely used
Schemas

RDF and other structured metadata formats
allow us to establish a common language to
describe different sorts of metadata

We can make schemas that describe
Social media
 Physical location
 Job details


Moreover, we can tie them all to one subject

Doesn’t require database reorganization
Why is that cool?
 What
this means is that we can tie any
arbitrary sets of data together with very
little work on our part
 We make a schema that describes a new
domain, and staple that information onto
an existing subject
Triples
 Within
these schemas, data is
conceptually organized as

<subject> <predicate> <object>
 Subject

The subject of the expression
 Predicate

The relationship between the subject and object
 Object

The direct object of the expression
 These
expressions are called “triples”
Triple examples
 Examples?
Storing triples
 Since
we are often interesting in large
amounts of data, we need to think on
how to store these
 Triplestores


Pretty obvious
What do these give us over doing
something like storing the information in a
database?
Triplestore querying
 Triplestores

can also be queried
SQL is more limited for the kinds of queries
we’d like to be able to make
 SPARQL

The acronym stands for:
 SPARQL
Protocol and RDF Query Language
SPARQL
 SPARQL

is a SQL-like query language
Allows us to query on the various schemas
we have assigned to our subjects
 SPARQL
queries can look surprisingly
readable
SPARQL example
PREFIX abc:
<http://example.com/exampleOntology#>
SELECT ?capital ?country
WHERE {
?x abc:cityname ?capital ;
abc:isCapitalOf ?y .
?y abc:countryname ?country ;
abc:isInContinent abc:Africa .
Querying power
 Using
SPARQL, you can make extremely
deep, powerful queries and reason very
intuitively on the data present in a
triplestore
 Organizing data this way allows
computers to actually be able to reason
on data as well
Caveats
 All

this tech is SUPER new
All tied very heavily into the Semantic Web
 Basically
introduce a system like this into the
web at large
 Metadata stored about web pages,
computers can reason about them
 Much

of this is a moving target
Not a whole lot of production applications
using this stuff yet
Tools
 There
are a few triplestore servers and
other tools you can use
 Jena




Apache project
Framework that allows for Semantic Web
concepts to be employed
Can query using SPARQL
Jena can use Postgres in the background
More tools
 RDFLib



https://github.com/RDFLib
Python library for RDF
Can run entirely in memory
 Good
more
for experimentation purposes and