Download Graphs and Functions:Recurring Themes in Databases

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data vault modeling wikipedia , lookup

Information privacy law wikipedia , lookup

Resource Description Framework wikipedia , lookup

Operational transformation wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Business intelligence wikipedia , lookup

Open data in the United Kingdom wikipedia , lookup

Versant Object Database wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Association rule learning wikipedia , lookup

XML wikipedia , lookup

Transcript
SeLeNe-related Research At Birkbeck
Alex Poulovassilis and Peter T.Wood
Database and Web Technologies Group
School of Computer Science and Information Systems
Birkbeck, University of London
SeLeNe Kick-off Meeting 15-16/11/2002
Research in CS & IS at Birkbeck
 Main groups:
• Database and Web Technologies
• Computational Intelligence
• Bioinformatics
• Software Engineering
 Main research funding sources: EPSRC, BBSRC, EU, Wellcome Trust,
HEFCE, industry
 URL http://www.dcs.bbk.ac.uk/~research/groups.html
SeLeNe Kick-off Meeting 15-16/11/2002
Teaching in CS & IS at Birkbeck





Foundation Degree in IT (part-time)
BSc Computing (pt)
BSc Information Systems and Management (pt)
MSc Computing Science (ft and pt)
PG Dip & MSc in e-commerce (ft and pt)
 MSc in Advanced Information Systems (ft and pt)
 MRes in Computer Science (ft and pt)
 MPhil/PhD in Computer Science (ft and pt)
 URL http://www.dcs.bbk.ac.uk/~courses/
SeLeNe Kick-off Meeting 15-16/11/2002
1. ECA Rules for XML
 This is work by us in collaboration with James Bailey at Melbourne.
It is currently being implemented by George Papamarkos, who has just
started at Birkbeck as a research student and part-time RA on SeLeNe
 XML repositories are increasingly being used in dynamic applications
where actions need to be taken in a timely fashion in response to
updates to the data
 Thus, there is a need for reactive functionality on XML repositories:
event-condition-action (ECA) rules are a natural candidate
SeLeNe Kick-off Meeting 15-16/11/2002
ECA Rules
 ECA rules take the form: on event if condition do action
Event
Detection
Users/
Apps
Condition
Evaluation
Action
Execution
SeLeNe Kick-off Meeting 15-16/11/2002
ECA rules in Active Databases
 ECA rules in active relational databases are of the form
on
if
insert/delete/update of a table
SQL condition
do
SQL statement(s)
 When an insertion/deletion/update occurs, the DBMS provides a set of
instantiations for the variables $new and $old
 These variables can be used within the condition and action parts of
rules
SeLeNe Kick-off Meeting 15-16/11/2002
ECA in Active Databases
 ECA rules are used in conventional data warehouses for
•
•
•
•
•
generation and incremental maintenance of materialised views
checking integrity constraints
performing automatic repairs when violations are detected
maintaining audit trails of the data
maintaining statistics of data warehouse performance and usage
 By analogy, ECA rules can be used to provide similar functionality on
semi-structured data such as XML and RDF.
SeLeNe Kick-off Meeting 15-16/11/2002
Our ECA Language for XML
 In our WWW2002 and Computer Networks 2002 papers, we present a
language for defining ECA rules on XML
 Rather than introducing yet another language for XML, we use
fragments of the XPath and XQuery languages within the event,
condition and action parts of our ECA rules
 This allows leverage of ongoing work on XPath and XQuery
SeLeNe Kick-off Meeting 15-16/11/2002
Our ECA Language for XML
 The event part of an ECA rule is of the form
INSERT e
or
DELETE e
where e is a simple XPath expression
 Simple XPath disallows the use of any axis other than the child, parent,
self, or descendant-or-self axes, and the use of all functions other than
document()
SeLeNe Kick-off Meeting 15-16/11/2002
Our ECA Language - Events
 In a rule event part of the form
INSERT e
the XPath expression e evaluates to a set of nodes
 The rule is triggered if this set of nodes includes any node that has
been inserted by the most recent update on the XML database
 The set of instantiations for the variable $delta is the set of new nodes
returned by e
SeLeNe Kick-off Meeting 15-16/11/2002
Our ECA Language - Events
 Similarly, in a rule event part of the form
DELETE e
the XPath expression e evaluates to a set of nodes
 The rule is triggered if this set of nodes includes any node that has
been deleted by the most recent update on the XML database
 The set of instantiations for the variable $delta is the set of deleted
nodes returned by e
SeLeNe Kick-off Meeting 15-16/11/2002
Our ECA Language - Conditions

The condition part of a rule is either
• TRUE, or
• one or more simple XPath expressions connected by and, or, not

A rule’s actions are executed on each XML document which
• has been changed by an event of the form specified in the rule's
event part,
• for each value of $delta for which the rule's condition is True
SeLeNe Kick-off Meeting 15-16/11/2002
Our ECA Language - Actions
 Each rule action is of the form
INSERT r BELOW e
or
DELETE e
 e is a simple XPath expression
 r is a simple XQuery expression
 Simple XQuery disallows the use of full FLWR expressions,
essentially permitting only the Return part of an expression.
SeLeNe Kick-off Meeting 15-16/11/2002
An Example
 An XML database containing two documents s.xml and p.xml:
<stores>
<products>
<store id="s1">
<product id="p1">
<location>...</location>
<name>...</name>
<manager>...</manager>
<price>...</price>
<product id="p1"/>
<store id="s1"/>
<product id="p2"/>
<store id="s2"/>
...
…
</store>
</product>
...
…
</stores>
</products>
SeLeNe Kick-off Meeting 15-16/11/2002
Example (cont’d)
 If one or more products are added to a store in s.xml, this rule appends
that store to the children of those products in p.xml if it’s not already a
child:
 Rule 1:
on INSERT document('s.xml')/stores/store/product
if not (document('p.xml')/products/
product[@id=$delta/@id]/store[@id=$delta/../@id])
do INSERT <store id='{$delta/../@id}'/>
BELOW document('p.xml')/products/product[@id=$delta/@id]
SeLeNe Kick-off Meeting 15-16/11/2002
Example (cont’d)
 In a symmetric way, if one or more stores are added to a product in
p.xml, this rule appends that product to the children of those stores in
p.xml if it’s not already a child:
 Rule 2:
on INSERT document('p.xml')/products/product/store
if not (document('s.xml')/stores/
store[@id=$delta/@id]/product[@id=$delta/../@id])
do INSERT <product id='{$delta/../@id}'/>
BELOW document('s.xml')/stores/store[@id=$delta/@id]
SeLeNe Kick-off Meeting 15-16/11/2002
ECA Rule Analysis
 We have also developed techniques for analysing the triggering and
activation dependencies between our XML ECA rules, described in
the two papers mentioned earlier
 These analysis techniques are also useful beyond ECA rules, since they
generally determine the effects of updates upon queries.
 So can also be used for analysing the effects of other, not necessarily
rule-initiated, updates made to an XML repository e.g.
• to determine if integrity constraints may have been violated, or
• whether materialised views need to be re-calculated.
SeLeNe Kick-off Meeting 15-16/11/2002
Relation to SeLeNe
 Similarly, we are planning to define an ECA rule language for RDF as
part of the SeLeNe project
 We need to specify the syntax and semantics of:
• queries (for rule conditions),
• updates (for rule actions), and
• events (for rule event parts)
e.g. as fragments of FORTH RDF suite’s RQL language (and the
planned extensions to with update facilities for SeLeNe)
SeLeNe Kick-off Meeting 15-16/11/2002
Relation to SeLeNe
 George Papamarkos will implement a prototype RDF ECA rule
execution engine
 Within the SeLeNe architecture, such RDF ECA rules could be used to
materialise views and to propagate changes from source learning
objects to derived learning objects
 Also, GP will work on developing techniques for automatically
generating such ECA rules from declarative view specifications (c.f.
earlier such techniques developed for relational databases)
SeLeNe Kick-off Meeting 15-16/11/2002
2. The AutoMed Project
 In work with Peter McBrien, AP has developed a new framework to
support integration of heterogeneous data sources
 The theoretical foundation of the framework consists of:
• a new notion of schema equivalence
• a set of primitive schema transformations which can be
composed to define unconditional or conditional equivalences
between schemas
SeLeNe Kick-off Meeting 15-16/11/2002
The AutoMed Project
 The modelling constructs of higher-level data models (e.g. relational,
object-oriented, semi-structured, XML, RDF) are specified in terms of
a low-level hypergraph data model (HDM)
 The specification of a modelling construct C automatically generates
addC, delC and renC primitive schema transformations
 add and del transformations have as an argument a query
 Composite schema transformations consist of a sequence of primitive
transformations, and allow constructs from different modelling
languages to be mixed within the same intermediate schema
SeLeNe Kick-off Meeting 15-16/11/2002
Query and Data Translation
 Schema transformations set up a two-way transformation pathway
between pairs of schemas:
 From a pathway T:S –> S’ we:
• compose the queries in the add steps to derive a definition of each
construct in S’ as a view over S, and
• compose the queries in the del steps to derive a definition of each
construct in S as a view over S’
 These view definitions can then be used to automatically translate data
and queries between S and S’. The process generalises to a set of local
schemas being integrated into a global schema
SeLeNe Kick-off Meeting 15-16/11/2002
Both-As-View integration
 Our schema transformation pathways capture at least the information
available from global-as-view (GAV) or local-as-view (LAV)
 We discuss this in a forthcoming paper (ICDE’03) and term our
integration approach both-as-view (BAV)
 Unlike GAV and LAV, our framework readily supports the evolution
of both local and global schemas (CAiSE’02, ICDE’03)
SeLeNe Kick-off Meeting 15-16/11/2002
Unstructured Text Sources
 As well as integrating structured and semi-structured data sources, we
are also working on extracting structure from unstructured text
sources – Dean Williams
 We are using existing IE technology (the GATE tool from Sheffield)
for text annotation. Natural language and domain ontologies will
extend these annotations.
 The extracted information will be matched with existing structured
information to derive new facts and perhaps new global schema
constructs
SeLeNe Kick-off Meeting 15-16/11/2002
Materialised integration
 Finally, as well as virtual integration of data sources, we are also
investigating using the AutoMed framework for materialised data
integration i.e. a data warehousing approach
 In particular, we are looking at incremental view maintenance and data
lineage tracing using the AutoMed schema transformation pathways –
Hao Fan
SeLeNe Kick-off Meeting 15-16/11/2002