Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Concurrency control wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational algebra wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Transcript
Language Extensions for the
Automation of Database Schema
Evolution
G. Papastefanatos1, P. Vassiliadis2, A. Simitsis3, K. Aggistalis2 , F.
Pechlivani2, Yannis Vassiliou1
(1) National Technical University of Athens
{gpapas,yv}@dbnet.ece.ntua.gr
(2) University of Ioannina
{pvassil,kostazz,fpechliv}@cs.uoi.gr
(3) IBM Almaden Research Center
[email protected]
Outline
• Motivation
• Graph-based modeling
• A language for policy annotation
• Evaluation
• Conclusions
ICEIS'08, Barcelona, June 2008
2
Outline
• Motivation
• Graph-based modeling
• A language for policy annotation
• Evaluation
• Conclusions
ICEIS'08, Barcelona, June 2008
3
Data Warehouse Environment
Act3
Act4
Act2
Act5
ICEIS'08, Barcelona, June 2008
WWW
Act1
4
Motivation
Act3
Act4
Act2
Act5
ICEIS'08, Barcelona, June 2008
WWW
Act1
5
Database Schema Evolution
• Current database systems are continuously evolving
environments, where design constructs are
– Added
– Removed
– Modified
• Evolution is not handled by current DBMS with an
automatic way
– Syntactic as well as semantic adaptation of queries and
views is a time-consuming task, treated in most of the cases
manually
• Evolution-driven database design is missing
ICEIS'08, Barcelona, June 2008
6
Evolution Effects
• SW artifacts around the DBMS are affected:
– Syntactically – i.e., become invalid
– Semantically – i.e., query must conform to the new
source database semantics
• Adaptation of activities, queries and views
– time-consuming task
– treated in most of the cases manually by the
administrators/developers
ICEIS'08, Barcelona, June 2008
7
We would like to know…
• What part of the process is affected and how
if e.g., an attribute is deleted?
• Can we predict the impact of changes?
• To what extent can readjustment be
automated?
• Can we perform what-if analysis for potential
changes of source configurations?
ICEIS'08, Barcelona, June 2008
8
Outline
• Motivation
• Graph-based modeling
• A language for policy annotation
• Evaluation
• Conclusions
ICEIS'08, Barcelona, June 2008
9
Database Schema Evolution – Our
approach
Graph based representation
of database constructs (i.e.,
relations, views, constraints,
queries)
Annotation of graph with
rules for adapting queries to
database schema evolution
Mechanism for performing
what-if analysis for
potential changes of
database configurations
ICEIS'08, Barcelona, June 2008
Evolving
applications
Evolving
databases
Database
Schema
Queries
Graph-based
modeling
for uniform
representation
Rules for
Handling Evolution
Metrics for
Evaluating Evolution
Design
10
Graph based representation
SELECT EMP.Emp#, Sum(WORKS.Hours)
as T_Hours
FROM EMP, WORKS
WHERE EMP.Emp# = WORKS.Emp#
GROUP BY EMP.Emp#
ICEIS'08, Barcelona, June 2008
11
Annotating the graph with adaptation
rules
1
Set of evolving database constructs:
• relations
• attributes
3
• constraints
Set of reaction policies:
• propagate
• block
• prompt
2 Set of potential evolution changes:
• addition
• deletion
• modification
ICEIS'08, Barcelona, June 2008
12
Query Adaptation - Example
Q: SELECT EID, Name
FROM EMP
Q: SELECT EID, Name, Phone
FROM EMP
Annotated Query Graph
Event
ON attribute addition
TO EMP
THEN propagate
map-select
S
Q
EID
S
EMP
S
map-select
…
S
Q
map-select
Name
ON attribute addition
TO EMP
THEN propagate
Add attribute Phone to
relation EMP
S
EID
Transformed Query Graph
Name
EID
S
S
EID
S
map-select
Name
map-select
Phone
ICEIS'08, Barcelona, June 2008
EMP
S
…
S
Name
Phone
13
Contributions
• Based on
– a graph-based model for database constructs (relational
tables, views, database constraints) and SQL queries
• we present
– a mechanism for the annotation of the graph’s constructs
with elements that facilitate what-if analysis and
predetermine the reaction to evolution events occurring in
the database schema
– SQL extensions that enable the implementation of our
approach for evolution management
• Case study
ICEIS'08, Barcelona, June 2008
14
Outline
• Motivation
• Graph-based modeling
• A language for policy annotation
• Evaluation
• Conclusions
ICEIS'08, Barcelona, June 2008
15
Annotation of graph constructs with
policies
ON <event> TO <element> THEN <policy>
{delete update insert}
• {relation, query,
view, attribute
constraint,
condition}
• specific nodes
{block propagate prompt}
ICEIS'08, Barcelona, June 2008
16
Global constraints
ON DELETE NODE THEN PROPAGATE
ON DELETE ATTRIBUTE THEN PROMPT
Example:
CREATE DATABASE company
ON DELETE ATTRIBUTE THEN PROMPT
ICEIS'08, Barcelona, June 2008
17
Top – level constructs
Example for relations:
CREATE TABLE works (EMP# NUMBER(3), PROJ#
NUMBER(3), HOURS NUMBER(5),
ON Add Attribute TO works THEN propagate)
This statement for relation works allows the addition
of attributes and propagates this addition to all
queries and views accessing this relation
ICEIS'08, Barcelona, June 2008
18
Top – level constructs
Example for views:
CREATE VIEW emps-prjs AS
SELECT E.Emp#, E.Name, P.Projname
FROM Emp E,Works W,Proj P
WHERE E.EMP#=W.EMP# AND W.Proj#=P.Proj#
ON Modify Condition TO emps-prjs THEN
block
This syntax blocks changes in the WHERE clause of
the view definition for view emps-prjs
ICEIS'08, Barcelona, June 2008
19
Top – level constructs
Example for queries:
Q:
SELECT EP.Emp#, EP.Name
FROM emps-prjs EP
WHERE EP.PRJNAME = ‘Olympic Games’
ON Add Attribute TO emps-prjs THEN
block
The above syntax blocks the inclusion of added
attributes in the underlying view emps-prjs in the
select clause of the query.
ICEIS'08, Barcelona, June 2008
20
Specific node annotation
CREATE TABLE emp
(EMP# NUMBER(3),
Name Varchar2(150),
... ,
ON Delete Attribute TO Name THEN block)
Q: SELECT E.Emp#, E.Name, P.Projname
FROM Emp E,Works W,Proj P
WHERE E.EMP#=W.EMP# AND W.Proj#=P.Proj#
ON Delete Attribute TO Name THEN propagate
ICEIS'08, Barcelona, June 2008
21
Constraints
CREATE TABLE emp
(EMP# NUMBER(3),
Name Varchar2(150),
Constraint EMP.PK PRIMARY KEY (EMP#),
ON Modify Constraint TO EMP.PK THEN propagate)
ICEIS'08, Barcelona, June 2008
22
Conditions
CREATE CONDITION <condition> AS <expression>
CREATE CONDITION Emp_Age_Cond AS AGE>50
CREATE CONDITION Works_Emp_FK AS WORKS.EMP# IN
EMP.EMP#
CREATE CONDITION Works_Emp_J AS
WORKS.EMP#=EMP.EMP#
• Then, a query SELECT * FROM EMP WHERE AGE_COND
would simply use the condition as a macro.
• Parametric conditions, to allow referring to aliases in SQL
queries are straightforward.
• Reverse eng. of existing code: automatic condition names can be
assigned to all the queries.
ICEIS'08, Barcelona, June 2008
23
Outline
• Motivation
• Graph-based modeling
• A language for policy annotation
• Evaluation
• Conclusions
ICEIS'08, Barcelona, June 2008
24
Testbed
• A data mart of an agency for the Greek
public sector
• 6 months of monitoring
• 52 “queries” (in reports, scripts) over 18
relations
• When converted to our graph structure:
2500 nodes
ICEIS'08, Barcelona, June 2008
25
Distribution of policies over the
annotated nodes
Event
# of nodes
Propagate
Block
Rename relations
Add attributes
Delete Attributes
18
64
0
13
1608
92
Rename Attributes
1615
85
Domain Modification
1690
10
0
21
4995
221
Condition Modification
Total Annotations
ICEIS'08, Barcelona, June 2008
26
Global policy and individual
annotations
# of operations
Scope
Query scope
Relation
Scope
Database
Scope
ICEIS'08, Barcelona, June 2008
Annotations
Policy Clauses
486
9
5180
293
36
2
27
Outline
• Motivation
• Graph-based modeling
• A language for policy annotation
• Evaluation
• Conclusions
ICEIS'08, Barcelona, June 2008
28
Conclusions
• Coherent framework for propagating potential
changes of the database to all the software artifacts
around the database based on a model-policy-event
scheme
• Extension to the SQL language specifically tailored
for the management of evolution.
• Limited overhead imposed on both the system and
the humans, who design and maintain it.
• Tested in a real-world scenario of the Greek public
sector.
ICEIS'08, Barcelona, June 2008
29
On-going/Future Work
• Hecataeus: A tool for visualizing
and performing what-if analysis
for several evolution scenarios.
• Patterns of evolution sequences
ICEIS'08, Barcelona, June 2008
30
Acknowledgment
Information dissemination of this work was supported
by the European Union in the framework of the project
“Support of Computer Science Studies in the University
of Ioannina” of the “Operational Program for Education
and Initial Vocational Training” of the 3rd Community
Support Framework of the Hellenic Ministry of
Education, funded by national sources and by the
European Social Fund (ESF).
ICEIS'08, Barcelona, June 2008
31
Gracias!
ICEIS'08,
Barcelona, June 2008
Sources: http://en.wikipedia.org/wiki/Image:Barcelona_-_planol_ciutat_vella_1860.jpg
http://maps.google.com
32
Questions?
http://www.cs.uoi.gr/~pvassil/projects/architecture_graph/
ICEIS'08, Barcelona, June 2008
33
Algorithm Propagate changeS
(DaWaK 2007)
Input: an ETL summary S over a
graph Go=(Vo,Eo) and an event e
Output: a graph Gn=(Vn,En)
Variables: a set of events E, and an
affected node A
Begin
dps(S, Go, Gn, {e}, A)
End
ICEIS'08, Barcelona, June 2008
dps(S, Gn, Go, E, A) {
I = Ins_by_policy(affected(E))
D = Del_by_policy(affected(E))
Gn = Go – D  I
E = E–{e}action(affected(E))
if consumer(A)nil
for each consumer(A)
dps(S,Gn,Go,E,consumer(A))
}
34
Conflict resolution (DaWaK 2007)
• Graph constructs may have contradictory
policies for the same event
Rule
Policies defined on query
graph structures are
stronger than policies
defined on view graph
structures which in turn
prevail on policies defined
on relation graph structures
ICEIS'08, Barcelona, June 2008
User’s
Choice
Query
Conditions
Query
Module
Query
Attributes
Query
View
Conditions
View
Module
View
Attributes
View
Relation
Conditions
Relation
Attributes
Relation
Relation
Module
35
Configuration of our setting
(DaWaK 2007)
7 ETL processes
7 source tables
Patch text
fields
53 ETL activities
3 lookup tables
9 target tables
Load to DSA
Tmp1
S1
Add field
Join
Patch text
fields
Load to DSA
S2
Add field
Tmp2
Sources
ICEIS'08, Barcelona, June 2008
Data Staging Area (DSA)
L1
π
T1
π
T2
γ
V3
L2
DW
36
Adapted activities per Event
(DaWaK 2007)
Affected Activities Per Event
80
70
60
50
Affected
40
Adapted
30
20
10
0
Rename
Table
Rename
Attributes
ICEIS'08, Barcelona, June 2008
Add
Attributes
Delete
Attributes
Modify
Attributes
Change PK
37
Related work
• DB schema Evolution
– Schema Versioning
• DW schema Evolution
• Materialized View Evolution
• Evolution wrt Model Mappings
ICEIS'08, Barcelona, June 2008
38