* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Survey
Document related concepts
Transcript
Language Extensions for the Automation of Database Schema Evolution G. Papastefanatos1, P. Vassiliadis2, A. Simitsis3, K. Aggistalis2 , F. Pechlivani2, Yannis Vassiliou1 (1) National Technical University of Athens {gpapas,yv}@dbnet.ece.ntua.gr (2) University of Ioannina {pvassil,kostazz,fpechliv}@cs.uoi.gr (3) IBM Almaden Research Center [email protected] Outline • Motivation • Graph-based modeling • A language for policy annotation • Evaluation • Conclusions ICEIS'08, Barcelona, June 2008 2 Outline • Motivation • Graph-based modeling • A language for policy annotation • Evaluation • Conclusions ICEIS'08, Barcelona, June 2008 3 Data Warehouse Environment Act3 Act4 Act2 Act5 ICEIS'08, Barcelona, June 2008 WWW Act1 4 Motivation Act3 Act4 Act2 Act5 ICEIS'08, Barcelona, June 2008 WWW Act1 5 Database Schema Evolution • Current database systems are continuously evolving environments, where design constructs are – Added – Removed – Modified • Evolution is not handled by current DBMS with an automatic way – Syntactic as well as semantic adaptation of queries and views is a time-consuming task, treated in most of the cases manually • Evolution-driven database design is missing ICEIS'08, Barcelona, June 2008 6 Evolution Effects • SW artifacts around the DBMS are affected: – Syntactically – i.e., become invalid – Semantically – i.e., query must conform to the new source database semantics • Adaptation of activities, queries and views – time-consuming task – treated in most of the cases manually by the administrators/developers ICEIS'08, Barcelona, June 2008 7 We would like to know… • What part of the process is affected and how if e.g., an attribute is deleted? • Can we predict the impact of changes? • To what extent can readjustment be automated? • Can we perform what-if analysis for potential changes of source configurations? ICEIS'08, Barcelona, June 2008 8 Outline • Motivation • Graph-based modeling • A language for policy annotation • Evaluation • Conclusions ICEIS'08, Barcelona, June 2008 9 Database Schema Evolution – Our approach Graph based representation of database constructs (i.e., relations, views, constraints, queries) Annotation of graph with rules for adapting queries to database schema evolution Mechanism for performing what-if analysis for potential changes of database configurations ICEIS'08, Barcelona, June 2008 Evolving applications Evolving databases Database Schema Queries Graph-based modeling for uniform representation Rules for Handling Evolution Metrics for Evaluating Evolution Design 10 Graph based representation SELECT EMP.Emp#, Sum(WORKS.Hours) as T_Hours FROM EMP, WORKS WHERE EMP.Emp# = WORKS.Emp# GROUP BY EMP.Emp# ICEIS'08, Barcelona, June 2008 11 Annotating the graph with adaptation rules 1 Set of evolving database constructs: • relations • attributes 3 • constraints Set of reaction policies: • propagate • block • prompt 2 Set of potential evolution changes: • addition • deletion • modification ICEIS'08, Barcelona, June 2008 12 Query Adaptation - Example Q: SELECT EID, Name FROM EMP Q: SELECT EID, Name, Phone FROM EMP Annotated Query Graph Event ON attribute addition TO EMP THEN propagate map-select S Q EID S EMP S map-select … S Q map-select Name ON attribute addition TO EMP THEN propagate Add attribute Phone to relation EMP S EID Transformed Query Graph Name EID S S EID S map-select Name map-select Phone ICEIS'08, Barcelona, June 2008 EMP S … S Name Phone 13 Contributions • Based on – a graph-based model for database constructs (relational tables, views, database constraints) and SQL queries • we present – a mechanism for the annotation of the graph’s constructs with elements that facilitate what-if analysis and predetermine the reaction to evolution events occurring in the database schema – SQL extensions that enable the implementation of our approach for evolution management • Case study ICEIS'08, Barcelona, June 2008 14 Outline • Motivation • Graph-based modeling • A language for policy annotation • Evaluation • Conclusions ICEIS'08, Barcelona, June 2008 15 Annotation of graph constructs with policies ON <event> TO <element> THEN <policy> {delete update insert} • {relation, query, view, attribute constraint, condition} • specific nodes {block propagate prompt} ICEIS'08, Barcelona, June 2008 16 Global constraints ON DELETE NODE THEN PROPAGATE ON DELETE ATTRIBUTE THEN PROMPT Example: CREATE DATABASE company ON DELETE ATTRIBUTE THEN PROMPT ICEIS'08, Barcelona, June 2008 17 Top – level constructs Example for relations: CREATE TABLE works (EMP# NUMBER(3), PROJ# NUMBER(3), HOURS NUMBER(5), ON Add Attribute TO works THEN propagate) This statement for relation works allows the addition of attributes and propagates this addition to all queries and views accessing this relation ICEIS'08, Barcelona, June 2008 18 Top – level constructs Example for views: CREATE VIEW emps-prjs AS SELECT E.Emp#, E.Name, P.Projname FROM Emp E,Works W,Proj P WHERE E.EMP#=W.EMP# AND W.Proj#=P.Proj# ON Modify Condition TO emps-prjs THEN block This syntax blocks changes in the WHERE clause of the view definition for view emps-prjs ICEIS'08, Barcelona, June 2008 19 Top – level constructs Example for queries: Q: SELECT EP.Emp#, EP.Name FROM emps-prjs EP WHERE EP.PRJNAME = ‘Olympic Games’ ON Add Attribute TO emps-prjs THEN block The above syntax blocks the inclusion of added attributes in the underlying view emps-prjs in the select clause of the query. ICEIS'08, Barcelona, June 2008 20 Specific node annotation CREATE TABLE emp (EMP# NUMBER(3), Name Varchar2(150), ... , ON Delete Attribute TO Name THEN block) Q: SELECT E.Emp#, E.Name, P.Projname FROM Emp E,Works W,Proj P WHERE E.EMP#=W.EMP# AND W.Proj#=P.Proj# ON Delete Attribute TO Name THEN propagate ICEIS'08, Barcelona, June 2008 21 Constraints CREATE TABLE emp (EMP# NUMBER(3), Name Varchar2(150), Constraint EMP.PK PRIMARY KEY (EMP#), ON Modify Constraint TO EMP.PK THEN propagate) ICEIS'08, Barcelona, June 2008 22 Conditions CREATE CONDITION <condition> AS <expression> CREATE CONDITION Emp_Age_Cond AS AGE>50 CREATE CONDITION Works_Emp_FK AS WORKS.EMP# IN EMP.EMP# CREATE CONDITION Works_Emp_J AS WORKS.EMP#=EMP.EMP# • Then, a query SELECT * FROM EMP WHERE AGE_COND would simply use the condition as a macro. • Parametric conditions, to allow referring to aliases in SQL queries are straightforward. • Reverse eng. of existing code: automatic condition names can be assigned to all the queries. ICEIS'08, Barcelona, June 2008 23 Outline • Motivation • Graph-based modeling • A language for policy annotation • Evaluation • Conclusions ICEIS'08, Barcelona, June 2008 24 Testbed • A data mart of an agency for the Greek public sector • 6 months of monitoring • 52 “queries” (in reports, scripts) over 18 relations • When converted to our graph structure: 2500 nodes ICEIS'08, Barcelona, June 2008 25 Distribution of policies over the annotated nodes Event # of nodes Propagate Block Rename relations Add attributes Delete Attributes 18 64 0 13 1608 92 Rename Attributes 1615 85 Domain Modification 1690 10 0 21 4995 221 Condition Modification Total Annotations ICEIS'08, Barcelona, June 2008 26 Global policy and individual annotations # of operations Scope Query scope Relation Scope Database Scope ICEIS'08, Barcelona, June 2008 Annotations Policy Clauses 486 9 5180 293 36 2 27 Outline • Motivation • Graph-based modeling • A language for policy annotation • Evaluation • Conclusions ICEIS'08, Barcelona, June 2008 28 Conclusions • Coherent framework for propagating potential changes of the database to all the software artifacts around the database based on a model-policy-event scheme • Extension to the SQL language specifically tailored for the management of evolution. • Limited overhead imposed on both the system and the humans, who design and maintain it. • Tested in a real-world scenario of the Greek public sector. ICEIS'08, Barcelona, June 2008 29 On-going/Future Work • Hecataeus: A tool for visualizing and performing what-if analysis for several evolution scenarios. • Patterns of evolution sequences ICEIS'08, Barcelona, June 2008 30 Acknowledgment Information dissemination of this work was supported by the European Union in the framework of the project “Support of Computer Science Studies in the University of Ioannina” of the “Operational Program for Education and Initial Vocational Training” of the 3rd Community Support Framework of the Hellenic Ministry of Education, funded by national sources and by the European Social Fund (ESF). ICEIS'08, Barcelona, June 2008 31 Gracias! ICEIS'08, Barcelona, June 2008 Sources: http://en.wikipedia.org/wiki/Image:Barcelona_-_planol_ciutat_vella_1860.jpg http://maps.google.com 32 Questions? http://www.cs.uoi.gr/~pvassil/projects/architecture_graph/ ICEIS'08, Barcelona, June 2008 33 Algorithm Propagate changeS (DaWaK 2007) Input: an ETL summary S over a graph Go=(Vo,Eo) and an event e Output: a graph Gn=(Vn,En) Variables: a set of events E, and an affected node A Begin dps(S, Go, Gn, {e}, A) End ICEIS'08, Barcelona, June 2008 dps(S, Gn, Go, E, A) { I = Ins_by_policy(affected(E)) D = Del_by_policy(affected(E)) Gn = Go – D I E = E–{e}action(affected(E)) if consumer(A)nil for each consumer(A) dps(S,Gn,Go,E,consumer(A)) } 34 Conflict resolution (DaWaK 2007) • Graph constructs may have contradictory policies for the same event Rule Policies defined on query graph structures are stronger than policies defined on view graph structures which in turn prevail on policies defined on relation graph structures ICEIS'08, Barcelona, June 2008 User’s Choice Query Conditions Query Module Query Attributes Query View Conditions View Module View Attributes View Relation Conditions Relation Attributes Relation Relation Module 35 Configuration of our setting (DaWaK 2007) 7 ETL processes 7 source tables Patch text fields 53 ETL activities 3 lookup tables 9 target tables Load to DSA Tmp1 S1 Add field Join Patch text fields Load to DSA S2 Add field Tmp2 Sources ICEIS'08, Barcelona, June 2008 Data Staging Area (DSA) L1 π T1 π T2 γ V3 L2 DW 36 Adapted activities per Event (DaWaK 2007) Affected Activities Per Event 80 70 60 50 Affected 40 Adapted 30 20 10 0 Rename Table Rename Attributes ICEIS'08, Barcelona, June 2008 Add Attributes Delete Attributes Modify Attributes Change PK 37 Related work • DB schema Evolution – Schema Versioning • DW schema Evolution • Materialized View Evolution • Evolution wrt Model Mappings ICEIS'08, Barcelona, June 2008 38