Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1st International Workshop on Bidirectional Transformations - Bx 2012 Tallin, March 25, 2012 PReCISE LIBD Bidirectional Transformations in Database Engineering Jean-Luc Hainaut, Anthony Cleve 2 Reference Objectives of the conference short practical intuitive introduction to the (x) transformational paradigm, applied to the DB domain. Based on: Hainaut, J-L., Cleve, A., Transformational Approach to Database Engineering and Evolution, tutorial, Dagstuhl seminar 11031 on "Bx", January 2011 Hainaut, J-L, The Transformational Approach to Database Engineering, in Lämmel, R., Saraiva, J., Visser, V., (Eds), Generative and Transformational Techniques in Software Engineering, pp. 95-143, LNCS 4143, Springer, 2006) 3 Contents 1. Introduction 2. Transformational engineering 3. Modeling data structures 4. Schema transformations 5. Bidirectional transformations 6. Typology of practical elementary transformations 7. Typology of practical complex transformations 8. Transformational modeling of database engineering processes 9. Conclusions and perspectives (A. Schema transformations in CASE tools) 4 1. Introduction 5 Goal: to introduce the conference (what else would you have expected?) 6 1. Introduction "Bidirectional transformations (bx) are a mechanism for maintaining the consistency of at least two related sources of information." "Such sources can be databases, software models, documents, graphs, and trees." bx appear in many aspects of the data & DB realm: Model-Driven Software/DB Development: to compute and synchronize views of software/DB models. Relational Databases: to construct updatable views. Data Transformation, Integration, and Exchange: to map data across paradigms, merge it from multiple sources, and exchange it between sources. Data Synchronizers: to bridge the gap between replicas in different formats. Serializers: to mediate between external data (binary or sequential data representations on the wire or the le system) and structured objects in memory. source: GRACE Report, 2008 7 1. Introduction Focus of this conference: Model-Driven Software Development: to compute and synchronize views of software models. Model-Driven Database Engineering: to compute and synchronize views of database schemas. 1. Introduction DB Engineering = { domain analysis, logical design, physical design, view derivation, optimisation, code generation, reverse engineering, maintenance, evolution, migration, integration, etc.} 8 9 1. Introduction Two directions of synchronization: Vertical synchronization: in the design abstraction artifact hierarchy , how to ensure that there is no information loss? conceptual schema logical schema physical schema DDL code horizontal synchronization: when an artifact changes, how to resynchronize the other artifacts? 10 1. Introduction Two directions of synchronization: Vertical synchronization: in the design abstraction artifact hierarchy, how to ensure that there is no information loss? conceptual schema horizontal synchronization: when an artifact changes, how to resynchronize the other artifacts? conceptual schema' Client program Client program' logical schema logical schema' physical schema physical schema' DDL code DDL code' 11 2. Transformational engineering 12 Goal: to show a practical application of schema transformations or how to motivate participants at 9 o'clock, summer time? 2. Transformational engineering How to derive a target schema from a source schema? through translation rules through transformations 13 2. Transformational engineering 14 Rule-based vs Transformation-based engineering Rule-based engineering the target specification is produced following a set of translation rules. Transformation-based engineering the target specification is produced by application of a chain of substitution operators to the source specifications. 15 2. Transformational engineering Rule-based view of Database Engineering Example: producing a relational schema from a conceptual schema BOOK ISBN Title Author[0-5] DatePublished id: ISBN 0-N of 1-1 COPY CopyNbr DatePurchased id: of.BOOK CopyNbr Conceptual schema (ER) Physical schema (MS Access) 16 2. Transformational engineering Rule-based view of Database Engineering Natural procedure: through translation rules Conceptual schema Physical constructs Entity type E table E Level-1 multivalued atomic attribute A of entity type E table A, comprising: column A; primary key made up of A. table EA, comprising column(s) copied from the primary key of table E; column copied from the primary key of table A; primary key comprising all these columns. in table B, column(s) copied from the primary key of table A; foreign key to A comprising these columns; if R was part of a candidate (primary) key of B, then add these attributes to the key. relationship-type R from B (with card. 1-1) to A (with card. 0-N) 17 2. Transformational engineering Rule-based view of Database Engineering OK, but what if: attribute A is at level 2, 3, …? attribute A is not atomic? relationship type R is many-to-many, or one-to-one, or N-ary? etc. Combinatorial explosion and complexity of the set of rules. 18 2. Transformational engineering Transformation-based view of Database Engineering Transforming the multivalued attribute Author BOOK ISBN Title Author[0-5] DatePublished id: ISBN 0-N of 1-1 COPY CopyNbr DatePurchased id: of.BOOK CopyNbr BOOK ISBN Title DatePublis hed id: ISBN 0-N 0-5 of 1-1 COPY CopyNbr DatePurchas ed id: of.BOOK CopyNbr AUTHOR AuthorNam e id: AuthorNam e write 1-N 19 2. Transformational engineering Transformation-based view of Database Engineering Transforming the many-to-many relationship type write BOOK ISBN Title DatePublis hed id: ISBN 0-N 0-5 BOOK ISBN Title DatePublis hed id: ISBN AUTHOR AuthorNam e id: AuthorNam e write 1-N 0-N 1-N 0-5 aw bw of of 1-1 1-1 COPY CopyNbr DatePurchas ed id: of.BOOK CopyNbr AUTHOR AuthorName id: AuthorName COPY CopyNbr DatePurchas ed id: of.BOOK CopyNbr 1-1 1-1 WRITE id: bw.BOOK aw.AUTHOR 20 2. Transformational engineering Transformation-based view of Database Engineering Transforming the one-to-many relationship type aw (and the others) BOOK ISBN Title DatePublis hed id: ISBN 0-N AUTHOR AuthorName id: AuthorName 1-N 0-5 aw bw of 1-1 COPY CopyNbr DatePurchas ed id: of.BOOK CopyNbr 1-1 1-1 WRITE id: bw.BOOK aw.AUTHOR BOOK ISBN Title DatePublis hed id: ISBN AUTHOR AuthorNam e id: AuthorNam e COPY ISBN CopyNbr DatePurchas ed id: ISBN CopyNbr ref: ISBN WRITE AuthorNam e ISBN id: ISBN AuthorNam e ref: ISBN ref: AuthorNam e No m ore than 5 WRITE rows per BOOK row. 21 2. Transformational engineering Transformation-based view of Database Engineering Coding (generally simple; rule-based or transformational) BOOK ISBN Title DatePublis hed id: ISBN AUTHOR AuthorNam e id: AuthorNam e COPY ISBN CopyNbr DatePurchas ed id: ISBN CopyNbr ref: ISBN WRITE AuthorNam e ISBN id: ISBN AuthorNam e ref: ISBN ref: AuthorNam e No m ore than 5 WRITE rows per BOOK row. No more than 5 WRITE rows per BOOK row. 22 2. Transformational engineering Transformation-based view of Database Engineering What if the attribute is multivalued, compound and comprises other multivalued components ? SALESMAN PID Nam e Sales [0-N] Year Cus tomer[0-N] Cus tID Volume id: PID id(Sales): Year id(Sales.Customer): Cus tID SALESMAN PID Nam e id: PID rule? SALES PID Year id: PID Year ref: PID CUSTOMER PID Year Cus tID Volum e id: PID Year Cus tID ref: PID Year 2. Transformational engineering Transformation-based view of Database Engineering SALESMAN PID Nam e Sales [0-N] Year Cus tom er[0-N] Cus tID Volum e id: PID id(Sales): Year id(Sales.Custom er): Cus tID SALESMAN PID Nam e id: PID 0-N for 1-1 SALES Year Cus tom er[0-N] Cus tID Volum e id: for.SALESMAN Year id(Custom er): Cus tID Note: slightly different variant of the transformation of an attribute into an entity type 23 24 2. Transformational engineering Transformation-based view of Database Engineering SALESMAN PID Nam e Sales [0-N] Year Cus tom er[0-N] Cus tID Volum e id: PID id(Sales): Year id(Sales.Custom er): Cus tID SALESMAN PID Nam e id: PID SALESMAN PID Nam e id: PID 0-N 0-N for for 1-1 1-1 SALES SALES Year id: for.SALESMAN Year Year Cus tom er[0-N] Cus tID Volum e id: for.SALESMAN Year id(Custom er): Cus tID 0-N to 1-1 CUSTOMER Cus tID Volum e id: to.SALES Cus tID 25 2. Transformational engineering Transformation-based view of Database Engineering SALESMAN PID Nam e Sales [0-N] Year Cus tom er[0-N] Cus tID Volum e id: PID id(Sales): Year id(Sales.Custom er): Cus tID SALESMAN PID Nam e id: PID SALESMAN PID Nam e id: PID 0-N 0-N for for 1-1 1-1 SALES SALES Year Cus tom er[0-N] Cus tID Volum e id: for.SALESMAN Year id(Custom er): Cus tID Year id: for.SALESMAN Year 0-N to 1-1 CUSTOMER Cus tID Volum e id: to.SALES Cus tID SALESMAN PID Nam e id: PID SALES PID Year id: PID Year ref: PID CUSTOMER PID Year Cus tID Volum e id: PID Year Cus tID ref: PID Year 2. Transformational engineering Transformation-based view of Database Engineering Observations no new operators iterative application of known operators compositional property of transformations (the composition of two transformations still is a transformation) no combinatorial explosion, just the right (small) set of operators need for meta-rules for applying the operators (a transformation plan) 26 2. Transformational engineering What now? 27 28 1. Introduction Questions We need to represent schemas in a great variety of models (GER = Generic ER model) What is a transformation and how to specify it? Does a transformation preserve the information contents of a schema? Let us be more concrete: what about PRACTICAL transformations? How do transformations help in REAL database engineering processes? 29 3. Modeling data structures 30 Goal: to define a common formalism to specify data structures in different data models or how to tidy the data model Babel tower? 3. Modeling Data Structures Dealing with multiple models A typical organization uses N different data models. E.g., it commonly uses DB2 databases, also uses a legacy IDMS database, writes its conceptual schemas in the ER model, quite often transfers data between databases, exchanges data with its environment, standardizes on XML format, plans to migrate some databases to other platforms, prepares the development of a datawarehouse, study the feasibility to merge several departments (and their information systems), etc. 31 32 3. Modeling Data Structures Dealing with multiple models conceptual schema organization application program design data warehouse operational data migrate ETL extract & export XML import environment XML 33 3. Modeling Data Structures Dealing with multiple models Considering all the inter-model and intra-model conversions, the organization requires N x N different mappings (= 16). Srel>er Srel>rel Ser>er Srer>rel Relational Model ER Model Srel>cod Ser>xml Scod>rel CODASYL Model Sxml>er XML Model Scod>xml Sxml>xml Scod>cod Sxml>cod 34 3. Modeling Data Structures Dealing with multiple models The usual answer: introducing a pivot model. Considering all the inter-model and intra-model conversions, the organization requires 2 x N + 1 different mappings (= 9). Sp>p Relational Model Srel>p Ser>p Sp>rel Sp>er ER Model Sp>cod Sp>xml XML Model Scod>p Sxml>p Pivot Model CODASYL Model 35 3. Modeling Data Structures Dealing with multiple models Example: relational logical design. conceptual schema Logical design logical schema complex Sp>p ER Model Ser>p very simple Pivot Model Sp>rel Relational Model very simple 36 3. Modeling Data Structures GER: the Generic Entity-Relationship model A large spectrum data structure model Encompasses several paradigms: ER, UML, SQL, CODASYL, IMS, file structures, XML, etc. Encompasses several levels of abstraction: conceptual, logical, physical, external Chosen as the pivot model in this tutorial Pivot Model GER Model 37 3. Modeling Data Structures GER: the Generic Entity-Relationship model Conceptual schema fragment (1) PERSON Nam e Address entity type attribute Is-a all-attribute ID T CUSTOMER Cus tom er ID id: Cus tom er ID 0-N relationship type role (with cardinality) hybrid ID of 1-1 ACCOUNT Account NBR Am ount id: of.CUSTOMER Account NBR EMPLOYEE Em ploye Nbr Date Hired id: Em ploye Nbr 38 3. Modeling Data Structures GER: the Generic Entity-Relationship model Conceptual schema fragment (2) SALESMAN PID Nam e Phone[0-5] Mobile[0-1] Address Street City multivalued attribute optional attribute compound attribute 0-N N-ary relationship type 0-N CUSTOMER s old Date Volum e 0-N PRODUCT 39 3. Modeling Data Structures GER: the Generic Entity-Relationship model Logical schema fragment record set / table array multivalued field foreign key ORDER ORD-ID DATE_RECEIVED ORIGIN DETAIL[1-5] array REFERENCE QTY-ORD id: ORD-ID ref: ORIGIN CUSTOMER CUSTOMER ID id: CUSTOMER ID 40 3. Modeling Data Structures GER: the Generic Entity-Relationship model Physical schema fragment: RDB unique index index PRODUCT PRO_CODE CATEGORY DESCRIPTION UNIT_PRICE id: PRO_CODE acc acc: CATEGORY PRODUCT.DAT storage space PRODUCT 41 4. Schema transformations 42 Goal: to define more precisely the nature of schema transformation or actually, what is a transformation? 43 4. Schema transformations A transformation T replaces a construct C in a schema S1 with another construct C', leading to schema S2 T S1 C S2 C' schemas 44 4. Schema transformations If the schema describes actual data, the transformation should also tell how to convert the data (t) ... T S1 S2 C C' schemas t data c c' 45 4. Schema transformations A transformation S is defined by two mappings T and t S = <T,t> C T inst_of c C' = T(C) inst_of t c' = t(c) T: structural mapping = syntax of S t: instance mapping = semantics of S 46 4. Schema transformations Mapping T can be specified with two predicates: P: minimal pre-condition Q: maximal post-condition S = <T,t> = <P,Q,t> 47 4. Schema transformations Expressing structural predicates through any logic-based language relational (more concise, a name denotes an object) entity-type(E) there exists an entity type with name E object-based (more general, a name is a property of an object) entity-type(e) there exists an entity type denoted by e name(e,E) the name of e is E must allow specification AND reasoning (e.g., DL) 48 4. Schema transformations Expressing structural predicates intuitive example entity-type(E) there exists an entity type with name E attribute(O,A,m,M,T) object (with name) O has an attribute with name A, cardinality m-M and type T id(O,Cp) object (with name) O has an identifier comprising components Cp rel-type(R) there exists a rel-type with name R role(R,r,E,m,M) rel-type R has a role with name r, played by E, with cardinality m-M 4. Schema transformations Specifying an entity type: entity-type(CUSTOMER) attribute(CUSTOMER,Cust#,1,1,integer) attribute(CUSTOMER,Name,1,1,string) attribute(CUSTOMER,Phone,0,5,string) id(CUSTOMER,{Cust#}) 49 50 4. Schema transformations Practically, a structural predicate can be defined graphically: entity-type(CUSTOMER) attribute(CUSTOMER,Cust#,1,1,integer) attribute(CUSTOMER,Name,1,1,string) attribute(CUSTOMER,Phone,0,5,string) id(CUSTOMER,{Cust#}) = CUSTOMER Cust# Name Phone[0-5] id: Cust# 51 4. Schema transformations The structural mapping of a transformation can be defined graphically: P Q P = entity-type(CUSTOMER) attribute(CUSTOMER,Cust#,1,1,integer) attribute(CUSTOMER,Name,1,1,string) attribute(CUSTOMER,Phone,0,5,string) id(CUSTOMER,{Cust#}) = = CUSTOMER Cust# Name Phone[0-5] id: Cust# Q = entity-type(CUSTOMER) attribute(CUSTOMER,Cust#,1,1,integer) attribute(CUSTOMER,Name,1,1,string) id(CUSTOMER,{Cust#}) entity-type(PHONE) attribute(PHONE,Phone,1,1,string) id(PHONE,{Phone}) rel-type(has) role(has,,CUSTOMER,0,5) role(has,,PHONE,1,N) CUSTOMER PHONE Cust# Name id: Cust# 0-5 Phone id: Phone has 1-N 52 4. Schema transformations From now on: P CUSTOMER CUSTOMER Cust# Name Phone[0-5] id: Cust# Q PHONE Cust# Name id: Cust# 0-5 Phone id: Phone has 1-N 53 4. Schema transformations Inverse transformation S2 = S1 -1 iff C: P1(C) C = T2(T1(C)) T1 CUSTOMER Cust# Name Phone[0-5] id: Cust# CUSTOMER PHONE Cust# Name id: Cust# T2 0-5 Phone id: Phone has 1-N Intuitively, S2 undoes the effect of S1 at the structural level 54 5. Bidirectional transformations 55 Goal: to study the semantics preservation of transformations or at last, could will talk about bx? 56 5. Bidirectional transformations A transformation can ... augment the information contents of the schema CUSTOMER Cust# Name Addres s CUSTOMER Cus t# Nam e Addres s Phone CUSTOMER Cust# Name Phone preserve the information contents of the schema CUSTOMER Cust# Name Phone CUSTOMER Cus t# Nam e Addres s Phone decrease the information contents of the schema CUSTOMER Cus t# Nam e 1-1 has PHONE 1-N Phone id: Phone 5. Bidirectional transformations A transformation can be ... not reversible: not semantics-preserving reversible: "one-way" semantics-preserving symmetrically reversible: fully semantics-preserving or bidirectional 57 58 5. Bidirectional transformations Examples P: R(A,B,C); Q: R1(A,B); R2(A,C); P: R(A,B,C); A B|C Q: R1(A,B); R2(A,C); not reversible reversible (Fagin's theorem) P: R(A,B,C); A B|C Q: R1(A,B); R2(A,C); R1[A] = R2[C]; symmetrically reversible (bx) 59 5. Bidirectional transformations Reversible transformation A transformation is reversible if there is an inverse mapping for instances as well S1 is reversible iff S2 = S1-1 : C: P(C) C = T2(T1(C)) c inst(C): c = t2(t1(c)) 60 5. Bidirectional transformations Symmetrically reversible (or bx) transformation S is symmetrically reversible iff both S and S S = <P,Q,t> S -1 -1 are reversible = <Q,P,t'> SR-transformations are the most desirable operators They preserve the information contents of the source schema They are semantics-preserving 5. Bidirectional transformations Can we formally prove that a transformation is semanticspreserving, i.e., that it is SR? Yes, definitely. See: Jean-Luc Hainaut, J.-L. (2006) The Transformational Approach to Database Engineering, in Generative and Transformational Techniques in Software Engineering, LNCS, Volume 4143, pages 95-143, Springer-Verlag. http://www.info.fundp.ac.be/~dbm/Documents/Publications-LIBD/Book-chapters/GTTSE-JLHLNCS-final-revised.pdf 61 62 6. Typology of practical transformations Elementary transformations 63 Goal: to look at some representative elementary transformations or could you give us some examples? 6. Typology of practical elementary transformations The working example DOCUMENT DocID Title Date-Published Keyword[0-10] id: DocID written 0-N AUTHOR Nam e Firs t-Name[0-1] Origin[0-1] 1-N 0-N res pons ible-for D REPORT Report Code Version id': Report Code BOOK ISBN Publis her id': ISBN res erved Res ervation date BORROWER PID Nam e Address Street City Phone[1-5] id: PID 0-N 0-N of 1-1 COPY Serial-No Date-Acquired Location Store Shelf Row id: of.BOOK Serial-No res pons ible 0-N 0-1 0-N 0-1 works on 0-N borrowing Borrow-Date Return-Date[0-1] id: COPY Borrow-Date 0-N 0-N PROJECT ProjCode Title ContractNo[0-1] Com pany id: ProjCode id': ContractNo 64 6. Typology of practical elementary transformations The main classes of elementary SR-transformations mutation transformations ISA transformations other elementary transformations 65 66 6. Typology of practical elementary transformations Mutation transformations A mutation changes the gender of an object while preserving its information contents 3 genders 6 mutations RT-to-ET Entity type ET-to-RT att-to-ET Rel-type att-to-RT ET-to-att RT-to-att Attribute 67 6. Typology of practical elementary transformations Mutation transformations (SR) Entity types and Rel-types (1) 0-N written 0-N AUTHOR Nam e Firs t-Name Origin DOCUMENT DocID Title Date-Published Keyword[0-10] id: DocID DOCUMENT DocID Title Date-Published Keyword[0-10] id: DocID AUTHOR Nam e Firs t-Name Origin 0-N 0-N doc 1-1 WRITTEN id: doc.DOCUMENT by.AUTHOR 1-1 by 68 6. Typology of practical elementary transformations Mutation transformations (SR) Entity types and Attributes DOCUMENT DocID Title Date-Published id: DocID 0-10 PROJECT ProjCode Title Com pany id: ProjCode DOCUMENT DocID Title Date-Published Keyword[0-10] id: DocID PROJECT ProjCode Title id: ProjCode 1-1 des cribe for 1-N 1-N KEYWORD Keyword id: Keyword COMPANY Com pany id: Com pany 6. Typology of practical elementary transformations Mutation transformations (SR) Rel-types and Attributes BOOK ISBN Publis her id: ISBN of 1-1 COPY Serial-No Date-Acquired id: of.BOOK Serial-No 0-N COPY ISBN Serial-No Date-Acquired id: ISBN Serial-No ref: ISBN BOOK ISBN Publis her id: ISBN 69 70 6. Typology of practical elementary transformations ISA transformations (SR) DOCUMENT DocID Title Date-Published Keyword[0-10] id: DocID D REPORT Report Code Version id': Report Code Materialization DOCUMENT DocID Title Date-Published Keyword[0-10] id: DocID excl: b-is a-doc.BOOK r-is a-doc.REPORT 0-1 r-is a-doc 1-1 REPORT Report Code Version id': Report Code BOOK ISBN Publis her id': ISBN Downward inheritance REPORT DocID Title Date-Published Keyword[0-10] Report Code Version id: DocID id': Report Code DOCUMENT DocID Title Date-Published Keyword[0-10] id: DocID BOOK DocID Title Date-Published Keyword[0-10] ISBN Publis her id: DocID id': ISBN 0-1 b-is a-doc 1-1 BOOK ISBN Publis her id': ISBN excl(REPORT.DocID, DOCUMENT.DocID, BOOK.DocID) Upward inheritance DOCUMENT DocID Title Date-Publis hed Keyword[0-10] Report[0-1] Report Code Version Book[0-1] ISBN Publisher id: DocID id': Report.Report Code id': Book.ISBN excl: Report Book 6. Typology of practical elementary transformations Other elementary transformations Non-set attributes (SR) DOCUMENT DocID Title Keyword[0-10] Sequence Value id: DocID id(Keyword): Sequence DOCUMENT DocID Title Keyword[0-10] array id: DocID DOCUMENT DocID Title Keyword[0-10] lis t id: DocID DOCUMENT DocID Title Keyword[0-10] bag id: DocID DOCUMENT DocID Title Keyword[0-10] Multiplicity Value id: DocID id(Keyword): Value DOCUMENT DocID Title Keyword[10-10] Index Value[0-1] id: DocID id(Keyword): Index 71 6. Typology of practical elementary transformations Other elementary transformations Multivalued attribute: instanciation and concatenation ? DOCUMENT DocID: char (12) Title: char (30) Date-Published: date (10) Keyword[0-5]: char (30) id: DocID ? DOCUMENT DocID: char (12) Title: char (30) Date-Published: date (10) Keyword1[0-1]: char (30) Keyword2[0-1]: char (30) Keyword3[0-1]: char (30) Keyword4[0-1]: char (30) Keyword5[0-1]: char (30) id: DocID DOCUMENT DocID: char (12) Title: char (30) Date-Published: date (10) Keywords[0-1]: char (150) id: DocID Very common but not SR 72 73 7. Typology of practical transformations Complex transformations 74 Goal: to examine some higher-level transformations or should I really apply these elementary operations to each of the 2,000 relationship types, 750 complex attributes and 682 isa relations??? 7. Typology of practical complex transformations Elementary transformations are just building blocks for more complex operators challenge: Developing higher-level SR transformations with elementary SR-transformations 75 7. Typology of practical complex transformations The main classes of complex SR-transformations compound transformations predicate-driven transformations model-driven transformations 76 7. Typology of practical complex transformations Compound transformations The composition of two transformations is a transformation The composition of two SR-transformations is an SR-transformation S1 = <T1, t1> S2 = <T2, t2> S12 = S2 o S1 = <T2 o T1, t2 o t1> 77 78 7. Typology of practical complex transformations Compound transformations new! ACCOUNT AccID Available id: AccID expens es Amount 0-5 DAY-of-WEEK Day-of-Week id: Day-of-Week known known known ACCOUNT AccID Available id: AccID 0-5 of known ACCOUNT AccID Available Expenses[0-5] Day-of-Week Am ount id: AccID id(Expenses): Day-of-Week 1-N ACCOUNT AccID Available Exp-Monday[0-1] Exp-Tuesday[0-1] Exp-Wednesday[0-1] Exp-Thursday[0-1] Exp-Friday[0-1] id: AccID ACCOUNT AccID Available id: AccID 1-1 EXPENSES Day-of-Week Am ount id: of.ACCOUNT Day-of-Week dom(Day-of-Week) = {'Monday','Tuesday', .. ,'Friday'} DAY-of-WEEK Day-of-Week id: Day-of-Week 1-N 0-5 on of 1-1 1-1 EXPENSES Amount id: of.ACCOUNT on.DAY-of-WEEK 7. Typology of practical complex transformations 79 Predicate-driven (conditional) transformations Transformations that apply to a set of qualified objects in the current schema S ( p) where S is a transformation p is a structural predicate interpretation: apply S to all the objects that satisfy p ambiguous: if p(o1) p(o2) (apply S(o1) p(o2) ) then should o2 be processed anyway? usual strategy (snapshot): first compute the set of objects that satisfy p, then apply S to each of its elements that survive. 7. Typology of practical complex transformations 80 Predicate-driven transformations We need a language for p structural (e.g., DL, OCL): complex and leading to huge expressions ad hoc for the GER: expressive, concise, parametric, but not generic, not closed ROLE_per_RT(I J): the number of roles of the current rel-type is between I and J ONE_ROLE_per_RT(I J): the number of "one" roles (with cardinality ?-1) is between I and J MAX_CARD_of_ATT(I J): the maximum cardinality of the current attribute is between I and J DEPTH_of_ATT(I J): the level of the current attribute is between I and J I {0, 1, 2, .., J}; J {I, I+1, …, N} 7. Typology of practical complex transformations 81 Predicate-driven transformations S (p) RT_into_ET(ROLE_per_RT(3 N)): transform all rel-types into an entity type (if they have at least 3 roles) RT_into_ATT(ROLE_per_RT(2 2) and ONE_ROLE_per_RT(1 2)): transform all rel-types into referential attributes (if they are binary and one-to-many or one-to-one) INSTANTIATE(MAX_CARD_of_ATT(2 4)): instanciate each attribute (if they are "slightly" multivalued: from 2 to 4values) ATT_into_ET_VAL(DEPTH_of_ATT(1 1) and MAX_CARD_of_ATT(5 N)): transform all attributes into an entity type (if they are at the top level and they are "strongly" multivalued: at least 5 values) 7. Typology of practical complex transformations 82 Model-driven transformation Goal: considering schema S1 in model M1, transform S1 into S2 that complies with model M2. Of course, as far as possible through SR-transformations! Example: considering the Entity-relationship schema S1, transform S1 into S2 that complies with the relational model. Of course, as far as possible without information loss! Structure: a compound transformation comprising predicate-driven transformations. Practical form: a transformation plan. 7. Typology of practical complex transformations 83 Model-driven transformation Principle: Identify the constructs of M1 that violate M2 For each such construct C, choose a transformation <T,t> = <P,Q,t> such that P(C) T(C) satisfies M2 Things may be a bit more complex, requiring a compound transformation. Example: processing N-ary rel-types for relational compliance requires two successive transformations 7. Typology of practical complex transformations Model-driven transformation Example: ER to Binary (flat Bachman) conversion The binary model is a variant of the ER model in which: there is no ISA relations the rel-types are functional (binary, one-to-many or one-to-one) the rel-types have no attributes each rel-type is defined on two distinct entity types (no cyclic rel-types) the attributes are single-valued and atomic. 84 7. Typology of practical complex transformations Model-driven transformation Flat Bachman schemas - invalid constructs: ISA relations cyclic rel-types complex rel-types (with attributes, N-ary) many-to-many binary rel-types multivalued attributes compound attributes. 85 7. Typology of practical complex transformations Model-driven transformation Flat Bachman schemas - processing invalid constructs: ISA relations: materialization cyclic rel-types: transform into entity types complex rel-types (with attributes, N-ary): transform into entity types many-to-many binary rel-types: transform into entity types multivalued attributes: transform into entity types compound attributes: disagregate. 86 7. Typology of practical complex transformations 87 Model-driven transformation Transformation plan for ER to Flat Bachman conversion ISA_into_RT; transform ISA relations by materialization; RT_into_ET(RECURSIVITY_in_RT(2 N)); transform rel-types in which the same entity type appears more than once; RT_into_ET(ATT_per_RT(1 N) or ROLE_per_RT(3 N)); transform complex rel-types; RT_into_ET(ONE_ROLE_per_RT(0 0)); transform rel-types in which there is no "one" role; LOOP; iteratively flatten the attribute structure ATT_into_ET_INST(MAX_CARD_of_ATT(2 N)) DISAGGREGATE ENDLOOP; 88 7. Typology of practical complex transformations Model-driven transformation Example of ER to Flat Bachman conversion DOCUMENT DocID Title Date-Published Keyword[0-10] id: DocID res pons ible-for 0-10 res pons ible 0-N res erved BOOK ISBN Publis her id': ISBN 0-N BORROWER PID Nam e id: PID 0-N 0-N of borrowing d isa 1-1 1-1 KEYWORD Keyword id: d.DOCUMENT Keyword BOOK ISBN Publis her id': ISBN BORROWER PID Nam e id: PID 0-1 0-N of 0-N 0-N 1-1 what RESPONSIBLE 0-N 1-1 1-1 PROJECT ProjCode Title id: ProjCode RESERVED id: by.BORROWER what.DOCUMENT by 0-N PROJECT ProjCode Title id: ProjCode COPY Serial-No Date-Acquired Loc_Store Loc_Shelf Loc_Row id: of.BOOK Serial-No 0-N for 0-N 1-1 by 1-1 0-N is 0-N of 1-1 COPY Serial-No Date-Acquired Location Store Shelf Row id: of.BOOK Serial-No 0-1 0-N 0-1 DOCUMENT DocID Title Date-Published id: DocID what 1-1 1-1 1-1 BORROWING id: for.PROJECT by.BORROWER what.COPY 7. Typology of practical complex transformations Model-driven transformation Other popular examples ER to UML UML to ER ER to relational relational to ER COBOL files to ER ER to XML relational to XML 89 90 8. Transformational modeling of database engineering processes 91 Goal: to show that large scope processes are transformational as well 92 8. Modeling of DB engineering processes Most database engineering processes are high-level transformations Example 1: database design Users requirements Conceptual design Conceptual schema Logical design Logical schema Physical design Physical schema Coding DDL code DDL code = DB-design(Users Requirements) DB-design = Coding o PhysD o LogD o ConcD 93 8. Modeling of DB engineering processes Logical design Conceptual_schema Logical_design Logical_schema Logical_schema = Logical_design(Conceptual_schema) Logical_design clearly is a model-driven transformation Let us develop its transformation plan 8. Modeling of DB engineering processes What are the invalid GER constructs in the relational model? ISA relations relationship types complex functional multivalued attributes compound attributes names not compliant with the SQL syntax 94 95 8. Modeling of DB engineering processes A transformation plan transform is-a relations transform complex rel-types still non simple attributes ? no yes transform level-1 multivalued attributes transform functional rel-types no disaggregate level-1 compound attributes any failure ? yes Add technical Id where needed Process names 8. Modeling of DB engineering processes 96 Application DOCUMENT DocID Title Date-Published Keyword[0-10] id: DocID 0-N written 1-N DOCUMENT DocID Title Date-Published REPORT[0-1] BOOK[0-1] id: DocID excl: BOOK REPORT AUTHOR Nam e Firs t-Nam e[0-1] Origin[0-1] 0-N D res pons ible-for REPORT BOOK Report Code ISBN Version Publis her id': Report Code id': ISBN res pons ible 0-N 0-N BORROWER PID Nam e Address Street City Phone[1-5] id: PID 0-N of 1-1 0-N COPY Serial-No Date-Acquired Location Store Shelf Row id: of.BOOK Serial-No 0-N borrowing Borrow-Date Return-Date[0-1] id: COPY Borrow-Date Keyword DocID Keyword id: DocID Keyword ref: DocID 0-1 0-1 works on 0-N PROJECT 0-N ProjCode Title ContractNo[0-1] Com pany id: ProjCode id': ContractNo res erved Res ervation date written Auth_ID DocID id: Auth_ID DocID ref: DocID equ: Auth_ID REPORT DocID Report Code Version id: DocID ref id': Report Code BOOK DocID ISBN Publis her id: DocID ref id': ISBN COPY DocID Serial-No Date-Acquired Loc_Store Loc_Shelf Loc_Row id: DocID Serial-No ref: DocID res erved PID DocID Res ervation date id: DocID PID ref: DocID ref: PID borrowing DocID Serial-No Borrow-Date PID ProjCode Return-Date[0-1] id: DocID Serial-No Borrow-Date ref: DocID Serial-No ref: PID ref: ProjCode AUTHOR Auth_ID Nam e Firs t-Nam e[0-1] Origin[0-1] id: Auth_ID Phone PID Phone id: PID Phone equ: PID BORROWER PID Nam e Add_Street Add_City ProjCode[0-1] Res ponsible[0-1] id: PID ref: Res ponsible ref: ProjCode PROJECT ProjCode Title Com pany id: ProjCode ContractNo ContractNo ProjCode id: ContractNo id': ProjCode ref 97 8. Modeling of DB engineering processes Example 2: database reverse engineering codeddl codeprg Parsing Raw physical schema Refinement Physical schema Cleaning Logical schema Conceptualization Conceptual schema Conceptual schema = DB-REng(codeddl, codeprg) DB-REng = Concept o Clean o Refine o Parse 8. Modeling of DB engineering processes Example 2: database reverse engineering Interesting observations Refine o Parse = Coding-1 Cleaning = Physical_design-1 Conceptualization = Logical_design-1 Conclusion: DB reverse engineering is (grossly speaking) the inverse of DB engineering 98 99 8. Modeling of DB engineering processes Hence the transformation plan of Conceptualization: transform is-a relations Transform FK into functional rel-types transform complex rel-types Remove technical Id transform level-1 multivalued attributes Transform FK into functional rel-types disaggregate level-1 compound attributes Aggregate heterogeneous serial attributes transform functional rel-types transform attribute entity types into multi-valued attributes add technical Id where needed transform relationship entity types into rel-types transform functional rel-types transform one-to-one rel-types into is-a relations 100 8. Modeling of DB engineering processes Experiment: DOCUMENT DocID Title Date-Published REPORT[0-1] BOOK[0-1] Keyword[0-N] id: DocID excl: BOOK REPORT 0-1 relational logical schema 0-N BOO_DOC 1-1 1-1 REPORT Report Code Version id': Report Code BOOK ISBN Publis her id': ISBN 1-N 0-N 0-1 REP_DOC written Res ponsible res erved Res ervation date Res ponsible 0-N 0-N 0-N 0-N COP_BOO Convincing, but obviously needs some polishing! 1-1 COPY Serial-No Date-Acquired Loc_Store Loc_Shelf Loc_Row id: COP_BOO.BOOK Serial-No AUTHOR Auth_ID Nam e Firs t-Nam e[0-1] Origin[0-1] id: Auth_ID 0-N 0-1 BORROWER PID Nam e Add_Street Add_City Phone[1-N] id: PID 0-1 borrowing Borrow-Date Return-Date[0-1] id: COPY Borrow-Date BOR_PRO_1 0-N 0-N PROJECT ProjCode Title Com pany ContractNo[0-1] id: ProjCode id': ContractNo 101 9. Conclusions and perspectives 9. Conclusions and perspectives 102 Intuitively, most database engineering processes are transformational by nature. By combining elementary transformations, we can give these processes a precise transformational definition. A transformation can be formalized so that its preservation properties can be proved. We need a small set of elementary transformations (20 - 40). Once correctly defined, a transformation is quite reliable, and is guaranteed to preserve information whatever the context in which it is applied. Transformation are (sort of …) easy to implement in CASE tools. Several general-purpose languages and engines: QVT, AST, ATL, Kermeta, GReAT, VIATRA, Tefkat, TXL and ... XSLT! 9. Conclusions and perspectives 103 However, some problems are not (completely) solved: a transformation must address all the aspects of the data structures: documentation, annotations, statistics, operations (methods). complex problem: propagating the constraints; OK for uniqueness, but others are less obvious. how to efficiently transform the data, following schema transformation? See J.-M. Hick thesis (2003). modifying a high-level abstract schema is easy, but how do we propagate the modifications to the lower-level schema and code (traceability)? transforming the data structures is nice, but what about the programs? Notion of co-transformation. See A. Cleve’s thesis (2009) how to derive a procedural transformation from the <P,Q> specification? how to derive a transformation plan from couple (M1, M2)? 104 Thanks 105 A. Schema transformations in CASE tools 106 A. Schema transformations in CASE tools CASE tools with DB design facilities offer explicit or implicit transformations for the production of the database code and for (limited) reverse engineering. conceptual schema DDL code conceptual schema relational schema DDL code DDL code relational schema conceptual schema Few CASE tools include transformations as a major engineering paradigm. Example DB-MAIN A. Schema transformations in CASE tools The DB-MAIN CASE environment Toolset of about 30 elementary transformations. Most are SR. The others trigger a warning. Predicate-driven transformations (through two transformation assistants) simplified (for the dummies) advanced (for the smarties) Model-driven transformations (through two transformation assistants) scripting facilities for developing transformation plans a dozen predefined, updatable, transformation plans 107 108 A. Schema transformations in CASE tools The DB-MAIN CASE environment 1. select an object Elementary transformations (Transforming attribute Phone into entity type PHONE) 3. if needed, select the variant 4. if needed, give target names 2. select a transformation A. Schema transformations in CASE tools The DB-MAIN CASE environment Elementary transformations (Transforming bag attribute Keyword into a set attribute) 109 A. Schema transformations in CASE tools 110 The DB-MAIN CASE environment Predicate-driven transformations: simplified assistant 1. choose a pattern 2. choose an action 3. execute 111 A. Schema transformations in CASE tools The DB-MAIN CASE environment Predicate-driven transformations: advanced assistant 1. select an operation 2. define the predicate 3. set its parameters 112 A. Schema transformations in CASE tools The DB-MAIN CASE environment Model-driven transformations: simplified assistant 1. build a couple (pattern,action) 2. write it 3. save script 4. execute script A. Schema transformations in CASE tools The DB-MAIN CASE environment Model-driven transformations: advanced assistant 113 A. Schema transformations in CASE tools The DB-MAIN CASE environment Model-driven transformations: advanced assistant 114