* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download oodbs - COW :: Ceng
Extensible Storage Engine wikipedia , lookup
Relational algebra wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Functional Database Model wikipedia , lookup
Concurrency control wikipedia , lookup
Clusterpoint wikipedia , lookup
ContactPoint wikipedia , lookup
Versant Object Database wikipedia , lookup
Object-Oriented & Object-Relational DBMSs CENG 553 Database Management Systems 1 Advanced Database Applications • • • • • • • • Computer-Aided Design/Manufacturing (CAD/CAM) Computer-Aided Software Engineering (CASE) Network Management Systems Office Information Systems (OIS) and Multimedia Systems Digital Publishing Geographic Information Systems (GIS) Interactive and Dynamic Web sites Other applications with complex and interrelated objects and procedural data. CENG 553 Database Management Systems 2 Expected features for new applications • • • • Complex objects Behavioral data Meta knowledge Long duration transactions CENG 553 Database Management Systems 3 Weaknesses of RDBMSs • Poor representation of “Real World” entities – Normalization leads to relations that do not correspond to entities in “real world”. • Semantic overloading – Relational model has only one construct for representing data and data relationships: the relation. – Relational model is semantically overloaded • Limited operations – only a fixed set of operations which cannot be extended. CENG 553 Database Management Systems 4 Object-Oriented Concepts • • • • • • • Abstraction, encapsulation, information hiding. Objects and attributes. Object identity. Methods and messages. Classes, subclasses, superclasses, and inheritance. Overloading. Polymorphism and dynamic binding. CENG 553 Database Management Systems 5 Database Systems First Generation DBMS: Network and Hierarchical – Required complex programs for even simple queries. – Minimal data independence. – No widely accepted theoretical foundation. Second Generation DBMS: Relational DBMS – Helped overcome these problems. Third Generation DBMS: OODBMS and ORDBMS. CENG 553 Database Management Systems 6 Object-Oriented Data Model No one agreed object data model. One definition: Object-Oriented Data Model (OODM) – Data model that captures semantics of objects supported in object-oriented programming. Object-Oriented Database (OODB) – Persistent and sharable collection of objects defined by an ODM. Object-Oriented DBMS (OODBMS) – Manager of an ODB. CENG 553 Database Management Systems 7 Commercial OODBMSs • • • • • • • GemStone from Gemstone Systems Inc., Objectivity/DB from Objectivity Inc., ObjectStore from Progress Software Corp., Ontos from Ontos Inc., FastObjects from Poet Software Corp., Jasmine from Computer Associates/Fujitsu, Versant from Versant Corp. CENG 553 Database Management Systems 8 Advantages of OODBMSs • • • • • • Enriched Modeling Capabilities. Removal of Impedance Mismatch. More Expressive Query Language. Support for Schema Evolution. Support for Long Duration Transactions. Applicability to Advanced Database Applications. CENG 553 Database Management Systems 9 Disadvantages of OODBMSs • • • • • • Lack of Universal Data Model. Lack of Experience. Lack of Standards. Query Optimization compromises Encapsulation. Object Level Locking may impact Performance. Complexity. CENG 553 Database Management Systems 10 Alternative Strategies for Developing an OODBMS • Extend existing object-oriented programming language. – GemStone extended Smalltalk. • Provide extensible OODBMS library. – Approach taken by Ontos, Versant, and ObjectStore. • Embed OODB language constructs in a conventional host language. – Approach taken by O2,which has extensions for C++. • Extend existing database language with object-oriented capabilities. – Approach being pursued by RDBMS and OODBMS vendors. – Ontos and Versant provide a version of OSQL. • Develop a novel database data model/language. CENG 553 Database Management Systems 11 Single-Level v. Two-Level Storage Model • With a traditional DBMS, programmer has to: – Decide when to read and update objects. – Write code to translate between application’s object model and the data model of the DBMS. – Perform additional type-checking when object is read back from database, to guarantee object will conform to its original type. • Conventional DBMSs have two-level storage model: storage model in memory, and database storage model on disk. • In contrast, OODBMS gives illusion of single-level storage model, with similar representation in both memory and in database stored on disk. CENG 553 Database Management Systems 12 Two-Level Storage Model for RDBMS CENG 553 Database Management Systems 13 Single-Level Storage Model for OODBMS CENG 553 Database Management Systems 14 Object Data Management Group (ODMG) • Established by vendors of OODBMSs to define standards. • The ODMG Standard includes : – – – – Object Data Model (ODM). Object Definition Language (ODL). Object Query Language (OQL). C++, Smalltalk, and Java Language Binding. CENG 553 Database Management Systems 15 Main Idea: Host Language = Data Language • Objects in the host language are mapped directly to database objects • Some objects in the host program are persistent. Changing such objects (through an assignment to an instance variable or with a method application) directly and transparently affects the corresponding database object • Accessing an object using its oid causes an “object fault” similar to pagefaults in operating systems. This transparently brings the object into the memory and the program works with it as if it were a regular object defined, for example, in the host Java program CENG 553 Database Management Systems 16 SQL Databases vs. ODMG • In SQL: Host program accesses the database by sending SQL queries to it (using JDBC, ODBC, Embedded SQL, etc.) • In ODMG: Host program works with database objects directly CENG 553 Database Management Systems 17 ODMG Data Model • Distinguishes between objects and pure values (which are called literals) • Both can have complex internal structure, but only objects have oids • Two kinds of classes: “ODMG classes” and “ODMG interfaces”, similar to Java – An ODMG interface: only signatures • does not have its own objects • cannot inherit from (be a subclass of) an ODMG class – only from another ODMG interface – An ODMG class: • can have attributes, methods with code, own objects • can inherit from (be a subclass of) other ODMG classes or interfaces – can have at most one immediate superclass CENG 553 Database Management Systems 18 ODMG Object Model – Built-in Collections Set: unordered collections without duplicates. Bag: unordered collections that do allow duplicates. List: ordered collections that allow duplicates. Array: 1D array of dynamically varying length. Dictionary: unordered sequence of key-value pairs with no duplicate keys. CENG 553 Database Management Systems 19 More on the ODMG Data Model • Can specify keys • Class extents have their own names – this is what is used in queries • Distinguishes between relationships and attributes • • • • Attribute values are literals Relationship values are objects Only binary relationships supported ODMG relationships have little to do with relationships in the E-R model CENG 553 Database Management Systems 20 ODL: ODMG’s Object Definition Language • ODL supports semantics constructs of ODMG • ODL is independent of any programming language • ODL is used to create object specification (classes and interfaces) • ODL is not used for database manipulation CENG 553 Database Management Systems 21 ODL Examples (1) A Very Simple Class • A very simple, straightforward class definition : class Degree { attribute string college; attribute string degree; attribute string year; }; CENG 553 Database Management Systems 22 ODL Examples (2) A Class With Key and Extent • A class definition with “extent”, “key”, and more elaborate attributes; still relatively straightforward class Person (extent persons key ssn) { attribute struct Pname {string fname …} name; attribute string ssn; attribute date birthdate; … short age(); } CENG 553 Database Management Systems 23 ODL Examples (3) A Class With Relationships • Note extends (inheritance) relationship • Also note “inverse” relationship Class Faculty extends Person (extent faculty) { attribute string rank; attribute float salary; attribute string phone; … relationship Dept works_in inverse Dept::has_faculty; relationship set<GradStu> advises inverse GradStu::advisor; void give_raise (in float raise); void promote (in string new_rank); }; CENG 553 Database Management Systems 24 Referential Integrity class STUDENT extends PERSON { ( extent StudentExt ) attribute Set<String> Major; relationship Set<COURSE> Enrolled; inverse COURSE::Enrollment; } class COURSE: Object { ( extent CourseExt ) attribute Integer CrsCode; attribute String Department; relationship Set<STUDENT> Enrollment; inverse STUDENT::Enrolled; } CENG 553 Database Management Systems 25 Object Query Language (OQL) • OQL is DMG’s query language • Provides declarative access to object database using SQLlike syntax. • Does not provide explicit update operators - leaves this to operations defined on object types. • Can be used as a standalone language and as a language embedded in another language, for which an ODMG binding is defined (Smalltalk, C++, and Java). • Embedded OQL statements return objects that are compatible with the type system of the host language CENG 553 Database Management Systems 26 Object Query Language (OQL) • An OQL query is a function that delivers an object whose type may be inferred from operator contributing to query expression. • Query definition expressions is of form: DEFINE Q as e • Defines query with name Q given query expression e. CENG 553 Database Management Systems 27 Example OQL: Extents & Traversal Paths Get set of all faculty (with identity) faculty Get set of all enrollments(with identity) CourseExt.Enrollment CENG 553 Database Management Systems 28 Example schema class Branch (extent branchOffices key branchNo) { attribute string branchNo; …. relationship Manager ManagedBy inverse Manager::Manages; void takeOnPropertyForRent(in string propertyNo); } CENG 553 Database Management Systems 29 Example (cont.) class Person { attribute struct Pname {string fName, string lName} name; } Class Staff extends Person (extent staff key staffNo) { attribute staffNo; attribute date DOB; …. short getAge(); } CENG 553 Database Management Systems 30 Example (cont.) class Manager extends Staff (extent managers) { relationship Branch Manages inverse Branch::ManagedBy; } CENG 553 Database Management Systems 31 Example OQL: Extents & Traversal Paths Find all branches in London DEFINE londonBranches AS SELECT b.branchNo FROM b IN branchOffices WHERE b.address.city = “London”; This returns a literal of type bag<string>. CENG 553 Database Management Systems 32 Example OQL: Extents & Traversal Paths Find all staff who work at London branches. londonBranches.Has This returns set<SalesStaff>. CENG 553 Database Management Systems 33 Example OQL: Use of structures Get structured set (without identity) containing name, sex, and age of all staff who live in London. SELECT struct (lName:s.name.lName, sex:s.sex, age:s.age) FROM s IN Staff WHERE s.WorksAt.address.city = “London” This returns a literal of type set<struct>. CENG 553 Database Management Systems 34 Example OQL: Use of structures Get structured set (without identity) containing branch number and set of all Assistants at branches in London. SELECT struct (branchNo:x.branchNo, assistants: (SELECT y FROM y IN x.WorksAt WHERE y.position = “Assistant”)) FROM x IN (SELECT b FROM b IN branchOffices WHERE b.address.city = “London”) This returns a literal of type set<struct>. CENG 553 Database Management Systems 35 OQL - Creating Objects A type name constructor is used to create an object with identity. Manager(staffNo: “SL21”, fName: “John”, lName: “White”, address: “19 Taylor St, London”, position: “Manager”, sex: “M”, DOB: date“1945-10-01”, salary: 30000) CENG 553 Database Management Systems 36 ORDBMS CENG 553 Database Management Systems 37 Merging Relational and Object Models • Object-oriented models support interesting data types --- not just flat files. – Maps, multimedia, etc. • The relational model supports very-highlevel queries. • Object-relational databases are an attempt to get the best of both. CENG 553 Database Management Systems 38 Evolution of DBMS’s • Object-oriented DBMS’s failed because they did not offer the efficiencies of wellentrenched relational DBMS’s. • Object-relational extensions to relational DBMS’s capture much of the advantages of OO, yet retain the relation as the fundamental abstraction. CENG 553 Database Management Systems 39 ORDBMSs • Vendors of RDBMSs conscious of threat and promise of OODBMS. • Agree that RDBMSs not currently suited to advanced database applications, and added functionality is required. • Can remedy shortcomings of relational model by extending model with OO features. CENG 553 Database Management Systems 40 ORDBMSs - Features • OO features being added include: – – – – – – – user-extensible types, encapsulation, inheritance, polymorphism, dynamic binding of methods, complex objects including non-1NF objects, object identity. CENG 553 Database Management Systems 41 Stonebraker’s View CENG 553 Database Management Systems 42 Objects in SQL:1999 • • • • Object-relational extension of SQL-92 Includes the legacy relational model SQL:1999 database = a finite set of relations relation = a set of tuples (extends legacy relations) OR • • • a set of objects (completely new) object = (oid, tuple-value) tuple = tuple-value tuple-value = [Attr1: v1, …, Attrn: vn] CENG 553 Database Management Systems 43 SQL:1999 Tuple Values • Tuple value: [Attr1: v1, …, Attrn: vn] – Attri are all distinct attributes – Each vi is one of these: • Primitive value: a constant of type CHAR(…), INTEGER, FLOAT, etc. • Reference value: an object Id • Another tuple value • A collection value Only the ARRAY construct is – a fixed size array. SETOF and LISTOF are not supported. CENG 553 Database Management Systems 44 Row Types • The same as the original (legacy) relational tuple type. However: – Row types can now be the types of the individual attributes in a tuple CREATE TABLE PERSON ( Name CHAR(20), Address ROW(Number INTEGER, Street CHAR(20), ZIP CHAR(5)) ) CENG 553 Database Management Systems 45 Row Types (Contd.) • Use path expressions to refer to the components of row types: SELECT P.Name FROM PERSON P WHERE P.Address.ZIP = ‘11794’ • Update operations: INSERT INTO PERSON(Name, Address) VALUES (‘John Doe’, ROW(666, ‘Hollow Rd.’, ‘66666’)) UPDATE PERSON SET Address.ZIP = ‘66666’ WHERE Address.ZIP = ‘55555’ UPDATE PERSON SET Address = ROW(21, ‘Main St’, ‘12345’) WHERE Address = ROW(123, ‘Maple Dr.’, ‘54321’) AND Name = ‘J. Public’ CENG 553 Database Management Systems 46 User Defined Types (UDT) • UDTs allow specification of complex objects/tuples, methods, and their implementation • Like ROW types, UDTs can be types of individual attributes in tuples • UDTs can be much more complex than ROW types (even disregarding the methods): the components of UDTs do not need to be elementary types CENG 553 Database Management Systems 47 A UDT Example CREATE TYPE PersonType AS ( Name CHAR(20), Address ROW(Number INTEGER, Street CHAR(20), ZIP CHAR(5)) ); CREATE TYPE StudentType UNDER PersonType AS ( Id INTEGER, Status CHAR(2) ) METHOD award_degree() RETURNS BOOLEAN; CREATE METHOD award_degree() FOR StudentType LANGUAGE C EXTERNAL NAME ‘file:/home/admin/award_degree’; File that holds the binary code CENG 553 Database Management Systems 48 Using UDTs in CREATE TABLE • As an attribute type: CREATE TABLE TRANSCRIPT ( Student StudentType, CrsCode CHAR(6), Semester CHAR(6), Grade CHAR(1) ) A previously defined UDT • As a table type: CREATE TABLE STUDENT OF StudentType; Such a table is called typed table. CENG 553 Database Management Systems 49 Objects • Only typed tables contain objects (i.e. tuples with oids) • Compare: CREATE TABLE STUDENT OF StudentType; and CREATE TABLE STUDENT1 ( Name CHAR(20), Address ROW(Number INTEGER, Street CHAR(20), ZIP CHAR(5)), Id INTEGER, Status CHAR(2) ) • Both contain tuples of exactly the same structure • Only the tuples in STUDENT – not STUDENT1 – have oids. • This disparity is motivated by the need to stay backward compatible with SQL-92. CENG 553 Database Management Systems 50 Querying UDTs • Nothing special – just use path expressions SELECT T.Student.Name, T.Grade FROM TRANSCRIPT T WHERE T.Student.Address.Street = ‘Main St.’ Note: T.Student has the type StudentType. The attribute Name is not declared explicitly in StudentType, but is inherited from PersonType. CENG 553 Database Management Systems 51 Updating User-Defined Types • Inserting a record into TRANSCRIPT: INSERT INTO TRANSCRIPT(Student,Course,Semester,Grade) VALUES (????, ‘CS308’, ‘2000’, ‘A’) – The type of the Student attribute is StudentType. How does one insert a value of this type (in place of ????)? – Further complication: the UDT StudentType is encapsulated, i.e., it is accessible only through public methods, which we did not define – Do it through the observer and mutator methods provided by the DBMS automatically CENG 553 Database Management Systems 52 Observer Methods • For each attribute A of type T in a UDT, an SQL:1999 DBMS is supposed to supply an observer method, A: ( ) T, which returns the value of A (the notation “( )” means that the method takes no arguments) • Observer methods for StudentType: • Id: ( ) INTEGER • Name: ( ) CHAR(20) • Status: ( ) CHAR(2) • Address: ( ) ROW(INTEGER, CHAR(20), CHAR(5)) • For example, in SELECT T.Student.Name, T.Grade FROM TRANSCRIPT T WHERE T.Student.Address.Street = ‘Main St.’ Name and Address are observer methods, since T.Student is of type StudentType Note: Grade is not an observer, because TRANSCRIPT is not part of a UDT CENG 553 Database Management Systems 53 Mutator Methods • An SQL:1999 DBMS is supposed to supply, for each attribute A of type T in a UDT U, a mutator method A: T U For any object o of type U, it takes a value t of type T and replaces the old value of o.A with t; it returns the new value of the object. Thus, o.A(t) is an object of type U • Mutators for StudentType: • Id: INTEGER StudentType • Name: CHAR(20) StudentType • Address: ROW(INTEGER, CHAR(20), CHAR(5)) StudentType CENG 553 Database Management Systems 54 Example: Inserting a UDT Value INSERT INTO TRANSCRIPT(Student,Course,Semester,Grade) VALUES ( NEW StudentType( ) .Id(111111111) .Status(‘G5’) .Name(‘Joe Public’) .Address(ROW(123,’Main St.’, ‘54321’)) , ‘CS532’, ‘S2002’, ‘A’ ) Add a value for Id Create a blank StudentType object Add a value for the Address attribute Add a value for Status ‘CS532’, ‘S2002’, ‘A’ are primitive values for the attributes Course, Semester, Grade CENG 553 Database Management Systems 55 Example: Changing a UDT Value UPDATE TRANSCRIPT SET Student = Student.Address(ROW(21,’Maple St.’,’12345’)).Name(‘John Smith’), Grade = ‘B’ Change Name Change Address WHERE Student.Id = 111111111 AND CrsCode = ‘CS532’ AND Semester = ‘S2002’ • Mutators are used to change the values of the attributes Address and Name CENG 553 Database Management Systems 56 Referencing Objects • Consider again CREATE TABLE TRANSCRIPT ( Student StudentType, CrsCode CHAR(6), Semester CHAR(6), Grade CHAR(1) ) • Problem: TRANSCRIPT records for the same student refer to distinct values of type StudentType (even though the contents of these values may be the same) – a maintenance/consistency problem • Solution: use self-referencing column – Bad design, which distinguishes objects from their references – Not truly object-oriented CENG 553 Database Management Systems 57 Self-Referencing Column • Every typed table has a self-referencing column – Normally invisible – Contains explicit object Id for each tuple in the table – Can be given an explicit name – the only way to enable referencing of objects CREATE TABLE STUDENT2 OF StudentType REF IS stud_oid; Self-referencing column Self-referencing columns can be used in queries just like regular columns Their values cannot be changed, however CENG 553 Database Management Systems 58 Reference Types and Self-Referencing Columns • To reference objects, use self-referencing columns + reference types: REF(some-UDT) CREATE TABLE TRANSCRIPT1 ( Student REF(StudentType) SCOPE STUDENT2, CrsCode CHAR(6), Semester CHAR(6), Grade CHAR(1) Reference type ) Typed table where the values are drawn from • Two issues: • How does one query the attributes of a reference type • How does one provide values for the attributes of type REF(…) – Remember: you can’t manufacture these values out of thin air – they are oids! CENG 553 Database Management Systems 59 Querying Reference Types • Recall: Student REF(StudentType) SCOPE STUDENT2 in TRANSCRIPT1. How does one access, for example, student names? • SQL:1999 has the same misfeature as C/C++ has (and which Java and OQL do not have): it distinguishes between objects and references to objects. To pass through a boundary of REF(…) use “” instead of “.” SELECT T.StudentName, T.Grade FROM TRANSCRIPT1 T WHERE T.StudentAddress.Street = “Main St.” Not crossing REF(…) boundary, use “.” Crossing REF(…) boundary, use CENG 553 Database Management Systems 60 Inserting REF Values • How does one give values to REF attributes, like Student in TRANSCRIPT1? • Use explicit self-referencing columns, like stud_oid in STUDENT2 • Example: Creating a TRANSCRIPT1 record whose Student attribute has an object reference to an object in STUDENT2: INSERT INTO TRANSCRIPT1(Student,Course,Semester,Grade) SELECT S.stud_oid, ‘HIS666’, ‘F1462’, ‘D’ FROM STUDENT2 S WHERE S.Id = ‘111111111’ CENG 553 Database Management Systems Explicit self-referential column of STUDENT2 61 Modifications to support ORDBMS • Parsing – Type-checking for methods pretty complex. • Query Rewriting – Often useful to turn path exprs into joins! • Optimization – New algebra operators needed for complex types. • Must know how to integrate them into optimization. – WHERE clause exprs can be expensive! • Selection pushdown may be a bad idea. CENG 553 Database Management Systems 62 Modifications (Contd.) • Execution – New algebra operators for complex types. – OID generation & reference handling. – Dynamic linking. – Support “untrusted” methods. – Support objects bigger than 1 page. CENG 553 Database Management Systems 63 Modifications (Contd.) • Access Methods – Indexes on methods, not just columns. – Need indexes for new WHERE clause exprs (not just <, >, =)! • Data Layout – Clustering of nested objects. – Chunking of arrays. CENG 553 Database Management Systems 64 OO/OR-DBMS Summary • Traditional SQL is too limited for new apps. • OODBMS: Persistent OO programming. – Difficult to use, no query language. • ORDBMS: Best (?) of both worlds: – Catching on in industry and applications. – Pretty easy for SQL folks to pick up. – Still has growing pains (SQL-3 standard still a moving target). CENG 553 Database Management Systems 65 Summary (Contd.) • ORDBMS offers many new features. – But not clear how to use them! – Schema design techniques not well understood – Query processing techniques still in research phase. • A moving target for OR DBA’s! CENG 553 Database Management Systems 66