Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Some high level language constructs for data of type relation By Joachim W. Schmidt Hamburg university, west Germany Background • 1970 – “A relational model of data for large shared data banks” by Codd • 1974 – SEQUEL (Standard English Query Language) is introduced – Textual interface that enables altering and querying relational database • Mid 70 – Relational databases use grows rapidly • Database came to resolve: – Large amount of data – The data has internal connections – Data must be available to many users • 1976 – EQUEL (Embedded Query Language) is introduced – Adding preprocessor instructions to C that are translated to database API calls Motivation • The database and its data language is an independent system – An additional library is required in order to work with such database • Data is interchanged with user programs through fixed interfaces in the form of I/O areas – Query is a string that can be checked only at run time – Result types can be checked only at run time – Database types not always correlate to native types in the program language Example Database db("localhost","dbuser","","testdb"); Query q(db); q.get_result("select num,name from employees"); while (q.fetch_row()) { long num = q.getval(); std::string name = q.getstr(); printf("employee#%ld: %s\n", num, name.c_str() ); } q.free_result(); PASCAL-R • Enhancing the Pascal language with constructs for relational database management – Constructs are part of the language – Allow altering and querying relations – Types can be checked at compile time – User friendly • The user does not program a procedure that produces the result but gives a declaration of certain properties of the result Language data types • Record – Ordered set of fields, each of which is of scalar type or string • Relation – Ordered set of records of the same type • Key – Ordered subset of the record fields that uniquely identify a record in a relation Example type employeeRecord = record enployeeNumber, employeeStatus:integer, employeeName:string; end; employeeRelation = relation <employeeNumber> of employeeRecord; Var employee: employeeRecord; employees: employeeRelation; Altering relations operations • Rel1 :+ Rel2 – Into Rel1 inserted copies of the records of Rel2 whose key values do not already occur in any record of Rel1 • Rel1 :- Rel2 – removing from Rel1 all the records contained in Rel2 • Rel1 :& Rel2 – Replacing in Rel1 all the records whose key occur in a record of Rel2 • Rel1 := Rel2 – Relation assignment Elementary relation constructors • Rel1 := [] – Initializes an empty relation • Rel1 := [record] – Initializes a one record relation from a record variable Elementary retrieval operation • rel↑ – Implicitly declared buffer to contain retrieval operation result • low(rel) – Assigns to rel↑ the record with the lowest key value • next(rel) – Assigns to rel↑ the record with the next key • aor(rel) – Boolean function that returns true when all of the records have been iterated (all of relation) Example var employees, result : EmployeeRelation; begin result := []; low(employees); while not aor(employees) do begin if employees↑.employeeStatus = 2 then result :+ [employees↑]; next(employees) end end Disadvantages of this construct • Unnecessary ordering of records • Does not support nesting • Hard to optimize Repetition statement - foreach • Foreach statement iterates over a relation in an arbitrary order, assigning a control variable with a value of a single record in each iteration • The control variable is declared implicitly to have the same record type as the records making the relation Example begin result := []; foreach curEmployee in employees do if curEmployee.employeeStatus = 2 then result += [curEmployee] end Another example type timeTableRecord = record emplyeeNumber, courseNumber, lectureTime : integer; dayOfWeek, room : string end; timeTableRelation = relation <emplyeeNumber, courseNumber, dayOfWeek> of timeTableRecord; var timeTable : timeTableRelation; begin result := []; foreach curEmployee in employees do foreach curLecture in timeTable do if (curEmployee.employeeNumber = curLecture.employeeNumber) and (curLecture.dayOfWeek = ‘Friday’) then result :+ [curEmployee]; end. Disadvantages of this construct • Iterating over the entire inner relation is sometimes redundant – Same record can be added several times to the result relation • One way to resolve this is a Boolean variable • Another way would be some kind of break statement • Not always trivial which control variable should be used to construct the result relation and which is only used to test a condition – In the above example the order of the 2 loops can be changed without effecting the result Predicates over relations • Motivation: to abstract the technical way in which the condition is evaluated – The inner loop in the above example is replaced by a predicate • A predicate is composed of a quantifier, control variable, range and the logical expression – Quantifiers are ‘all’ and ‘some’ • Predicates can contain several quantifiers Example begin foreach curEmployee in employees do if some curLecture in timeTable ((curEmployee.employeeNumber = curLecture.employeeNumber) and (curLecture.dayOfWeek = 'friday')) then result :+ [curEmployee] end Advantages of predicates • The user does not have to add implicit exit – This problem has been shifted to the implementer • An efficient implementation can parallelize the record processing – No implicit order Nested quantifier example type courseRecord = record courseNumber, courseLvl : integer, courseName : string end; courseRelation = relation <courseNumber> of courseRecord; var courses : courseRelation; begin result := []; foreach curEmployee in employees do if all curLecture in timeTable ((curEmployee.employeeNuber != curLecture.employeeNumber) or some course in courses ((curLecture.courseNumber = course.courseNumber) and (course.courseLvl = 1))) then result :+ [curEmployee]; end Construction of sub relations • Motivation: – All the above examples had constructed a relation by: • Iterating over the source relation sequentially • Testing a condition • Adding records that satisfy the condition to the result relation one at a time – It can be simplified and optimized by using a construction of sub relation • Introducing the ‘each’ construct for relations Examples begin result1 := [each employee in employees: employee.employeeStatus = 2]; result2 := [each employee in employees: some lecture in timeTable ((employee.employeeNumber = lecture.employeeNumber) and (lecture.dayOfWeek = 'Friday')))]; end. General relation constructor • Previous ‘each’ construct was limited to one relation only • The general form can construct a relation from several other relations – Several relations are tested together, each has its own control variable – The result relation fields can be taken from any of those relation fields – The logical expression to add records to the result relation can be made of fields from any relation Example type publicationRecord = record title: string, year, employeeNumber: integer end; publicationRelation = relation <title, emplyeeNumber> of publicationRecord var publications: publicationRelation; begin result := [each (employee.employeeName, publication.title, publication.year) for employee, publication in employeeRecord, publicationRecord: (publication.employeeNumber = employee.employeeNumber) and some lecture in timeTable (employee.employeeNumber = lecture.employeeNumber)] end. Advantages • Very high level language constructs – The user does not program a procedure but gives a declaration of the required result properties only • Highly readable – The entire code for achieving the result relation is at one place • Can be implemented efficiently Implementation • These constructs were implemented in PASCAL compiler – Compiler was modified to accept the syntax – Run time library was added to handle the execution of the predicates and relational constructor Conclusions • The expression power of the proposed constructs is satisfactory • It did not look into some of the database traditional problems – Simultaneous access – Data integrity • Type checking is possible, but requires the database schema to be known at compile time and remain unchanged in runtime