* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Some high level language constructs for data of type
Survey
Document related concepts
Transcript
Some high level language
constructs for data of type
relation
By Joachim W. Schmidt
Hamburg university, west
Germany
Background
• 1970 – “A relational model of data for large shared data banks” by
Codd
• 1974 – SEQUEL (Standard English Query Language) is introduced
– Textual interface that enables altering and querying relational database
• Mid 70 – Relational databases use grows rapidly
• Database came to resolve:
– Large amount of data
– The data has internal connections
– Data must be available to many users
• 1976 – EQUEL (Embedded Query Language) is introduced
– Adding preprocessor instructions to C that are translated to database
API calls
Motivation
• The database and its data language is an
independent system
– An additional library is required in order to work with
such database
• Data is interchanged with user programs
through fixed interfaces in the form of I/O areas
– Query is a string that can be checked only at run time
– Result types can be checked only at run time
– Database types not always correlate to native types in
the program language
Example
Database db("localhost","dbuser","","testdb");
Query q(db);
q.get_result("select num,name from employees");
while (q.fetch_row())
{
long num = q.getval();
std::string name = q.getstr();
printf("employee#%ld: %s\n", num, name.c_str() );
}
q.free_result();
PASCAL-R
• Enhancing the Pascal language with
constructs for relational database
management
– Constructs are part of the language
– Allow altering and querying relations
– Types can be checked at compile time
– User friendly
• The user does not program a procedure that
produces the result but gives a declaration of
certain properties of the result
Language data types
• Record
– Ordered set of fields, each of which is of
scalar type or string
• Relation
– Ordered set of records of the same type
• Key
– Ordered subset of the record fields that
uniquely identify a record in a relation
Example
type
employeeRecord =
record
enployeeNumber, employeeStatus:integer,
employeeName:string;
end;
employeeRelation =
relation <employeeNumber> of employeeRecord;
Var
employee: employeeRecord;
employees: employeeRelation;
Altering relations operations
• Rel1 :+ Rel2
– Into Rel1 inserted copies of the records of Rel2
whose key values do not already occur in any record
of Rel1
• Rel1 :- Rel2
– removing from Rel1 all the records contained in Rel2
• Rel1 :& Rel2
– Replacing in Rel1 all the records whose key occur in
a record of Rel2
• Rel1 := Rel2
– Relation assignment
Elementary relation constructors
• Rel1 := []
– Initializes an empty relation
• Rel1 := [record]
– Initializes a one record relation from a record
variable
Elementary retrieval operation
• rel↑
– Implicitly declared buffer to contain retrieval operation
result
• low(rel)
– Assigns to rel↑ the record with the lowest key value
• next(rel)
– Assigns to rel↑ the record with the next key
• aor(rel)
– Boolean function that returns true when all of the
records have been iterated (all of relation)
Example
var
employees, result : EmployeeRelation;
begin
result := [];
low(employees);
while not aor(employees) do
begin
if employees↑.employeeStatus = 2 then
result :+ [employees↑];
next(employees)
end
end
Disadvantages of this construct
• Unnecessary ordering of records
• Does not support nesting
• Hard to optimize
Repetition statement - foreach
• Foreach statement iterates over a relation
in an arbitrary order, assigning a control
variable with a value of a single record in
each iteration
• The control variable is declared implicitly
to have the same record type as the
records making the relation
Example
begin
result := [];
foreach curEmployee in employees do
if curEmployee.employeeStatus = 2 then
result += [curEmployee]
end
Another example
type
timeTableRecord = record emplyeeNumber, courseNumber, lectureTime : integer;
dayOfWeek, room : string end;
timeTableRelation = relation <emplyeeNumber, courseNumber, dayOfWeek> of
timeTableRecord;
var
timeTable : timeTableRelation;
begin
result := [];
foreach curEmployee in employees do
foreach curLecture in timeTable do
if (curEmployee.employeeNumber = curLecture.employeeNumber)
and (curLecture.dayOfWeek = ‘Friday’)
then result :+ [curEmployee];
end.
Disadvantages of this construct
• Iterating over the entire inner relation is
sometimes redundant
– Same record can be added several times to the result
relation
• One way to resolve this is a Boolean variable
• Another way would be some kind of break statement
• Not always trivial which control variable should
be used to construct the result relation and
which is only used to test a condition
– In the above example the order of the 2 loops can be
changed without effecting the result
Predicates over relations
• Motivation: to abstract the technical way in
which the condition is evaluated
– The inner loop in the above example is
replaced by a predicate
• A predicate is composed of a quantifier,
control variable, range and the logical
expression
– Quantifiers are ‘all’ and ‘some’
• Predicates can contain several quantifiers
Example
begin
foreach curEmployee in employees do
if some curLecture in timeTable ((curEmployee.employeeNumber
= curLecture.employeeNumber) and
(curLecture.dayOfWeek = 'friday'))
then result :+ [curEmployee]
end
Advantages of predicates
• The user does not have to add implicit exit
– This problem has been shifted to the
implementer
• An efficient implementation can parallelize
the record processing
– No implicit order
Nested quantifier example
type
courseRecord = record courseNumber, courseLvl : integer, courseName : string
end;
courseRelation = relation <courseNumber> of courseRecord;
var
courses : courseRelation;
begin
result := [];
foreach curEmployee in employees do
if all curLecture in timeTable
((curEmployee.employeeNuber != curLecture.employeeNumber) or
some course in courses ((curLecture.courseNumber =
course.courseNumber)
and (course.courseLvl = 1)))
then result :+ [curEmployee];
end
Construction of sub relations
• Motivation:
– All the above examples had constructed a
relation by:
• Iterating over the source relation sequentially
• Testing a condition
• Adding records that satisfy the condition to the
result relation one at a time
– It can be simplified and optimized by using a
construction of sub relation
• Introducing the ‘each’ construct for relations
Examples
begin
result1 := [each employee in employees:
employee.employeeStatus = 2];
result2 := [each employee in employees:
some lecture in timeTable
((employee.employeeNumber = lecture.employeeNumber) and
(lecture.dayOfWeek = 'Friday')))];
end.
General relation constructor
• Previous ‘each’ construct was limited to one
relation only
• The general form can construct a relation from
several other relations
– Several relations are tested together, each has its
own control variable
– The result relation fields can be taken from any of
those relation fields
– The logical expression to add records to the result
relation can be made of fields from any relation
Example
type
publicationRecord = record title: string, year, employeeNumber: integer
end;
publicationRelation = relation <title, emplyeeNumber> of publicationRecord
var
publications: publicationRelation;
begin
result := [each (employee.employeeName, publication.title, publication.year)
for employee, publication in employeeRecord, publicationRecord:
(publication.employeeNumber = employee.employeeNumber) and
some lecture in timeTable (employee.employeeNumber =
lecture.employeeNumber)]
end.
Advantages
• Very high level language constructs
– The user does not program a procedure but
gives a declaration of the required result
properties only
• Highly readable
– The entire code for achieving the result
relation is at one place
• Can be implemented efficiently
Implementation
• These constructs were implemented in
PASCAL compiler
– Compiler was modified to accept the syntax
– Run time library was added to handle the
execution of the predicates and relational
constructor
Conclusions
• The expression power of the proposed
constructs is satisfactory
• It did not look into some of the database
traditional problems
– Simultaneous access
– Data integrity
• Type checking is possible, but requires the
database schema to be known at compile
time and remain unchanged in runtime