Download X13_Tables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

SQL wikipedia , lookup

Clusterpoint wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Ingres (database) wikipedia , lookup

Relational algebra wikipedia , lookup

Join (SQL) wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
1
File: X13_Tables.doc
By: J.M. Smeenk
Revision: 5/20/09 8:49:38 AM
-----TABLES
-----13.
TABLES.
A relational table in BPL is a non-primitive type called tbl used to store
database information. A table is a directory whose entries are all matrices or
vectors of primitive types. Relational database theory is well-known and is
employed in the popular language SQL. While BPL may not replace SQL, BPL has
extensive relational database capabilities.
13.1. TABLE CREATION, INSERTION, AND ARRANGING.
Creation of a table is straightforward:
NameMat: -1 5#"AdamsJonesSmith"
** -1 MEANS "AS MANY AS THE DATA INDICATE (I.E., 3 ROWS)
DeptVec: 123 456 456
SalaryVec: 40000 50000 60000
Employees."Name DeptNo Salary":: tbl.{lit[,],int[],int[]}: ++
{NameMat,DeptVec,SalaryVec}
disp: 3
** USE FRAMED OUTPUT FOR NESTED ARRAYS AND DIRECTORIES
Employees ** VALUE OF TABLE IS A NESTED MATRIX OF NAMES & REFERENTS
+-----+------+------+
|Name |DeptNo|Salary|
+-----+------+------+
|Adams|123
|40000 |
|Jones|456
|50000 |
|Smith|456
|60000 |
+-----+------+------+
Table fields can be displayed in any order (rather than the default of order of
insertion into the table), with fields omitted if desired:
Employees."Salary DeptNo"
+------+------+
|Salary|DeptNo|
+------+------+
|40000 |123
|
|50000 |456
|
|60000 |456
|
+------+------+
A new employee can be inserted easily:
Employees:insert {"McDonald",789,50000}
** MATRIX Name OF EMPLOYEES BECOMES PADDED TO ACCOMODATE LONGER NAME
Employees
+--------+------+------+
|Name
|DeptNo|Salary|
+--------+------+------+
|Adams
|123
|40000 |
|Jones
|456
|50000 |
|Smith
|456
|60000 |
|McDonald|789
|50000 |
+--------+------+------+
2
If the insertion leads to one or more larger matrix fields, and the matrix was
declared without a fixed upper bound for the width, the matrix is internally
redeclared to the new width of the field data and the rows of the matrix are
padded with the appropriate delimiter. However, in such an insertion, if the
matrix was declared with a width with some upper bound and the data in its
field is longer than this width, an error ensues.
The Employees table can be sorted by Salary as ascending primary key and Name
as descending secondary key:
Employ: Employees arrange "Salary -Name"
** arrange KEYWORD MEANS "TABLE SORT"; KEYWORD sort IS ALREADY ++
**
DEFINED FOR NON-TABLE ARRAYS.
** PRIMARY KEY IS Salary, SECONDARY KEY IS Name
** NO PREFIX INDICATES ASCENDING SORT, - INDICATES DESCENDING SORT
Employ
+--------+------+------+
|Name
|DeptNo|Salary|
+--------+------+------+
|Adams
|123
|40000 |
|McDonald|789
|50000 |
|Jones
|456
|50000 |
|Smith
|456
|60000 |
+--------+------+------+
13.2.
TABLE QUERIES.
To make relational tables more powerful, BPL allows table queries that can have
the form Table.Namelist\(.FieldName DyadicFn Expression). The namelist may be
replaced by a single name or else by "" (which signifies all field names). For
example, the following code returns all employee names in department 456 and
their salaries:
Employees."Name Salary"\(.DeptNo=456)
+--------+------+
|Name
|Salary|
+--------+------+
|Jones
|50000 |
|Smith
|60000 |
+--------+------+
The \(...) query portion, which chooses the appropriate records of the table,
uses fieldnames of the table preceded by .. An expression such as .DeptNo is
erroneous outside of table queries since it is really an abbreviation for
Employees.DeptNo. The DyadicFn is typically a relational (<, <=, =, >=, >, <>)
or similar function (in, is, notin, isnot) or a rowwise relational (`<, `<=,
`=, `>=, `>, `<>) or a rowwise similar function (`in, `notin). The meta
operator ` modifies a function so that it works on rows rather than individual
items, so that comparisons are made at a "word" level rather than a "character"
level.
More complicated query expressions such as the following can be written:
Employees\((.Name `= "Adams") or .Salary>=60000)
** WHEN NAMELIST IS OMITTED IN TABLE QUERY, ALL FIELDS ARE RETURNED
+-----+------+------+
|Name |DeptNo|Salary|
+-----+------+------+
|Adams|123
|40000 |
|Smith|456
|60000 |
+-----+------+------+
or
3
Employees."Salary DeptNo"\((.Name eq "Adams") or .Salary>=60000)
+------+------+
|Salary|DeptNo|
+------+------+
|40000 |123
|
|60000 |456
|
+------+------+
Reassignments may be done as well.
To change Adams' name to Parker:
Employees.Name\(.Name eq "Adams"): "Parker"
** ASSUMES THERE IS ONLY ONE Adams IN THE TABLE
Employees
+--------+------+------+
|Name
|DeptNo|Salary|
+--------+------+------+
|Parker |123
|40000 |
|Jones
|456
|50000 |
|Smith
|456
|60000 |
|McDonald|789
|50000 |
+--------+------+------+
To give a $10000 raise to all employees in department 456:
Employees.Salary\(.DeptNo=456):+ 10000
Employees
+--------+------+------+
|Name
|DeptNo|Salary|
+--------+------+------+
|Parker |123
|40000 |
|Jones
|456
|60000 |
|Smith
|456
|70000 |
|McDonald|789
|50000 |
+--------+------+------+
13.3.
TABLE DELETIONS.
A table always remembers which records were most recently selected in the bit
vector mask Table\(). (If no selection has yet occurred, Table\() is a mask
vector of 0s whose length is the number of records in the table, indicating
that none of records have been selected.) For example:
Employees\()
0 1 1 0
** MOST RECENT QUERY TO Employees WAS .Dept=456; 2 OF 4 RECORDS ++
**
HAVE DEPARTMENT 456
Employees\(.Name eq "Parker")
1 0 0 0
** NEW QUERY MASK REPLACES OLD MASK OF 0 1 1 0
** Employees\()
1 0 0 0
Employees\(): 1 1 1 0 -- MASK MAY BE SET IF DESIRED
Employees
+--------+------+------+
|Name
|DeptNo|Salary|
+--------+------+------+
|Parker |123
|40000 |
|Jones
|456
|60000 |
|Smith
|456
|70000 |
|McDonald|789
|50000 |
+--------+------+------+
4
One or more records can be queried and deleted:
Employees: delete Employees\((.Name eq "Parker") or .Salary>=60000)
or alternatively:
Employees: delete Employees\()
since the explicit query references the same records as the masked query.
either method, the result is:
By
Employees
+--------+------+------+
|Name
|DeptNo|Salary|
+--------+------+------+
|McDonald|789
|50000 |
+--------+------+------+
13.4.
MULTIPLE TABLES.
Relational databases typically have several tables and may use the results of
one selection to select from another table. For example, referring back to a
previous table and supposing another table EmpDetail has already been created:
Employ
+--------+------+------+
|Name
|DeptNo|Salary|
+--------+------+------+
|Adams
|123
|40000 |
|McDonald|789
|50000 |
|Jones
|456
|50000 |
|Smith
|456
|60000 |
+--------+------+------+
EmpDetail
+---+--------+--------+-----+
|Num|HireDate|Last
|First|
+---+--------+--------+-----+
|111|1 1 2001|Adams
|Pat |
|222|2 2 2002|Jones
|Lee |
|333|3 3 2003|McDonald|Ann |
|444|4 4 2004|Smith
|Joe |
+---+--------+--------+-----+
This is not particularly great database design, but such design is beyond the
scope of this document. More to the point, it is possible to do the following
query across the two tables to retrieve the last names in department 456:
TempTbl: Employ."Name"\(.DeptNo=456)
TempTbl
+-----+
|Name |
+-----+
|Jones|
|Smith|
+-----+
ResultTbl: EmpDetail."Last Num"\(.Last `in TempTbl.Name)
5
ResultTbl
+-----+---+
|Last |Num|
+-----+---+
|Jones|222|
|Smith|444|
+-----+---+
The most recent output could also have been obtained with
EmpDetail."Last Num"\(.Last `in Employ."Name"\(.DeptNo=456)).
13.5.
MANIPULATING TABLES.
It is relatively easy to manipulate tables using BPL. For example, the user
function RecNums displays the record numbers of a table:
|: RecNums:: fn
|" x:: tbl
** TABLE TO NUMBER (RIGHT ARGUMENT)
z:: tbl
** NUMBERED TABLE (RESULT)
z: (2 1#["#",-1 1#1~~numrows x])~x
|"
** numrows x RETURNS THE NUMBER OF ROWS IN TABLE x
%RecNums ResultTbl
+-+-----+---+
|#|Last |Num|
+-+-----+---+
|1|Jones|222|
|2|Smith|444|
+-+-----+---+
13.6.
NULL VALUES.
BPL weakly supports the concept of null values (empty fields) in a relational
database. An employee whose employee number has not yet been assigned can be
assigned as follows:
ResultTbl:insert {"Clark",}
ResultTbl
+-----+---+
|Last |Num|
+-----+---+
|Jones|222|
|Smith|444|
|Clark|0 |
+-----+---+
The insert function looks at the referent of Num (i.e., 222 444), and notes
that this referent is numeric data, so insert inserts dnum (the numeric
delimiter, default value 0) into the Num field. (If the referent of Num had
been literal, the correct number of blanks obtained from dlit (the literal
delimiter, default value " ") would have been inserted. In a design where 0 is
a valid employee number but -1 is not, the user could change dnum (default 0)
to -1 before inserting.
Different columns of a table may have different null values.
13.7.
RELATIONAL DATABASE MANIPULATION.
A number of manipulations are common in relational database theory.
section explains how they are done in BPL.
This
6
13.7.1.
RESTRICT OPERATION.
The restrict (aka select) relational database operation returns all rows of a
table which satisfy a logical condition.
Employees.""\((.Name eq "Adams") or Salary>=60000) is an example of a restrict
operation in BPL.
13.7.2.
PROJECT OPERATION.
The project relational database operation returns selected columns of a table.
Employees."Name Salary" is an example of a project operation in BPL.
Restrict operations and project operations are frequently combined, e.g.
Employees."Name Salary"\((.Name `= "Adams") or Salary>=60000).
Technically, a project operation returns only distinct records. This can be
accomplished by saying dist Tbl, where the system function dist removes
duplicate records of a table leaving only distinct records.
13.6.3.
TIMES OPERATION.
The times (aka product) relational database operation returns a kind of
Cartesian product of two tables, where every record in one table is
concatenated with every record in another. Given two defined tables:
Employees
** NOTE McDonald NOW HAS A NULL DeptNo
+--------+------+------+
|Name
|DeptNo|Salary|
+--------+------+------+
|Adams
|123
|40000 |
|McDonald|0
|50000 |
+--------+------+------+
Departments
+--------+---------+--------+
|DeptCode|DeptName |Building|
+--------+---------+--------+
|123
|Sales
|Main St |
|456
|Marketing|Wall St |
|789
|Design
|Main St |
+--------+---------+--------+
The product is:
{Employees,"DeptNo"} times {Departments,"DeptCode"}
+--------+------+------+--------+---------+--------+
|Name
|DeptNo|Salary|DeptCode|DeptName |Building|
+--------+------+------+--------+---------+--------+
|Adams
|123
|40000 |123
|Sales
|Main St |
|Adams
|123
|40000 |456
|Marketing|Wall St |
|Adams
|123
|40000 |789
|Design
|Main St |
|McDonald|0
|50000 |123
|Sales
|Main St |
|McDonald|0
|50000 |456
|Marketing|Wall St |
|McDonald|0
|50000 |789
|Design
|Main St |
+--------+------+------+--------+---------+--------+
13.7.4.
INNER JOIN AND OUTER JOIN OPERATIONS.
The join relational database operation combines two tables along a common field
(or, less commonly, several fields). The two forms of join are the outer join
and the inner join. If system bit variable joinkind is 0 (the default), an
outer join is performed; if joinkind is 1, an inner join is performed.
7
The previous Employees and Departments could be combined along the department
number, which is also referred to as the department code, using an outer join:
joinkind: 0
{Employees,"DeptNo"} join {Departments,"DeptCode"}
+--------+------+------+--------+---------+--------+
|Name
|DeptNo|Salary|DeptCode|DeptName |Building|
+--------+------+------+--------+---------+--------+
|Adams
|123
|40000 |123
|Sales
|Main St |
+--------+------+------+--------+---------+--------+
The McDonald record does not appear in the outer join result because DeptNo in
Employees for McDonald was 0 (null) and could not be matched to any DeptCode in
Departments.
Using the inner join gives a somewhat different result:
joinkind: 1
{Employees,"DeptNo"} join {Departments,"DeptCode"}
+--------+------+------+--------+---------+--------+
|Name
|DeptNo|Salary|DeptCode|DeptName |Building|
+--------+------+------+--------+---------+--------+
|Adams
|123
|40000 |123
|Sales
|Main St |
|McDonald|0
|50000 |0
|0
|
|
+--------+------+------+--------+---------+--------+
** BUILDING FOR McDonald IS "
"
This alternate solution to the null DeptNo problem preserves the information
for McDonald in Employees in the result while assigning nulls to the fields
from Departments.
13.7.5.
UNION OPERATION.
The union relational database operation combines two compatible tables as if
performing a set theory union by records, similar to the union function. If
the user the two tables below, the user can find everyone who is a singer or an
actor or both as follows:
Singers
+--------+---------+-----------+
|LastName|FirstName|PhoneNumber|
+--------+---------+-----------+
|Douglas |Mary
|555-1111
|
|Edwards |Mike
|555-2222
|
|Conners |Lou
|555-3333
|
+--------+---------+-----------+
Actors
+--------+---------+-----------+
|LastName|FirstName|PhoneNumber|
+--------+---------+-----------+
|Douglas |Mary
|555-1111
|
|Carter |Pam
|555-4444
|
|Andrews |Lee
|555-5555
|
+--------+---------+-----------+
Singers table`union Actors
+--------+---------+-----------+
|LastName|FirstName|PhoneNumber|
+--------+---------+-----------+
|Douglas |Mary
|555-1111
|
|Edwards |Mike
|555-2222
|
|Conners |Lou
|555-3333
|
|Carter |Pam
|555-4444
|
|Andrews |Lee
|555-5555
|
+--------+---------+-----------+
8
The table operator modifies certain functions with vector arguments so that
these functions behave similarly with tables.
13.7.6.
INTERSECTION OPERATION.
The intersection relational database operation combines two compatible tables
as if performing a set theory intersection by records, similar to the inter
function. The user can find all singers who are also actors as follows:
Singers table`inter Actors
+--------+---------+-----------+
|LastName|FirstName|PhoneNumber|
+--------+---------+-----------+
|Douglas |Mary
|555-1111
|
+--------+---------+-----------+
13.7.7.
MINUS OPERATION.
The minus relational database operation combines two compatible tables as if
performing a set theory difference by records, similar to the minus function.
The user can find all singers who are not actors as follows:
Singers table`minus Actors
+--------+---------+-----------+
|LastName|FirstName|PhoneNumber|
+--------+---------+-----------+
|Edwards |Mike
|555-2222
|
|Conners |Lou
|555-3333
|
+--------+---------+-----------+
13.7.8.
DIVIDE OPERATION.
The divide relational database operation employs two tables. The first table
T1 has a field F1 whose domain has the values V1, V2, ..., Vn. The second
table T2 has a field F2 whose domain is a subset of the domain of F1. The
divide operation calculates the table Temp such that the only records of Temp
are the records of T1 that have every possible value of the domain of F2; field
F1 is then deleted from Temp to yield the resulting table T3. This can be more
clearly illustrated with an example:
Athletes
+---------+--------+
|Name
|Sport
|
+---------+--------+
|Alexander|Baseball|
|Alexander|Tennis |
|Baines
|Baseball|
|Carlton |Tennis |
|Carlton |Swimming|
|Godwin
|Baseball|
|Godwin
|Tennis |
|Godwin
|Swimming|
+---------+--------+
SportsList1
+--------+
|Sport
|
+--------+
|Baseball|
|Tennis |
|Swimming|
+--------+
9
Athletes divide SportsList1
+---------+
|Name
|
+---------+
|Godwin
|
+---------+
SportsList2
+--------+
|Sport
|
+--------+
|Tennis |
|Baseball|
+--------+
Athletes divide SportsList2
+---------+
|Name
|
+---------+
|Alexander|
|Godwin
|
+---------+
SportsList3
+----------+
|Sport
|
+----------+
|Baseball |
|Football |
+----------+
Athletes divide SportsList3
+---------+
|Name
|
+---------+
+---------+
In short, these divide operations yield all athletes who can do all the sports
in the given sports list.
13.8.
AGGREGATION AND GROUPING.
Given the following table:
Customers
+-------+-------+
|AcctNum|Balance|
+-------+-------+
|12345 |100.10 |
|23456 |200.20 |
|34567 |300.30 |
|45678 |400.40 |
+-------+-------+
The number of accounts with a balance less than 300.00 can be obtained by sum
Customers.Balance<300.00 <-> 2, and the average of all the account balances can
be obtained by mean Customers.Balance <-> 250.25. Maximums, minimums, and
other statistics can be similarly obtained; in general it is not difficult to
write programs to calculate statistics about a table.
In a more complicated situation, a customer might have several accounts:
10
Cust2
+--------+---------+-------+-------+
|LastName|FirstName|AcctNum|Balance|
+--------+---------+-------+-------+
|Zimmers |Zsa Zsa |12345 |100.10 |
|Zimmers |Zsa Zsa |23456 |200.20 |
|Young
|Yvonne
|34567 |300.30 |
|Young
|Yvonne
|45678 |400.40 |
|Young
|Yvonne
|56789 |500.50 |
|Xylon
|Xavier
|67890 |600.60 |
+--------+---------+-------+-------+
The total balance (wealth) for each customer may be calculated as follows:
C2: Cust2."LastName FirstName" group`sum Cust2."Balance"
C2
+--------+---------+-------+
|LastName|FirstName|Balance|
+--------+---------+-------+
|Zimmers |Zsa Zsa |300.30 |
|Young
|Yvonne
|1201.20|
|Xylon
|Xavier
|600.60 |
+--------+---------+-------+
C2.Wealth== C2.Balance
** RENAME FIELD TO MORE APPROPRIATE NAME
The group operator, which is used only with functions applied to tables,
modifies the sum function to calculate the sum of the Balance fields
corresponding to each customer given in the LastName and FirstName fields. The
maximum balance or minimum balance or average account balance of each customer
may also be obtained using group in similar ways.