Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 File: X13_Tables.doc By: J.M. Smeenk Revision: 5/20/09 8:49:38 AM -----TABLES -----13. TABLES. A relational table in BPL is a non-primitive type called tbl used to store database information. A table is a directory whose entries are all matrices or vectors of primitive types. Relational database theory is well-known and is employed in the popular language SQL. While BPL may not replace SQL, BPL has extensive relational database capabilities. 13.1. TABLE CREATION, INSERTION, AND ARRANGING. Creation of a table is straightforward: NameMat: -1 5#"AdamsJonesSmith" ** -1 MEANS "AS MANY AS THE DATA INDICATE (I.E., 3 ROWS) DeptVec: 123 456 456 SalaryVec: 40000 50000 60000 Employees."Name DeptNo Salary":: tbl.{lit[,],int[],int[]}: ++ {NameMat,DeptVec,SalaryVec} disp: 3 ** USE FRAMED OUTPUT FOR NESTED ARRAYS AND DIRECTORIES Employees ** VALUE OF TABLE IS A NESTED MATRIX OF NAMES & REFERENTS +-----+------+------+ |Name |DeptNo|Salary| +-----+------+------+ |Adams|123 |40000 | |Jones|456 |50000 | |Smith|456 |60000 | +-----+------+------+ Table fields can be displayed in any order (rather than the default of order of insertion into the table), with fields omitted if desired: Employees."Salary DeptNo" +------+------+ |Salary|DeptNo| +------+------+ |40000 |123 | |50000 |456 | |60000 |456 | +------+------+ A new employee can be inserted easily: Employees:insert {"McDonald",789,50000} ** MATRIX Name OF EMPLOYEES BECOMES PADDED TO ACCOMODATE LONGER NAME Employees +--------+------+------+ |Name |DeptNo|Salary| +--------+------+------+ |Adams |123 |40000 | |Jones |456 |50000 | |Smith |456 |60000 | |McDonald|789 |50000 | +--------+------+------+ 2 If the insertion leads to one or more larger matrix fields, and the matrix was declared without a fixed upper bound for the width, the matrix is internally redeclared to the new width of the field data and the rows of the matrix are padded with the appropriate delimiter. However, in such an insertion, if the matrix was declared with a width with some upper bound and the data in its field is longer than this width, an error ensues. The Employees table can be sorted by Salary as ascending primary key and Name as descending secondary key: Employ: Employees arrange "Salary -Name" ** arrange KEYWORD MEANS "TABLE SORT"; KEYWORD sort IS ALREADY ++ ** DEFINED FOR NON-TABLE ARRAYS. ** PRIMARY KEY IS Salary, SECONDARY KEY IS Name ** NO PREFIX INDICATES ASCENDING SORT, - INDICATES DESCENDING SORT Employ +--------+------+------+ |Name |DeptNo|Salary| +--------+------+------+ |Adams |123 |40000 | |McDonald|789 |50000 | |Jones |456 |50000 | |Smith |456 |60000 | +--------+------+------+ 13.2. TABLE QUERIES. To make relational tables more powerful, BPL allows table queries that can have the form Table.Namelist\(.FieldName DyadicFn Expression). The namelist may be replaced by a single name or else by "" (which signifies all field names). For example, the following code returns all employee names in department 456 and their salaries: Employees."Name Salary"\(.DeptNo=456) +--------+------+ |Name |Salary| +--------+------+ |Jones |50000 | |Smith |60000 | +--------+------+ The \(...) query portion, which chooses the appropriate records of the table, uses fieldnames of the table preceded by .. An expression such as .DeptNo is erroneous outside of table queries since it is really an abbreviation for Employees.DeptNo. The DyadicFn is typically a relational (<, <=, =, >=, >, <>) or similar function (in, is, notin, isnot) or a rowwise relational (`<, `<=, `=, `>=, `>, `<>) or a rowwise similar function (`in, `notin). The meta operator ` modifies a function so that it works on rows rather than individual items, so that comparisons are made at a "word" level rather than a "character" level. More complicated query expressions such as the following can be written: Employees\((.Name `= "Adams") or .Salary>=60000) ** WHEN NAMELIST IS OMITTED IN TABLE QUERY, ALL FIELDS ARE RETURNED +-----+------+------+ |Name |DeptNo|Salary| +-----+------+------+ |Adams|123 |40000 | |Smith|456 |60000 | +-----+------+------+ or 3 Employees."Salary DeptNo"\((.Name eq "Adams") or .Salary>=60000) +------+------+ |Salary|DeptNo| +------+------+ |40000 |123 | |60000 |456 | +------+------+ Reassignments may be done as well. To change Adams' name to Parker: Employees.Name\(.Name eq "Adams"): "Parker" ** ASSUMES THERE IS ONLY ONE Adams IN THE TABLE Employees +--------+------+------+ |Name |DeptNo|Salary| +--------+------+------+ |Parker |123 |40000 | |Jones |456 |50000 | |Smith |456 |60000 | |McDonald|789 |50000 | +--------+------+------+ To give a $10000 raise to all employees in department 456: Employees.Salary\(.DeptNo=456):+ 10000 Employees +--------+------+------+ |Name |DeptNo|Salary| +--------+------+------+ |Parker |123 |40000 | |Jones |456 |60000 | |Smith |456 |70000 | |McDonald|789 |50000 | +--------+------+------+ 13.3. TABLE DELETIONS. A table always remembers which records were most recently selected in the bit vector mask Table\(). (If no selection has yet occurred, Table\() is a mask vector of 0s whose length is the number of records in the table, indicating that none of records have been selected.) For example: Employees\() 0 1 1 0 ** MOST RECENT QUERY TO Employees WAS .Dept=456; 2 OF 4 RECORDS ++ ** HAVE DEPARTMENT 456 Employees\(.Name eq "Parker") 1 0 0 0 ** NEW QUERY MASK REPLACES OLD MASK OF 0 1 1 0 ** Employees\() 1 0 0 0 Employees\(): 1 1 1 0 -- MASK MAY BE SET IF DESIRED Employees +--------+------+------+ |Name |DeptNo|Salary| +--------+------+------+ |Parker |123 |40000 | |Jones |456 |60000 | |Smith |456 |70000 | |McDonald|789 |50000 | +--------+------+------+ 4 One or more records can be queried and deleted: Employees: delete Employees\((.Name eq "Parker") or .Salary>=60000) or alternatively: Employees: delete Employees\() since the explicit query references the same records as the masked query. either method, the result is: By Employees +--------+------+------+ |Name |DeptNo|Salary| +--------+------+------+ |McDonald|789 |50000 | +--------+------+------+ 13.4. MULTIPLE TABLES. Relational databases typically have several tables and may use the results of one selection to select from another table. For example, referring back to a previous table and supposing another table EmpDetail has already been created: Employ +--------+------+------+ |Name |DeptNo|Salary| +--------+------+------+ |Adams |123 |40000 | |McDonald|789 |50000 | |Jones |456 |50000 | |Smith |456 |60000 | +--------+------+------+ EmpDetail +---+--------+--------+-----+ |Num|HireDate|Last |First| +---+--------+--------+-----+ |111|1 1 2001|Adams |Pat | |222|2 2 2002|Jones |Lee | |333|3 3 2003|McDonald|Ann | |444|4 4 2004|Smith |Joe | +---+--------+--------+-----+ This is not particularly great database design, but such design is beyond the scope of this document. More to the point, it is possible to do the following query across the two tables to retrieve the last names in department 456: TempTbl: Employ."Name"\(.DeptNo=456) TempTbl +-----+ |Name | +-----+ |Jones| |Smith| +-----+ ResultTbl: EmpDetail."Last Num"\(.Last `in TempTbl.Name) 5 ResultTbl +-----+---+ |Last |Num| +-----+---+ |Jones|222| |Smith|444| +-----+---+ The most recent output could also have been obtained with EmpDetail."Last Num"\(.Last `in Employ."Name"\(.DeptNo=456)). 13.5. MANIPULATING TABLES. It is relatively easy to manipulate tables using BPL. For example, the user function RecNums displays the record numbers of a table: |: RecNums:: fn |" x:: tbl ** TABLE TO NUMBER (RIGHT ARGUMENT) z:: tbl ** NUMBERED TABLE (RESULT) z: (2 1#["#",-1 1#1~~numrows x])~x |" ** numrows x RETURNS THE NUMBER OF ROWS IN TABLE x %RecNums ResultTbl +-+-----+---+ |#|Last |Num| +-+-----+---+ |1|Jones|222| |2|Smith|444| +-+-----+---+ 13.6. NULL VALUES. BPL weakly supports the concept of null values (empty fields) in a relational database. An employee whose employee number has not yet been assigned can be assigned as follows: ResultTbl:insert {"Clark",} ResultTbl +-----+---+ |Last |Num| +-----+---+ |Jones|222| |Smith|444| |Clark|0 | +-----+---+ The insert function looks at the referent of Num (i.e., 222 444), and notes that this referent is numeric data, so insert inserts dnum (the numeric delimiter, default value 0) into the Num field. (If the referent of Num had been literal, the correct number of blanks obtained from dlit (the literal delimiter, default value " ") would have been inserted. In a design where 0 is a valid employee number but -1 is not, the user could change dnum (default 0) to -1 before inserting. Different columns of a table may have different null values. 13.7. RELATIONAL DATABASE MANIPULATION. A number of manipulations are common in relational database theory. section explains how they are done in BPL. This 6 13.7.1. RESTRICT OPERATION. The restrict (aka select) relational database operation returns all rows of a table which satisfy a logical condition. Employees.""\((.Name eq "Adams") or Salary>=60000) is an example of a restrict operation in BPL. 13.7.2. PROJECT OPERATION. The project relational database operation returns selected columns of a table. Employees."Name Salary" is an example of a project operation in BPL. Restrict operations and project operations are frequently combined, e.g. Employees."Name Salary"\((.Name `= "Adams") or Salary>=60000). Technically, a project operation returns only distinct records. This can be accomplished by saying dist Tbl, where the system function dist removes duplicate records of a table leaving only distinct records. 13.6.3. TIMES OPERATION. The times (aka product) relational database operation returns a kind of Cartesian product of two tables, where every record in one table is concatenated with every record in another. Given two defined tables: Employees ** NOTE McDonald NOW HAS A NULL DeptNo +--------+------+------+ |Name |DeptNo|Salary| +--------+------+------+ |Adams |123 |40000 | |McDonald|0 |50000 | +--------+------+------+ Departments +--------+---------+--------+ |DeptCode|DeptName |Building| +--------+---------+--------+ |123 |Sales |Main St | |456 |Marketing|Wall St | |789 |Design |Main St | +--------+---------+--------+ The product is: {Employees,"DeptNo"} times {Departments,"DeptCode"} +--------+------+------+--------+---------+--------+ |Name |DeptNo|Salary|DeptCode|DeptName |Building| +--------+------+------+--------+---------+--------+ |Adams |123 |40000 |123 |Sales |Main St | |Adams |123 |40000 |456 |Marketing|Wall St | |Adams |123 |40000 |789 |Design |Main St | |McDonald|0 |50000 |123 |Sales |Main St | |McDonald|0 |50000 |456 |Marketing|Wall St | |McDonald|0 |50000 |789 |Design |Main St | +--------+------+------+--------+---------+--------+ 13.7.4. INNER JOIN AND OUTER JOIN OPERATIONS. The join relational database operation combines two tables along a common field (or, less commonly, several fields). The two forms of join are the outer join and the inner join. If system bit variable joinkind is 0 (the default), an outer join is performed; if joinkind is 1, an inner join is performed. 7 The previous Employees and Departments could be combined along the department number, which is also referred to as the department code, using an outer join: joinkind: 0 {Employees,"DeptNo"} join {Departments,"DeptCode"} +--------+------+------+--------+---------+--------+ |Name |DeptNo|Salary|DeptCode|DeptName |Building| +--------+------+------+--------+---------+--------+ |Adams |123 |40000 |123 |Sales |Main St | +--------+------+------+--------+---------+--------+ The McDonald record does not appear in the outer join result because DeptNo in Employees for McDonald was 0 (null) and could not be matched to any DeptCode in Departments. Using the inner join gives a somewhat different result: joinkind: 1 {Employees,"DeptNo"} join {Departments,"DeptCode"} +--------+------+------+--------+---------+--------+ |Name |DeptNo|Salary|DeptCode|DeptName |Building| +--------+------+------+--------+---------+--------+ |Adams |123 |40000 |123 |Sales |Main St | |McDonald|0 |50000 |0 |0 | | +--------+------+------+--------+---------+--------+ ** BUILDING FOR McDonald IS " " This alternate solution to the null DeptNo problem preserves the information for McDonald in Employees in the result while assigning nulls to the fields from Departments. 13.7.5. UNION OPERATION. The union relational database operation combines two compatible tables as if performing a set theory union by records, similar to the union function. If the user the two tables below, the user can find everyone who is a singer or an actor or both as follows: Singers +--------+---------+-----------+ |LastName|FirstName|PhoneNumber| +--------+---------+-----------+ |Douglas |Mary |555-1111 | |Edwards |Mike |555-2222 | |Conners |Lou |555-3333 | +--------+---------+-----------+ Actors +--------+---------+-----------+ |LastName|FirstName|PhoneNumber| +--------+---------+-----------+ |Douglas |Mary |555-1111 | |Carter |Pam |555-4444 | |Andrews |Lee |555-5555 | +--------+---------+-----------+ Singers table`union Actors +--------+---------+-----------+ |LastName|FirstName|PhoneNumber| +--------+---------+-----------+ |Douglas |Mary |555-1111 | |Edwards |Mike |555-2222 | |Conners |Lou |555-3333 | |Carter |Pam |555-4444 | |Andrews |Lee |555-5555 | +--------+---------+-----------+ 8 The table operator modifies certain functions with vector arguments so that these functions behave similarly with tables. 13.7.6. INTERSECTION OPERATION. The intersection relational database operation combines two compatible tables as if performing a set theory intersection by records, similar to the inter function. The user can find all singers who are also actors as follows: Singers table`inter Actors +--------+---------+-----------+ |LastName|FirstName|PhoneNumber| +--------+---------+-----------+ |Douglas |Mary |555-1111 | +--------+---------+-----------+ 13.7.7. MINUS OPERATION. The minus relational database operation combines two compatible tables as if performing a set theory difference by records, similar to the minus function. The user can find all singers who are not actors as follows: Singers table`minus Actors +--------+---------+-----------+ |LastName|FirstName|PhoneNumber| +--------+---------+-----------+ |Edwards |Mike |555-2222 | |Conners |Lou |555-3333 | +--------+---------+-----------+ 13.7.8. DIVIDE OPERATION. The divide relational database operation employs two tables. The first table T1 has a field F1 whose domain has the values V1, V2, ..., Vn. The second table T2 has a field F2 whose domain is a subset of the domain of F1. The divide operation calculates the table Temp such that the only records of Temp are the records of T1 that have every possible value of the domain of F2; field F1 is then deleted from Temp to yield the resulting table T3. This can be more clearly illustrated with an example: Athletes +---------+--------+ |Name |Sport | +---------+--------+ |Alexander|Baseball| |Alexander|Tennis | |Baines |Baseball| |Carlton |Tennis | |Carlton |Swimming| |Godwin |Baseball| |Godwin |Tennis | |Godwin |Swimming| +---------+--------+ SportsList1 +--------+ |Sport | +--------+ |Baseball| |Tennis | |Swimming| +--------+ 9 Athletes divide SportsList1 +---------+ |Name | +---------+ |Godwin | +---------+ SportsList2 +--------+ |Sport | +--------+ |Tennis | |Baseball| +--------+ Athletes divide SportsList2 +---------+ |Name | +---------+ |Alexander| |Godwin | +---------+ SportsList3 +----------+ |Sport | +----------+ |Baseball | |Football | +----------+ Athletes divide SportsList3 +---------+ |Name | +---------+ +---------+ In short, these divide operations yield all athletes who can do all the sports in the given sports list. 13.8. AGGREGATION AND GROUPING. Given the following table: Customers +-------+-------+ |AcctNum|Balance| +-------+-------+ |12345 |100.10 | |23456 |200.20 | |34567 |300.30 | |45678 |400.40 | +-------+-------+ The number of accounts with a balance less than 300.00 can be obtained by sum Customers.Balance<300.00 <-> 2, and the average of all the account balances can be obtained by mean Customers.Balance <-> 250.25. Maximums, minimums, and other statistics can be similarly obtained; in general it is not difficult to write programs to calculate statistics about a table. In a more complicated situation, a customer might have several accounts: 10 Cust2 +--------+---------+-------+-------+ |LastName|FirstName|AcctNum|Balance| +--------+---------+-------+-------+ |Zimmers |Zsa Zsa |12345 |100.10 | |Zimmers |Zsa Zsa |23456 |200.20 | |Young |Yvonne |34567 |300.30 | |Young |Yvonne |45678 |400.40 | |Young |Yvonne |56789 |500.50 | |Xylon |Xavier |67890 |600.60 | +--------+---------+-------+-------+ The total balance (wealth) for each customer may be calculated as follows: C2: Cust2."LastName FirstName" group`sum Cust2."Balance" C2 +--------+---------+-------+ |LastName|FirstName|Balance| +--------+---------+-------+ |Zimmers |Zsa Zsa |300.30 | |Young |Yvonne |1201.20| |Xylon |Xavier |600.60 | +--------+---------+-------+ C2.Wealth== C2.Balance ** RENAME FIELD TO MORE APPROPRIATE NAME The group operator, which is used only with functions applied to tables, modifies the sum function to calculate the sum of the Balance fields corresponding to each customer given in the LastName and FirstName fields. The maximum balance or minimum balance or average account balance of each customer may also be obtained using group in similar ways.