Download SQL Language Guide - Online Documentation

Document related concepts

Tandem Computers wikipedia , lookup

DBase wikipedia , lookup

Oracle Database wikipedia , lookup

Concurrency control wikipedia , lookup

Relational algebra wikipedia , lookup

Microsoft Access wikipedia , lookup

Database wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Functional Database Model wikipedia , lookup

Ingres (database) wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Null (SQL) wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

SQL wikipedia , lookup

PL/SQL wikipedia , lookup

Transcript
RDM SQL Language Guide
Raima Database Manager 11.0
RDM SQL Language Guide
1
RDM SQL Language Guide
Trademarks
Raima Database Manager® (RDM®), RDM Embedded® and RDM Server® are trademarks of Raima Inc. and
may be registered in the United States of America and/or other countries. All other names may be trademarks of
their respective owners.
This guide may contain links to third-party Web sites that are not under the control of Raima Inc. and Raima Inc.
is not responsible for the content on any linked site. If you access a third-party Web site mentioned in this guide,
you do so at your own risk. Inclusion of any links does not imply Raima Inc. endorsement or acceptance of the
content of those third-party sites.
2
RDM SQL Language Guide
Contents
Contents
Introduction
3
10
Operational Overview
11
How this Book is Organized
14
A Language for Describing a Language
16
A Simple Interactive SQL Scripting Utility
18
Interface and Scripting Commands
18
Defining a Database
25
Create Database
25
Create Domain
26
Create Table
27
Standard Database Table
27
Virtual Table
30
Compiling a DDL Specification
32
Example Databases
32
National Science Foundation Awards Database
33
Antiquarian Bookshop Database
35
Retrieving Data from a Database
40
Simple Queries
40
Column Expressions
41
Conditional Queries
46
Retrieving Data from Multiple Tables
50
Sorting Query Results
55
Performing Result Set Aggregate Calculations
58
NSF Gender Study Example
Inserting Data into a Database
63
67
Transactions
67
Insert Values
68
Contents
3
RDM SQL Language Guide
Insert From Select
70
Import
70
Changing and Deleting Data in a Database
73
Searched Delete Statement
73
Searched Update Statement
75
Writing and Using Stored Procedures
79
Concurrent Database Access
83
Locking In RDM SQL
84
Read Only Transactions
85
Modification Stored Procedures
86
Avoiding Deadlock
86
Concurrent Database Access Use in Static SQL Applications
87
How Queries are Processed by RDM SQL
89
Overview of the Query Optimization Process
89
Cost-Based Optimization
93
Restriction Factors
94
Table Access Methods
94
Sequential Table Scan
95
Hashed Access Retrieval
95
Index Access Retrieval
95
Joins Involving Primary and Foreign Keys
96
Optimizable Expressions
97
Access Plan Determination
98
Selecting From Alternative Access Methods
98
Selecting the Access Order
98
Sorting and Grouping Operations
100
Outer Join Processing
100
Returning the Number of Rows in a Table
100
Query Construction Guidelines
Contents
101
4
RDM SQL Language Guide
Controlling Optimizer with a User-Specified Restriction Factor
Using SQL in an Application Program
Native SQL API Basics
102
103
103
Comparing the ODBC API with the Native RSQL API
105
Connection Handles
107
Statement Handles
107
Header Files
109
API Function Parameters
109
SQL Data Types and Values
110
Structure of an RDM SQL Application
113
Hello World!
114
Initializing and Terminating TFS operation
118
Connecting to a TFS and Opening Databases
119
Database Unions
121
Compiling and Executing SQL Statements
122
Retrieving Select Statement Results
128
Basic Retrieval
128
Retrieving Blob Data Values
128
Fetching Results From Retrieval Stored Procedures
130
Positioned Update and Delete Statements
User-Defined Functions (UDFs) in SQL
135
138
UDF Load Table Definition and Registration
138
UDF Type Checking Function: udfCheck
142
UDF Initialization Function: udfInit
145
UDF Termination Function: udfTerm
146
Scalar Call Function: udfScalarCall
147
Aggregate UDF Call Function: udfAggCall
148
Aggregate UDF Result Function: udfAggResult
150
Aggregate UDF Reset Function: udfAggReset
151
Contents
5
RDM SQL Language Guide
Calling RSQL API Functions from a UDF
152
Using Virtual Tables to Access Any Data
154
Virtual Table Load Table Definition and Registration
155
Thread-safe Access to Global Data Used by a Virtual Table Interface
158
Virtual Table Execution Function: vtInsert
160
Virtual Table Row Count Function: vtRowCount
164
Virtual Table Row Count Function: vtSelectCount
164
Virtual Table Select Open Function: vtSelectOpen
166
Virtual Table Fetch Function: vtFetch
168
Virtual Table Select CloseFunction: vtSelectClose
170
Virtual Table Usage
170
Accessing a Core (non-SQL) Database in RDM SQL
173
How Core Database Record Types are Mapped to SQL Tables
173
Mapping Core Keys to SQL Keys
174
Mapping Core Sets to SQL Foreign Keys
175
Multi-Member Sets and Explicit Locking
176
Order of Columns in the Table
176
Null Values
176
Adding Column Information and Creating a Catalog
176
SQL Built-In Function Reference
179
Aggregate Functions
179
Scalar Functions
179
Mathematical Functions
179
Date and Time Functions
180
String Functions
180
abs
182
acos
183
age
184
asin
185
Contents
6
RDM SQL Language Guide
atan
186
atan2
187
avg
188
ceiling
189
convert
190
cos
193
cot
194
count
195
curdate
197
curtime
198
dayofmonth
199
dayofweek
200
dayofyear
201
exp
202
floor
203
hour
204
if
205
ifnull
206
log
207
max
208
min
209
minute
210
mod
211
month
212
pi
213
quarter
214
query
215
rand
216
second
217
Contents
7
RDM SQL Language Guide
sign
218
sin
219
sqrt
220
sum
221
tan
223
week
224
year
225
SQL Language Syntax Summary
226
RDM DDL Statements
226
RDM DML Statements
228
RDM Procedure Statements
234
SQL Reserved Words for RDM
235
SQL Statement Reference
237
close
238
commit
239
create catalog
240
create database
241
create domain
243
create procedure
245
create table
247
create virtual table
252
delete
254
drop database
256
drop procedure
257
end read only transaction
258
execute
259
export
261
import
262
initialize
265
Contents
8
RDM SQL Language Guide
insert
266
lock table
268
open
270
release
272
rollback
273
savepoint
274
select
275
set
281
set column
283
start
285
unlock table
287
update
288
SQL UDF Reference
290
udfAggCall
291
udfAggReset
293
udfAggResult
295
udfCheck
297
udfInit
300
udfScalarCall
302
udfTerm
305
SQL Virtual Table Function Reference
307
vtFetch
308
vtInsert
311
vtRowCount
315
vtSelectClose
317
vtSelectCount
319
vtSelectOpen
321
Glossary
324
Index
335
Contents
9
RDM SQL Language Guide
Introduction
"The days just prior to marriage are like
a snappy introduction to a tedious book."
- Wilson Mizner,
US Screenwriter (1876-1933)
According to Wikipedia's entry entitled "Elephant Joke", there's an old one that goes like this:
Q. How many elephants will fit into a Mini?
A. Four: two in the front, two in the back.
Q. How many giraffes will fit into a Mini?
A. None. It's full of elephants.
Of course, if it is possible to get four elephants into a Mini then it must be pretty easy to get one in. In which case,
there must also be no problem using SQL in an embedded computer application! But, even if one does succeed
in getting the elephant into the car, the added weight will certainly have a significant negative impact on its speed.
Such is the case on the advisability of using SQL in an embedded database application. The 2008 edition of Volume 2 of the ANSI/ISO SQL standard is over 1300 pages long. That's about twice the size of the 1992 standard
which itself was considerably larger than the original 1989 standard. A fully-compliant implementation of SQL
(which may not actually exist) is indeed a monster. For any SQL database management system (DBMS) implementer, just the effort involved to understand the standard in order to construct a commercially-viable, fullycompliant implementation is immense.
Nevertheless, SQL has become the industry standard database access language. As such, there are many software developers who know how to use SQL. Because of this vast availability of SQL database skills, many companies that are involved in the development of embedded computer applications with database management
requirements would like to be able to use SQL to access and manipulate that database information.
The DBMS capabilities that are needed in embedded computing applications are not nearly as broad as those
needed in enterprise systems. RDM SQL has been designed specifically for embedded systems applications. As
such, it provides a subset of the ANSI/ISO standard SQL that is suitable for running on a wide variety of computers and embedded operating systems many of which have limited computing resources. Some non-standard
features are also included that are designed specifically for the needs of embedded computing applications.
RDM SQL is built on top of the RDM database system and thus provides all of its replication and mirroring
capabilities. However, it is important to note that RDM SQL is not designed to provide an SQL interface to existing RDM applications but to be the primary database access interface for the application. Of course, the ability to
use the core-level RDM API is available to the RDM SQL user but the need to utilize the lower-level recordoriented API would be the exception and not the rule. On a practical level what this means is that the application
database can only be defined through the RDM SQL DDL which does not expose all of the DDL capabilities available in the non-SQL RDM DDL.
Features of SQL that are not all that useful in embedded applications and, when implemented, can consume a
significant amount of computing resources have not been implemented in RDM SQL. Those features include:
database views (create view) and security (grant and revoke), check clause integrity constraints, triggers
(create trigger), and dynamic DDL (alter table).
Introduction
10
RDM SQL Language Guide
Non-standard features that have been added based on embedded application requirements include the ability
to:
l include compiled C modules containing statically initialized database catalog tables and SQL stored procedures,
l include compiled C modules containing statically initialized, pre-compiled SQL stored procedure definitions,
l define user-defined SQL functions in C,
l define virtual tables that allow any kind of data source (e.g., real-time sensor network data) to be accessed through SQL,
l limit the number of returned rows from a select statement by number or time,
l produce a target SQL application that does not need to perform any dynamic compilation of SQL statements.
This manual uses standard database and SQL terminology such as DDL (database definition language), DML
(database manipulation language), etc. If there is a term that you do not understand simply refer to the glossary
toward the end of the manual for a definition.
Operational Overview
RDM SQL is designed to be used in a C language application program and execute on virtually any operating system and hardware platform. While many platforms are supported, a given database application must only use
platforms that are architecturally identical (e.g., same endianess).
Input and output to an RDM database is managed by an RDM Transactional File Server (TFS). The RDM SQL
application makes calls to the RDM SQL application program interface (API) functions which can compile and/or
execute SQL statements embedded in the application program. Figure 1
Figure 1 shows a typical RDM SQL application that includes the ability to dynamically compile and execute SQL
statements.
Introduction
11
RDM SQL Language Guide
Figure 1 - Dynamic RDM SQL Application
Embedded applications, however, typically have well-defined data access and manipulation requirements and
so they usually do not need to have the ability to support ad hoc query processing. As much as 25-30% of an
SQL implementation goes to the support of dynamic compilation. Thus, if this can be eliminated from the embedded application code, a not insignificant amount of memory can be saved.
In order to do this, RDM SQL provides the ability to define a basic stored procedure that can contain either one or
more select statements or one or more insert, update, or delete statements. These statements are compiled on a
host development computer system. The compiled form of the stored procedure is stored in both a C file and a
binary file. The C file can be compiled and linked in with the application and the procedures executed through a
specific RDM SQL API function call (rsqlExecProc). When all of the SQL statements used by an application
are encapsulated this way in pre-compiled stored procedures then the compilation component of RDM SQL is no
longer needed and can be omitted from the application. Figure 2 depicts this situation.
Notice that an RDM application program can access databases from any number of TFSs and that those TFSs
can be running on any computer that is accessible to the application's computer through TCP/IP. A feature of
RDM SQL is the ability to open multiple instances of the same database running on separate TFSs as a single
database that is a union of the separate instances. This allows the database to be separated into independent
partitions on which queries can be performed across all partitions. The Concurrent Database Access section will
describe this feature in more detail.
Introduction
12
RDM SQL Language Guide
Figure 2 - Static RDM SQL Application
Provided with RDM SQL is a command-line tool called rdmsql (described in detail in the Interactive SQL Scripting Utility section) which can be used to dynamically execute user-specified SQL statements and text files containing SQL statements. A typical use of rdmsql is to process a file containing the SQL DDL statements that
define a database. This process is shown in Figure 3.
Introduction
13
RDM SQL Language Guide
Figure 3 - How RDM SQL Processes a DDL File
Embedded development often involves doing development on a host system and deploying the application on a
target system. Catalogs and stored procedures that are created on the host platform can only be used on a target platform that is architecturally identical to the host. However, if the catalogs and stored procedures were
created by an RDM SQL running under a target simulator on the host system, then they will be output in a targetcompatible format.
Besides the native RDM SQL API, standard ODBC and JDBC interfaces are also provided. Two forms of each
are available. A client-server version allows an ODBC or JDBC application to interact with an RDM SQL database engine running as a server on a separate computer. This allows, for example, third-party ODBC-based
tools to access an RDM SQL database without having to execute on the same computer. A situation which may
not even be possible on some embedded systems. Alternatively, if you prefer to program using a standard SQL
interface, you can link your target computer C/C++ (or Java) application directly with our ODBC (or JDBC)
library.
How this Book is Organized
The sections in this book are designed as a tutorial that incrementally introduces you to SQL in general and its
use in RDM specifically. Rather than just repeat here what's also in the Table of Contents, I recommend that you
check it out to see how the book is organized.
Following the chapters, the appendices which comprise a significant amount of the book provide a reference
manual for the system. If you already know SQL then you can skip most of the chapters and go right to the appendices. However, I would strongly suggest that you read through Chapters 3, 4, 8, 9, 10, 12, and 13 because they
describe important features that are unique to RDM SQL. Okay, so you don't really get to skip much at all.
Introduction
14
RDM SQL Language Guide
We here at Raima have worked hard to make this manual both easy-to-read and easy-to-use as well as accurate. Any errors are the responsibility of the primary author and if you find any we would greatly appreciate your
letting us know which you can easily do through our web site at http://www.raima.com.
Introduction
15
RDM SQL Language Guide
A Language for Describing a Language
Works of imagination should be
written in very plain language;
the more purely imaginative they are
the more necessary it is to be plain.
- Samuel Taylor Coleridge
SQL stands for "Structured Query Language". You have probably seen many different methods used in programming manuals to show how to use a specific programming language. The two most common methods use
syntax flow diagrams and what is known as Backus-Naur Form (BNF) which is a formal language for describing
a programming language. In this document we use a simplified BNF method that seeks to represent the language in a way that closely matches the way you will code your own SQL statements for your application.
For example, the following select statement:
select sale_name, company, city, state
from salesperson natural join customer;
can be described by this syntax rule:
select_stmt:
select identifier[, identifier]… from identifier [natural join identifier] ;
where "select_stmt" is the name of the rule (sometimes called a non-terminal); the bold-faced identifiers select,
from, natural, and join are key words (sometimes called terminal symbols); identifier is like a function argument
that stands in place of a user-specified value (technically, it too is the name of a rule that is matched by any userspecified value that begins with a letter followed by any sequence consisting of letters, digits, and the underscore
("_") character). Rule names are identifiers and their definitions are specified by giving the rule name beginning
in column 1 and terminating the rule with a colon (":") as shown above.
There are also special meta-symbols that are part of the syntax descriptor language. Two are shown in the
above select_stmt syntax rule. The brackets ("[" and "]") enclose optional elements. The ellipsis ("…") specifies
that the preceding item can be repeated zero or more times. Other meta-symbols include a vertical bar (i.e., an
"or" symbol) that is used to separate alternative elements and braces ("{" and "}") which enclose a set of alternatives from which one must always be matched. All other special characters (e.g., the "," and ";" in the select_
stmt rule) are considered to be part of the language definition. Meta-symbols that are themselves part of the language will be enclosed in single quotes (e.g., '[') in the syntax rule.
Rule names can be used in other rules. For example, the syntax for a stored procedure that can contain multiple
select statements could be described by the following rule:
create_proc:
create procedure identifier as
select_stmt[; select_stmt]…
end proc;
A Language for Describing a Language
16
RDM SQL Language Guide
In order to make the syntax more readable, any non-bold, italicized name is considered to be matched as an identifier. Thus, the select_stmt rule can also be written as follows…
select_stmt:
select column_name[, column_name]… from table_name [natural join table_name] ;
where column_name represents identifiers that correspond to table column names and table_name represents
identifiers that correspond to table names.
Some italicized terms are used to match specific text patterns. E.g., number matches any text pattern that can be
used to represent a number (either integer or decimal) and integer matches any pattern that represents an
integer number.
These rules are summarized in the table below.
Table 1. Syntax Description Language Elements
Syntax Element
Description
keyword
Bold-faced words that identify the special words used in the language that specify
actions and usage. Sometimes called reserved words. Examples, select, insert,
create, using.
identifier
Italicized word corresponding to an identifier: sequences of letters, digits, and "_" that
begin with a letter.
number
Any text that corresponds to an integer or decimal number.
integer
Any text that corresponds to an integer.
[option1 | option2]
A selection in which either nothing or option1 or option2 is specified.
{option1 | option2}
Either option1 or option2 must be specified.
element…
Repeat element zero or more times.
identifier
Normal-faced identifiers correspond to the names of syntax rules. Syntax rules are
defined by the name starting in column 1 and ending with a ":".
A Language for Describing a Language
17
RDM SQL Language Guide
A Simple Interactive SQL Scripting Utility
Beauty of style and harmony
and grace and good rhythm
depends on simplicity.
- Plato
Okay, I know that this is the world of point-and-click, easy-to-use applications. In fact, many abound for doing just
that with SQL. So what value can there possibly be in providing a text-based, command-line-oriented, interactive
SQL scripting utility? Well, for one thing, you can keep both hands on the keyboard and never have to touch the
mouse! Novel concept isn't it? It also has provided us here at Raima with something that was easy to write and is
easily ported to any platform. Hence, the interface works identically on all platforms. It also provides us (and, presumably, you as well) with the ability to generate test cases that can be easily and automatically executed. You
will more effectively learn how to properly formulate SQL statements by actually typing them in than by simply
pointing to icons that do the job for you.
The name of this program is rdmsql. To start rdmsql, open an OS command window and enter a command
that conforms to the following syntax.
rdmsql
When started rdmsql will display its startup banner (unless the –B option was specified) and an input prompt.
Enter ? for list of interface commands.
001 rdmsql:
The number in the command prompt above (001 rdmsql:) is a SQL statement number which is incremented
for each SQL statement executed.
Interface and Scripting Commands
The list of rdmsql interface commands are given in the following table.
Command
?
--
Description
Display the list of commands available.
Comment delimiter. Lines beginning with "--" will be ignored.
-- Script File Example
-- Open bookshop database and wait for input
.c 1
open bookshop;
.c [n srv port]
Running the above script will open the bookshop database and then wait for input.
Select connection handle "n". By default there are 5 connection handles available. If "n" is
not provided, the current connection information is displayed.
A Simple Interactive SQL Scripting Utility
18
RDM SQL Language Guide
Command
.d * | n [,n]
.q
.r filespec
!oscmd
Description
If the remote connection option is selected on the command line, the "srv" parameter
specifies the host name where rdmsqlserver is running and "port" specifies the anchor
port number (default is port number 21553),
Disconnect all connections (*) or specific connections by connection number.
Exit the rdmsql utility. The process of exiting will rollback any uncommitted transactions
and disconnect connections before exiting.
Read and execute statements from filespec.
Execute the specified OS command. For example, the following shows executing a
"dir" command:
001 rdmsql: !dir *.txt /b
acctmgrs.txt
authors.txt
bnotelines.txt
bnotes.txt
bookgens.txt
books.txt
booksubs.txt
genres.txt
names.txt
patrons.txt
pnotelines.txt
pnotes.txt
sales.txt
subjects.txt
001 rdmsql:
<return>
;
*
-[n]
+[n]
#n
/old/new/[g]
Display the current statement.
Resubmit current statement.
Display statement history (default 25).
Retreat current statement n lines (default 1)
Advance current statement n lines (default 1)
Make statement number n the current statement.
Substitute 'new' for 'old' in current statement. Specify 'g' to replace all occurrences.
In the example below, the current statement is statement 002. The substitution command
(/091/081/) replaces the matching text in the calculation and redisplays the modified
statement. The modified current statement is then resubmitted using the ";" command.
002 rdmsql: select bookid, price, price*0.091 tax from book
where bookid like "carl%";
bookid
price
tax
carlyle01
125
11.375
carlyle02
1385
126.035
carlyle03
995
90.545
carlyle04
3750
341.25
carlyle05
5750
523.25
A Simple Interactive SQL Scripting Utility
19
RDM SQL Language Guide
Command
Description
003 rdmsql: /091/081/
rdmsql: select bookid, price, price*0.081 tax from book where
bookid like "carl%"
+ 003 rdmsql: ;
bookid
carlyle01
carlyle02
carlyle03
carlyle04
carlyle05
004 rdmsql:
.T [start|stop]
.e [on|off]
.t [on|off]
.n
price
125
1385
995
3750
5750
tax
10.125
112.185
80.595
303.75
465.75
Start / stop timer. Displays elapsed time between start and stop in seconds and outputs to
stdout.
Turn on/off echo of executing statements. If on/off is not specified, the current echo mode
is displayed.
Turn on/off table display mode. If on/off is not specified, the current table display mode is
displayed.
Display next row if table display mode is off.
The example below shows the usage of the display table mode:
116 rdmsql: .t on
*** table mode is on
116 rdmsql: select name, age(hire_date) from acctmgr where age
(hire_date) = 12;
name
age(hire_date)
Fox, Joe
12
Kelly, Kathleen
12
117 rdmsql: .t off
*** table mode is off
117 rdmsql: select name, age(hire_date) from acctmgr where age
(hire_date) = 12;
name
: Fox, Joe
age(hire_date) : 12
118 rdmsql: .n
name
: Kelly, Kathleen
age(hire_date) : 12
118 rdmsql: .n
*** no more rows
118 rdmsql:
.l [n]
.w [n]
.C
.R
Set output page length to n lines. If n is not specified, the current page length is displayed.
(default 50)
Set output page width to n columns. If n is not specified, the current page width is displayed. (default 4096)
Execute commit (alternative to "commit;").
Execute rollback (alternative to "rollback;").
A Simple Interactive SQL Scripting Utility
20
RDM SQL Language Guide
Command
Description
005 rdmsql: select avg(price) from book;
avg(price)
7200.48012232416
006 rdmsql: update book set price = 100;
*** 327 rows affected
007 rdmsql: select avg(price) from book;
avg(price)
100
008 rdmsql: .R
008 rdmsql: select avg(price) from book;
avg(price)
7200.48012232416
.i
.m message
.y [on|off]
Display current transaction status.
Display message on stdout.
Set prepare only mode. If on/off is not supplied, the current mode is displayed. (default
off)
The example below shows the preparation of a statement requiring one parameter,
assigning the parameter and then executing the statement.
016 rdmsql:
*** prepare
016 rdmsql:
bookid like
017 rdmsql:
017 rdmsql:
bookid
carlyle01
carlyle02
carlyle03
carlyle04
carlyle05
carroll01
carroll02
carroll03
cather01
cather02
cather03
cather04
cather05
cather06
cather07
cather08
017 rdmsql:
A Simple Interactive SQL Scripting Utility
.y on
only mode is on
select bookid, price, price*? as tax from book where
"ca%";
.p1 0.091
.x
price
125
1385
995
3750
5750
4500
2000
75
7500
5450
5895
1550
850
475
335
250
tax
11.375
126.035
90.545
341.25
523.25
409.5
182
6.825
682.5
495.95
536.445
141.05
77.35
43.225
30.485
22.75
21
RDM SQL Language Guide
Command
Description
The parameter value can be changed and the current statement re-executed:
017 rdmsql: .p1 0.092
017 rdmsql: .x
.o [on|off]
.s filespec
.f getcursor
Set autocommit mode. If on/off not specified, the current mode will be displayed. (default
off)
Save entered commands to filespec. File will be saved and closed on exit.
Get the a cursor name associated with the current statement handle.
The following example illustrates using a cursor to update a specific row in a table.
002 rdmsql: .t off
*** table mode is off
002 rdmsql: select bookid, price from book for update;
bookid
: alcott01
price
: 1200
003 rdmsql: .n
bookid
: alcott02
price
: 1075
003 rdmsql: .f getcursor
*** cursor = SQL_CUR_2108_41d8
003 rdmsql: .h 2
*** using statement handle 2 of connection 1
003 rdmsql: update book set price=1076 where current of SQL_CUR_
2108_41d8;
*** 1 rows affected
004 rdmsql: select bookid, price from book;
bookid
: alcott01
price
: 1200
005 rdmsql: .n
bookid
: alcott02
price
: 1076
005 rdmsql:
Once a connection has been opened, you can submit SQL statements by simply typing in the statement from the
command prompt. Statements can span multiple input lines and are terminated with a semicolon (";"). At this
point, rdmsql will compile and execute the statement. Any errors detected during compilation or execution will
be displayed. If the statement was a select statement then the result set will be displayed and paginated according to the .l and .w settings. A sample session is shown below. User input is shown in bold-faced text.
RDMSQL Utility
Raima Database Manager 11.0.0 Build 412 [2-15-2012] http://www.raima.com/
Copyright © 2012, Raima Inc. All rights reserved.
Enter ? for list of interface commands.
A Simple Interactive SQL Scripting Utility
22
RDM SQL Language Guide
001 rdmsql: .c 1
*** using statement handle 1 of connection 1
001 rdmsql: .l 50
*** lines per page = 50
001 rdmsql: .w 132
*** columns per page = 132
001 rdmsql: open bookshop;
002 rdmsql: select full_name, gender, yr_born, yr_died from author;
FULL_NAME
GENDER YR_BORN YR_DIED
Alcott, Louisa May
M
1832
1888
Austen, Jane
F
1775
1817
Bacon, Francis
M
1561
1626
Barrie, J. M. (James Matthew)
M
1860
1937
Baum, L. Frank (Lyman Frank)
M
1856
1919
Bronte, Charlotte
F
1816
1855
Bronte, Emily
F
1818
1848
Burns, Robert
M
1759
1796
Burroughs, Edgar Rice
M
1875
1950
Carlyle, Thomas
M
1795
1881
Carroll, Lewis
M
1832
1898
Cather, Willa
F
1873
1947
Chaucer, Geoffrey
M
1343
1400
Chesterton, G. K. (Gilbert Keith)
M
1874
1936
Coleridge, Samuel Taylor
M
1772
1834
Conrad, Joseph
M
1857
1924
Cooper, James Fenimore
M
1789
1851
Crane, Stephen
M
1871
1900
Descartes, Rene
M
1596
1650
Defoe, Daniel
M
1661
1731
Dickens, Charles
M
1812
1870
Dostoyevsky, Fyodor
M
1821
1881
Doyle, Arthur Conan, Sir
M
1859
1930
Dumas, Alexandre
M
1802
1870
Eliot, George
F
1819
1880
Faulkner, William
M
1897
1962
Ferber, Edna
F
1887
1968
Franklin, Benjamin
M
1706
1790
Gaskell, Elizabeth Cleghorn
F
1810
1865
Hardy, Thomas
M
1840
1928
Hawthorne, Nathaniel
M
1804
1864
Hemingway, Ernest
M
1899
1961
Hobbes, Thomas
M
1588
1679
Hugo, Victor
M
1802
1885
Irving, Washington
M
1783
1859
James, Henry
M
1843
1916
Flaubert, Gustave
M
1821
1880
Johnson, Samuel
M
1709
1784
Kipling, Rudyard
M
1865
1936
Lewis, Sinclair
M
1885
1951
London, Jack
M
1876
1916
Longfellow, Henry Wadsworth
M
1807
1882
Milton, John
M
1608
1674
A Simple Interactive SQL Scripting Utility
23
RDM SQL Language Guide
Muir, John
M
1838
1914
Paine, Thomas
M
1737
1809
Poe, Edgar Allan
M
1809
1849
Potter, Beatrix
F
1866
1943
Raleigh, Walter, Sir
M
1552
1618
Scott, Walter, Sir
M
1771
1832
Shakespeare, William
M
1564
1616
**** press <enter> to continue or s to stop here
FULL_NAME
GENDER YR_BORN YR_DIED
Shelley, Mary Wollstonecraft
F
1797
1851
Sinclair, Upton
M
1878
1968
Steinbeck, John
M
1902
1968
Stevenson, Robert Louis
M
1850
1894
Stowe, Harriet Beecher
F
1811
1896
Swift, Jonathan
M
1667
1745
Tennyson, Alfred, Baron
M
1809
1892
Thoreau, Henry David
M
1817
1862
Tolstoy, Leo
M
1828
1910
Trollope, Anthony
M
1815
1882
Twain, Mark
M
1835
1910
Verne, Jules
M
1828
1905
Wells, H. G. (Herbert George)
M
1866
1946
Wharton, Edith
F
1862
1937
Whitman, Walt
M
1819
1892
Wilde, Oscar
M
1854
1900
Woolf, Virginia
F
1882
1941
003 rdmsql: .q
The –b startupfile command line option can be used to run the script file startupfile in batch mode in
which rdmsql will automatically open a connection and process each statements in order. When the last one
has been executed rdmsql will automatically compile and execute a commit statement to ensure that all of the
work has completed and data stored and then the program will terminate. Error messages associated with any
errors that are encountered will be output to stdout.
This option is useful for processing files containing a SQL DDL specification. It is also good to use when importing data into database tables.
A Simple Interactive SQL Scripting Utility
24
RDM SQL Language Guide
Defining a Database
But Vronsky felt that now especially it
was essential for him to clear up
and define his position if he were
to avoid getting into difficulties.
- Leo Tolstoy, Anna Karenin
A poorly designed database can create all kinds of difficulties for the user of a database application. Unfortunately, the blame for those difficulties are often laid at the feet of the database management system which, try
as it might, simply cannot use non-existent access paths to quickly get at the needed data. Good database
design is as much of an art as it is engineering and a solid understanding of the application requirements is a necessary prerequisite. However, it is not the purpose of this document to teach you how to produce good database
designs. But you do need to understand that designing a database is a complex task and that the quality of the
application in which it is to be used is highly dependent on the quality of the database design. If you are not experienced in designing databases then it is highly recommended that you first consult any number of good books on
that subject before setting out to develop your RDM SQL database.
A database schema is the definition of what kind of data is to be stored and how that data is to be organized in the
database. The Database Definition Language (DDL) consists of the SQL statements that are used to describe a
particular database schema (also called the database definition). Three DDL statements are provided in RDM
SQL: create database (schema), create domain, and create table. The create database (schema) statement
names the database that will be defined by the create domain and create table statements that follow it. The
create domain statement allows you to define a special-purpose data type that can be used by a subsequent
create table statement in the declaration of a table column. The create table statement is used to define the characteristics of a table that will be stored in the database. Each of these DDL statements are described in detail in
the following sections.
Create Database
The create database statement must be the first DDL statement issued for a new database specification. The
syntax for this statement is as follows.
create_schema_stmt:
create {schema | database} db_name
[pagesize = num] [inmemory [persistent | volatile | read]]
Use of "schema" (instead of "database") follows the ISO/ANSI SQL standard convention. The pagesize and
inmemory options are RDM SQL extensions. The pagesize option sets the default page size for all of the database files. If not specified, the default page size is 1024 bytes. The inmemory option indicates that the database
is to be kept entirely in memory. The read, persistent, and volatile options control whether the database files are
read from disk when the database is opened (read, persistent), and whether they are written to the disk when
the database is closed (persistent). The default inmemory option is volatile which means that the database is
created empty the first time it is opened and will cease to exist either after the last application closes the database
(e.g. Windows) or when the system is rebooted (Unix). The read option means that the entire database is read
from the files when the database is opened, changes to the data are allowed but are not written back to the files
Defining a Database
25
RDM SQL Language Guide
on closing. The persistent option means that the entire database is read on opening and all changes that were
made while the database was open are written when the database is closed.
The database consists of all of the tables that are declared in the create table statements that are issued after
the create database statement.
Examples
create database sales;
create database usage_stats no nulls pagesize = 512;
Create Domain
A "domain" is simply a user-defined and named data type which can then be specified as the data type for columns that are declared in a create table statement. The syntax for the create domain statement is shown below.
create_domain_stmt:
create domain domain_name [as] data_type
[default {constant | null}]
The name of the domain is specified as the domain_name. The data_type specifies the base type for the domain.
A constant value or null can be specified as the default.
The distinct values clause specifies the number of distinct values that will be stored in columns of this type. The
range clause specifies the minimum and maximum values that will be stored in columns of this type. These two
clauses provide important information that is only used by the RDM SQL query optimizer to determine the best
possible execution plan for a query. Note that these clauses do not specify column validation checks. It will still be
possible to store values that are outside of the specified range.
The data types that are available in RDM SQL are given in the following syntax specification.
data_type:
base_type | blob_type
base_type:
{character | char } [(length)]
|
{{character | char} varying | varchar } (length)
|
{binary [(length)]
|
{double [precision] | float | real }
|
{ tinyint | smallint | int | integer | long | bigint}
|
date | time | timestamp
Defining a Database
26
RDM SQL Language Guide
blob_type:
{{character | char} large object | long varchar | clob} [(length)] file_option
|
{binary large object | large varbinary | blob} [(length)] file_option
file_option:
[pagesize = num] [inmemory [persistent | volatile | read]]
Each specific blob instance is stored in a separate set of blob file pages using only as many pages as are needed
to store the entire blob. If the size of the blob data is less than a page the unused space on that page will remain
unused. Hence, you should probably supply a pagesize value that will minimize the amount of unused page
space based on the average size of your blob data.
Examples
create domain birth_date as
date range date "1900-01-01" to date "2011-01-01";
create domain gender as
char distinct values = 2;
create domain us_state as
char(2) distinct values = 53
Create Table
Standard Database Table
The create table statement is used to define a table to be included in the database. Create table statements can
only be issued after the create database statement and before issuing any other non-DDL statements. Any
domain types that are used in column declarations included in the create table statement must have already
been declared through the issuance of a prior create domain statement. The syntax for the create table statement is as follows.
standard_table:
create [circular] table table_name (
column_def[, column_def]...
[, key_def[, key_def]...]
) [pagesize = num] [inmemory [persistent | volatile | read]]
[maxpgs = num] [maxrows = num]
Defining a Database
27
RDM SQL Language Guide
column_def:
column_name {type_spec | domain_name}
[distinct values = num] [range constant to constant]
[not null] [key_spec] [refs_spec]
type_spec:
data_type [default {constant | null}]
key_spec:
|
[primary | unique] key ['['keysize']']
{primary | unique} key [hash { (num) | of num rows}] ['['keysize']']
refs_spec:
references
table_name[.column_name] [triggered_action]
key_def:
|
[primary | unique] key [hash {(num) | of num rows}] ['['keysize']'] [key_name]
(column_name[asc | desc] [, column_name[asc | desc] ]...)
[pagesize = num] [inmemory [persistent | volatile | read]] [maxpgs = num]
foreign key [set_name] (column_name[, column_name]...
references table_name[(column_name[, column_name]...)]
[triggered_action]
triggered_action:
on update action_spec [on delete action_spec]
|
on delete action_spec [on update action_spec]
action_spec:
cascade | restrict | set null
The table_name is a user-specified identifier that names the table. The contents of the table is comprised of the
columns that are declared within it. Columns are declared to be of a specific data type which is either explicitly
given or specified through use of a previously declared domain name. A default value and display format can also
optionally be specified unless the column was declared with a domain type.
The distinct values clause specifies the number of distinct values that will be stored in this column. The range
clause specifies the minimum and maximum values that will be stored in the column. These two clauses provide
important information that is only used by the RDM SQL query optimizer to determine the best possible
execution plan for a query. Note that these clauses do not specify column validation checks. It will still be possible
to store values that are outside of the specified range.
Columns can be specified with one or more constraints which declare the column to be:
l not null—null values are not allowed for the column,
l a primary/unique or non-unique key—on which an index will be automatically created,
l a foreign key that references the primary/unique key of the specified table.
Defining a Database
28
RDM SQL Language Guide
Foreign key references are automatically implemented using RDM sets. The name of the column becomes the
name of the RDM set. The RDM record type into which the SQL table is mapped will not contain a data field for
this column. The SQL column value is retrieved through the owner of the set—i.e., the primary key column's
value. A triggered_action can be specified with foreign key columns in order to specify what should happen when
the referenced row (the owner record instance) is updated or deleted. The default action is restrict meaning that
primary key rows that have existing foreign key references cannot be updated/deleted. If on ... cascade is specified, then all of the referenced rows are updated or deleted when the primary key row is updated (i.e., the primary key column value) or deleted. Note that the referencing table may itself have a primary key declared that is
referenced by foreign keys in other tables that may not have a cascade triggered action specified. Thus, a delete
of the referenced row of a cascade delete allowed table may be denied due to a restrict foreign key on a row of a
referencing table.
A key_def on a table is used to declare multi-column primary/unique/non-unique keys and foreign keys. The [primary | unique] key clause is used to identify the columns from the table on which a key is to be formed. You can
specify the sort order for each column to be either ascending (default) or descending. A table can have only one
primary key. If a key_name is specified then that will be the name of the RDM compound key. If not specified a
unique system-generated name will be used.
Each table is contained in a separate RDM data file. Each key is contained in a separate RDM key file. The
values for each blob type column is stored in a separate RDM blob file. The file_option can optionally be specified
to provide RDM-specific file characteristics.
Examples
create table sales_office(
office_id char(3) primary key,
city char(24),
state char(2)
);
create table salesperson(
sale_id integer primary key,
name char(38) not null,
sex gender,
dob birth_date,
hired_on date default today,
sales_tot double,
office char(3) references outlet,
mgr_id integer references salesperson,
unique key sale_key(name, office)
);
create table customer(
cust_id integer primary key,
name char(38),
sale_id integer not null
references salesperson
on update cascade
on delete restrict
);
Defining a Database
29
RDM SQL Language Guide
Virtual Table
An RDM SQLvirtual table is defined through a combination of the create virtual table statement and a set of
user-written C functions that conform to a particular interface specification. A pointer to a pre-defined structure
array that contains an entry for each virtual table with the addresses of each of the virtual table interface functions
is passed into SQL before the database is opened. These functions are then called by SQL at the appropriate
times during the execution of any SQL statement that references the virtual table. This interaction is depicted in
Figure 4 which shows SQL calling the function in the application's virtual table function module to fetch a row of
weather data from a wireless sensor network (WSN). Note that in this example by storing the data retrieved from
the virtual table in a standard table, RDM can then replicate that data to an outside host DBMS (e.g., RDM
Server or some other well-known SQL DBMS). Also note that the green boxes represent code that is compiled
as part of the user's application while the blue boxes represent RDM systems code.
The syntax for the create virtual table statement is given below.
virtual_table:
create virtual [read only] table table_name (
vcolumn_def[, vcolumn_def]…
)
vcolumn_def:
column_name base_type
[distinct values = num] [range constant to constant]
[primary key]
base_type:
{character | char } [(length)]
|
{{character | char} varying | varchar } (length)
|
{binary [(length)]
|
{double [precision] | float | real }
|
{ tinyint | smallint | int | integer | long | bigint}
|
date | time | timestamp
Defining a Database
30
RDM SQL Language Guide
Figure 4. Virtual Tables in RDM SQL
No create virtual table statement for a given database can be submitted until all standard create table statements have first been submitted. In other words, the create virtual table statements must all come at the end of
your database schema specification. Only one primary key column declaration can appear in a create virtual
table statement. Values for this column must be unique and will be used by SQL in calls to the user-function in
the virtual table interface API to find the row for a specified value.
The DDL schema specification for the aforementioned wireless weather sensor database is given in the following example.
create database weather_db;
create table location( /* location of weather sensor */
longitude integer,
latitude integer,
sensor_id bigint,
descr char(48),
county char(24),
state char(2),
primary key loc_id(longitude, latitude)
);
Defining a Database
31
RDM SQL Language Guide
create table weather_summary(
longitude integer,
latitude integer,
rdg_date date,
hour_of_day smallint,
avg_temp smallint,
avg_ press smallint,
avg_hum smallint,
avg_lumens smallint,
foreign key (longitude, latitude) references location
);
create virtual readonly table weather_data(
sensor_id bigint primary key,
loc_long integer,
loc_lat integer,
rdg_time timestamp,
temperature smallint,
pressure smallint,
humidity smallint,
light smallint,
power integer
);
Compiling a DDL Specification
Of course, you can interactively enter your DDL statements using rdmsql (or any other ODBC-based SQL utility) but you will normally create the DDL specification for your database using a text editor and storing it in a text
file. A good convention is to store SQL scripts in files with a ".sql" extension. A convention that I like to use is to
name the DDL specification file "dbname.sql". For example, the DDL files for the two example databases
described in the next section are "nsfawards.sql" and "bookshop.sql".
Assuming you too use the same convention. you can use rdmsql to compile an SQL DDL file as follows.
rdmsql –b [@hostname:port] dbname.sql
If the @hostname:port is not specified, @localhost:21553 will be used. Errors will be reported to stdout and identify the file and line number of the offending SQL statement. A successful compilation of a DDL specification will
produce the dbname_cat.c and dbname_cat.h files in the current directory (when the "generate C files" option is
enabled -see rsqlSetGenCFiles) and the database dictionary file (dbname.dbd), catalog file (dbname.cat),
data files (dbname.d*), and key files (dbname.k*) in a directory named dbname on the TFS. The database will be
initialized and ready for use.
Example Databases
Two example databases are provided with RDM SQL that facilitate learning how to use RDM SQL and will be
used in most of the examples given in this book. This section describes the two databases by presenting the DDL
Defining a Database
32
RDM SQL Language Guide
specifications along with an explanation of how that data would be used in a SQL application. The first database
contains actual data derived from over 130,000 National Science Foundation (USA) research grants that were
awarded during the years 1990 through 2003. The second database is for a hypothetical bookshop that only sells
high-end, rare antiquarian books.
National Science Foundation Awards Database
The data used in this example has been extracted from the University of California Irvine Knowledge Discovery
in Databases Archive (http://kdd.ics.uci.edu/). The original source data can be found at http://kdd.ics.uci.edu/databases/nsfabs/nsfawards.html. The data was processed by a Raima-developed RDM SQL program
that, in addition to pulling out the data from each award document, converted all personal names to a "last name,
first name, …" format and, where possible, identified each person's gender from the first name. The complete
DDL specification for the NSF awards database is shown below.
NOTE: The NSF Awards example is a large database and may take a few minutes to create and populate.
create database nsfawards;
create table person(
name
char(35) primary key,
gender
char(1) distinct values
jobclass
char(1) distinct values
);
create table sponsor(
name
char(50) primary key,
addr
char(40),
city
char(20),
state
char(2) distinct values
zip
char(5)
);
create table nsforg(
orgid
char(3) primary key,
name
char(40)
);
create table nsfprog(
progid
char(4) primary key,
descr
char(40)
);
create table nsfapp(
appid
char(10) primary key,
descr
char(40)
);
create table award(
awardno
integer primary key,
title
char(200),
award_date date key,
instr
char(3) distinct values
start_date date,
exp_date
date key,
amount
double key,
Defining a Database
= 3,
= 2
= 100,
= 11,
33
RDM SQL Language Guide
abstract
prgm_mgr
sponsor_nm
orgid
long varchar,
char(35) references person,
char(50) references sponsor,
char(3) references nsforg
);
create table investigator(
awardno
integer references
name
char(35) references
);
create table field_apps(
awardno
integer references
appid
char(10) references
);
create table progrefs(
awardno
integer references
progid
char(4) references
);
award,
person
award,
nsfapp
award,
nsfprog
Descriptions for each of the tables declared in the nsfawards database are given in the following table.
Table 4. NSF Awards Database Table Descriptions
Table Name
person
sponsor
nsforg
nsfprog
nsfapp
award
investigator
field_apps
progrefs
Description
Contains one row for each investigator or NSF program manager. An investigator
(jobcclass = "I") is a person who is doing the research. The NSF program manager
(jobcclass = "P") oversees the research project on behalf of the NSF. An award can
have more than one investigator but only one program manager. The gender column is derived from the first name but has three values "M", "F", and "U" for
"unknown" when the gender based on the first name could not be determined
(about 13%).
Institution that is sponsoring the research. Usually where the principal investigator is
employed. Each award has a single sponsor.
NSF organization. The highest level NSF division or office under which the grant is
awarded.
Specific NSF programs responsible for funding research grants.
NSF application areas that the research impacts.
Specific data about the research grant. The columns are fairly self-explanatory. For
clarity the exp_data column contains the award expiration data (i.e., when the
money runs out). The amount column contains the total funding amount. The instr
column is a code indicating the award instrument (e.g., "CTG" = "continuing", "STD"
= "standard", etc.).
The specific investigators responsible for carrying out the research. This table is
used to form a many-to-many relationship between the person and award tables.
NSF application areas for which the research is intended. This table is used to form
a many-to-many relationship between the nsfapp and award tables.
Specific programs under which the research is funded. This table is used to form a
many-to-many relationship between the nsfprog and award tables.
Note that the interpretations given in the above descriptions are my own and may not be completely accurate
(e.g., it could be that NSF programs are not actually responsible for funding research grants). However, my
Defining a Database
34
RDM SQL Language Guide
intent is to simply use this data for the purpose of illustration (although we will later delve into an interesting
gender analysis).
Note the use of the distinct values clause in the DDL specification. In particular, where the number of actual distinct values is significantly less than the total number of rows in the table it is important to indicate this so that the
SQL query optimizer can make better choices as to access method. The Concurrent Database Access section
explains in greater detail how the RDM query optimizer works.
A schema diagram for the nsfawards database is shown below. Each box represents a table and each arrow
represents a one-to-many relationship between two tables. The arrow label is the foreign key column (declared
using the references clause in the DDL specification) in the target (i.e. the "many" side) table on which the relationship is formed.
Figure 5 - NSF Awards Database Schema Diagram
Antiquarian Bookshop Database
Our fictional bookshop is located in Hertford, England (a very real and charming town north of London). It is
located in a building constructed around 1735 and has two rather smallish rooms on two floors with floor-to-ceiling bookshelves throughout. Upon entering, one is immediately transported to a much earlier era being quite
overwhelmed by the wonderful sight and odor of the ancient mahogany wood in which the entire interior is lined
along with the rare and ancient books that reside on them. There is a little bell that announces one's entrance into
the shop but it is not really needed, as the delightfully squeaky floor boards quite clearly makes your presence
known.
In spite of the ancient setting and very old and rare books, this bookshop has a very modern Internet storefront
through which it sells and auctions off its expensive inventory. A computer system contains a database describing the inventory and manages the sales and auction processes. The database schema for our bookshop is
given below.
create database bookshop;
Defining a Database
35
RDM SQL Language Guide
create table author(
last_name
char(13) primary key,
full_name
char(35),
gender
char(1) distinct values = 2,
yr_born
smallint,
yr_died
smallint,
short_bio
varchar(250)
);
create table genres(
text
char(31) primary key
);
create table subjects(
text
char(51) primary key
);
create table book(
bookid
char(14) primary key,
last_name
char(13)
references author on delete cascade on update cascade,
title
varchar(255),
descr
char(61),
publisher
char(136),
publ_year
smallint key,
lc_class
char(33),
date_acqd
date,
date_sold
date,
price
double,
cost
double
);
create table related_name(
bookid
char(14)
references book on delete cascade on update cascade,
name
char(61)
);
create table genres_books(
bookid
char(14)
references book on delete cascade on update cascade,
genre
char(31)
references genres
);
create table subjects_books(
bookid
char(14)
references book on delete cascade on update cascade,
subject
char(51)
references subjects
);
Defining a Database
36
RDM SQL Language Guide
create table acctmgr(
mgrid
char(7) primary key,
name
char(24),
hire_date
date,
commission double
);
create table patron(
patid
char(3) primary key,
name
char(30),
street
char(30),
city
char(17),
state
char(2),
country
char(2),
pc
char(10),
email
char(63),
phone
char(15),
mgrid
char(7)
references acctmgr
);
create table note(
noteid
integer primary key,
bookid
char(14)
references book on delete cascade on update cascade,
patid
char(3)
references patron on delete cascade on update cascade
);
create table note_line(
noteid
integer
references note on delete cascade on update cascade,
text
char(61)
);
create table sale(
bookid
char(14)
references book on delete cascade on update cascade,
patid
char(3)
references patron on delete cascade on update cascade
);
create table auction(
aucid
integer primary key,
bookid
char(14)
references book on delete cascade on update cascade,
mgrid
char(7)
references acctmgr,
start_date date,
end_date
date,
reserve
double,
curr_bid
double
Defining a Database
37
RDM SQL Language Guide
);
create table bid(
aucid
integer
references auction on delete cascade on update cascade,
patid
char(3)
references patron on delete cascade on update cascade,
offer
double,
bid_ts
timestamp
);
Descriptions for each of the above tables are given below.
Table 5. Bookshop Database Table Descriptions
Table Name
author
book
genres
subjects
related_name
genres_books
subjects_books
note
note_line
acctmgr
patron
sale
auction
bid
Defining a Database
Description
Each row contains biographical information about a renowned author.
Contains information about each book in the bookshop inventory. The last_name
column associates the book with its author. Books with a non null date_sold are no
longer available.
Table of genre names (e.g., "Historical fiction") with which particular books are associated via the genres_books table.
Table of subject names (e.g., "Cape Cod") with which particular books are associated via the subjects_books table.
Related names are names of individuals associated with a particular book. The
names are usually hand-written in the book's front matter or on separate pages that
were included with the book (e.g., letters) and identify the book's provenance (owners). Only a few books have related names. However, their presence can significantly increase the value of the book.
Used to create a many-to-many relationship between genres and books.
Used to create a many-to-many relationship between subjects and books.
Connects each note_line to its associated book. Notes include edition info and other
comments (often coded) relating to its condition.
One row for each line of text in a particular note.
Account manager are the bookshop employees responsible for servicing the
patrons and managing auctions.
Bookshop customers and their contact info. Connected to their purchases/bids
through their relationship with the sale and auction tables.
Contains one row for each book that has been sold. Connects the book with the
patron who acquired through the bookid and patid columns.
Some books are auctioned. Those that have been (or currently being) auctioned
have a row in this table that identifies the account manager who oversees the auction. The reserve column specifies the minimum acceptable bid, curr_bid contains
the current amount bid.
Each row provides the bid history for a particular auction.
38
RDM SQL Language Guide
Foreign keys are declared using the references clause. Many are specified with the on delete/update cascade
option indicating that deletions or updates to the referenced rows will cause the referencing row to automatically
be deleted or updated as well.
A schema diagram depicting the inter-table relationships is shown below. As was mentioned above for the NSF
awards database, the arrows represent a one-to-many relationship between the source and target tables and
labels on the arrows identify the foreign key in the target table on which the relationship is formed.
Figure 6 - Bookshop Database Schema Diagram
The sample data that is included with this example contains book descriptions that were obtained from the
United States Library of Congress online card catalog: http://catalog.loc.gov. The short biographical sketches
included with each author entry are condensed descriptions from information about each author contained on
Wikipedia: http://www.wikipedia.org. The use of the Wikipedia information is governed by the Creative Commons Attribution-ShareAlike license: http://creativecommons.org/licenses/by-sa/3.0/. Pricing information and
the JPEG files of photographs of some of the books in the database were obtained from the website for Peter
Harrington Antiquarian Bookseller in Chelsea London, http://www.peterharrington.co.uk, which is a perfect realworld example of the kind of bookshop depicted in this example.
Defining a Database
39
RDM SQL Language Guide
Retrieving Data from a Database
You can use all the quantitative data you can get,
but you still have to distrust it and use your own
intelligence and judgment.
- Alvin Toffler
The reason data is stored in a database is so that it can be later retrieved and looked at. However, in order to do
something intelligent with that data it must first intelligently be retrieved. This is often much easier to say than to
do and that is particularly true with a language like SQL.
Data is retrieved from RDM databases using the SQL select statement. This section will explain how to properly
formulate select statements to view data contained in one or more RDM databases.
A completely specified select statement is commonly referred to as a query. The complete set of rows that are
returned by a select statement is called the result set.
Simple Queries
The most basic of queries is to retrieve all of the rows and columns of a table. The easiest way to do this is to use
the following statement:
select_statement:
select * from
table_name
The "*" indicates that all of the columns declared in table_name are to be returned. Thus, you can enter the following statement to see all of the account managers in the acctmgr table in the bookshop database.
select * from acctmgr;
MGRID
ALFRED
AMY
BARNEY
FRANK
JOE
KATE
KLARA
NAME
Kralik, Alfred
Zonn, Amy
Noble, Barney
Doel, Frank
Fox, Joe
Kelly, Kathleen
Novac, Klara
HIRE_DATE
1997-07-02
1994-07-06
1972-05-08
1987-02-13
1998-12-18
1998-12-18
1990-01-02
COMMISSION
0.025
0.025
0.035
0.030
0.025
0.025
0.025
Of course, if you only need to see some but not all of the columns in a table, those columns can be individually
listed as indicated in the following syntax.
select_statement:
select
column_name[, column_name]… from
Retrieving Data from a Database
table_name
40
RDM SQL Language Guide
Each specified column_name must identify a column that is declared in table_name. The next example retrieves
the name, city, and email address of each bookshop patron.
select name, city, email from patron;
NAME
Carlos Slim Helu
William Gates, III
Warren Buffett
Mukesh Ambani
Bernard Arnult
Stephen Jobs
Scrooge McDuck
Richie Rich
Jed Clampett
Bruce Wayne
Thurston Howell III
Artimis Fowel II
Charles Montgomery Burns
Jay Gatsby
Lucille Bluth
Chatsworth Osborne Jr.
Jean Luc Picard
Jeffrey Bezos
Giorgio Armani
CITY
Acapulco
Redmond
Omaha
Mumbai
Cannes
Cupertino
Anaheim
San Diego
Beverly Hills
Gotham City
Newport
Dublin
Springfield
West Egg
Newport Beach
Haddonfield
San Francisco
Seattle
Piacenza
EMAIL
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
Column Expressions
Besides retrieving the values of individual columns, a select statement allows you to specify expressions that can
perform arithmetic operations on the columns in a table. The normal arithmetic operators (+, -, *, /) along with a
wide range of scalar functions can be included in a select column expression. The complete syntax for column
expressions is given below.
select_statement:
select expression [column_alias] [, expression [column_alias] ]… fromtable_name
expression:
operand [arith_operator operand]...
operand:
constant | param_ref | column_ref | function | (expr)
param_ref:
? | :param_name
Retrieving Data from a Database
41
RDM SQL Language Guide
column_ref:
[{table_name | correlation_name}.]column_name
arith_operator:
+|-|*|/
function:
aggregate_fcn | scalar_fcn
aggregate_fcn:
{sum | avg | max | min} (expression)
|
count ({* | column_ref })
|
aggregate_udf_name ([expression][, expression]...)
scalar_fcn:
|
if (conditional_expr, expression, expression)
|
numeric_function | datetime_function | string_function
|
scalar_udf_name ([expression][, expression]...)
numeric_function:
abs(arith_expr)
|
acos(arith_expr)
|
asin(arith_expr)
|
atan(arith_expr)
|
atan2(arith_expr)
|
{ceil | ceiling}(arith_expr)
|
cos(arith_expr)
|
cot(arith_expr)
|
exp(arith_expr)
|
floor(arith_expr)
|
{ln | log}(arith_expr)
|
mod(arith_expr)
|
pi()
|
rand(num)
|
sign(arith_expr)
|
sin(arith_expr)
|
sqrt(arith_expr)
|
tan(arith_expr)
datetime_function:
age(dt_expr)
|
{curdate | current_date}()
|
{curtime | current_time}()
|
dayofmonth(dt_expr)
|
dayofyear(dt_expr)
|
hour(dt_expr)
Retrieving Data from a Database
42
RDM SQL Language Guide
|
|
|
|
|
|
minute(dt_expr)
month(dt_expr)
quarter(dt_expr)
second(dt_expr)
week(dt_expr)
year(dt_expr)
string_function:
ascii(string_expr)
|
char(num)
|
concat(string_expr, string_expr)
|
convert(expression, {convert_type | {char}, width, convert_format})
|
lcase(string_expr)
|
left(string_expr, num)
|
length(string_expr)
|
locate(string_expr, string_expr, num)
|
ltrim(string_expr)
|
repeat(string_expr, num)
|
replace(string_expr, string_expr, string_expr)
|
right(string_expr, num)
|
rtrim(string_expr)
|
substring(string_expr, num, num)
|
ucase(string_expr)
|
unicode(string_expr)
convert_type:
char |smallint | integer | real
|
double | date | time | timestamp | tinyint | bigint
convert_format:
numeric_format | datetime_format
numeric_format:
"[<< | >> | ><]['text' | $][- | (][#,]#[.#[#]...][e | E]['text' | $ | %]"
datetime_format:
"[<< | >> | ><]['text' | spchar | date_code | time_code]..."
date_code:
m | mm | mmm | mon | mmmm | month
|
d | dd | ddd | dddd | day
|
yy | yyyy
time_code:
h | hh | m | mm | s | ss | .f[f]... | [a/p | am/pm | A/P | AM/PM]
The built-in numeric functions that are available in RDM SQL are listed in the following table.
Retrieving Data from a Database
43
RDM SQL Language Guide
Table 6. Built-in Numeric Functions
Function
abs
acos
asin
atan
atan2
ceil | ceiling
cos
cot
exp
floor
ln | log
mod
pi
rand
sign
sin
sqrt
tan
Description
Returns the absolute value of an expression.
Returns the arccosine of an expression.
Returns the arcsine of an expression.
Returns the arctangent of an expression.
Returns the arctangent of an x-y coordinate pair.
Finds the upper bound for an expression.
Returns the cosine of an angle.
Returns the cotangent of an angle.
Returns the value of an exponential function.
Finds the lower bound for an expression.
Returns the natural logarithm of an expression.
Returns the remainder of arith_expr1/arith_expr2.
Returns the value of pi.
Returns next random floating-point number. Non-zero num is seed.
Returns the sign of an expression (-1, 0, +1).
Returns the sine of an angle.
Returns the square root of an expression.
Returns the tangent of an angle.
The RDM SQL data and time manipulation functions are listed below. Note that dt_expr is an arith_expr that
involves only date, time, and timestamp columns and values.
Table 7. Built-in Date and Time Functions
Function
age
curdate
current_date
curtime
current_time
Description
Returns the age (in full years).
Returns the current date.
current_timestamp
Returns the current date and time
Returns the day of the month.
Returns the day of the week.
Returns the day of the year.
Returns the hour.
Returns the minute.
Returns the month.
Returns the quarter.
Returns the second.
Returns the week.
Returns the year.
dayofmonth
dayofweek
dayofyear
hour
minute
month
quarter
second
week
year
Returns the current time.
The RDM SQL string manipulation functions are listed below.
Retrieving Data from a Database
44
RDM SQL Language Guide
Table 8. Built-in String Functions
Function
ascii
char
concat
convert
insstr
Description
Returns the numeric ASCII value of a character
Returns the ASCII character with numeric value num
Concatenates two strings
Convert an expression to a data type or a character string
Replace num2 chars from string_expr2 in string_expr1 beginning at position num1
(1st position is 1 not 0)
Converts a string to lowercase
Returns the leftmost num characters from the string
Returns the length of the string
Locate string_expr1 from position num in string_expr2
Removes all leading spaces from string
Repeats string num times
Replace string_expr2 with string_expr3 in string_expr1
Returns the rightmost num characters from string
Removes all trailing spaces from string
Returns num2 characters from string_expr beginning at position num1.
Convert string to uppercase
Returns the numeric Unicode value of a character
Returns a Unicode character with numeric value num.
lcase
left
length
locate
ltrim
repeat
replace
right
rtrim
substring
ucase
unicode
wchar(num)
Arithmetic operators that are specified in an expression are evaluated based on the precedence given in the following table.
Table 9. Precedence of Arithmetic Operators
Priority
Operator
Use
Highest
()
Parenthetical expressions
High
+
Unary plus
High
-
Unary minus
Medium
*
Multiplication
Medium
/
Division
Lowest
+
Addition
Lowest
-
Subtraction
Okay, I know. That's a lot of detail to have to wade through but you're through it now and so we'll illustrate column
expressions with a couple of examples. More sophisticated examples will be given in subsequent sections.
The following query computes the sales tax based on a rate of 9.2% for each book.
select bookid, price, price*0.091 tax from book;
BOOKID
alcott01
alcott02
Retrieving Data from a Database
PRICE
1200.00
1075.00
TAX
109.20
97.82
45
RDM SQL Language Guide
alcott03
alcott04
alcott05
alcott06
austen01
austen02
...
wilde04
wilde05
woolf01
woolf02
woolf03
1550.00
1250.00
850.00
875.00
12500.00
13500.00
141.05
113.75
77.35
79.62
1137.50
1228.50
22500.00
2000.00
3250.00
1750.00
32500.00
2047.50
182.00
295.75
159.25
2957.50
The next query computes both the raw profit and the percentage profit margin for each book based on the price
and cost columns in each row of the book table.
select bookid, price, cost, price-cost profit, ((price-cost)/cost)*100 margin from
book;
BOOKID
alcott01
alcott02
alcott03
alcott04
alcott05
alcott06
austen01
austen02
...
wilde04
wilde05
woolf01
woolf02
woolf03
PRICE
1200.00
1075.00
1550.00
1250.00
850.00
875.00
12500.00
13500.00
COST
960.00
860.00
1240.00
1000.00
708.00
729.00
9615.00
10384.00
PROFIT
240.00
215.00
310.00
250.00
142.00
146.00
2885.00
3116.00
MARGIN
25.00
25.00
25.00
25.00
20.00
20.00
30.00
30.00
22500.00
2000.00
3250.00
1750.00
32500.00
17307.00
1600.00
2600.00
1400.00
25000.00
5193.00
400.00
650.00
350.00
7500.00
30.00
25.00
25.00
25.00
30.00
Notice any pattern when you compare the profit margin percentage with the price? The higher the price, the
larger the profit margin.
Conditional Queries
While there are times when one needs to see all of the rows in a table, by far the more common situation is that
only some rows of a table are needed. In order to restrict the rows to be returned by a select statement you can
specify a conditional expression in a select statement where clause which specifies that only those rows for
which the conditional expression is true are to be retrieved. The syntax for the select statement containing the
where clause is as follows.
Retrieving Data from a Database
46
RDM SQL Language Guide
select_statement:
select expression [column_alias] [, expression [column_alias] ]… from
where conditional_expr
table_name
conditional_expr:
rel_expr [bool_oper rel_expr]...
rel_expr:
|
|
|
|
|
|
expression [not] rel_oper expression
expression [not] between constant and
expression [not] in (constant[, constant]...)
column_ref is [not] null
string_expr [not] like "string"
not rel_expr
( conditional_expr )
constant
rel_oper:
|
|
|
|
|
= | ==
<
>
<=
>=
<> | != | /=
bool_oper:
& | && | and
|
"|" | "||" | or
The like operation can be used to perform simple pattern matching. SQL defines two pattern matching symbols.
The "%" can be specified to match zero or more characters. The "?" can be specified to match any single character.
For example, most of the short biographical sketches (column short_bio) contained in the author table specifies the nationality of the author. Hence, for example, the following query will retrieve only those authors in which
"English" is included somewhere in the short_bio column.
select full_name from author where short_bio like "%English%";
FULL_NAME
Austen, Jane
Bacon, Francis
Bronte, Charlotte
Bronte, Emily
Carroll, Lewis
Chaucer, Geoffrey
Chesterton, G. K. (Gilbert Keith)
Retrieving Data from a Database
47
RDM SQL Language Guide
Coleridge, Samuel Taylor
Conrad, Joseph
Defoe, Daniel
Dickens, Charles
Eliot, George
Hardy, Thomas
Hobbes, Thomas
Johnson, Samuel
Milton, John
Potter, Beatrix
Raleigh, Walter
Scott, Walter
Shakespeare, William
Tennyson, Alfred
Trollope, Anthony
Wells, H. G. (Herbert George)
Woolf, Virginia
The next query returns those books that are priced over £100,000.
select bookid, price, title from book where price > 100000.00;
BOOKID
shakespeare01
marke.
shakespeare02
shakespeare04
shakespeare05
PRICE TITLE
175000.00 The Tragicall Historie of Hamlet, Prince of Den135000.00 Midsummer night's dream
250000.00 Plays
225000.00 Romeo and Juliet
Books that have not been sold have a null date_sold column value. Issue the following query to list all those
books that have sold.
select bookid, date_sold, price, title from book where date_sold is not null;
BOOKID
alcott01
alcott04
alcott05
alcott06
austen03
bacon03
bacon04
death.
burns01
carlyle03
...
wells04
wells05
wharton03
wharton05
DATE_SOLD
2010-05-04
2010-01-11
2010-08-14
2010-01-06
2009-10-28
2010-04-01
2010-02-13
2009-07-12
2009-12-13
2006-12-15
2010-01-02
2009-03-20
2010-04-04
Retrieving Data from a Database
PRICE
1200.00
1250.00
850.00
875.00
13500.00
5000.00
2500.00
TITLE
Moods
Little men : life at Plumfield with Jo's boys
Eight cousins;
Rose in bloom. A sequel to 'Eight cousins.'
Mansfield Park: a novel. In three volumes.
Sylva sylvarum. French
History natural and experimental, of life and
1250.00 Poems, chiefly in the Scottish dialect...
995.00 Chartism.
3000.00
25000.00
3250.00
4000.00
The war of the worlds,
The first men in the moon, by H.G. Wells ...
Crucial instances,
The descent of man, and other stories
48
RDM SQL Language Guide
wharton08
wharton09
wharton11
wilde04
2010-07-13
2008-12-20
2007-08-08
2007-12-23
2500.00
2500.00
1500.00
22500.00
Ethan Frome
The age of innocence
The buccaneers
The ballad of Reading gaol.
Note that the following query does not return any rows even though you might think that it should.
select bookid, date_sold, title from book where date_sold != null;
SQL uses three-valued conditional results: a condition can be true, or false, or indeterminate. The processing
details are too complicated to get into here but in order to do null value comparisons you must use the is null and
is not null operators.
The in operator will return all rows in which the left hand expression evaluates to one of the values specified in
the list as in the next example which lists those patrons from California and Washington.
select name, city, email from patron where state in ("CA","WA");
NAME
William Gates, III
Stephen Jobs
Scrooge McDuck
Richie Rich
Jed Clampett
Lucille Bluth
Jean Luc Picard
Jeffrey Bezos
CITY
Redmond
Cupertino
Anaheim
San Diego
Beverly Hills
Newport Beach
San Francisco
Seattle
EMAIL
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
The between operator returns those rows where the left hand expression inclusively evaluates to a value
between the two values on the right.
select last_name, publ_year, title from book where publ_year between 1810 and
1820;
LAST_NAME
AustenJ
AustenJ
AustenJ
AustenJ
CooperJ
IrvingW
ScottW
ScottW
ScottW
PUBL_YEAR
1813
1813
1814
1816
1820
1814
1810
1811
1815
Retrieving Data from a Database
TITLE
Pride and prejudice: a novel ...
Sense and sensibility: a novel.
Mansfield Park: a novel. In three volumes.
Emma: a novel. In three volumes.
Precaution; a novel...
Biographical memoir of Capt-David Porter.
The lady of the lake. A poem.
The vision of Don Roderick: a poem.
The field of Waterloo; a poem.
49
RDM SQL Language Guide
Retrieving Data from Multiple Tables
I am a lover of historical fiction. Suppose I wanted to see all of the books of that genre. You will note that there is
nothing in the book table which identifies the genre. However, there is a table called genres_books that contains a bookid column and a genre column. The declaration of bookid in genres_books indicates that it references the book table. So, one could issue the following query to list the bookid for each book that has a genre
equal to "Historical fiction".
select bookid from genre_books where genre = "Historical fiction";
BOOKID
cather03
cather07
cooper03
cooper04
defoe02
eliot04
hawthorne03
hawthorne04
scott01
scott07
stevenson06
twain05
twain09
Unfortunately, this does not tell you very much about the book. What you really need is to see the information in
the particular row from the book table that has the same bookid listed in the genres_books table. You can do
this using a query that specifies a join operation on the two tables as shown in the following example.
select last_name, title from book, genre_books
where book.bookid = genre_books.bookid and genre = "Historical fiction";
LAST_NAME
CatherW
CatherW
CooperJ
CooperJ
DefoeD
EliotG
HawthorneN
HawthorneN
ScottW
ScottW
StevensonR
TwainM
TwainM
TITLE
O pioneers! By Willa Sibert Cather ...
Shadows on the rock.
The last of the Mohicans; a narrative of 1757.
The prairie : a tale
Memoirs of a cavalier:
Romola.
The scarlet letter, a romance.
The house of the seven gables, a romance.
Rob Roy.
Ivanhoe; a romance,
Kidnapped : being memoirs of the adventures of David Balfour
The prince and the pauper : a tale for young people of all ages
Connecticut Yankee in King Arthur's court
The join is specified by listing each table in the from clause and then including in the where clause an equals operation between the bookid columns in each table. When designing a database (see Defining a Database), as
much as possible you will want to use the same column names between tables which are related in this way.
Retrieving Data from a Database
50
RDM SQL Language Guide
These relationships can (and should) be explicitly declared through the foreign and primary key specifications in
the create table statement. When you use the same column names in the two tables, the join operation based on
those columns containing the same values is called a natural join. SQL provides a simpler syntax for specifying
natural joins. For example, the above query can also be specified as follows.
select last_name, title from book natural join genre_books
where genre = "Historical fiction";
Join processing is a fundamental feature of all relational database systems. As such, SQL defines a rich set of
join specification options. The syntax for specifying joins is given below.
select_statement:
select expression [column_alias] [, expression [column_alias] ]…
from table_ref [, table_ref]… [where conditional_expr]
table_ref:
table_primary | table_join
table_primary:
table_spec | ( table_join )
table_spec:
[db_name.]table_name [[as] correlation_name]
table_join:
table_ref natural [inner | {left | right} [outer]] join table_primary
|
table_ref [inner | {left | right} [outer]] join table_primary
[using ( column_name[, column_name]...) | on conditional_expr]
The natural join specification indicates that the join is to be performed based on the common columns (names
and types) from the two tables. The join is formed from the columns from the table (or tables) specified on the left
side of "natural … join" that have identical values with those columns from the table (or tables) on the right side
that have the same name. Since common column names are used to form the join, sometimes you may not get
the expected results because the tables may have unrelated columns that happen to have the same name. Thus,
if you desire to make extensive use of the natural join, care must be taken in naming the columns in your table
definitions so that common column names between related tables are only those upon which the joins are based.
It is also best to explicitly declared the relationship using the primary key and foreign key/references clauses in
your create table declarations.
By default, a natural join specification performs an inner join between two tables. An inner join is a join between
those tables that have matching values in the join columns. However, sometimes it is possible to have values in
one table that have no matching entry in the other. An outer join allows one to see those unmatched rows as
wells. For example, the following query will return the list of all the books in the inventory for each author as well
as those authors for which no books are available.
select bookid, full_name, title from author natural left outer join book;
Retrieving Data from a Database
51
RDM SQL Language Guide
FULL_NAME
Alcott, Louisa May
Alcott, Louisa May
Alcott, Louisa May
...
Eliot, George
Faulkner, William
Ferber, Edna
Ferber, Edna
Ferber, Edna
Franklin, Benjamin
Gaskell, Elizabeth Cleghorn
Gaskell, Elizabeth Cleghorn
Gaskell, Elizabeth Cleghorn
Gaskell, Elizabeth Cleghorn
Gaskell, Elizabeth Cleghorn
Hardy, Thomas
Hardy, Thomas
Hardy, Thomas
Hardy, Thomas
Hawthorne, Nathaniel
Hawthorne, Nathaniel
Hawthorne, Nathaniel
Hawthorne, Nathaniel
Hemingway, Ernest
Hobbes, Thomas
...
TITLE
Moods
On picket duty, and other tales.
Little women, or, Meg, Jo, Beth, and Amy
Middlemarch: a study of provincial life.
*NULL*
Dawn O'Hara, the girl who laughed,
Show boat; a novel by Edna Ferber.
American beauty,
Advice to a young tradesman
Mary Barton: a tale of Manchester life ...
North and South.
The life of Charlotte Bronte, by E.C. Gaskell.
Wives and daughters. A novel.
Cranford.
A pair of blue eyes; a novel by Thomas Hardy ...
Under the greenwood tree
Far from the madding crowd,
A Laodicean. A novel.
Fanshawe, a tale ...
Twice-told tales.
The scarlet letter, a romance.
The house of the seven gables, a romance.
*NULL*
Leviathan
A left outer join will include those rows from author (full_name is a column of author) that do not have a
corresponding row in book (author is the left-side table in the join clause). In this example, our bookshop
evidently does not have a book by Faulkner or Hemingway. To see only the authors that do not have a book in
the inventory, enter the query below.
select full_name, title from author natural left join book where title is null;
FULL_NAME
Faulkner, William
Hemingway, Ernest
Steinbeck, John
TITLE
*NULL*
*NULL*
*NULL*
When there are common columns between two tables in which some of the columns should not be included in
the join you can specify a qualified join where you explicitly identify the join columns. For example, each bookshop patron is serviced by one account manager. The account manager is identified by the mgrid column in the
patron table. However, both tables also have a name column but clearly that column should not be used in the
join. So, to see a list of account managers and the patrons each one services, enter the following select statement.
select acctmgr.name, patron.name from acctmgr inner join patron using(mgrid);
Retrieving Data from a Database
52
RDM SQL Language Guide
ACCTMGR.NAME
Fox, Joe
Fox, Joe
Fox, Joe
Kelly, Kathleen
Kelly, Kathleen
Kelly, Kathleen
Doel, Frank
Doel, Frank
Kralik, Alfred
Kralik, Alfred
Kralik, Alfred
Kralik, Alfred
Novac, Klara
Novac, Klara
Novac, Klara
Noble, Barney
Noble, Barney
Zonn, Amy
Zonn, Amy
PATRON.NAME
Bernard Arnult
Chatsworth Osborne Jr.
Giorgio Armani
Stephen Jobs
Scrooge McDuck
Jay Gatsby
Warren Buffett
Artimis Fowel II
William Gates, III
Thurston Howell III
Charles Montgomery Burns
Jean Luc Picard
Mukesh Ambani
Richie Rich
Lucille Bluth
Carlos Slim Helu
Bruce Wayne
Jed Clampett
Jeffrey Bezos
The "inner" does not actually have to be specified as the default is to perform an inner join. Also notice that the columns in the select expression list are qualified by table name to differentiate the account manager name from
the patron name.
Where the join columns between the tables do not have the same name use the on clause to provide the join conditions. Issue the following query on the NSF awards database to list the 2001 NSF grants awards to those sponsors located in North Dakota.
select name, award_date, title from sponsor join award on(sponsor_nm = name)
where state = "ND" and award_date between date "2001-01-01" and date "2001-1231"
NAME
Bismarck St Coll
Cankdeska Cikana Community
Dakota Technologies, Inc.
North Dakota State U Fargo
North Dakota State U Fargo
Anal..
North Dakota State U Fargo
of..
North Dakota State U Fargo
layer..
North Dakota State U Fargo
North Dakota State U Fargo
lem-..
North Dakota State U Fargo
..
North Dakota State U Fargo
Suppo..
Retrieving Data from a Database
AWARD_DATE
2001-07-10
2001-07-23
2001-06-22
2001-06-11
2001-04-19
TITLE
Energy Technology Education Project
Cankdeska Cikana Community College Rural..
SBIR Phase I: Novel Ultrasensitive Gas..
Optics for Scientists and Engineers Lab..
GOALI: Sequencing the Assembly Line and
2001-08-06 US-Egypt Cooperative Research: Development
2001-05-31 SGER: Evaluation and Modeling of Inter2001-09-25 Mathematics and Engineering Scholarships
2001-11-26 Developing and Assessing Impact of Prob2001-12-26 Novel Instrumentation and Experimental for
2001-09-26 High Performance Network Connection in
53
RDM SQL Language Guide
North Dakota State U Fargo
..
North Dakota State U Fargo
SMC
Hybrid..
Sitting Bull College
tiative
Turtle Mountain Cmty Col
Math..
U of North Dakota
Project
U of North Dakota
for..
U of North Dakota
Automa..
U of North Dakota
U of North Dakota
U of North Dakota
gradu..
U of North Dakota
Populat..
U of North Dakota
Science,..
United Tribes Tech College
2001-05-11 Molecular Basis of Substrate Specificity,
2001-04-18 Statics: The next generation
2001-11-15 SBIR Phase I: Protective Metal Foam
2001-03-07 Sitting Bull College Rural Systemic Ini2001-09-20 Rural Systemic Initiatives in Science,
2001-04-26 Red River Geoscience Education Pilot
2001-04-10 CAREER: Thermoeconomic Modeling as a Tool
2001-08-30 Acquisition of a Variable Temperature
2001-07-28 Acquisition of an Automated Sequencer
2001-05-02 CAREER: Protein Export in Escherichia coli
2001-02-20 REU Site: Research Experience for Under2001-04-27 CAREER: Environmental Heterogeneity,
2001-11-19 University of North Dakota Computer
2001-07-20 United Tribes- Rural Systemic Initiative
The above examples all involve joins between just two tables. However, a select statement can involve joins
between more than two tables. Joins still occur in pairs. The result of a single join operation is a virtual table that
is then joined with another table. Join processing proceeds in a left-to-right manner. Thus, the left-hand "table"
for the second join is the result of the previous join and is joined to the next table. In the above syntax specification
note that a table_ref on the left hand side of the join operator can be a fully specified join whereas the right-hand
side is table_primary—a table name. This processing order can be altered (or clarified) using parentheses. For
example, the query below will return the investigator name and the research title for all NSF awards granted to
the University of Colorado at Denver.
select person.name, title
from (award natural join investigator natural join person)
join sponsor on (sponsor_nm = sponsor.name)
where sponsor.name = "U of Colorado Denver";
PERSON.NAME
Hirshman, Elliot
Zapien, Donald C.
Struct..
Mandel, Jan
Andrew., Andrew
Mandel, Jan
Bennethum, Lynn S.
Russell, Thomas F.
Billups, Stephen C.
Stith, Bradley J.
Retrieving Data from a Database
TITLE
Using Midazolam to Explore the Nature of Implicit Memory
RUI: Investigation of the Relationship of Ferritin's
Scalable Submesh Computing
Acquisition of a High-Performance Parallel
Acquisition of a High-Performance Parallel
Acquisition of a High-Performance Parallel
Acquisition of a High-Performance Parallel
Acquisition of a High-Performance Parallel
Lipid Signaling During Fertilization
Computer
Computer
Computer
Computer
Computer
for..
for..
for..
for..
for..
54
RDM SQL Language Guide
Zamudio, Stacy
..
Charles.§, Charles M.
Pr..
Andrew., Andrew
Sievering, Herman
Mari..
Tracer, David P.
Papua..
Jenkins, Peter E.
versity..
Sanders, Nancy M.
Standar..
Billups, Stephen C.
Weaver, Gabriela C.
Practi..
Rens, Kevin L.
Ancestry, Altitude and Placental Development in Highlands
REU Site: American Economic Association Summer Training
Preconditioned Algorithms for Large Eigenvalue Problems..
Sea-Salt Aqueous Phase SO2 Oxidation: Contribution to
Breast Feeding Structure and Parental Investment in
Toward T3 Tetherless Communications Workshop, UniSchool District Capacity to Support the Mathematics
Algorithms for Nonsmooth Equations
Proof of Concept Proposal for Physical Chemistry in
Concrete Maturity: A Quantitative Understanding of How..
Notice that both the person and sponsor tables have a column called name. Thus, references to each name must
be qualified with the table name to ensure that SQL uses the correct name.
Sorting Query Results
Suppose I want to see just the names of the investigators from the University of Colorado at Denver who have
been awarded NSF grants. Scanning the result set for familiar names would be much easier if the results were
returned sorted by the person's name. The order by clause of the select statement allows you to specify the column or columns on which to sort the result set. The syntax is given below.
select_statement:
select [distinct] expression [column_alias] [, expression [column_alias] ]…
from table_ref [, table_ref]… [where cond_expr]
order by {num | column_name} [asc | desc] [,{num | column_name} [asc | desc]]…
The num is the ordinal position of the select expression on which to sort where num = 1 refers to the first expression. The column_name is either the specified column_alias or the column name when expression is simply a
table column name. The default sort order is asc (ascending) but desc can be specified to reverse the order. If
more than one order by column is specified each subsequent column specifies a sort order within each value
from the outer sort column(s). If select distinct is specified, duplicate rows in the result set will be eliminated. All
of this is actually easier to show than to explain.
The next query will return the list of all investigators from the University of Colorado Denver that have been
awarded NSF grants.
select person.name
from award natural join investigator natural join person
join sponsor on (sponsor_nm = sponsor.name)
where sponsor.name = "U of Colorado Denver"
Retrieving Data from a Database
55
RDM SQL Language Guide
order by 1;
PERSON.NAME
Alaghband, Gita
Altman, Tom
Andrew., Andrew
Andrew., Andrew
Andrew., Andrew
Andrew., Andrew
Banks, David L.
Beekman, Christopher S.
Beekman, Christopher S.
...
Stith, Bradley J.
Stith, Bradley J.
Stith, Bradley J.
Tagg, Randall P.
Tagg, Randall P.
Tagg, Randall P.
Tang, Michael S.
Tracer, David P.
Walker, Kenneth
Weaver, Gabriela C.
Weaver, Gabriela C.
Weaver, Gabriela C.
Zamudio, Stacy
Zapien, Donald C.
This list includes some duplicate entries. To eliminate them add distinct to the select as shown below.
select distinct person.name
from award natural join investigator natural join person
join sponsor on (sponsor_nm = sponsor.name)
where sponsor.name = "U of Colorado Denver"
order by 1;
PERSON.NAME
Alaghband, Gita
Altman, Tom
Andrew., Andrew
Banks, David L.
Beekman, Christopher S.
Bennethum, Lynn S.
Billups, Stephen C.
...
Stith, Bradley J.
Tagg, Randall P.
Tang, Michael S.
Tracer, David P.
Walker, Kenneth
Weaver, Gabriela C.
Retrieving Data from a Database
56
RDM SQL Language Guide
Zamudio, Stacy
Zapien, Donald C.
The next example will show the list of awards for each investigator in order of when the grant was issued with the
most recent listed first.
select person.name, award_date, title
from award natural join investigator natural join person
join sponsor on (sponsor_nm = sponsor.name)
where sponsor.name = "U of Colorado Denver"
order by 1, 2 desc;
PERSON.NAME
Alaghband, Gita
Altman, Tom
stru..
Andrew., Andrew
Andrew., Andrew
ative
Andrew., Andrew
Compu..
Andrew., Andrew
Banks, David L.
ticipation in
Beekman, Christopher S.
Reg..
Beekman, Christopher S.
Reg.. ...
Stein, Fredrick M.
Stith, Bradley J.
Stith, Bradley J.
Stith, Bradley J.
Kinas..
Tagg, Randall P.
Tagg, Randall P.
ime..
Tagg, Randall P.
Tang, Michael S.
Em..
Tracer, David P.
ment..
Walker, Kenneth
Em..
Weaver, Gabriela C.
Weaver, Gabriela C.
istry..
Weaver, Gabriela C.
Experim..
Zamudio, Stacy
in
Retrieving Data from a Database
AWARD_DATE TITLE
1993-08-16 RIA: Parametric Modeling Tools for Performance
1992-09-04 Elimination of Certain Ambiguity Causing Con2002-08-28 Preconditioned Algorithms for Large Eigenvalue
2002-07-30 Sixth IMACS International Symposium on Iter2000-08-28 Acquisition of a High-Performance Parallel
1995-06-26 Mathematical Sciences: Preconditioned Parallel
1998-09-11 Group Travel Award to Support U.S Par2002-11-06 The Articulation of Political Strategies and
2002-06-12 The Articulation of Political Strategies and
2002-01-28
2002-04-30
1999-03-22
1996-05-15
Energy 2020: A Teacher Enhancement Workshop To
Lipid Signaling During Fertilization
RUI: Lipid Signaling During Fertilization
RUI: Induction of Cell Division by Protein
2002-01-28 Energy 2020: A Teacher Enhancement Workshop To
1995-06-30 Course Modules in Apparatus Design and Exper1995-06-08 Mathematical Sciences: Patterns, Chaos, and ..
1995-02-02 Engineering, Technology and Culture: with an
1999-12-20 Breast Feeding Structure and Parental Invest1995-02-02 Engineering, Technology and Culture: with an
2002-01-28 Energy 2020: A Teacher Enhancement Workshop To
1999-12-14 Proof of Concept Proposal for Physical Chem1996-05-10 Integration of Novel Laser-Spectroscopy
2002-07-17 Ancestry, Altitude and Placental Development
57
RDM SQL Language Guide
Zapien, Donald C.
Ferri..
2002-02-11 RUI: Investigation of the Relationship of
Performing Result Set Aggregate Calculations
All of the select statements shown thus far have produced detail rows where each row of the result set corresponds to a single row from the table (a base table or table formed from the set of joined tables in the from
clause). There are often times when you want to perform a calculation on one or more columns from a related set
of rows returning only a summary row that includes the calculation result. The set of rows over which the calculations are performed is called the aggregate. The select statement group by clause is used to identify the column or columns that define each aggregate—those rows that have identical group by column values. Five built-in
aggregate functions are provided in SQL as defined in the table below.
Table 10. Built-in Aggregate Functions
Function
count
sum
avg
min
max
Description
Returns the number (distinct) of rows in the aggregate.
Returns the sum of the (distinct) values of expression in the aggregate.
Returns the average of the (distinct) values of expression in the aggregate.
Returns the minimum expression value in the aggregate.
Returns the maximum expression value in the aggregate.
The complete syntax for the select statement including group by is as follows.
select_stmt:
select [first] [all | distinct] {* | select_item[, select_item]...}
from table_ref[, table_ref]...
[where conditional_expr]
[grouping | sorting | grouping sorting]
[limit (num {rows | mins | secs | msecs})]
[for {read only | update [of
column_name[, column_name]...]}]
grouping:
group by sort_col[, sort_col]... [having conditional_expr]
sorting:
order by sort_col [asc | desc][, sort_col [asc | desc]]...
sort_col:
num | column_name
select_item:
expression [alias_name | "column heading"]
Retrieving Data from a Database
58
RDM SQL Language Guide
table_ref:
table_primary | table_join
table_primary:
table_spec | ( table_join )
table_spec:
[db_name.]table_name [[as] correlation_name]
table_join:
natural_join | qualified_join | cross_join
natural_join:
table_ref natural [inner | {left | right} [outer]] join table_primary
qualified _join:
table_ref [inner | {left | right} [outer]] join table_primary
[using (column_name[, column_name]...) | on conditional_expr]
cross_join:
table_ref cross join table_primary
arith_expr:
expression
/* involving only numeric operands and operations */
dt_expr:
expression
string_expr:
expression
/* involving only date/time/timestamp operands and operations */
/* involving only string operands and operations */
expression:
operand [arith_operator operand]...
operand:
constant | param_ref | column_ref | function | (expr)
param_ref:
? | :param_name
column_ref:
[{table_name | correlation_name}.]column_name
Retrieving Data from a Database
59
RDM SQL Language Guide
arith_operator:
+|-|*|/
function:
aggregate_fcn | scalar_fcn
aggregate_fcn:
{sum | avg | max | min} (expression)
|
count ({* | column_ref })
|
aggregate_udf_name ([expression][, expression]...)
scalar_fcn:
|
if (conditional_expr, expression, expression)
|
numeric_function | datetime_function | string_function
|
scalar_udf_name ([expression][, expression]...)
numeric_function:
abs(arith_expr)
|
acos(arith_expr)
|
asin(arith_expr)
|
atan(arith_expr)
|
atan2(arith_expr)
|
{ceil | ceiling}(arith_expr)
|
cos(arith_expr)
|
cot(arith_expr)
|
exp(arith_expr)
|
floor(arith_expr)
|
{ln | log}(arith_expr)
|
mod(arith_expr)
|
pi()
|
rand(num)
|
sign(arith_expr)
|
sin(arith_expr)
|
sqrt(arith_expr)
|
tan(arith_expr)
datetime_function:
age(dt_expr)
|
{curdate | current_date}()
|
{curtime | current_time}()
|
dayofmonth(dt_expr)
|
dayofyear(dt_expr)
|
hour(dt_expr)
|
minute(dt_expr)
|
month(dt_expr)
|
quarter(dt_expr)
|
second(dt_expr)
Retrieving Data from a Database
60
RDM SQL Language Guide
|
|
week(dt_expr)
year(dt_expr)
string_function:
ascii(string_expr)
|
char(num)
|
concat(string_expr, string_expr)
|
convert(expression, {convert_type | {char}, width, convert_format})
|
lcase(string_expr)
|
left(string_expr, num)
|
length(string_expr)
|
locate(string_expr, string_expr, num)
|
ltrim(string_expr)
|
repeat(string_expr, num)
|
replace(string_expr, string_expr, string_expr)
|
right(string_expr, num)
|
rtrim(string_expr)
|
substring(string_expr, num, num)
|
ucase(string_expr)
|
unicode(string_expr)
convert_type:
char |smallint | integer | real
|
double | date | time | timestamp | tinyint | bigint
convert_format:
numeric_format | datetime_format
numeric_format:
"[<< | >> | ><]['text' | $][- | (][#,]#[.#[#]...][e | E]['text' | $ | %]"
datetime_format:
"[<< | >> | ><]['text' | spchar | date_code | time_code]..."
date_code:
m | mm | mmm | mon | mmmm | month
|
d | dd | ddd | dddd | day
|
yy | yyyy
time_code:
h | hh | m | mm | s | ss | .f[f]... | [a/p | am/pm | A/P | AM/PM]
To illustrate the basic operation of aggregate calculations, consider the following example which computes the
total sales for each bookshop account manager.
select name, count(*), sum(price)
from (acctmgr join patron using(mgrid)) natural join sale natural join book
Retrieving Data from a Database
61
RDM SQL Language Guide
group by 1;
NAME
Doel, Frank
Fox, Joe
Kelly, Kathleen
Kralik, Alfred
Noble, Barney
Novac, Klara
Zonn, Amy
COUNT(*)
5
19
14
18
6
21
9
SUM(PRICE)
31745
95500
67350
72685
234700
221650
15660
The from clause needs a little explanation. A natural join between acctmgr and patron cannot be used because
besides the mgrid column which is the correct join column both tables have a column called name which is not a
legitimate join column as they never contain the same value. So the using clause is specified to identify the particular common column name on which to form the join.
The count(*) give the number of detail rows (i.e., sold books) in the aggregate for each account manager. The
sum(price) gives the total of all of the price values in the aggregate for each account manager.
You can see all of the detail rows that were used in the aggregate calculations by issuing the following query.
select name, price
from (acctmgr join patron using(mgrid)) natural join sale natural join book
order by 1;
NAME
Doel, Frank
Doel, Frank
Doel, Frank
Doel, Frank
Doel, Frank
Fox, Joe
Fox, Joe
Fox, Joe
Fox, Joe
...
Zonn, Amy
Zonn, Amy
Zonn, Amy
Zonn, Amy
Zonn, Amy
PRICE
25000
750
2500
995
2500
3500
12500
750
1200
1250
1200
4375
750
325
Figure 7 illustrates how aggregate calculations are performed on the detail rows that are retrieved.
Retrieving Data from a Database
62
RDM SQL Language Guide
Figure 7 - Group By Aggregate Calculations
NSF Gender Study Example
The next example is from the NSF awards database. This is a rather involved example that shows how you can
use SQL to do analytical studies based on historical data contained in a database. The conclusions that are given
are the author's own based on his interpretation of the results of the queries given below.
The person table contains a list of all of the individual research investigators (jobclass = "I") and NSF program managers (jobclass = "P"). The gender of each person was not included in the original data but was
deduced from the person's first name based on a modified version of the list of names available from the following web site:
http://www.gpeters.com/names/baby-names.php?report=pop_all&showcount=10000
Not all first names in the person table were in this list and hence the gender could not be deduced. Thus, the
gender column values can be "M", "F", or "U". You can issue the following queries to see the totals for each
gender.
select count(*) from person where gender = "M";
COUNT(*)
57386
select count(*) from person where gender = "F";
COUNT(*)
17537
select count(*) from person where gender = "U";
COUNT(*)
10983
Alternatively, the next query can be used to compute the same results in one pass through the person table.
Retrieving Data from a Database
63
RDM SQL Language Guide
select sum(if(gender="F",1,0)) female,
sum(if(gender="M",1,0)) male,
sum(if(gender="U",1,0)) unknown from person;
FEMALE
17537
MALE
57386
UNKNOWN
10983
It might be interesting to see what difference there is between the ratio of male to female investigators and the
ratio of male to female program managers. The following query uses a group by to group the totals by jobclass.
select jobclass, sum(if(gender="F",1,0)) female, sum(if(gender="M",1,0)) male
from person where gender != "U"
group by 1;
JOBCLASS
I
P
FEMALE
17197
340
MALE
56813
573
The ratio of male to female investigators is 3.3 while the ratio for program managers is 1.7. Assuming that the program managers are NSF employees, it appears that, on a percentage basis, they hire significantly more women
to oversee NSF research grants than women to whom they award the grants.
To see if there is any trend in the percentage of women granted NSF awards, you can issue the query below to
see the percentage of women who were awarded NSF grants by year.
select year(award_date), 100.*sum(if(gender="F",1,0))/count(gender) pct_females
from award natural join investigator natural join person
where gender != "U" group by 1;
YEAR(AWARD_DATE)
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
Retrieving Data from a Database
PCT_FEMALES
21.74
22.21
19.79
17.90
18.81
17.69
19.91
18.82
19.52
20.85
19.61
20.02
20.94
21.04
21.93
64
RDM SQL Language Guide
Notice that there appears to be no significant variations and certainly no trend to suggest that more women are
entering into research in the sciences between the years 1989 and 2003. As noted above, the NSF does hire a
greater percentage of women program managers. The following query shows the percentage by year and while
the percentages are greater than in the prior result no trend is evident here either.
select year(award_date), 100.0*sum(if(gender="F",1,0))/count(gender) PCT_FEMALE_
PMS
from award join person on(prgm_mgr = name)
where gender != "U" group by 1;
YEAR(AWARD_DATE)
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
PCT_FEMALE_PMS
22.95
24.57
21.86
18.71
20.11
17.82
20.61
19.50
20.42
21.75
19.60
20.57
21.14
20.83
21.99
This data can be compared to the percentage of women earning doctoral degrees in science, engineering, and
health between the years 1989 and 2003 according to the NSF's own data as shown in the following table.
Table 11. Percentage of Science & Engineering Doctorates Earned by Women1
Year
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
All science,
engineering,
and health
fields
29.7
29.2
30.3
30.2
31.6
31.9
32.8
33.3
34.5
36.0
36.5
38.0
Computer
sciences
Engineering
Life
Mathsciencematics
es
17.6
15.6
14.6
13.8
15.7
15.2
18.7
15.1
16.5
17.2
18.3
16.4
8.3
8.5
9.0
9.3
9.2
10.9
11.6
12.3
12.3
13.1
14.8
15.7
38.7
37.9
39.2
39.7
42.0
42.2
42.4
43.8
44.9
45.8
44.8
47.2
18.0
17.7
19.2
19.4
23.0
21.1
22.3
20.6
23.4
25.2
25.6
24.7
PhysPsyical
science-chology
s
19.1
56.1
18.8
58.3
19.2
61.4
20.8
59.1
20.9
61.1
20.8
62.2
22.5
63.6
21.8
66.7
22.7
66.4
24.4
66.9
23.6
66.8
25.1
66.6
Social
sciences
34.1
33.3
36.9
36.0
37.7
37.0
37.8
36.5
38.7
41.5
41.7
42.9
1http://www.nsf.gov/statistics/infbrief/nsf08308/
Retrieving Data from a Database
65
RDM SQL Language Guide
2001
2002
2003
38.0
39.2
39.4
18.7
20.6
20.3
16.9
17.6
17.3
47.2
47.8
48.5
27.3
28.9
26.6
25.5
27.3
27.8
66.7
66.6
68.1
42.9
44.5
44.8
Here trends that show an increasing percentage of women who've earned doctorates in every field are clearly
evident. What isn't clear is why these same trends are not also represented in the NSF research grant awards.
Now I suppose that it is possible that those person table rows in which the gender was not deducible could be a
higher percentage of female than male but that does not strike me as likely. One might even ask why the
researcher's gender was not included in the data collection. Perhaps it was but it was not included in the report
data in order to avoid just this kind of analysis. But that is mere speculation. The culprits, if there really are any,
could be anywhere not just who at NSF decides who is awarded research grants. Other data that could be significant requires tracking the gender of the proposed investigators for all grant requests including those that are
rejected. If that data were to show a trend that corresponds to that in the above table then it would seem that the
fault lies in the grant awards process. However, if no such trend is evident, it is possible that the problem could be
inside the grant requesting institutions where the authority for approving grant requests resides with senior
research management. However, other NSF data1 does show an historical increase in the percentage of
women in senior faculty positions. So, since we evidently do not have all of the data, it would be "a capital mistake
to theorize before one has data."
1http://www.nsf.gov/statistics/seind10/pdf/c05.pdf
Retrieving Data from a Database
66
RDM SQL Language Guide
Inserting Data into a Database
"I never guess. It is a capital mistake to theorize
before one has data. Insensibly one begins to twist
facts to suit theories, instead of theories to suit facts".
- Sherlock Holmes
In this section you will learn how to put data into an RDM SQL database. Three methods are available to you for
doing this. The most common is through the insert values statement that stores a single row into a table.
Another is to use the insert select statement that lets you store all of the rows returned from a select statement
into a table. The select retrieves rows from other tables in the same database or in another database but can
also retrieve data from a non-SQL data source that you can make available to RDM SQL through the create virtual table feature. The third method is through the use of the non-standard import statement. This statement can
insert new rows into a table from data stored in a comma-delimited or XML text file.
When making modifications to database content it is vitally important to maintain the logical integrity of the data.
Logical integrity means that all the related rows from multiple tables, as defined by the foreign and primary key
relationships in the DDL, always exist. That means, for example, that for every book stored in the bookshop database the referenced author row exists as do all of its related names, notes, sales and auctions. Logical integrity is
achieved through the use of transactions. This section will also show you how to use SQL transactions to ensure
that the logical integrity of your database remains intact and it is with that subject that we begin.
Transactions
It is very important that any database management system (DBMS) ensures that the data that is stored in a database satisfies the ACID criteria: Atomicity, Consistency, Isolation, and Durability. Atomicity means that a set of
interrelated database modifications all be made together at the same time. If one modification from the set fails
then all fail. Consistency means that a database never contains errant data or relationships and that a transaction always transforms the database from one consistent state into another. Consistency is something that is
primarily the responsibility of the application because the database cannot be certain that all of the necessary
modifications have been properly included in any given transaction. In RDM SQL, consistency rules are specified
through the foreign and primary key declarations and RDM SQL does ensure that those relationships are consistent. Isolation means that the changes that are being made during a transaction are only visible to the user
(program task) making them. Not until the transaction's changes have been committed to the database are other
users (tasks) able to see them. Durability refers to the DBMS's ability to ensure that the changes made by all
transactions that have committed survive any kind of system failure.
The work necessary to ensure that a DBMS supports "ACIDicity" makes it among the most complex of all system
software components. The challenge being to maintain ACIDicity and yet allow the database data to be easily
accessed by as many users as possible, as fast as possible. However, there is an unavoidable and severe negative performance impact caused by the need to maintain an ACID compliant database. When enforcement of
these properties is relaxed, data can be updated and accessed much more quickly but the consistency and integrity of the data will certainly be impaired should a system failure occur.
Three statements are used for transaction processing. The start transaction statement does just that. The commit statement will write to the database all of the changes made since the last start transaction. The rollback
statement will undo all of the changes made since the last start transaction. The syntax for each of these statements is shown below.
Inserting Data into a Database
67
RDM SQL Language Guide
start_stmt:
{start trans[action] | begin [work] [trans[action]]} [read only]
commit_stmt:
{commit [work] | end [trans[action]]}
release_stmt:
release savepoint savepoint_id]
rollback_stmt:
rollback [work] [[to savepoint] savepoint_id]
If no start transaction statement has been executed prior to the execution of an insert, update, or delete statement, the system will automatically start a transaction for you.
The read only transaction was described in detail in the Retrieving Data from a Database section. Examples
showing how to use transactions with the insert statement are provided in the following sections.
Insert Values
The insert values statement is used to insert a new row into a table. Its syntax is as shown in the box below.
insert_values_stmt:
insert into [db_name.]table_name [( column_name[, column_name]... )]
values simple_expr[, simple_expr]...
simple_expr:
simple_operand [+ | - | * | / | % | simple_operand]…
|
( simple_expr )
simple_operand:
constant | column_name | arg_name | ? | scalar_fcn
scalar_fcn:
numeric_function | datetime_function | string_function | system_function
|
udf_name ([simple_expr][, simple_expr]...)
The insert values statement is used to insert a single row into the table table_name. If a column_name list is
specified it must include every column which requires that a value be specified (a primary key column or one
which does not have a default value but does have a not null declared). For each column, there must be a value
specified in the same corresponding position in the values list. If no column_name list is specified then there
must be a value listed for each column declared in the table in the order in which the columns were declared in
the create table statement for table_name.
Inserting Data into a Database
68
RDM SQL Language Guide
The values specified in the values list will usually simply be a constant of a data type that is compatible with the
data type of its corresponding column. However, simple expressions can be used and besides constant values
can include a reference to another column value in the list (column_name) , parameter marker references (designated by a "?") or, if the insert statement is contained within a create procedure statement, procedure argument names (arg_name). Expressions can also include calls to the built-in SQL functions or to a user-defined
function. Use of functions will be described in detail in the Changing and Deleting Data in a Database section and
in the User-Defined Functions (UDFs) in SQL section. The arithmetic operations that are supported include the
usual addition (+), subtraction (-), multiplication (*), and division (/) as well as modulo (%). If a values list entry
includes a column_name it must reference another column in the table and the values list entry for that column
cannot itself include a column_name reference.
Here are some example insert statements:
start transaction;
insert into author values "DescartesR", "Descartes, Rene", "M", 1596, 1650,
"French philosopher, mathematician, physicist, and writer";
insert into book values "descartes01", "DescartesR", "Principia philosophiae",
"12 p.l., 310 p. illus., diagrs. 21 cm.",
"Amstelodami, apud Ludovicum Elzevirium",
1644, "B1860 1644", date "2010-09-22", null, 1.20*cost, 12750.0;
insert into related_name values "descartes01", "Lessing J. Rosenwald Collection";
insert into related_name values "descartes01", "John Davis Batchelder Collection";
insert into note(noteid, bookid) values nextnote(), "descartes01";
insert into note_line values thisnote(), "Title vignette: device of Louis
Elzevir.";
insert into note_line values thisnote(), "Last preliminary leaf (sig. b[4])
blank.";
commit;
There are several things to notice from this example. The first is the presence of the start transaction and commit
statements that enclose the seven insert statements. As was discussed in the last section, since all of the data
being inserted into the database is interrelated, by placing it inside a single transaction unit, the system guarantees that either all of the data will be reliability stored in the database or, in the event of a system failure during
the transaction, none of it will. If each insert statement was individually committed then, should a failure occur,
some of the data would be missing. Therefore, it is always best to enclose all related database modification statements (i.e., insert, update, and delete) in a transaction.
The value associated with the price column in the second insert statement (i.e., the next to last entry in the values
list) is an expression that references the cost column (the last entry in the list). In this example, the asking price
for the book is calculated as a 20% markup over the cost of the book.
The final three insert statements illustrate how RDM user-defined functions (UDF) can be used to implement an
"auto-increment" integer primary key. UDFs will be explained in detail in the User-Defined Functions (UDFs) in
SQL section but here all you need to know is that the call to nextnote() returns the next higher noteid value
and the call to thisnote() returns the current value (i.e., that just returned by nextnote() when the previous insert statement was executed). This allows the foreign key value for column noteid in table note_line
to reference the note row that was just entered.
Inserting Data into a Database
69
RDM SQL Language Guide
Insert From Select
You can also insert new rows into a table from another table using insert from select statement. The syntax for
the insert from select statement is given below. The select statement was described in detail in the Retrieving
Data from a Database section and its use with the insert statement will show the basics of how the two can be
used together.
insert_select_stmt:
insert into table_name [( column_name[, column_name]... )]
from select_stmt
The number of result columns returned from the select_stmt must equal the number of columns specified in
the colum_name list or, if not specified, the number of columns declared in the table. The data type of each
result column must also be compatible with its corresponding table column.
The following example uses the weather sensor database example discussed in the Defining a Database section. The select statement retrieves data from the various weather sensors and stores the results in the
weather_summary table. It uses the limit clause to specify that the data is to be accumulated and summarized
every 60 minutes. Even though only the SQL statements are shown, the execution of the statement would be performed inside a loop in the application program. One row per longitude and latitude, date, and hour of the day is
stored in the weather_summary table. Note that the execution time for this statement is one hour.
insert into weather_summary from
select loc_long, loc_lat, curdate(), hour(rdg_time)
avg(temperature), avg(pressure), avg(humidity), avg(light) from weather_data
group by 1, 2, 3, 4 limit(60 mins);
Import
Data from non-database sources that are contained in text files can be loaded into a database table by using the
import statement as shown in the syntax specification below.
import_stmt:
import into
table_name
from [char | wchar | xml] file "filename"
The data must either be stored in a comma-delimited or XML format. A comma-delimited format requires that
each column value be specified in the order in which the columns are declared in the table. Absence of a column
value is indicated by a blank or empty entry (e.g., ",,"). Specify wchar if the text is stored with wide-characters.
The following statements are used to load the sample data contained in comma-delimited text files into bookshop
example database.
import into author from file "authors.txt";
import into book from file "books.txt";
Inserting Data into a Database
70
RDM SQL Language Guide
import
import
import
import
import
import
import
import
import
import
import
import
import
into
into
into
into
into
into
into
into
into
into
into
into
into
genres from file "genres.txt";
subjects from file "subjects.txt";
related_name from file "names.txt";
genres_books from file "bookgens.txt";
subjects_books from file "booksubs.txt";
acctmgr from file "acctmgrs.txt";
patron from file "patrons.txt";
note from file "bnotes.txt";
note_line from file "bnotelines.txt";
note from file "pnotes.txt";
note_line from file "pnotelines.txt";
sale from file "sales.txt";
auction from file "auctions.txt";
In XML format the column values are identified using XML attributes or tags to identify the column name with
which the tagged value is associated. The columns can be in any order but all necessary columns must be
included (i.e., columns declared as not null without a default value or which are declared as a primary or unique
key). Each row is bracketed between pairs of <ROW> and </ROW> tags. For each row column values are specified between pairs of <column_name> and </column_name> tags. The file begins with a <RAIMA-SQL>
tag and ends with a </RAIMA-SQL> tag. A portion of file sponsors.xml which can be used to load the sponsor
table in the nsfawards database is shown below.
<RAIMA-SQL>
...
<ROW>
<name>UNAVCO, Inc.</name>
<addr>3360 Mitchell Lane</addr>
<city>Boulder</city>
<state>CO</state>
<zip>80301</zip>
</ROW>
<ROW>
<name>UNIAX Corporation</name>
<addr>6780 Cortona Drive</addr>
<city>Santa Barbara</city>
<state>CA</state>
<zip>93117</zip>
</ROW>
<ROW>
<name>UNIVERSITY OF MICHIGAN</name>
<addr>2455 Hayward Street</addr>
<city>Ann Arbor</city>
<state>MI</state>
<zip>48109</zip>
</ROW>
<ROW>
<name>UNIVERSITY OF WISCONSIN MA</name>
<addr></addr>
<city></city>
<state> </state>
<zip> / </zip>
Inserting Data into a Database
71
RDM SQL Language Guide
</ROW>
<ROW>
<name>UNT Hlth Sci Ctr at Fort W</name>
<addr>Camp Bowie at Montgomery</addr>
<city>Fort Worth</city>
<state>TX</state>
<zip>76107</zip>
</ROW>
<ROW>
<name>URS Group, Inc.</name>
<addr>566 El Dorado Street - 2nd Floor</addr>
<city>Pasadena</city>
<state>CA</state>
<zip>91101</zip>
</ROW>
<ROW>
<name>US Army Corps of Engineers</name>
<addr>Transatlantic Programs Center</addr>
<city>Winchester</city>
<state>VA</state>
<zip>22601</zip>
</ROW>
...
</RAIMA-SQL>
The following statement loads the sponsor table in the nsfawards database from the above file.
import into sponsor from xml file "sponsors.xml";
Inserting Data into a Database
72
RDM SQL Language Guide
Changing and Deleting Data in a Database
Politicians are like diapers. They both need
changing regularly and for the same reason.
- Unknown
As I write this sentence and look up and see the quote at the top of the page which I found several weeks ago, I
note that today is election day in the USA. Interesting coincidence. However, what you will learn about changing
and deleting data in a database using SQL in this section will be much easier than changing politicians!
The SQL update statement is used to change the value of one or more columns in the rows of a particular table.
The SQL delete statement can be used to delete one or more rows from a particular table. Two forms are provided for each statement. A searched update or delete contains a where clause that is used to determine which
rows of the table are to be updated or deleted. Searched updates and deletes are designed to be used interactively although they are also easily used in an application program. A positioned update or delete is used in conjunction with a select statement that is being processed under a separate statement handle and is only used
within an application program. For that reason, the discussion on positioned updates and deletes will be discussed in the Using SQL in an Application Program section.
Searched Delete Statement
The syntax for the delete statement is as follows.
delete_stmt:
delete from [db_name.]table_name
[where {conditional_expr | current of cursor_name}]
If no where clause is specified then all of the rows in the table are deleted. If a where clause is specified then only
those rows for which the conditional expression is true will be deleted. If a referential integrity violation occurs on
any row during the execution of the delete statement, then the delete fails with no rows deleted. A referential
integrity violation occurs when there is a foreign key reference to a row to be deleted and the foreign key/references declaration does not include on delete cascade. All foreign key/references declarations that do include
on delete cascade will cause the referencing rows from those tables to be deleted as well.
Our antiquarian bookshop has a limited first edition, first impression copy of Jacob's Room by Virginia Woolf
worth 32,500 pounds. The owner has loaned this copy to the British Library for an upcoming Virginia Woolf exhibition. Hence, it needs to be removed from the inventory. The following queries show the pertinent information
from the book table as well as the entries in all the tables that reference the book.
select bookid, publ_year, price, title from book where bookid = "woolf03";
BOOKID
PUBL_YEAR
PRICE TITLE
woolf03
1922
32500.00 Jacob's room [by] Virginia Woolf.
select * from related_name where bookid = "woolf03";
BOOKID
NAME
woolf03
Hogarth Press, publisher.
Changing and Deleting Data in a Database
73
RDM SQL Language Guide
select * from genre_books where bookid = "woolf03";
BOOKID
GENRE
woolf03
Psychological fiction
woolf03
Experimental fiction
select * from subjects_books where bookid = "woolf03";
BOOKID
SUBJECT
woolf03
World War, 1914-1918
woolf03
Young men
woolf03
England
select text from note natural join note_line where bookid = "woolf03";
TEXT
First edition, first impression. One of probably
40 'A' subscribers copies.
Because all of the references to this particular book have foreign keys that specify on delete cascade, all that is
needed to remove the book and its references is to issue the following statement.
delete from book where bookid = "woolf03";
The previous four select statements will now not return any results. Now suppose you want to delete the genre
"Gothic fiction." You might first attempt the direct approach as follows.
delete from genres where text = "Gothic fiction";
**** referential integrity error: row to be deleted is referenced
The referential integrity error results from the fact that the foreign key references to this table are by default on
delete restrict which prevents the deletion of rows from a table where references exist. The genres table is referenced by only one other foreign key: the genre column of the genres_books table. You can use the following
query to list all of the rows in genres_books that reference "Gothic fiction."
select * from genres_books where genre = "Gothic fiction";
BOOKID
GENRE
austen06
Gothic fiction
There is only one reference which is Jane Austen's Northanger Abbey. So to delete "Gothic fiction" from the genres table you must first delete the reference in genres_books (which is appropriate considering the book is not
gothic fiction but is, in fact, a parody of gothic fiction).
delete
**** 1
delete
**** 1
from
rows
from
rows
genres_books where genre = "Gothic fiction";
affected
genres where text = "Gothic fiction";
affected
Changing and Deleting Data in a Database
74
RDM SQL Language Guide
At this point, since these are only examples, I suggest that you issue a rollback to restore the database back to its
original state.
select * from genres where text = "Gothic fiction";
TEXT
select * from genres_books where genre = "Gothic fiction";
BOOKID
GENRE
rollback;
select * from genres where text = "Gothic fiction";
TEXT
Gothic fiction
select * from genres_books where genre = "Gothic fiction";
BOOKID
GENRE
austen06
Gothic fiction
Searched Update Statement
The syntax for the searched update statement is given below.
update_stmt:
update [db_name.]table_name
set column_name = expression[, column_name = expression]...
[where {conditional_expr | current of cursor_name}]
The values to which the named columns in the set clause are assigned are the evaluated results of the specified
column expressions. The column values in [db_name.]table_name referenced by the expressions are the
pre-updated column values. The rows that are updated are those for which conditional_expr is true. If the
update of any of the selected rows results in an referential integrity violation (i.e., a foreign key column in the table
is changed to a value that does not exist in the referenced table), the update is aborted and the changes to the
rows that had already been modified are discarded. If the where clause is not specified, all of the rows in the specified table are updated.
If one of the columns specified in the set clause is a primary key that is referenced by one or more foreign key references in other tables then one of two results can occur. If the foreign key declaration in the create table statement of the referencing table is specified with on update cascade then the update will succeed and the column
values of all referencing rows will automatically (and instantly) be updated accordingly. If no on clause is specified or if on update restrict is specified, the update will be rejected with a referential integrity error.
The following query lists the unsold books priced at £25,000 and above in the order in which the books were
acquired.
select bookid, date_acqd, price, title from book
where date_sold is null and price >= 25000.00
Changing and Deleting Data in a Database
75
RDM SQL Language Guide
order by date_acqd;
BOOKID
shakespeare01
Prince...
poe02
decartes01
twain01
...
shakespeare07
shakespeare03
shakespeare06
twain03
potter04
shakespeare04
wells02
woolf03
shelley01
raleigh01
DATE_ACQD
2006-01-02
PRICE TITLE
175000.00 The Tragicall Historie of Hamlet,
2006-02-14
2006-03-09
2006-08-06
2006-10-26
2007-05-22
2007-08-22
2007-09-17
2007-12-19
2008-02-09
2009-03-24
2009-08-10
2009-11-26
2010-01-12
25000.00 Tales of the grotesque and arabesque
75000.00 Principia philosophiae
32500.00 The celebrated jumping frog of Calaveras
25000.00
75000.00
34500.00
67500.00
80000.00
250000.00
30000.00
32500.00
25000.00
32500.00
Works. 1709
Macbeth, a tragedy.
King Richard II
The adventures of Tom Sawyer,
The tale of Peter Rabbit
Plays
The island of Doctor Moreau,
Jacob's room [by] Virginia Woolf.
Frankenstein; or, The modern Prometheus.
The history of the world.
Given the difficult economic conditions and because they have been sitting in inventory unsold for some time, the
shop owner has decided to lower the price by 15% on the most expensive books that were acquired prior to
2007. The following update statement will do this.
Note that the values in the date_acqd and date_sold columns in your installation of the bookshop database
example will be comprised of dates later than those shown here.
update book set price = price - price*0.15
where date_sold is null and date_acqd < date "2007-01-01" and price >=
25000.00;
**** 5 rows affected
select bookid, date_acqd, price, title from book
where date_sold is null and price >= 25000.00 order by date_acqd;
BOOKID
shakespeare01
Prince...
decartes01
twain01
...
shakespeare03
shakespeare06
twain03
potter04
shakespeare04
wells02
woolf03
shelley01
raleigh01
DATE_ACQD
2006-01-02
PRICE TITLE
148750.00 The Tragicall Historie of Hamlet,
2006-03-09
2006-08-06
2007-05-22
2007-08-22
2007-09-17
2007-12-19
2008-02-09
2009-03-24
2009-08-10
2009-11-26
2010-01-12
63750.00 Principia philosophiae
27625.00 The celebrated jumping frog of Calaveras
75000.00
34500.00
67500.00
80000.00
250000.00
30000.00
32500.00
25000.00
32500.00
Changing and Deleting Data in a Database
Macbeth, a tragedy.
King Richard II
The adventures of Tom Sawyer,
The tale of Peter Rabbit
Plays
The island of Doctor Moreau,
Jacob's room [by] Virginia Woolf.
Frankenstein; or, The modern Prometheus.
The history of the world.
76
RDM SQL Language Guide
It was also noticed that the bookid values in the book table all begin with the author's last name followed by a
two-digit ordered sequence. However, two authors share the same last name: Emily and Charlotte Bronte. The
bookid values for the two sisters begin with the first initial to differentiate between the authors. The shop owner
was to change this so that the initial follows the last name in order to preserve the last name bookid convention.
Since all foreign key references to bookid have been declared with the on update cascade specification, it is
possible to update the bookid column even though it is the book table's primary key. The following example
shows the update statements that do this. Notice the use of the built-in string function replace.
select bookid, last_name, title from book where last_name like "Bronte%";
BOOKID
cbronte01
[pseud.]
cbronte02
cbronte03
ebronte01
LAST_NAME
BronteC
TITLE
Jane Eyre. An autobiography. Ed. by Currer Bell
BronteC
BronteC
BronteE
Villette.
Jane Eyre.
Wuthering Heights. A novel.
update book set bookid = replace(bookid, "cbronte", "brontec")
where last_name = "BronteC";
*** 3 rows affected
update book set bookid = replace(bookid, "ebronte", "brontee")
where last_name = "BronteE";
*** 1 rows affected
select bookid, last_name, title from book where last_name like "Bronte%";
BOOKID
brontec01
[pseud.]
brontec02
brontec03
brontee01
LAST_NAME
BronteC
TITLE
Jane Eyre. An autobiography. Ed. by Currer Bell
BronteC
BronteC
BronteE
Villette.
Jane Eyre.
Wuthering Heights. A novel.
One final comment. Notice that in none of the above examples was a commit statement issued. Hence, the
changes made by the foregoing update statements have not yet been permanently stored in the database.
Since, these were just examples, let's just go ahead and issue a rollback statement to discard them.
rollback;
select bookid, date_acqd, price, title from book
where date_sold is null and price >= 25000.00 order by date_acqd;
BOOKID
shakespeare01
poe02
decartes01
twain01
DATE_ACQD
2006-01-02
2006-02-14
2006-03-09
2006-08-06
Changing and Deleting Data in a Database
PRICE
175000
25000
75000
32500
TITLE
The Tragicall Historie of Hamlet, Prince...
Tales of the grotesque and arabesque
Principia philosophiae
The celebrated jumping frog of Calaveras ...
77
RDM SQL Language Guide
shakespeare07
shakespeare03
shakespeare06
twain03
potter04
shakespeare04
wells02
woolf03
shelley01
raleigh01
2006-10-26
2007-05-22
2007-08-22
2007-09-17
2007-12-19
2008-02-09
2009-03-24
2009-08-10
2009-11-26
2010-01-12
25000
75000
34500
67500
80000
250000
30000
32500
25000
32500
Works. 1709
Macbeth, a tragedy.
King Richard II
The adventures of Tom Sawyer,
The tale of Peter Rabbit
Plays
The island of Doctor Moreau,
Jacob's room [by] Virginia Woolf.
Frankenstein; or, The modern Prometheus.
The history of the world.
select bookid, last_name, title from book where last_name like "Bronte%";
BOOKID
cbronte01
[pseud.]
cbronte02
cbronte03
ebronte01
LAST_NAME
BronteC
TITLE
Jane Eyre. An autobiography. Ed. by Currer Bell
BronteC
BronteC
BronteE
Villette.
Jane Eyre.
Wuthering Heights. A novel.
Changing and Deleting Data in a Database
78
RDM SQL Language Guide
Writing and Using Stored Procedures
There is no procedure for learning to write.
What you must do, is learn to think.
- S. Leonard Rubenstein, Pennsylvania State University
classroom lecture, 1980.
A stored procedure is a named and possibly parameterized collection of one or more SQL statements that are
precompiled and executed together as a group. In RDM SQL, stored procedures are defined using the create
procedure statement as shown in the syntax specification given below.
create_proc_stmt:
create {proc | procedure} proc_name [(arg_name arg_type[, arg_name arg_type]...)] as
{select_stmt... |
[start_stmt] {insert_stmt | update_stmt | delete_stmt}... [commit_stmt]}
end {proc | procedure}
arg_type:
|
|
|
{character | char }
{double [precision] | float | real }
{tinyint | smallint | int | integer long | bigint}
date | time | timestamp
You will notice that you can either include one or more select statements or you can only include one or more
database modification statements optionally as a transaction. Stored procedures, therefore, can be used to specify the precompiled queries and the precompiled database modifications needed by an application. However,
RDM SQL stored procedures do not allow you to specify a single procedure that does both. The limitations are
designed to keep the RDM SQL implementation as efficient and as small as possible because of the resource limitations of many embedded computing environments.
The names used for stored procedure arguments must not conflict with column names that are declared in any of
the tables that are referenced in the SQL statements contained in the stored procedure. The argument data
types must be compatible with how they are used in the SQL statements specified in the procedure.
When a stored procedure has been successfully compiled by RDM SQL, the compiled code is stored in a file
named proc_name.ssp on the database's TFS. Also created and stored in the current directory is a file named
proc_name_ssp.c containing statically initialized C data structures that contain the compiled stored procedure
information and a file named proc_name_ssp.h which is a C header file to be included in any program that will
directly execute the stored procedure by calling function rsqlExecProc. This process is illustrated in Figure 8.
Writing and Using Stored Procedures
79
RDM SQL Language Guide
Figure 8 - How Create Procedure is Processed
There are two ways to execute a stored procedure. If all of your SQL database access is through pre-compiled
stored procedures (i.e., use of the proc_name_ssp.c module), then as mentioned above, the application,
calls rsqlExecProc. This will be explained in detail in the Using SQL in an Application Program section. The
other way to execute a stored procedure is by compiling and executing an execute statement as shown in the following syntax.
execute_stmt:
[exec[ute] | run] proc_name [(constant[, constant]...)]
The next example creates and executes a stored procedure that will retrieve some of the columns in the book
table for a specific bookid value that is passed in as an argument.
create proc getbook(bid char) as
select last_name, publ_year, price, title from book
where bookid = bid
end proc;
execute getbook("austen03");
LAST_NAME
AustenJ
PUBL_YEAR
1814
Writing and Using Stored Procedures
PRICE TITLE
13500.00 Mansfield Park: a novel. In three volumes.
80
RDM SQL Language Guide
Now suppose we really want to see the author's full name along with the selected book information. You can do
this by including two select statements: one that returns the full_name column from the author row that's
joined with the book and another that returns the book data. Note also that the execute key word is optional.
create proc getbook(bid char) as
select full_name from author natural join book where bookid = bid
select publ_year, price, title from book where bookid = bid
end proc;
getbook("austen03");
FULL_NAME
Austen, Jane
PUBL_YEAR
1814
PRICE TITLE
13500.00 Mansfield Park: a novel. In three volumes.
The next example shows how to modify the database contents using a stored procedure. The newpatron procedure inserts a new row into the patron table.
create procedure newpatron(
pid char, nm char, cty char, str char, st char,
cntry char, zip char, em char, tel char, mid char) as
insert into patron values pid, nm, str, cty, st, cntry, zip, em, tel, mid
end proc;
newpatron("RLM", "Randy Merilatt", "720 3rd Ave Suite 1100", "Seattle", "WA",
"US", "98104", "[email protected]","206-748-5200","BARNEY");
select name, city, state, mgrid, email from patron where patid = "RLM";
NAME
Randy Merilatt
CITY
Seattle
STATE MGRID
WA
BARNEY
EMAIL
[email protected]
The above version of newpatron does encapsulate the insert inside a transaction. So in order to make the new
patron permanent, a commit needs to be separately executed. Normally, you would not use a transaction inside
a stored procedure when there is more than one modification stored procedure that you want to have as part of a
single transaction. The version of newpatron that uses a transaction is defined below.
create procedure newpatron(
pid char, nm char, cty char, str char, st char,
cntry char, zip char, em char, tel char, mid char) as
start transaction
insert into patron values pid, nm, str, cty, st, cntry, zip, em, tel, mid
commit
end proc;
A modification stored procedure can contain more than one statement. The next example records a book sale.
Writing and Using Stored Procedures
81
RDM SQL Language Guide
create procedure sold(b_id char, p_id char, amt double) as
start transaction
insert into sale values b_id, p_id
update book set price = amt, date_sold = curdate() where bookid = b_id
commit
end proc;
To record the sale of Jane Austen's Emma to Lucille Bluth for £12,500 enter the following.
select last_name, price, date_sold, title from book where bookid = "austen04";
LAST_NAME
PRICE DATE_SOLD
AustenJ
13500 *NULL*
exec sold("austen04","BLU", 12500.00);
TITLE
Emma: a novel. In three volumes.
*** 1 rows affected
*** 1 rows affected
select last_name, price, date_sold, title from book where bookid = "austen04";
LAST_NAME
AustenJ
PRICE DATE_SOLD TITLE
12500 2010-11-18 Emma: a novel. In three volumes.
If an error occurs during the execution of any of the SQL statements in a stored procedure, any changed made
by that statement are aborted and the stored procedure will immediately exist leaving any remaining statements
unexecuted. If the stored procedure is a modification procedure any changes made by the stored procedure prior
to the attempted execution of the offending statement are automatically rolled back. If no transaction was specified in the stored procedure, any changes made during the active transaction but prior to the execution of the
stored procedure remain intact and can either be committed or rolled back as desired.
In RDM SQL, stored procedures are not intended to be an alternative way to program. They simply provide the
ability to pre-compile the SQL statements that are needed to access and manipulate the database so that an
application does not incur the cost of either having to compile the statements dynamically at runtime.
Writing and Using Stored Procedures
82
RDM SQL Language Guide
Concurrent Database Access
The test of a first-rate intelligence is the ability
to hold two opposed ideas in the mind at the
same time, and still retain the ability to function.
- F. Scott Fitzgerald , "The Crack-Up" (1936)
Concurrent database access refers to the situation where the database is being accessed from more than one
connection (user) at a time. Without the database system exerting some control over what gets updated by who
and when, all kinds of data integrity and consistency problems can arise. This can be illustrated with the simple
example given below in Table 12 which shows what can happen when the database system does not provide
some kind of concurrent access protection.
Table 12. Concurrent Update Problem
Time
Connection 1
T1
select price from book where bookid =
"cbronte03";
PRICE
12500.00
T2
select price from book where bookid =
"cbronte03";
PRICE
12500.00
update book set price=14500.00 where
bookid "cbronte03";
T3
T4
T5
Connection 2
update book set price=10500.00 where
bookid "cbronte03";
select price from book where bookid =
"cbronte03";
PRICE
10500.00
select price from book where bookid =
"cbronte03";
PRICE
10500.00
At time T1 connection 1 executes a select that returns the price of the books as 12,500. At time T2 connection 2
executes the same select and gets the same result. Then at time T3 connection 2 issues an update changing the
price to 14,500 while at time T4 connection one changes the price to 10,500 overwriting the change just made by
connection 2. At time T5 both connections issue the same select with connection 1 getting the expected result
while the user on connection 2 wonders if there is something wrong with her keyboard!
One of the most common ways for a DBMS to prevent these kinds of problems is to use locking in order to prevent other connections from accessing the data being updated. So, in the above example, if at time T1 connection 1 places a lock on the book table then the lock request issued by connection 2 at T2 will wait until
connection 1 releases the lock which will occur when the update completes and the lock is freed. Then connection 2's lock request will be granted and the select statement will now return the value of price as 10,500 and
connection 2's update can proceed with no anomalies.
Table 13. Locking Solution to Concurrent Update Problem
Time
Connection 1
T1
T2
T3
Request book table lock
Lock granted
Request book table lock
select price from book where bookid =
Concurrent Database Access
Connection 2
83
RDM SQL Language Guide
Time
T4
T5
T6
Connection 1
"cbronte03";
PRICE
12500.00
update book set price=10500.00 where
bookid "cbronte03";
Free book table lock
T7
T8
Connection 2
Lock granted
select price from book where bookid =
"cbronte03";
PRICE
10500.00
update book set price=14500.00 where
bookid "cbronte03";
Free book table lock;
Locking In RDM SQL
RDM SQL provides two types of locks. A read (share) lock locks a table for read-only access. Any number of different connections can have a read lock on a table. During the time that a table is read locked, no modifications
can occur on the table. A write (exclusive) lock locks a table for exclusive access by the connection which was
granted the write lock. When one connection has been granted a write lock on a table, lock requests from other
connections are queued and granted on a first-come, first-served basis.
Queued lock requests do not wait forever. When a lock request has waited for 10 seconds, it will be deleted from
the queue and a timeout status code (errTIMEOUT) will be returned. The timeout value for a connection can be
changed using the set timeout statement as shown below or through a call to function rsqlSetTimeout.
set_timeout_stmt:
set timeout {to | =} integer
A timeout value equal to -1 disables timeout checking so that lock calls will wait indefinitely. Timeouts should only
be disabled when you are certain that there is no possibility of a deadlock situation arising (see deadlock discussion below). Any non-negative value specifies the number of seconds to wait for the requested table lock(s)
to be granted. Setting the timeout to zero means that a lock request will return immediately if the lock cannot be
granted.
Only table-level locking is provided in RDM SQL. Table locking is simple and is therefore very efficient but
because an entire table is locked at a time, it works best in applications where there are a limited number of concurrent connections. If, however, you keep the duration of your transactions as short as possible good throughput is achievable for most embedded systems applications.
Lock requests are automatically issued by RDM SQL when needed (implicit locking). For example, read locks
are requested for each table that is accessed by a select statement. When the locks on all of the needed tables
have been granted then statement execution will proceed. If the select statement was executed outside a transaction, the locks are held until the statement handle on which the select is associated (i.e., the cursor) is closed
which occurs automatically after the last row has been fetched. If the select was executed after a transaction has
started then the locks will be held until the transaction is either committed or rolled back.
Concurrent Database Access
84
RDM SQL Language Guide
A write-lock is requested by RDM SQL for the tables that are being modified by an insert, update, or delete statement. Write-locks are not freed until either a commit or rollback operation is executed.
Table locks can be explicitly requested by either executing a lock table statement or through a call to the RDM
SQL API function rsqlLockTables. The syntax for the lock table statement is shown below.
lock_stmt:
lock table [in db_name] table_lock[, table_lock]...
table_lock:
table_name [read | write | default]
If neither read nor write is specified, then read is the default outside of a transaction and write is the default inside
a transaction. If a read only transaction (see below) is active then the lock request will return an error. Either all
lock requests will succeed or none will. I.e, this is an either all or none request which can be used to prevent a
deadlock situation in which one process holds a lock on table A while requesting a lock on table B while a second
process is holding a lock on table B while requesting a lock on table A.
The system will switch into explicit locking mode on execution of the first lock table statement (rsqlLockTables call). In this mode, all tables that are accessed by any subsequent SQL statements must be explicitly
locked. If not, SQL will return an errNOTLOCKED status. Note that the values of foreign key columns are
retrieved from the referenced row in the primary key table (RDM SQL does not actually store them in the foreign
key table). Hence, both the foreign and primary key tables must be explicitly locked when accessing foreign key
column values.
unlock _stmt:
unlock table {[db_name.]table_name | all}
This statement will free the read lock on table tabname or will free all read locks. This can only be executed outside of a transaction. The locks held within a transaction can only be freed through a transaction commit or rollback.
The SQL system automatically reverts to implicit locking mode when all table locks have been freed.
Read Only Transactions
A read only transaction allows a transaction consistent snapshot of the database to be queried without the need
to place locks on the accessed tables. A read only transaction can be explicitly started by executing the following
statement.
start_stmt:
{start trans[action] | begin [work] [trans[action]]} [read only]
Once a read only transaction has started, database modifications that have been committed by other connections will not be visible. Read only transactions are terminated by executing either a commit or a rollback
statement. If a read only transaction is active when a select statement executes, no lock requests will be issued.
Concurrent Database Access
85
RDM SQL Language Guide
By default, RDM SQL automatically requests read locks on the tables that are accessed by a select statement.
However, an option is available that will cause SQL to automatically initiate a read only transaction instead of
requesting locks. The read only transaction will be terminated when the select statement completes (i.e., cursor
is closed). The mode is controlled using the statement given in the following syntax.
read_only_trmode_stmt:
set read only trans[action] mode [to | =] {auto | manual}
When this mode is set to manual (default), SQL will issue lock requests on the tables to be accessed by a select
statement. When this mode is set to auto, SQL will executed each select statement within its own read only transaction.
You can also explicitly indicate that a select is to use a read only transaction instead of locks by adding the for
read only clause to the end of your select statement.
Read only transactions are very useful in concurrent database access applications because they do not block
access to the database from other connections. However, these do not come free. Long running read only transactions will eventually seriously degrade system performance. Therefore, it is best that read only transactions be
kept as short as possible.
Modification Stored Procedures
RDM SQL automatically places write locks on the tables that are being modified in an insert, update, or delete
statement. If you encapsulate all of your database modifications in stored procedures that includes an opening
start transaction and a closing commit statement—a transactional stored procedure—then the system will issue a
grouped lock request at the start of execution of the stored procedure to acquire all of the locks on all of the tables
involved in the modification. The execute statement (or call to rsqlExecProc) will return status errTIMEOUT
when one or more of the requested locks could not be acquired within the timeout window.
Transactional stored procedures can modify only one database at a time. If you use more than one database at a
time, then the modifications for each must be made in separate transactions.
Avoiding Deadlock
A deadlock (also known as deadly embrace) is an egregious situation that can arise in any system that involves
concurrent access to shared data from multiple processes. In its simplest form, process 1 holds an exclusive lock
on data item A and is requesting a lock on data item B while at the same time process 2 holds an exclusive lock
on data item B while requesting a lock on data item A. As you can easily see, both processes will wait forever
unless one or the other releases the lock it holds. Of course, much more complex deadlock scenarios exist that
involve multiple processes.
The primary application programming technique available in RDM that can be used to avoid deadlock is the timeout. A lock request will fail if the lock is not granted within the time duration specified by the connection's timeout
value. The default timeout is set to 10 seconds. As noted above, this value can be changed using either the set
timeout statement or through a call to the rsqlSetTimeout function.
Concurrent Database Access
86
RDM SQL Language Guide
While timeouts can be used to avoid deadlock, a related condition known as a livelock can still occur in which, in
the example above both of process 1's and process 2's lock requests timeout at the same time, causing each to
free the other lock as well and then restart their respective transactions with the timing of the operations such that
the same situation continues to repeat itself.
Both livelock and deadlock can be avoided by including in a single request locks on all of the tables (i.e., a
grouped lock request) that will potentially be modified by a transaction. As noted in the last section, a transactional stored procedure performs a grouped lock request for all needed locks at the beginning of the transaction, before any modification statements have executed. The table locks included in grouped lock requests
made by RDM SQL are always specified in the same order. While a timeout can still certainly occur, neither a
deadlock nor livelock situation will occur.
However, if you are issuing dynamic SQL transactions that include multiple database modification statements,
you need to explicitly lock all tables that can be modified in the transaction immediately following the start transasction statement. While not strictly necessary, it is also best to specify the tables in the lock table statement in
the order in which they are declared in your DDL specification (this is the order in which SQL automatically issues
the grouped lock request when a transactional stored procedure is executed). If you do not explicitly lock the
tables in a dynamic SQL transaction, SQL will automatically make the lock requests for each statement. If a timeout occurs during execution of a database modification statement, the correct response is to roll back the transaction and then restart it.
It is highly recommended that you encapsulate all of your transactions in transactional stored procedures in order
to ensure that deadlock and livelock situations are avoided. It is also recommended that you use read only transactions as much as possible as these will not block other updating processes. Both regular and read only transactions should execute in as short a time frame as possible.
Concurrent Database Access Use in Static SQL Applications
These statements are only available through dynamic SQL—they cannot be included in stored procedures.
Explicit locking within a static SQL application that uses only pre-compiled stored procedures must be done
through calls to the RDM SQL API locking functions as shown in the table below. The Using SQL in an Application Program section will describe in detail the use of these functions in an RDM SQL C application program.
Table 14. RDM SQL API Functions that Correspond to SQL Locking Statements
SQL Statement
RDM SQL API Function
lock table
rsqlLockTables
unlock table
rsqlUnlockTable
set timeout
rsqlSetTimeout
set read only transaction mode
rsqlSetReadOnlyTrmode
start transaction [ read only ]
rsqlTransStart or rsqlTransStartReadOnly
savepoint
rsqlTransSavepoint
release savepoint
rsqlTransRelease
rollback
rsqlTransRollback or rsqlTransEndReadOnly
commit
rsqlTransCommit or rsqlTransEndReadOnly
Concurrent Database Access
87
RDM SQL Language Guide
Examples
If a timeout occurs at any time during the execution of a statement within a transaction, the transaction should be
rolled back and restarted.
Concurrent Database Access
88
RDM SQL Language Guide
How Queries are Processed by RDM SQL
Artificial Intelligence is no match for natural stupidity.
- Unknown
A query optimizer is the component of an SQL system that attempts to determine the best way to retrieve the
data that is needed to produce the results specified by a given select statement. The problem with the term
"query optimizer" is that it makes it sound like it can take a stupidly formulated query and turn it into one that
executes at optimal performance. The fact is, query optimizers are just not that smart. So, it is important that queries be reasonably formulated and the more you understand how the optimizer goes about its business the better
equipped you will be to do just that. That is what this section is all about. Here you will …
l
l
l
l
learn how the RDM SQL optimizer works,
learn the different ways in which data can be retrieved from a database,
be given guidelines on how to construct fast-performing queries, and
learn how to retrieve and interpret a query's access plan.
Overview of the Query Optimization Process
In SQL, queries are specified using the select statement, and many methods (or query execution plans) exist for
processing a query. The goal of the optimizer is to discover, among potentially many possible options, which plan
will execute in the shortest amount of time. Of course, the only way to guarantee a specific plan is optimal is to
execute every possibility and then choose the fastest one. As this clearly defeats the purpose of optimization,
other methods must be devised.
The query optimizer must resolve two interrelated issues: how it will access each table referenced in the query,
and in what order. To access requested rows in a table, the optimizer can choose from a variety of access methods. It determines the best execution plan by estimating the cost associated with each access method and by factoring in the constraints on these methods imposed by each possible access ordering. Note that the decisions
made by the optimizer are independent of the listed order of the tables in the from clause or the location of the
expressions in the where clause.
To illustrate consider the declarations for the two tables defined below.
create table customer(
cust_id char(3) primary key,
company char(30) not null,
street char(30),
city char(17),
state char(2),
key cust_geo(state, city)
);
create table sales_order(
cust_id char(3) references customer,
ord_num smallint primary key,
ord_date date key,
amount double
);
How Queries are Processed by RDM SQL
89
RDM SQL Language Guide
RDM SQL will generate two indexes for each table. The customer table has an index on cust_id and a compound index for cust_geo on state and city. The sales_order table has an index on ord_num and
another on ord_date. With this in mind, consider the following query.
select company, ord_num, ord_date, amount from customer natural join sales_order
where state = "CO" and ord_date = date "2010-11-23";
Note that this is functionally identical to the query...
select company, ord_num, ord_date, amount from customer, sales_order
where customer.cust_id = sales_order.cust_id and
state = "CO" and ord_date = date "2010-11-23";
In this second form, two tables will be accessed: customer and sales_order. T he first relational expression
in the where clause specifies the join predicate, which relates the two tables based on their declared foreign and
primary keys. RDM SQL implements foreign and primary key relationships using a bi-directional, direct access
method. This means that it is possible to quickly go from 1) the foreign key row to the referenced primary key row
and 2) from the primary key row to each row that references it. Note also that the state column in the customer table is the first column in the cust_geo key, and the ord_date column in the sales_order table is
the first column in the order_key key. Thus the optimizer has choices of which index to use. All possible
execution plans considered by the RDM Server query optimizer for this query are listed in the following table.
Table 15. Possible Execution Plans for Example Query
Plan
Description
1
Scan customer table (i.e., read all rows) to locate rows where state = "CO", then for each
matching customer row, scan sales_order table to locate rows that match customer's
cust_id and have ord_date = 2010-11-23.
2
Scan customer table to locate rows where state = "CO", then for each customer row, read
each sales_order row through the primary to foreign key join, and return only those that have
ord_date = 2010-11-23.
3
Use the cust_geo index to find the customer rows where state = "CO", then for each customer row, scan sales_order table to locate rows that match customer's cust_id and have
ord_date = 2010-11-23.
4
Use the cust_geo index to find the customer rows where state = "CO", then for each customer row, read each sales_order row through the primary to foreign key join, and return only
those that have ord_date = 2010-11-23.
5
Scan sales_order table to locate rows where ord_date = 2010-11-23, then for each
sales_order row, scan customer table to locate rows that match sales_order's cust_id
and have state = "CO".
6
Scan sales_order table to locate rows where ord_date = 2010-11-23, then for each
sales_order row, read the customer row through the foreign to primary key join, and return
only those that have state = "CO".
7
Use the order_ndx index to find the sales_order rows where ord_date = 2010-11-23,
then for each sales_order row, scan customer table to locate rows that match sales_
order's cust_id and have state = "CO".
How Queries are Processed by RDM SQL
90
RDM SQL Language Guide
Plan
Description
8
Use the order_ndx index to find the sales_order rows where ord_date = 2010-11-23,
then for each sales_order row, read the customer row through the foreign to primary key join,
and return only those that have state = "CO".
Because the time (based on the number of disk accesses) required to scan an entire table is generally much
greater than the time needed to locate a row through an index, plans 4 and 8 seem the best. However, it is
unclear which of the two plans is optimal. In fact, both are probably good enough to obtain acceptable performance.
Additional information to help you make the best choice includes the number of rows in each table, the number of
customers from Colorado, and the number of orders for November 23, 2010. Let's assume that there are 1000
customers and 20,000 sales orders. Thus there is an average of 20 sales orders per customer. Of the 1000 customers, 25 are located in Colorado and 8 sales orders were made on 2010-11-23.
Now let's estimate the number of disk accesses for plan 4. Since all 25 Colorado customers are grouped
together in the index for cust_geo (state is the first column in the index) it is likely that no more than 3 index
reads are needed to locate them but each of the 25 rows need to be read and then for each customer row its
related sales_order rows (average of 20) need to be read and the ord_date checked. That gives a total
number of disk accesses as…
Plan 4 Cost Estimate = 3 + 25*20 = 503.
To estimate the number of disk accesses for plan 8 all of the 8 sales_order rows with an ord_date of 2010-11-23
can be retrieved in 1 index read plus 8 reads for each row. Then the associated customer row is found through
the foreign to primary key join (1 read) and the state column value is checked. That gives a total number of disk
accesses...
Plan 8 Cost Estimate = 1 + 8 + 8*1 = 17.
Clearly, plan 8 is the better choice.
Note that plans 1 and 5 perform what is called a Cartesian or cross-product—for each row of the first table
accessed, all rows of the second table are retrieved. Thus given that the customer table contained 1000 rows
and the sales_order table contained 20,000 rows, the query would need to read a total of 20,000,000 rows!
Cross-products are extremely inefficient and will never be considered by the optimizer except when a necessary
join predicate has been omitted from the query. In our example, this would occur if the relational expression,
"customer.cust_id = sales_order.cust_id" was not specified. Necessary join predicates are often
erroneously omitted when four or more tables are listed in the from clause and/or when multi-column join predicates (for compound foreign and primary keys) are required. To avoid this, it is best to use explicit join specification in the from clause as was shown in the first select statement in the above example. It is also important
when defining foreign and primary keys that there be no other columns in the two tables that have the same
name other than the foreign and primary key columns because the SQL standard defines a natural join as being
based not on the declared foreign and primary keys (which is how it should define it) but based on the commonly
named columns.
The optimization process is depicted below in Figure 9. The green boxes represent internal data structures and
the blue boxes represent processes.
How Queries are Processed by RDM SQL
91
RDM SQL Language Guide
Figure 9 - RDM SQL Query Optimization Process
Using the information in the catalog, the select statement is parsed, validated, and represented in a set of easily
processed query description tables. These tables include a tree representation of the where clause expressions
(called the expression tree) and information about the tables, columns, and keys in the database.
The system then analyzes those tables, and constructs both the access rule table and the expression table. For
table that is referenced in the from clause, the analysis process uses information in the catalog and other data
related statistics such as then number of rows in each table, blocking factors, and user-specified column statistics. The access rule table contains a rule entry for each possible access method (for example, table scan or
index lookup) for each table. The expression table has one entry for each conditional expression specified in the
where clause. These tables drive the actual optimization process.
Finally, the optimizer determines the plan with the lowest total cost. An execution plan basically consists of a
series of steps (one step for each table listed in the from clause), of how the table in that particular plan step will
be accessed. The possible access rules that can be applied at that step are sorted by their cost so that the first
candidate rule is the cheapest. The optimizer's goal is to select one access rule for each step that minimizes the
total cost of the complete execution plan. As the optimizer iterates through the steps, the cost of the candidate
plan is updated. As soon as a candidate plan's cost exceeds the cost of the currently best complete plan, the candidate plan is abandoned at its current step and the next rule for that step is then tested. Conditional expressions
that are incorporated into the plan are deleted from the expression tree so that they are not redundantly
executed.
How Queries are Processed by RDM SQL
92
RDM SQL Language Guide
Cost-Based Optimization
The cost to determine the execution plan is the time it takes the optimizer to find the "optimal" plan. An execution
plan consists of n steps where n is the number of tables listed in the from clause. Each step of the plan specifies
the table to be accessed and the method to be used to access rows from that table. The cost increases factorially
to the number of tables listed in the from clause (n!). Performance impact start to become noticeable for queries
that reference more than about 10-12 tables. This is due to the increasing number of combinations of access
orderings that must be considered (2 tables have 2 possible orderings, 3 have 6, 4 have 24, etc.). The cost to estimate each candidate plan also includes a linear factor of the number of access methods available at each step in
a plan from which the optimizer must choose. More access methods means the optimizer must do more work,
but the odds of finding a good plan improve.
The cost to carry out an execution plan is the total number of file reads required to access the necessary database information. Because it is extremely difficult to accurately estimate the effects caused by caching performance and diverse database page sizes, physical disk read estimates are not possible. Hence, the system
estimates the number of logical file read based on an analysis of the number of reads required to read a row for
each access method. There is also a CPU computation component but that it much more difficult to estimate and
is controlled by a constant that is somewhat akin to Einstein's infamous cosmological constant. More on this later.
The statistics maintained for use by cost-based optimizers are used to: 1) guide the choice between alternative
access methods derived from the relational expressions specified in the where clause, 2) estimate the number of
output rows that result from each plan step, and 3) estimate the number of logical reads incurred by each possible access method.
The statistics used by the RDM cost-based optimizer include:
l
l
l
l
l
l
Number of rows in a table
Number of rows per page in a table (database I/O is performed a page at a time)
Depth of an index's B-tree
Number of keys per page in an index
The range of possible values in a column
The number of distinct values in a column
The last two stats can be specified by the user through distinct values and range clauses of the create domain
and create table statements or the set column stats statement.
Most SQL implementations adopt a cost-based approach because the quality of the execution plan that is
chosen is not all that sensitive to how a particular query is formulated. Another optimization approach is called
rule-based optimization which access the tables in the order in which the tables are specified which places a
greater responsibility on the part of the query formulator to understand the best way for the query to be processed. This is not to suggest that cost-based optimization frees the query developer of having to put any thought
into how the query should be constructed (re: opening paragraph of this section). If that were so then this discussion would not be necessary. Nevertheless, cost-based optimizers will more reliably produce higher quality
query execution plans but no optimization strategy is perfect.
How Queries are Processed by RDM SQL
93
RDM SQL Language Guide
Restriction Factors
A restriction factor is associated with each relational expression that is specified in the where clause and is an
estimate of the ratio of number of rows for which the expression is true to the total number of candidate rows. A
candidate row is a row of the table being produced by the select statement before the where clause is evaluated.
Restriction factors are used by the optimizer to decide between alternative access methods. Restriction factors
are floating point values between 0 and 1 and are computed based on the kind of relational expression as follows.
Table 16. Restriction Factor Computations
Relational Expression
Restriction Factor Estimate
column = value
1/number of distinct values of column
column in (value[, value]…)
number of values in list * (1/number of distinct values of column)
column >[=] value
(max(column) – value) / (max(column) – min(column))
column <[=] value
(value - min(column)) / (max(column) – min(column))
column between loval and hival
(hival – loval) / (max(column) – min(column))
Table Access Methods
RDM SQL provides a variety of methods for retrieving the rows in a table. Each of these access methods is
described below, including how cost is estimated for each method. The cost estimate equations use the above
statistics as represented by the following parameters.
Table 17. Table Access Method Cost Estimation Parameters
Parameter Definition
P
The number of pages in the file in which the table's rows are stored.
D
The depth of the B-tree index.
C
The cardinality of the table being accessed (that is, the number of rows in the table).
Cf
The cardinality of the table containing the referenced foreign key.
Cp
The cardinality of the table containing the referenced primary key.
K
The maximum number of key values per index page.
R
The restriction factor, an estimate (between 0 and 1) of the percentage of the rows of the table that
satisfy the conditional expression. The restriction factor for a conditional expression is the product
of the restriction factors for each relational expression in the conditional expression's boolean product (i.e., rel_expr and rel_expr …)
Database access is performed by reading data and index file pages. A data file page contains at least one
(usually more) table row so each physical disk read will read that number of rows. An index file page contains
many keys per page depending on the size of the page and the size of the index values. RDM uses a B-tree structure for its indexes, which guarantees that each index page is at least half full. On the average, index pages are
about 60-70% full. The depth of a B-tree indicates the number of index pages that must be read to locate a particular key value. Most B-trees have a depth of from 4 to 7 levels. A hash index can usually locate a key value in 1
to 3 reads depending on the quality of the hash and the number of key values (rows).
How Queries are Processed by RDM SQL
94
RDM SQL Language Guide
Sequential Table Scan
Each row of a table is stored as a record in a file. A data file can contain the rows from one or more tables. The
most basic access method is to perform a sequential scan of a file where the table's rows are retrieved by sequentially reading through the file. Thus, the cost (measured in logical disk accesses) to perform a sequential scan of a
table is equal to the number of pages in the file:
Escan = Cost of sequential file scan = P
A sequential file scan is used in queries where the where clause contains no optimizable conditional expressions
that reference foreign key, primary key, or indexed columns.
Hashed Access Retrieval
Hashed access retrieval accesses an individual row based on the hashed key value. Typically more than 1 page
read is required but usually less than 2 or 3 additional reads. Hence, the optimizer assumes that the cost of a
hashed retrieval is 2.
Ehash =Cost of hashed access retrieval = 2
Index Access Retrieval
The cost of an indexed access retrieval depends on the relational expression on which the access is based. The
cost estimate computations for the each of the optimizable relational expressions are as follows.
l Equality Conditionals
Indexed access retrieval allows retrieval of an individual row or set of matching rows, based on the value of one
or more columns contained in a single index. These values can be specified in the query directly or through a join
predicate.
For a unique index, the cost to access a single row is equal to the depth of the index's B-tree (seldom more than 4
) + 1 (to read the row from the data file). For a non-unique index, the cost is based on an estimate of the average
number of rows having the same index value derived from number of distinct column values. The percentage of
the table's rows that match the specified equality constraint is the restriction factor (R). Thus, the estimate of
number of matching rows is equal to the cardinality of the table multiplied by the restriction factor, or:
number of matching rows = C * R
The cost estimate (in logical page reads) of an indexed access retrieval is equal to the number of index pages
that must be accessed plus the number of matching rows (1 logical page read per row), or:
Eeq = Cost of index access for column = value
= D + (C * R)/(.7 * K) + (C * R)
How Queries are Processed by RDM SQL
95
RDM SQL Language Guide
This assumes that each index page is an average of 70% full (D = depth of B-tree, K = maximum number of keys
per index page). Note that this formula works for both unique and non-unique indexes (for unique indexes, R =
1/C).
l In Conditionals
When the in operator is used, the restriction factor is equal to the sum of the equality restriction factors for each of
the listed values. Thus, the cost is simply the sum of the costs of the individual values.
Elist = Cost of index access for column in (v1, v2, ..., vn)
= SUM(cost(column = vi)) for all i: 1..n
l Inequality Conditionals
Indexed scans use an index to access the rows satisfying an inequality relational expression involving the major
column in the index. The estimate of the cost of an index scan is calculated exactly the same as the indexed
access method. The restriction factor is calculated as given in Table 15.
Eineq = Cost of index access for inequality relational expressions
= D + (C * R)/(.7 * K) + (C * R)
l Like Conditionals
[TBD] Need to check the code.
Elike =
Joins Involving Primary and Foreign Keys
Foreign and primary key relationships are implemented in RDM by internally maintaining rowid pointers that are
used to optimally access the related rows and to easily ensure that referential integrity is enforced. A one-tomany relationship is created between the referenced primary key table and the referencing foreign key table.
Thus, only 1 read is needed to access the related row in the primary key table from the referencing row in the foreign key table. This is summarized below.
Efp = Cost of a foreign key to primary key access = 1
The number of reads needed to access the foreign key table rows that reference a particular primary key table
row is computed by dividing the cardinality of the primary key table by the cardinality of the foreign key table as follows.
Epf = Cost of a primary key to foreign key access = Cf / Cp
One additional optimization occurs when a foreign key table contains a foreign_key_column = value condition.
Since the related primary key is indexed and the related foreign key table rows can be directly accessed from the
referenced primary key row the foreign key table rows can quickly be found through an index access to the primary key row and then directly accessing each of the referencing foreign key table rows. The cost for this is summarized below.
Epk = Eeq + Epf
How Queries are Processed by RDM SQL
96
RDM SQL Language Guide
All of these formulas are summarized below in Table 17.
Table 18. Table Access Method Cost Estimation Formulas
Access Method
Cost Estimate Computation
sequential file scan
Escan = P
direct access
Edirect = 1
hashed access
Ehash = 2
index access for column = value
Eeq = D + (C * R)/(.7 * K) + (C * R)
index access for column in (v1, v2, ..., vn)
Elist = SUM(cost(column = vi)) for all i: 1..n
index access for inequalities
Eineq = D + (C * R)/(.7 * K) + (C * R)
index access for like with prefix
Elike = D + ((C * R)/(.7 * K)) + (C * R)
foreign key to primary key
Efp = 1
primary key to foreign key
Epf = Cf / Cp
to foreign key through primary key
Epk = Eeq + Epf
Optimizable Expressions
The RDM SQL query optimizer is able to optimize a restricted set of relational expressions that are specified in
the where clause of a select statement. Simple expressions involving a comparison between a simple column
and a literal constant value (or parameter marker or stored procedure argument) can be analyzed by the optimizer to determine if any access methods exist that can retrieve rows satisfying that particular conditional.
Expressions for potential use by the optimizer in an execution plan are referred to as optimizable. Table 18 summarizes the optimizable relational expressions.
Table 19. Optimizable Relational Expressions
1
KeyCol1 = constant [and KeyCol2 = constant]...
2
FkCol1 = constant [and FkCol2 = constant]...
3
FkCol1 = PkCol1 [and FkCol2 = PkCol2]...
4
KeyCol1 = Cola [and KeyCol2 = Colb]...
5
KeyCol1in (constant[, constant]...)
6
KeyCol1 {> | >= | < | <=} constant
7
KeyCol1 {> | >=} constant [and KeyCol1 {< | <=} constant]
8
KeyCol1between constant and constant
9
KeyCol1like "pattern"
The constant is either a literal, a parameter marker ('?'), or a stored procedure argument (if statement is contained in a stored procedure declaration). The KeyColi's refer to the i'th declared column in a given key. The
FkCol i's (PkCol i's) refer to the i'th declared column in a foreign (primary) key. An equality comparison must be
provided for all multi-column foreign and primary key columns in order for the optimizer to recognize a join predicate. Cola, Colb, etc., are columns from the same table that match (in type and length) KeyCol1 , KeyCol2, etc.,
respectively.
How Queries are Processed by RDM SQL
97
RDM SQL Language Guide
These expressions are all written in the following form: ColumnName relop expression. Note that expressions of
the form: expression relop ColumnName are recognized and transformed by the optimizer so that the ColumnName is always listed on the left hand side. This transformation may require modification of the relational
operator. For example,
select … from … where 1000 > colname
Is changed to
select … from … where colname < 1000
Depending on how the where clause is organized, an expression may or may not be optimizable. Conditional
expressions composed in conjunctive normal form are optimizable. In conjunctive normal form, the where clause
is constructed as follows:
C1 and C2 and ... Cn
Each Ci is a conditional expression comprised of a single or multiple or'ed relational comparisons. Only those
Ci's that consist of a single optimizable relational expression are optimizable. In other words, relational expressions that are sub-branches of an or'ed conditional expression are not optimizable. The best possible optimization results are obtained when the desired conditions use and. The optimizer can recognize a sequence of
or'ed equality comparisons referencing the same KeyCol1 and will convert it into an in comparison. For example,
the optimizer will convert…
select … from book
where bookid = "austen02" or bookid = "cbronte01" or bookid = "dickens07";
into…
select … from book
where bookid in ("austen02", "cbronte01", "dickens07");
Access Plan Determination
Selecting From Alternative Access Methods
Consider the following query from the NSF database.
Selecting the Access Order
When a query references more than one table, the optimization process becomes more complex, because the
optimizer must choose between different methods to access each table, and the order in which to access them.
Many access methods rely only on the values specified in the conditional expression for the needed data.
How Queries are Processed by RDM SQL
98
RDM SQL Language Guide
However, some access methods (those associated with join predicates) require that other tables have already
been accessed. This places constraints on the possible orderings. Access methods available at the first step in
the plan are those that do not depend on any other tables.
For possible access methods at the first plan step, the optimizer chooses the method with the lowest cost from a
list of possible methods sorted by cost. The accessed table is then marked as bound. The access methods available at the next step in the plan include the choices from the first step for the other tables, plus those methods
that depend on the table bound by the first step. These too are ordered by cost. The optimizer continues in this
manner until methods have been chosen for all steps in the plan. It then selects the method with the next highest
cost and recursively evaluates a new plan. At any point in the process, if the plan being evaluated exceeds the
total cost of the current best complete plan, that plan is abandoned and another is chosen. A flowchart of the optimizer algorithm is given in Figure 10.
Figure 10 - Optimizer Algorithm Flowchart
How Queries are Processed by RDM SQL
99
RDM SQL Language Guide
Sorting and Grouping Operations
For select statements that include a group by or order by specification, the SQL optimizer performs two separate optimization passes. The first pass restricts the choice of usable access methods to only those that produce
or maintain the specified ordering. For example, an index scan retrieves its results in the order specified in the
key declaration. If the results match the specified ordering, they are included as a usable access method. This
optimization pass is fast because, typically, very few plans produce the desired ordering without performing an
external sort of the result set.
If a plan is produced by the first pass, it is saved (along with its cost estimate), and a second optimization is performed without the ordering restriction. An estimate of the cost required to sort the result set, based on the optimizer's estimate of the result set's size, is added to the cost of the plan produced by the unrestricted pass. From
the two plans, the optimizer will choose the one with the lowest cost.
The estimate of the sort cost is based on the optimizer's cardinality estimate, the length of the sort key, and the
sort index page size. The optimizer will calculate the number of I/Os as two times the number of index pages to
store the sort index (one pass to create the page and another to read each page in order) and add the number of
result rows.
Note that if both the group by and order by clauses are specified, only the group by ordering can be satisfied by
existing indexes and joins. A separate sort of the result set will always be required for the order by clause. If there
is no index to satisfy the specified group by, then two sort passes will be needed.
Outer Join Processing
The optimizer processes outer joins by forcing all outer joins into left outer joins (right outer joins are converted
into left outer joins by simply reversing the order). It then will disable all access paths that require the right hand
table to be accessed before the left hand table. If there is no access path (that is, through an index or declared foreign key) from the left hand table to the right hand table, the optimizer will simply perform an inner join (rather
than doing a potentially very expensive cross-product).
Returning the Number of Rows in a Table
The row counts for each table in a database are maintained by the RDM runtime. SQL recognizes queries of the
following form:
select count(*) from tablename
and generates a special execution plan that returns the current row count value for the specified table. No table
or index scan is needed. However, if the query is specified as shown below, the optimizer performs a scan of the
table or index (if colname is indexed) and counts the rows.
select count(columnname) from tablename
Thus, if you need the row count of the entire table, use the first form and not the second.
How Queries are Processed by RDM SQL
100
RDM SQL Language Guide
Query Construction Guidelines
Some systems perform a great deal of work to convert poorly written queries into well written queries before submitting the query to the optimizer. This is particularly useful in systems where ad hoc querying (such as in enterprise environments) is performed by non-technical people. SQL is less user friendly, so often this work is
performed by front-end tools. RDM SQL does not perform complex query transformation analysis (it will do simple things such as converting expressions like "10 = quantity" into "quantity = 10"). Therefore, a thorough understanding of the information provided here will assist you in formulating queries that can be optimized efficiently by
RDM Server SQL. Guidelines for writing efficient RDM Server SQL queries are listed below.
l Formulate where clauses in conjunctive normal form. Avoid using or.
l Formulate conditional expressions according to the forms listed in Table 18. Use literal constants as often as possible. The compile-time for most queries is insignificant compared to their execution time. Thus, dynamically constructing and compiling queries containing literal constants (as opposed to parameter markers or stored procedures) will allow the optimizer to make more intelligent access choices.
l Make sure that the only columns that have the same name in tables that are related through foreign and primary keys are the foreign and primary key columns themselves. Then use the natural join clause when formulating queries that join the two tables.
l Include more (not fewer) conditional expressions in the where clause, and include redundant expressions. For example, foreign and primary keys exist between tables A and B, B and C, and A and C. Even though it is not strictly necessary (mathematically) to include a join predicate between A and C, doing so provides the optimizer with additional access path choices. Also, assuming that join predicates exist and a simple conditional is specified for the primary key, you can include the same conditional on the foreign key as well. Look at the following query:
select ... from A,B where A.pkey = B.fkey and A.pkey = 1000
You can improve this query by adding the conditional shown in an equivalent version below.
select ... from A,B where A.pkey = B.fkey and A.pkey = 1000 and B.fkey =
1000
l If you are not using SQL's extended join syntax in the from clause of your select statements, make certain join predicates exist for all pairs of referenced tables that are related through foreign and primary keys.
l Avoid sorting queries with large result sets in which no index is available to produce the desired ordering. If you have heavy report writing requirements, consider using the replication or mirroring feature to maintain a redundant, read-only copy of the database on a separate TFS and run your reports from there. This will allow the primary system to provide the best response to update requests without blocking or being blocked by a high level of query activity.
l In defining your DDL, explicitly declare the foreign and primary key relationships. You can still do joins between tables even when the relationships are not declared but optimum join performance is guaranteed when you declare those relationships in your create table DDL statements.
l Do not include conditional expressions in the having clause that belong in the where clause. Conditional expressions contained in the having clause should always include an aggregate function reference. Note that expressions in the having clause are not taken into consideration by the optimizer.
How Queries are Processed by RDM SQL
101
RDM SQL Language Guide
l Use the distinct values and range clauses in either the create table or the set column stats statements to provide more statistical information to the optimizer. T
he distinct values clause is particularly important for equality conditions. Do not declare a key on a column that has only a few distinct values. For example, never declare a key on a column that contains a person's gender. I f no distinct values clause is specified, the optimizer will use the current number of rows in the table. The range clause is used with inequality conditions.
l Only declare keys that you actually need to get the needed performance in your embedded application. More keys increases the time to insert new rows in a table besides consuming more storage.
Controlling Optimizer with a User-Specified Restriction Factor
The restriction factor is the fraction of a table between 0 and 1 that is returned as a result of the application of a
specific where condition. The lower the value, the greater the likelihood that the access method associated with
that condition will be chosen by the optimizer. This factor is computed by the optimizer based on the type of relational expression and the range values for the column, if specified. Note that you can override the optimizer's estimate by using a non-standard RDM SQL feature. A relational expression, relexpr, can be written as "(relexpr,
factor)", where factor is a decimal fraction between 0 and 1 indicating the percentage of the file restricted by relexpr.
For example, in the following query from the NSF database, where the optimizer would normally access the data
using the awardno key, the specified restriction factors will actually cause the optimizer to use the award_
date key.
select * from award
where (awardno = 70246, 1.0)
and
(award_date > date "2002-07-01", 0.00001);
When statistics used by the optimizer are not accurate enough for a given query and the result is unsatisfactory,
you can use this feature to override the stats-based restriction factor and substitute your own value. However,
your use of this feature renders the query independent of future changes to the data distribution statistics.
How Queries are Processed by RDM SQL
102
RDM SQL Language Guide
Using SQL in an Application Program
Some people like my advice so much that
they frame it upon the wall instead of using it.
- Gordon R. Dickson
The previous sections have described how to use SQL as a database language. While some programming considerations necessarily were involved with the operational aspects of the SQL language itself, how to actually
use RDM SQL from an application program is the subject of this section.
There are several different application programming interfaces (API) available for use with RDM SQL. The
nativeRDM SQL API is designed for use with C application programs. Raima also provides an API that conforms
to Microsoft's ODBC (Open Data Base Connectivity) API specification which is also designed for use with C
application programs. Programs written in Java can access RDM SQL through the JDBC (Java Data Base Connectivity) API that is also provided by Raima. Both the ODBC and JDBC APIs have been implemented using the
RDM native API so those of you who are familiar with ODBC or JDBC will see close similarities with them.
If you are an experienced ODBC programmer, you will have little difficulty in learning how to use the native API.
However, while there are many similarities, there are also some significant differences so you will want to do a
careful reading of this section and do not assume that just because ODBC does something a certain way that the
native API does it the same way. In fact, we've designed the native API to be simpler and easier to use than
ODBC.
Native SQL API Basics
A complete, alphabetical list of the functions provided in the RDM SQL API is given below.
Table 1. RDM SQL API Functions
Function
rsqlAllocConn
rsqlAllocStmt
rsqlBindNamedParam
rsqlBindParam
rsqlCancelRow
rsqlCloseDB
rsqlCloseDBAll
rsqlCloseStmt
rsqlDropDB
rsqlExecDirect
rsqlExecProc
rsqlExecute
rsqlFetch
rsqlFreeConn
rsqlFreeStmt
rsqlGetAutoCommit
rsqlGetColDescr
Using SQL in an Application Program
Description
Allocate a new connection handle
Allocate a new statement handle
Bind a data value to a named parameter marker
Bind a data value to a parameter marker
Cancel (discard) column value changes to current row
Close a database
Close all databases that are open on a connection
Close the open select statement cursor
Drop (delete) a database
Prepare and execute a SQL statement
Directly execute a pre-compiled SQL stored procedure
Execute a compiled SQL statement
Fetch the next row of the select statement result set
Free a connection handle
Free a statement handle
Get the connection handle's current auto commit status
Get description information for a select statement result column
103
RDM SQL Language Guide
Function
Description
rsqlGetConnHandle
Get connection handle associated with specified statement handle
rsqlGetCursorName
Get the cursor name associated for the specified statement handle
rsqlGetData
Get data value for one select statement result column
rsqlGetDateFormat
Get the current date format setting
rsqlGetDateSeparator
Get the current date separator character
rsqlGetDBNames
Get a list of the names of the currently opened databases
rsqlGetDeferBlobMode
Get the current deferred blob reading mode setting
rsqlGetErrorInfo
Get the message associated with the current error code
rsqlGetErrorMsg
Get the message associated with a specific error code
rsqlGetGenCFiles
Get the connection handle's "generate C files" mode
rsqlGetNumParams
Get the number of parameter markers in the compiled statement
rsqlGetNumResultCols
Get the number of result columns in the compiled select statement
rsqlGetParamDescr
Get description information for a SQL statement parameter marker
rsqlGetReadOnlyTrmode
Get the current read only transaction mode
rsqlGetRowCount
Get the count of the # of rows affected by the executed statement
rsqlGetSelectType
Get the statement handle's select statement type
rsqlGetStmtState
Get the statement handle's statement state
rsqlGetStmtString
Return the SQL statement string for a statement handle
rsqlGetStmtType
Get the statement type of the prepared statement
rsqlGetTableName
Get result column's table name
rsqlGetTimeout
Get a connection's lock request timeout value
rsqlInitDB
Initialize a database
rsqlLockTables
Issue an explicit lock request for one or more database tables
rsqlMoreResults
Execute next statement in the currently executing stored procedure
rsqlOpenCat
Open a database through its compiled catalog module
rsqlOpenDB
Open a database by name
rsqlPackDate
Pack a CAL_DATE into a binary DATE_VAL
rsqlPackTime
Pack a CAL_TIME into a binary TIME_VAL
rsqlPackTimestamp
Pack a CAL_TIMESTAMP into a binary TIMESTAMP_VAL
rsqlParamData
Check for and initialize rsqlPutData for next data-at-exec parameter
rsqlPrepare
Compile an SQL statement
rsqlPutData
Put a data value for a data-at-exec blob parameter
rsqlRegisterProc
Register a compiled stored procedure
rsqlRegisterUDFs
Register C-based user-defined functions
rsqlRegisterVirtualTables Register C-based virtual tables
rsqlSetAutoCommit
Set the auto commit status for the specified connection
rsqlSetCursorName
Set the cursor name for the specified statement handle
rsqlSetDateFormat
Set the date constant format for the connection
rsqlSetDateSeparator
Set the current date constant separator character for the connection
rsqlSetDeferBlobMode
Set a statement's deferred reading mode for blob data
rsqlSetErrorCallback
Set an error callback user function
rsqlSetGenCFiles
Set the connection handle's "generate C files" mode
Using SQL in an Application Program
104
RDM SQL Language Guide
Function
rsqlSetReadOnlyTrmode
rsqlSetTimeout
rsqlShowPlan
rsqlTFSInit
rsqlTFSTerm
rsqlTransCommit
rsqlTransEndReadOnly
rsqlTransRelease
rsqlTransRollback
rsqlTransSavepoint
rsqlTransStart
rsqlTransStartReadOnly
rsqlTransStatus
rsqlUnlockTable
rsqlUnlockTableAll
rsqlUnpackDate
rsqlUnpackTime
rsqlUnpackTimestamp
rsqlUpdateCol
rsqlUpdateRow
Description
Set the current read only transaction mode
Set lock wait timeout in seconds for the connection
Show a query's execution plan as a result set
Initialize RDM SQL TFST or TFSS operation
Terminate RDM SQL TFST or TFSS operation
Commit a transaction
End a read only transaction
Release a transaction savepoint
Rollback to transaction savepoint or start
Mark a transaction savepoint
Start a transaction
Start a read only transaction
Return the current transaction state for the specified connection
Free a read lock on a database table
Unlock all read locked tables
Unpack a binary DATE_VAL into a CAL_DATE structure
Unpack a binary TIME_VAL into a CAL_TIME structure
Unpack a binary TIMESTAMP_VAL into a CAL_TIMESTAMP structure
Update a column value of current row
Store the updated column values for the current row
Comparing the ODBC API with the Native RSQL API
The following table provides a mapping of the ODBC API functions with the RSQL API functions. Not all ODBC
functions have an equivalent RSQL API function. Some, (e.g., SQLTables, SQLColumns, etc) are implemented in the RDM ODBC layer as select statements on built-in virtual system catalog tables which are
described later in this section. Also note that those functions that do have a RSQL API equivalent do not have the
same function arguments. However, the basic operational approach (e.g., function calling sequence) that is used
in an ODBC application is also needed in a RSQL application. ODBC API functions that are not listed do not
have a RSQL API counterpart.
Table 2. ODBC to RDM SQL API Function Mapping
ODBC API Function
RSQL Function
Comments
SQLAllocHandle
SQLBindCol
rsqlAllocConn
rsqlAllocStmt
n/a
Allocation of connection and statement handles are made through
separated functions. There is no environment handle.
Column result values are not bound but are returned by
rsqlFetch or rsqlGetData.
SQLBindParameter
SQLCancel
SQLCloseCursor
SQLColAttribute
SQLColumns
rsqlBindParam
n/a
rsqlCloseStmt
rsqlGetColDescr
n/a
SQLConnect
n/a
Using SQL in an Application Program
Call rsqlCloseStmt to cancel statement processing.
Database meta-data information is available by executing select
statements on the appropriate syscat virtual tables.
Connections are initiated when rsqlAllocConn is called. Databases are opened through calls to rsqlOpenDB or rsqlO-
105
RDM SQL Language Guide
ODBC API Function
RSQL Function
Comments
penCat.
SQLDescribeCol
SQLDescribeParam
SQLDescribeStmt
SQLDisconnect
SQLEndTran
rsqlTransRollback
SQLExecDirect
SQLExecute
SQLExtendedTran
rsqlGetColDescr
rsqlGetParamDescr
rsqlGetStmtDescr
n/a
rsqlTransCommit
SQLDescribeStmt is a Raima Inc. extension.
Connections are closed when rsqlFreeConn is called.
SQLFetch
rsqlExecDirect
rsqlExecute
rsqlTransStart
rsqlTransSavepoint
rsqlTransRelease
rsqlTransCommit
rsqlTransRollback
rsqlFetch
SQLForeignKeys
n/a
SQLFreeHandle
rsqlFreeConn
rsqlFreeStmt
rsqlGetAutoCommit
Not all ODBC connection attributes have a RDM equivalent. Not
rsqlGetDateFormat
all RDM connection attributes have an ODBC equivalent.
rsqlGetDateSeparator
rsqlGetDeferBlobMode
rsqlGetReadOnlyTrmode
rsqlGetCursorName
rsqlGetData
rsqlMoreResults
rsqlGetNumParams
rsqlGetNumResultCols
rsqlPrepare
n/a
Database meta-data information is available by executing select
statements on the appropriate syscat virtual tables.
n/a
Database meta-data information is available by executing select
statements on the appropriate syscat virtual tables.
rsqlPutData
rsqlGetRowCount
rsqlSetAutoCommit
Not all ODBC connection attributes have a RDM equivalent. Not
rsqlSetDateFormat
all RDM connection attributes have an ODBC equivalent.
rsqlSetDateSeparator
rsqlSetDeferBlobMode
rsqlSetReadOnlyTrmode
rsqlSetCursorName
rsqlSetErrorCallback SQLSetError is a Raima Inc. extension.
n/a
Database meta-data information is available by executing select
statements on the appropriate syscat virtual tables.
n/a
Database meta-data information is available by executing select
SQLGetConnectAttr
SQLGetCursorName
SQLGetData
SQLMoreResults
SQLNumParams
SQLNumResultCols
SQLPrepare
SQLPrimaryKeys
SQLProcedures
SQLPutData
SQLRowCount
SQLSetConnectAttr
SQLSetCursorName
SQLSetError
SQLSpecialColumns
SQLTables
Using SQL in an Application Program
We believe that separate calls represent a better API design than
a single call with a control variable
Note that the rsqlFetch returns the column result values-no
bound columns.
Database meta-data information is available by executing select
statements on the appropriate syscat virtual tables.
106
RDM SQL Language Guide
ODBC API Function
RSQL Function
Comments
statements on the appropriate syscat virtual tables.
SQLTransactStatus
rsqlTransStatus
The advantage of using the native API instead of ODBC is that it is simpler and more efficient with a smaller footprint. However, ODBC is available and can certainly be used if DBMS independence and/or use of a standard
SQL API is needed.
Connection Handles
Almost all of these functions require the use of either a connection handle or a statement handle. A connection
provides single-threaded access to the RDM SQL database engine. A connection handle is used to keep all of
the data used in all of the SQL calls for that connection thread safe. This means that each connection from a
given RDM SQL program can be executed in its own thread. A single connection typically connects to one or
more databases that are controlled by a single RDM Transactional File Server (TFS). However, a single connection can open a union of two or more instances of a database schema that are each running under a separate
TFS.
Statement Handles
A statement handle keeps track of all of the data involved in the compilation and execution of a single SQL statement. Each statement handle is associated with a single connection but a single connection can have multiple
statement handles.
The functions listed in Table 3 are those that deal with system-wide issues and, therefore, require neither a connection nor a statement handle.
Table 3. RDM SQL API Functions that Do Not Need a Handle
Usage
Startup
Status
Shutdown
Function
rsqlTFSInit
rsqlGetErrorMsg
rsqlTFSTerm
Description
Initialize RDM SQL TFST or TFSS operation
Get error message for a specific error code
Terminate RDM SQL TFST or TFSS operation
The functions that use a connection handle are listed below in Table 4 along with an indication as to how each
function is used.
Table 4. RDM SQL API Functions that Use a Connection Handle
Usage
Startup
Function
rsqlAllocConn
rsqlAllocStmt
rsqlDropDB
rsqlOpenDB
rsqlOpenCat
rsqlRegisterProc
rsqlRegisterUDFs
rsqlRegisterVirtualTables
rsqlSetAutoCommit
Using SQL in an Application Program
Description
Allocate a connection handle and open the connection
Allocate a statement handle
Drop (delete) a database
Open one or more databases by name
Open a database through the provided catalog
Register a compiled stored procedure
Register user-defined functions table
Register virtual tables in databases to be opened
Set auto-commit mode.
107
RDM SQL Language Guide
Usage
Status
Operation
Shutdown
Function
rsqlGetTimeout
rsqlSetTimeout
rsqlSetDateFormat
rsqlSetDateSeparator
rsqlSetReadOnlyTrmode
rsqlGetDBNames
rsqlGetAutoCommit
rsqlTransStatus
rsqlGetReadOnlyTrmode
rsqlGetDateFormat
rsqlGetDateSeparator
rsqlLockTables
rsqlUnlockTable
rsqlUnlockTableAll
rsqlTransStart
rsqlTransSavepoint
rsqlTransRelease
rsqlTransRollback
rsqlTransCommit
rsqlTransStartReadOnly
rsqlTransEndReadOnly
rsqlGetErrorInfo
rsqlCloseDB
rsqlCloseDBAll
rsqlFreeConn
Description
Get a connection's lock request timeout value
Set a connection's lock request timeout value
Set the date constant format
Set the current date constant separator character
Set the current read only transaction mode
Get a list of the names of currently opened databases
Get the current auto-commit mode setting
Return the transaction state for the specified connection
Get the current read only transaction mode
Get the current date format setting
Get the current date separator character
Issue lock request for one or more database tables
Free a read lock on a database table
Unlock all read locked tables
Start a transaction
Mark a transaction savepoint
Release a transaction savepoint
Rollback to transaction savepoint or start
Commit a transaction
Start a read only transaction
End a read only transaction
Get connection related error info
Close a database
Close all open databases
Free the connection handle
The functions that use a statement handle are shown below in Table 5 together with an indication of how each
function is used.
Table 5. RDM SQL API Functions that Use a Statement Handle
Usage
Setup
Compile
Function
rsqlAllocStmt
rsqlGetDeferBlobMode
rsqlSetDeferBlobMode
rsqlInitDB
rsqlPrepare
rsqlGetColDescr
rsqlBindNamedParam
rsqlBindParam
rsqlGetParamDescr
rsqlGetCursorName
rsqlSetCursorName
rsqlGetNumParams
Using SQL in an Application Program
Description
Allocate a statement handle
Get the current deferred blob reading mode setting
Set the current deferred blob reading mode setting
Initialize a database
Compile an RDM SQL statement
Get result set column description
Bind value variables to a named parameter marker
Bind value variables to a parameter marker
Get description of parameter
Get statement's cursor name
Set statement's cursor name
Get number of parameter markers in statement
108
RDM SQL Language Guide
Usage
Execute
Errors
Shutdown
Function
rsqlGetNumResultCols
rsqlGetTableName
rsqlGetStmtString
rsqlGetStmtState
rsqlGetStmtType
rsqlShowPlan
rsqlCancelRow
rsqlExecute
rsqlExecDirect
rsqlExecProc
rsqlFetch
rsqlGetData
rsqlParamData
rsqlPutData
rsqlGetRowCount
rsqlMoreResults
rsqlCloseStmt
rsqlUpdateCol
rsqlUpdateRow
rsqlGetErrorInfo
rsqlFreeStmt
Description
Get number of select statement result columns
Get result column's table name
Return the SQL statement string for a statement handle
Get the statement handle's statement state
Get statement type
Show a query's execution plan as a result set
Cancel (discard) column value changes to current row
Execute compiled SQL statement
Compile and execute SQL statement
Execute stored procedure
Fetch next row from result set
Get data value for one select statement result column
Set up next data-at-exec parameter
Put a data value for a data-at-exec blob parameter
Get # of rows affected by just executed statement
Execute next statement in stored procedure
Close select statement cursor
Update a column value of current row
Store the updated column values for the current row
Get statement's error information
Free statement handle
Header Files
There is one standard header file that must be #include'd in each module of your application that calls an
RDM API SQL function: rsql.h. It is contained in the standard RDM include directory. This file will itself include
all other RDM header files that are needed. Of particular importance is header file rsqltypes.h which
includes all of the type and macro definitions used by the native RSQL API.
API Function Parameters
As noted above, most functions take either a connection handle or a statement handle. Other needed arguments
are specified in the reference manual entries for each function. A connection handle is declared as type HCONN.
A statement handle is declared as type HSTMT. The typedef for each is void * and is declared in header file
rsqltypes.h.
All character string arguments are assumed to be C-based, null-terminated character strings.
Output arguments are passed as pointers and, unless otherwise noted, can be NULL when there is no interest in
that particular result value.
Using SQL in an Application Program
109
RDM SQL Language Guide
SQL Data Types and Values
SQL data types are identified in the API functions by use of the SQL_T enumeration type declared in header file
rsqltypes.h. The table below lists each of the SQL data types that are supported in RDM SQL along with its
SQL_T value and its equivalent C data type (includes some possibly RDM-declared types such as uint8_t).
Table 6. SQL Data Type Values
SQL Data Type
char
varchar
binary
varbinary
boolean
tinyint
smallint
integer
bigint
real
float, double
date
time
timestamp
long varchar
long varbinary
SQL_T value
tCHAR
tVARCHAR
tBINARY
tVARBINARY
tBOOL
tTINYINT
tSMALLINT
tINTEGER
tBIGINT
tREAL
tFLOAT, tDOUBLE
tDATE
tTIME
tTIMESTAMP
tCLOB
tBLOB
C Data Type
char
char
uint8_t
uint8_t
int8_t
int8_t
int16_t
int32_t
int64_t
float
double
int32_t
int32_t
int64_t
char
uint8_t
Data values such as select statement result column values and stored procedure argument values are provided
in RSQL-specific generic data value containers of type RSQL_VALUE. The declaration for this struct type is
contained in header file rsqltypes.h as shown below.
/* container for blob (long var...) data values */
typedef struct {
void
*buf; /* ptr to blob data (VALUE.len==amount of blob data in
buf) */
uint32_t
pos; /* current position==total bytes read so far */
} LONGVAR;
typedef union _value {
int8_t
tv;
int16_t
sv;
int32_t
lv;
int64_t
llv;
float
fv;
double
dv;
char
*cv;
void
*pv;
LONGVAR
lvv;
TIMESTAMP_VAL ts;
DB_ADDR
dbal
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
Using SQL in an Application Program
tTINYINT, tBOOL */
tSMALLINT */
tINTEGER */
tBIGINT */
tREAL */
tFLOAT, tDOUBLE */
tCHAR, tVARCHAR */
tBINARY, tVARBINARY */
tCLOB, tWCLOB, tBLOB */
tDATE, tTIME, tTIMESTAMP */
tROWID (internal use only) */
110
RDM SQL Language Guide
} VALUE;
typedef enum _val_status {
vsOKAY = 0,
vsTRUNCATE = 1,
/* string truncation */
vsNOVAL = 2
} VAL_STATUS;
/* general purpose SQL data value container */
typedef struct _rsql_value {
SQL_T
type;
/* internal data type code */
uint32_t
len;
/* # of bytes of var-length data (e.g., strlen+1) else 0
*/
VAL_STATUS status; /* operation status code */
VALUE
vt;
/* generic data type container */
} RSQL_VALUE;
Since the TIMESTAMP_VAL struct in used by both the RSQL API and the RDM Core API it is declared in a
separate header (base.h) as given below.
/* Date, time, and timestamp definitions */
typedef uint32_t DATE_VAL;
typedef uint32_t TIME_VAL;
typedef struct {
DATE_VAL date;
TIME_VAL time;
} TIMESTAMP_VAL;
Functions rsqlFetch and rsqlGetData return select statement column result values using the RSQL_
VALUE container. Stored procedure arguments must be specified using the RSQL_VALUE container when calling function rsqlExecProc. Access to the value in the RSQL_VALUE container is given in the table below for
each possible data type.
HSTMT hstmt;
RSQL_VALUE *ResultRow;
uint16_t nocols, cno;
while ( rsqlFetch(hstmt &ResultRow, &nocols) == errSUCCESS )
for ( cno = 0; cno < nocols; ++cno )
/* access the result column value as follows... */
Table 7. RSQL_VALUE Container Access
ResultRow[cno].type
ResultRow[cno].vt
tCHAR
.cv
tVARCHAR
.cv
tBINARY
tVARBINARY
.pv
.pv
Using SQL in an Application Program
ResultRow[cno].len
# of bytes (including
null)
# of bytes (including
null)
# of bytes
# of bytes
vt Field C Type
char *
char *
void *
void *
111
RDM SQL Language Guide
ResultRow[cno].type
tBOOL
tTINYINT
tSMALLINT
tINTEGER
tBIGINT
tREAL
tFLOAT
tDOUBLE
tDATE
tTIME
tTIMESTAMP
tCLOB
tBLOB
ResultRow[cno].vt
.tv
.tv
.sv
.lv
.llv
.fv
.dv
.dv
.dtv
.tmv
.tsv
.lvv.buf
.lvv.buf
ResultRow[cno].len
0
0
0
0
0
0
0
0
0
0
0
# of bytes
# of bytes
vt Field C Type
int8_t
int8_t
int16_t
int32_t
int64_t
float
double
double
DATE_VAL
TIME_VAL
TIMESTAMP_VAL
void *
void *
Note that the ResultRow[cno].len field only contains the length of variable-length data types and is zero for
scalar data types.
Basic access of the data values stored in RSQL_VALUE containers is illustrated in the example C program snippet below.
HSTMT hstmt;
uint16_t cno, nocols;
RSQL_VALUE *ResultRow;
...
while ( rsqlFetch(hstmt, &ResultRow, &nocols) == errSUCCESS ) {
/* display result row values */
for ( cno = 0; cno < norows; ++cno ) {
switch ( pRow[cno].type ) {
case tCHAR:
case tVARCHAR: printf("%s", pRow[cno].vt.cv); break;
case tBOOL:
printf("%s", pRow[cno].vt.tv ? "True" : "False");
break;
case tSMALLINT: printf("%d", pRow[cno].vt.sv); break;
...
}
}
}
...
Note that the pointers to variable-length data returned from an SQL API function call (e.g., rsqlFetch) may not
survive the next call and so you may need to copy the data if it needs to survive the next call (e.g., to
rsqlFetch).
It is important that you properly initialized all of the fields of the RSQL_VALUE structure when using it to pass
values to the RSQL native SQL API. For scalar (non-char/binary types-i.e. those whose lengths never vary), the
len field must be zero. The status field is ignored for input RSQL_VALUE arguments. Of course, the actual
data value (or pointer) needs to be assigned to the proper field in the vt union. Copies of any variable-length
data passed through a pointer field will be made by the SQL system from input RSQL_VALUE pointers.
Using SQL in an Application Program
112
RDM SQL Language Guide
Other RSQL_VALUE usage issues are addressed in the remaining examples in this section as well as in the function description entries in the RDM SQL API Reference.
Structure of an RDM SQL Application
An RDM SQL C application program consists of a set of calls to the RDM SQL API functions in a particular
sequence as outlined below.
1. Set up and initialize your application's use of RDM SQL as follows.
a. Call rsqlTFSInit if you're using the directly-linked Transactional File Server (TFS).
b. Call rsqlAllocConn to allocate a connection handle and open the connection. All of the SQL calls for a given connection must be made from a single thread. Other threads can have their own connections as well.
c. Call rsqlSetErrorCallback if you want to have your own error handling routine automatically called by RDM SQL.
d. Call rsqlRegisterUDFs to register any user-defined functions for your application.
e. Call rsqlRegisterVirtualTables to register the virtual tables that are defined in the database(s) to be opened in the next step.
f. Open the needed database(s) by calling either rsqlOpenDB or rsqlOpenCat (alternatively you can open database(s) by executing the open databaseRDM SQL statement after step i below).
g. Call rsqlRegisterProc for each directly linked stored procedure C module (i.e., procname_
ssp.c) that is used in your application.
h. Call any rsqlSet* functions (e.g., rsqlSetDateFormat, rsqlSetTimeout) to set up any needed operational parameters.
i. Call rsqlAllocStmt to allocate a statement handle that you will use to compile and execute SQL statements. A
llocate as many statement handles as you will need. If you intend to do positioned updates and/or deletes then you will need at least two statement handles. Typically, you will need a statement handle for each statement that will be compiled once but potentially executed multiple times.
2. Prepare your application to execute SQL statements as follows.
a. Call rsqlPrepare to compile each of the statements that will need to executed by your application.
b. Call rsqlBindParam to bind your application's variables to any parameter markers that were specified in the SQL statements prepared in the prior step.
3. At this point your application is execution ready. That means that your application will...
a. Call rsqlExecute to execute the appropriate statements that implement the database access needs for each particular function. A
lternatively, you can call rsqlExecDirect to both compile and execute a statement in a single call. Usually, you would only do this for statements that only need to be executed once.
b. Possibly call rsqlParamData and rsqlPutData to process any needed data-at-exec blob parameters specified in insert and update statements.
c. Call transaction statements (e.g., rsqlTransStart, rsqlTransCommit) to encapsulate related database modifications within transactions.
Using SQL in an Application Program
113
RDM SQL Language Guide
d. Call rsqlFetch to retrieve the result rows from an executed select statement. You may also need to call rsqlGetData to retrieve blob data results a block at a time. Alternatively, if the select is updateable, you may need to call rsqlGetCursorName or rsqlSetCursorName associated with a related positioned update or delete statement to change the current row returned from the call to rsqlFetch. Y
ou will need to call rsqlCloseStmt on a select for which you do not call rsqlFetch through to the end of the result set.
e. Possibly call rsqlExecProc to execute any stored procedures.
4. When your application is ready to terminate you need to ...
a. Call rsqlFreeStmt for each statement handle allocated in step 1j.
b. Call rsqlFreeConn for each allocated connection which automatically closes all open databases and terminates the connection and frees the connection handle and all its associated dynamically allocated memory.
c. If you're using the directly-linked TFS, call rsqlTFSTerm to terminated TFS processing.
Hello World!
The most basic of the above steps are illustrated below in an RDM SQL version of the ubiquitous "Hello World!"
C program. Now, granted, this is a little bit more complex than a simple printf statement. But it should serve
well to show the basic approach needed to use the RDM SQL API.
In the first version of the program, the return values from the SQL API functions are mostly ignored. This is perfectly okay in this case because I know what I'm doing and I know that there are no errors or unusual statuses
that are going to be returned (of course, if you take this code and try it yourself and get errors then I am going to
be really embarrassed!).
By the way, all of the example programs referred to throughout this section are available under the GettingStarted\examples\sql_db directory.
Example Program: hello1Example_main.c
#include "rsql.h"
/* =======================================================================
Simple RDM SQL "Hello World!" Example #1
*/
int main()
{
const RSQL_VALUE *row;
HCONN
hdbc;
HSTMT
hstmt;
rsqlAllocConn(&hdbc);
rsqlAllocStmt(hdbc, &hstmt);
/* create the database */
rsqlExecDirect(hstmt, "create database hellodb");
rsqlExecDirect(hstmt, "create table hellotab(txtln char(24))");
stat = rsqlTransCommit(hdbc);
Using SQL in an Application Program
114
RDM SQL Language Guide
if ( stat != errSUCCESS ) {
printf("*** unable to connect to TFS\n");
exit((int)stat);
}
/* insert a couple of rows into hellotab */
rsqlExecDirect(hstmt, "insert into hellotab values \"Hello\"");
rsqlExecDirect(hstmt, "insert into hellotab values \"World!\"");
rsqlTransCommit(hdbc);
/* retrieve and display the rows */
rsqlExecDirect(hstmt, "select txtln from hellotab");
while ( rsqlFetch(hstmt, &row, NULL) == errSUCCESS )
printf("%s\n", row->vt.cv);
rsqlFreeStmt(hstmt);
rsqlFreeConn(hdbc);
}
Executing this program will produce the following output:
Hello
World!
In this example the program is creating the database that will be used and so the first TFS communication does
not occur until the call to rsqlTransCommit following the create statement calls to rsqlExecDirect. When
the database already exists (which will typically be the case), the startup calls would be as follows.
rsqlAllocConn(&hdbc);
/* open database hellodb in shared mode */
stat = rsqlOpenDB(hdbc, "hellodb", "s");
if ( stat != errSUCCESS ) {
printf("*** unable to open the database\n");
exit((int)stat);
}
rsqlAllocStmt(hdbc, &hstmt);
/* insert a couple of rows into hellotab */
...
Now, good programming means that one should not just go around ignoring the status codes returned from function calls. However, checking every function for an unpleasant status code and then doing something appropriate with it adds a lot of code to the program that is not directly related to the important work being performed.
For example, doing this to this program would make the code look something like the following snippet.
RSQL_ERRCODE
...
stat;
Using SQL in an Application Program
115
RDM SQL Language Guide
/* create the database */
stat = rsqlExecDirect(hstmt, "create database hellodb");
if ( stat != errSUCCESS ) return report_error(NULL, hstmt, stat);
stat = rsqlExecDirect(hstmt, "create table hellotab(txtln char(24))");
if ( stat != errSUCCESS ) return report_error(NULL, hstmt, stat);
stat = rsqlExecDirect(hstmt, "commit");
if ( stat != errSUCCESS ) return report_error(NULL, hstmt, stat);
/* insert a couple of rows into hellotab */
stat = rsqlExecDirect(hstmt, "insert into hellotab values \"Hello\"");
if ( stat != errSUCCESS ) return report_error(NULL, hstmt, stat);
stat = rsqlExecDirect(hstmt, "insert into hellotab values \"World!\"");
if ( stat != errSUCCESS ) return report_error(NULL, hstmt, stat);
stat = rsqlExecDirect(hstmt, "commit");
if ( stat != errSUCCESS ) return report_error(NULL, hstmt, stat);
Isn't it just a little difficult to see what is really happening? We'll be discussing how to handle errors later on in this
section. However, a little introduction of a simple technique using the RDM SQLrsqlSetErrorCallback
function with use of C's setjmp and longjmp functions will illustrate how you can properly handle errors and
have readable code all at the same time.
The rsqlSetErrorCallback function arguments include the pointer to the callback function and a pointer to
an application data area. In our example, this is going to be a pointer to a struct of type ERR_DATA as shown
below.
/* error data structure */
typedef struct {
jmp_buf
errexit;
HCONN
hdbc;
HSTMT
hstmt;
int
erractive;
} ERR_DATA;
The hdbc and hstmt handles will be saved in this struct so that the error handling function can use them in
calls to rsqlTransRollback and rsqlGetErrorInfo. The errexit jmp_buf will contain the setjmp
location that will be set by the main program prior to calling rsqlSetErrorCallback. The erractive
flag will prevent looping in case rsqlTransRollback generates an error (e.g., "transaction not active").
The complete program is given below.
Example Program: hello2Example_main.c
#include "rsql.h"
/* error data structure */
typedef struct {
jmp_buf
errexit;
HCONN
hdbc;
HSTMT
hstmt;
int
erractive;
Using SQL in an Application Program
116
RDM SQL Language Guide
} ERR_DATA;
/* =======================================================================
Report error
*/
RSQL_ERRCODE EXTERNAL_FCN report_error(
HRSQL
hrsql,
RSQL_ERRCODE
stat,
ERR_DATA
*errdata)
{
char errmsg[133], *emsg = errmsg;
if ( errdata->erractive ) {
errdata->erractive = 0;
return stat;
}
if ( errdata && errdata->hstmt ) {
errdata->erractive = 1;
rsqlGetErrorInfo(errdata->hstmt, errmsg, 132);
printf("*** error: %s\n", emsg);
rsqlTransRollback(errdata->hdbc, NULL);
longjmp(errdata->errexit, (int32_t)stat);
}
rsqlGetErrorMsg(stat, &emsg);
printf("*** error: %s\n", emsg);
return stat;
}
/* =======================================================================
Simple RDM SQL "Hello World!" Example #2
*/
int main()
{
const RSQL_VALUE *row;
RSQL_ERRCODE stat;
HCONN
hdbc = NULL;
HSTMT
hstmt = NULL;
ERR_DATA
errdata;
errdata.erractive = 0;
if ( stat = (RSQL_ERRCODE)setjmp(errdata.errexit) )
return stat;
stat = rsqlAllocConn(&hdbc);
if ( stat != errSUCCESS ) return report_error(stat, NULL);
errdata.hdbc = hdbc;
rsqlSetErrorCallback(hdbc, report_error, &errdata);
rsqlAllocStmt(hdbc, &hstmt);
Using SQL in an Application Program
117
RDM SQL Language Guide
errdata.hstmt = hstmt;
/* create the database */
rsqlExecDirect(hstmt, "create database hellodb");
rsqlExecDirect(hstmt, "create table hellotab(txtln char(24))");
rsqlTransCommit(hdbc);
/* insert a couple of rows into hellotab */
rsqlExecDirect(hstmt, "insert into hellotab values \"Hello\"");
rsqlExecDirect(hstmt, "insert into hellotab values \"World!\"");
rsqlTransCommit(hdbc);
/* retrieve and display the rows */
rsqlExecDirect(hstmt, "select txtln from hellotab");
while ( rsqlFetch(hstmt, &row, NULL) == errSUCCESS )
printf("%s\n", row->vt.cv);
rsqlFreeStmt(hstmt);
rsqlFreeConn(hdbc);
}
The call to rsqlSetErrorCallback passes in the address of function report_error along with a pointer
to the errdata struct variable. When any SQL error occurs, the RDM SQL system will call function report_
error which will print the error message and then do a longjmp to the setjmp called at the beginning of the program. So, errors are properly caught without the need to pollute the important calls with a lot of status checking
code.
Initializing and Terminating TFS operation
If you are building your application to function as a server application that is integrated directly with the RDM
Transactional File Server (through use of the TFST configuration option), then you will need to include calls to
functions rsqlTFSInit and rsqlTFSTerm to initialize and terminate TFS operation. These calls are unnecessary if your application will only use the TFSR configuration in which one or more TFSs execute as separate
processes or if your application will only use the standalone TFS (TFSS).
Function rsqlTFSInit initializes the TFS. It takes two arguments. The first argument, docroot, is a string that
specifies the path name of the "root database directory" into which database directories will be stored. If docroot
is NULL then the root database directory will be the current directory. The second argument, tparams, is a
pointer to a struct variable containing elements that specify various TFS operational parameters. If tparams is
NULL then the system default values will be used for the TFS operational parameters. Note that even if both
arguments are NULL, this function must still be called when using the TFST configuration. The table below
describes the elements in the TFS_PARAMS struct that are relevant for RDM SQL.
Element Declaration
Default
Description
port
uint16_t
no_disk uint32_t
rd_only uint32_t
21553
0
0
TCP/IP port number on which the TFS will be listening for remote connections.
Set this flag to 1 to indicate that the TFS is to run diskless.
Set this flag to 1 to indicate that the databases controlled by this TFS are readonly.
Using SQL in an Application Program
118
RDM SQL Language Guide
As the TFS_PARAMS struct has elements besides the ones described above, it is always best to clear your
TFS_PARAMS variable first (see example below). Refer to function d_tfsinit for more details about use of all
of the TFS_PARAMS struct elements.
So code fragment below shows the calls to rsqlTFSInit and rsqlTFSTerm.
#include "rsql.h"
int main()
{
RSQL_ERRCODE
HCONN
HSTMT
TFS_PARAMS
stat;
hdbc = NULL;
hstmt = NULL;
tfs;
/* clear the tfs params struct: this is necessary */
memset(&tfs, 0, sizeof(tfs));
/* assign the tfs param values */
tfs.port = 21553;
/* Initialize this program to be the TFS */
stat = rsqlTFSInit("c:\tfs_dbs", (const TFS_PARAMS *)&tfs);
if ( stat != errSUCCESS ) {
printf("unable to start TFS, status code = %d\n", stat);
return stat;
}
stat = rsqlAllocConn(&hdbc);
...
do the database stuff
rsqlFreeConn(hdbc);
/* terminate TFS operation */
rsqlTFSTerm();
return 0;
}
Connecting to a TFS and Opening Databases
Opening a database and connecting to a TFS occurs when calling either rsqlOpenDB or rsqlOpenCat. Function rsqlOpenDB specifies one or more databases to be opened from the binary catalog files (e.g., bookshop.cat) stored in the database directory on the TFS. Function rsqlOpenCat specifies a database to open
using the catalog structure from the C catalog module (e.g., bookshop_cat.c). You need to call rsqlOpenCat for each database that is to be opened.
The database name(s) argument given in the call to rsqlOpenDB or rsqlOpenCat can specify the TFS on
which that particular database is located as given in the following syntax.
Using SQL in an Application Program
119
RDM SQL Language Guide
"dbname[@TFSComputerName[:port]]"
where:
dbname
the name of the database to be opened
TFSComputerName
the name of the computer on which the TFS is running (default is localhost),
port
the TCP/IP port number on which the TFS is listening (default is 21553)
More than one database can be specified in the rsqlOpenDB function call by separating each database specification with a semi-colon (";"). For example, the following code segment opens the bookshop and nsfawards databases each running on a separate TFS on different computers.
#include "rsql.h"
static char sel_acctmgr[] = "select mgrid, commission from acctmgr";
static char sel_sponsor[] = "select name, city from sponsor where state = 'WA'";
main()
{
HCONN hdbc;
HSTMT hstmt;
RSQL_ERRCODE stat;
RSQL_VALUE *row;
rsqlAllocConn(&hdbc);
rsqlOpenDB(hdbc, "bookshop@RaimaSrvr1:1650;nsfawards@RaimaSvr2:21553", "s");
rsqlAllocStmt(hdbc, &hstmt);
stat = rsqlExecDirect(hstmt, sel_acctmgr);
if ( stat != errSUCCESS )
return stat;
printf("**** %s\n", sel_acctmgr);
while ( rsqlFetch(hstmt, &row, NULL) == errSUCCESS ) {
printf("%s, %f\n", row[0].vt.cv, row[1].vt.dv);
}
stat = rsqlExecDirect(hstmt, sel_sponsor);
if ( stat != errSUCCESS )
return stat;
printf("**** %s\n", sel_sponsor);
while ( rsqlFetch(hstmt, &row, NULL) == errSUCCESS ) {
printf("%s, %s\n", row[0].vt.cv, row[1].vt.cv);
}
rsqlFreeConn(hdbc);
}
Use of function rsqlOpenCat is shown in the following version of the previous example.
Using SQL in an Application Program
120
RDM SQL Language Guide
#include "rsql.h"
#include "bookshop_cat.h"
#include "nsfawards_cat.h"
static char sel_acctmgr[] = "select mgrid, commission from acctmgr";
static char sel_sponsor[] = "select name, city from sponsor where state = 'WA'";
main()
{
HCONN hdbc;
HSTMT hstmt;
RSQL_ERRCODE stat;
RSQL_VALUE *row;
rsqlAllocConn(&hdbc);
rsqlOpenCat(hdbc, &bookshop_cat, "@localhost:21553", "s");
rsqlOpenCat(hdbc, &nsfawards_cat, "@localhost:21555", "s");
rsqlAllocStmt(hdbc, &hstmt);
stat = rsqlExecDirect(hstmt, sel_acctmgr);
if ( stat != errSUCCESS )
return stat;
printf("**** %s\n", sel_acctmgr);
while ( rsqlFetch(hstmt, &row, NULL) == errSUCCESS ) {
printf("%s, %f\n", row[0].vt.cv, row[1].vt.dv);
}
stat = rsqlExecDirect(hstmt, sel_sponsor);
if ( stat != errSUCCESS )
return stat;
printf("**** %s\n", sel_sponsor);
while ( rsqlFetch(hstmt, &row, NULL) == errSUCCESS ) {
printf("%s, %s\n", row[0].vt.cv, row[1].vt.cv);
}
rsqlFreeConn(hdbc);
}
Database Unions
A database union allows multiple instances of the same database running on different TFSs to be opened and
accessed as though they were just a single database. The database names can be different but they must all
have identical DDL schema definitions (hence, identical catalogs). Database unions allow you to partition a database among multiple TFSs running on separate computers (or as separate processes on the same multicore/multi-processor computer) in order to take advantage of the performance benefits from truly parallel database access.
Using SQL in an Application Program
121
RDM SQL Language Guide
You can call either rsqlOpenCat or rsqlOpenDB to open a union of two or more databases. The specification
for each database and TFS combination is separated using the vertical bar symbol, "|". The following examples
show the calls needed for the case where the NSF awards database was partitioned between three TFSs.
rsqlOpenCat(hdbc, @nsfawards_cat,
"nsfawards@NSF1:21553|nsfawards@NSF2:21555|nsfawards@NSF3:21557", "s");
or,
rsqlOpenDB(hdbc,
"nsfawards@NSF1:21553|nsfawards@NSF2:21555|nsfawards@NSF3:21557", "s");
Compiling and Executing SQL Statements
As SQL is a database language, statements coded in SQL need to be compiled in order to be executed. The function that needs to be called in order to compile an SQL statement is rsqlPrepare. The function that needs to
be called in order to executed a compiled SQL statement is rsqlExecute. A statement can be compiled once
and executed multiple times. In fact, except for a few situations described later on in this section, it is best to compile most of your statements once when the program starts and then execute them as needed. You can also compile and execute a statement in a single call using function rsqlExecDirect.
The SQL statement to be compiled is passed to rsqlPrepare as a standard null-terminated string. The status
returned from the call to rsqlPrepare will indicate any error encountered during compilation. Several functions can be called in order to discover information about the compiled statement. You can call function
rsqlGetStmtType in order to discover the type of statement just compiled. Function rsqlGetNumResultCols can be called to retrieve the number of select statement result columns. Function rsqlGetColDescr can be called to retrieve information about a particular select statement result column.
Parameters are specified within an SQL statement string using a question mark character ('?') and can appear in
any context in which a literal constant value is allowed. Parameters are identified as ordinals beginning at 1 and
proceeding in left-to-right order in the statement string. Function rsqlBindParam must be called before the
statement is executed in order to provide to SQL the type and location information in the user application where a
parameter value can be found.
Once all of the specified parameter markers have been bound to the application variables containing their values,
function rsqlExecute can be called to execute the compiled SQL statement.
The following program shows the basic sequence of compiling and executing a simple SQL select statement
with parameter markers. Note that the checking of the status codes returned from most of the RSQL API function calls has been left out for readability. The bold-faced lines are discussed below.
Example Program: params1Example_main.c
1
2
3
4
#include "rsql.h"
static void gettext(
const char *prompt,
Using SQL in an Application Program
122
RDM SQL Language Guide
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
char
size_t
*text,
len)
{
printf("%s ", prompt);
if (fgets(text, len, stdin) == NULL )
text[0] = '\0';
else {
char *nl = strchr(text, '\n');
if ( nl )
*nl = '\0';
}
}
/* =======================================================================
Simple RDM SQL parameter markers example 1
*/
int main()
{
const RSQL_VALUE *row;
RSQL_ERRCODE
stat;
HCONN
hdbc;
HSTMT
hstmt;
char
buf[250];
int16_t
lo_born = 0, hi_born = 0;
char
gender[2] = "";
char stmt[] = "select full_name, yr_born, yr_died from author "
"where gender = ? and yr_born between ? and ?";
rsqlAllocConn(&hdbc);
rsqlAllocStmt(hdbc, &hstmt);
stat = rsqlOpenDB(hdbc, "bookshop", "s");
if ( stat != errSUCCESS ) {
printf("unable to open bookshop database\n");
rsqlFreeConn(hdbc);
exit((int)stat);
}
rsqlPrepare(hstmt, stmt);
rsqlBindParam(hstmt, 1, tCHAR, gender, NULL);
rsqlBindParam(hstmt, 2, tSMALLINT, &lo_born, NULL);
rsqlBindParam(hstmt, 3, tSMALLINT, &hi_born, NULL);
for ( ; ; ) {
/* get parameter values from user */
gettext("\nenter gender (M/F):", gender, sizeof(gender));
if ( gender[0] != 'M' && gender[0] != 'F' ) {
printf("gender must be a M or F\n");
continue;
}
gettext("\nenter low year born:", buf, sizeof(buf));
lo_born = (int16_t)atoi(buf);
if ( lo_born == 0 )
Using SQL in an Application Program
123
RDM SQL Language Guide
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
break;
gettext("enter high year born:", buf, sizeof(buf));
hi_born = (int16_t)atoi(buf);
if ( hi_born == 0 )
break;
if ( lo_born > hi_born ) {
printf("low year born must be less or equal to high!\n");
continue;
}
/* execute select statement */
rsqlExecute(hstmt);
/* fetch result set */
printf("NAME
YR_BORN YR_DIED\n");
printf("----------------------------------- ------- -------\n");
while ( rsqlFetch(hstmt, &row, NULL) == errSUCCESS )
printf("%-35.35s %4d
%4d\n",
row[0].vt.cv, row[1].vt.sv, row[2].vt.sv);
}
rsqlFreeStmt(hstmt);
rsqlFreeConn(hdbc);
exit(0);
}
The select statement specified at lines 30 and 31 in stmt contains three parameters. The first is the comparison
value for the gender column of type char and the second and third specify the low and high comparison values
for the smallint column yr_born. The statement is compiled by the call to rsqlPrepare at line 41. The three
calls to rsqlBindParam associate each parameter with the local variable that will contain its value at execution
time. The final argument to rsqlBindParam is not used because it is only needed for parameters that need to
specify a length (e.g., tBINARY) or to indicate that a parameter value is to be specified at execution time (e.g., a
blob data-at-exec parameter).
The actual parameter values are assigned inside the for loop at line 48 for the gender parameter, line 55 for the
low yr_born parameter, and at line 60 for the high yr_born parameter. Note that while the gender column
was declared as a single character column (see bookshop.sql for the bookshop database DDL), the parameter value for it must be a null-terminated string. The C data type for the variable that is associated with a given
parameter must be as indicated in Table 6.
The call to rsqlExecute at line 69 executes the select statement with the specified parameter values and the
rsqlFetch while loop at line 74 retrieves all of the rows that satisfy the where clause with the current set of
parameter values.
RDM SQL also provides the ability to specify named parameter markers and then call rsqlBindNamedParam
to bind the parameter values. Named parameter markers are specified by a colon followed by an identifier that
serves as the parameter name. Referring to the above example, the following changes modify the program to
use named parameters.
Using SQL in an Application Program
124
RDM SQL Language Guide
30
31
...
42
43
44
char
stmt[] = "select full_name, yr_born, yr_died from author "
"where gender = :gen and yr_born between :lo and :hi";
rsqlBindNamedParam(hstmt, "gen", tCHAR, gender, NULL, NULL);
rsqlBindNamedParam(hstmt, "hi", tSMALLINT, &hi_born, NULL, NULL);
rsqlBindNamedParam(hstmt, "lo", tSMALLINT, &lo_born, NULL, NULL);
Use of parameter markers with an insert statement is shown in the example program below which inserts new
rows into the author table of the bookshop database.
Example Program: params2Example_main.c
1
2
3
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
#include "rsql.h"
static void gettext(
...
/* =======================================================================
Simple RDM SQL parameter markers example 2 including blobs
*/
int main()
{
HCONN
hdbc;
HSTMT
hstmt;
char
char
int32_t
char
int32_t
int16_t
int32_t
int16_t
int32_t
char
char
int32_t
uint32_t
last_name[14] = "";
full_name[35] = "";
full_name_len = 0;
gender[2] = " ";
gender_len = 0;
yr_born = 0;
yr_born_len = 0;
yr_died = 0;
yr_died_len = 0;
year[5];
bio[132] = "";
data_at_exec = -2;
short_bio_len;
char
stmt[] = "insert into author values ?, ?, ?, ?, ?, ?";
rsqlAllocConn(&hdbc);
rsqlAllocStmt(hdbc, &hstmt);
rsqlOpenDB(hdbc, "bookshop", "s");
rsqlPrepare(hstmt, stmt);
/* bind all 6 parameters */
rsqlBindParam(hstmt, 1, tCHAR,
rsqlBindParam(hstmt, 2, tCHAR,
Using SQL in an Application Program
last_name, NULL);
full_name, &full_name_len);
125
RDM SQL Language Guide
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
{
94
95
96
97
{
98
99
100
rsqlBindParam(hstmt,
rsqlBindParam(hstmt,
rsqlBindParam(hstmt,
rsqlBindParam(hstmt,
3,
4,
5,
6,
tCHAR,
gender,
tSMALLINT, &yr_born,
tSMALLINT, &yr_died,
tCLOB,
bio,
&gender_len);
&yr_born_len);
&yr_died_len);
&data_at_exec);
for ( ; ; ) {
/* get parameter values from user */
gettext("enter last_name:", last_name, sizeof(last_name));
if ( !last_name[0] ) break;
gettext("enter full_name:", full_name, sizeof(full_name));
full_name_len = full_name[0] ? 0 : -1;
gettext("enter gender (M/F):", gender, sizeof(gender));
if ( !gender[0] )
gender_len = -1;
else if ( gender[0] == 'M' || gender[0] == 'F' )
gender_len = 0;
else {
printf("gender must be a M or F\n");
continue;
}
gettext("enter year born:", year, sizeof(year));
if ( year[0] ) {
yr_born = (int16_t)atoi(year);
yr_born_len = 0;
}
else
yr_born_len = -1;
gettext("enter year died:", year, sizeof(year));
if ( year[0] ) {
yr_died = (int16_t)atoi(year);
yr_died_len = 0;
}
else
yr_died_len = -1;
rsqlTransStart(hdbc, NULL);
/* execute select statement */
if ( rsqlExecute(hstmt) != errNEEDDATA )
printf("rsqlExecute did NOT return errNEEDDATA!!\n");
break;
}
while ( rsqlParamData(hstmt, NULL, NULL) == errNEEDDATA )
for ( ; ; ) {
gettext("enter short_bio:", bio, sizeof(bio));
if ( !bio[0] )
Using SQL in an Application Program
126
RDM SQL Language Guide
101
102
103
104
105
106
107
108
109
110
111
112
113
break;
short_bio_len = (uint32_t)strlen(bio);
rsqlPutData(hstmt, bio, short_bio_len);
}
/* add a null terminator */
rsqlPutData(hstmt, "", 1);
}
rsqlTransCommit(hdbc);
}
rsqlFreeStmt(hstmt);
rsqlFreeConn(hdbc);
exit(0);
}
The insert statement at line 40 (compiled at line 46) contains a parameter marker for each of the author table's
six columns. The author table's declaration is shown below for easy reference.
create table author(
last_name
char(13) primary key,
full_name
char(35),
gender
char distinct values = 2,
yr_born
smallint,
yr_died
smallint,
short_bio
long varchar,
key yob_gender_key(yr_born, gender)
);
To specify a null column value for a parameter the parameter length variable pointed to by the pLenValue
(final) argument to rsqlBindParam must be set to -1 at the time rsqlExecute is called. Line 62 shows how
this is done for the full_name_len variable that was specified in the rsqlBindParam call at line 50. Nulls
are allowed for all of the author table columns except last_name. Hence, the pLenValue argument is not
needed (i.e., it is NULL) in its call to rsqlBindParam at line 49.
Use of data-at-exec parameters is designed to provide the ability to store blob (i.e., columns of type long
varchar, or long varbinary) data values in sets of fixed-length blocks in order to minimize the amount of needed
memory. Data-at-exec parameters are parameter values that will be supplied by the application program after
rsqlExecute is called to execute the statement. A data-at-exec parameter is specified by setting the length
variable specified through the pLenValue argument to rsqlBindParam to -2(see lines 37 and 54).. When
executing an SQL statement for which one or more data-at-exec parameters have been specified, rsqlExecute will return status errNEEDDATA to indicate that it is ready for the application to supply the blob data
values. The program then calls rsqlParamData to set up the subsequent calls to rsqlPutData that store
the parameter's blob value. Lines 93 to 107 show how this is done for the long varchar column short_bio in
the author table.
It is important to note that character blob data is considered to be one long null terminated string. If multiple calls
to rsqlPutData are used to store its value it is important that the terminating null byte only be included on the
final rsqlPutData call. Hence, short_bio_len is set to the string length at line 102, excluding the null byte,
in the intermediate rsqlPutData calls at line 103. The additional call at line 106 ensures that the blob is terminated by a null byte..
Using SQL in an Application Program
127
RDM SQL Language Guide
Retrieving Select Statement Results
Basic Retrieval
Retrieving the result set rows of a select statement is quite simple. After successfully compiling and executing a
select statement through calls to rsqlPrepare and rsqlExecute (or rsqlExecDirect), the program can
retrieve the result set one row at a time by calling rsqlFetch. After the last row has been fetched the next call
to rsqlFetch will return status errNOMOREDATA. A number of examples that do just that have already been
given.
Function rsqlFetch must be called to retrieve the next row of a select statement's result set. The values of
each result column are returned through the pResult argument. You can also access a column's result value
using function rsqlGetData. In fact, you can call fetch passing NULL for the pResult argument and then call
rsqlGetData to retrieve the value for a specific result column. For example, you could replace lines 59-61 of
the params1Example_main.c example program given earlier with the following code to do the same thing.
while ( rsqlFetch(hstmt, NULL, NULL) == errSUCCESS ) {
RSQL_VALUE *pColval;
rsqlGetData(hstmt, 1, &pColval, 0, NULL);
printf("%-35.35s ", pColval->vt.cv);
rsqlGetData(hstmt, 2, &pColval, 0, NULL);
printf("%4d
", pColval->vt.sv);
rsqlGetData(hstmt, 3, &pColval, 0, NULL);
printf("%4d\n", pColval->vt.sv);
}
While you can use rsqlGetData to do this it is primarily intended as a way to retrieve blob column values in
chunks -i.e., a block at a time. The basic approach for doing just that is shown in the following example program.
Retrieving Blob Data Values
Example Program: getdataExample_main.c
1
2
3
17
18
19
20
21
22
23
24
#include "rsql.h"
static void gettext(
...
/* =======================================================================
Simple RDM SQL example retrieving blob data using rsqlGetData
*/
int main()
{
const RSQL_VALUE *pColval;
RSQL_ERRCODE
stat;
Using SQL in an Application Program
128
RDM SQL Language Guide
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
{
64
65
66
67
68
69
70
71
72
73
74
75
HCONN
HSTMT
char
char
uint32_t
char
hdbc;
hstmt;
last_name[40] = "";
short_bio[81];
remlen;
stmt[] = "select full_name, short_bio from author"
" where last_name like ? for read only";
rsqlAllocConn(&hdbc);
rsqlAllocStmt(hdbc, &hstmt);
stat = rsqlOpenDB(hdbc, "bookshop", "s");
if ( stat != errSUCCESS ) {
printf("unable to open bookshop database\n");
rsqlFreeConn(hdbc);
exit((int)stat);
}
rsqlPrepare(hstmt, stmt);
rsqlBindParam(hstmt, 1, tCHAR, last_name, NULL);
for ( ; ; ) {
/* get parameter value from user */
gettext("\nenter author's last_name:", last_name, sizeof(last_name)-1);
if (!last_name[0]) break;
strcat(last_name, "%");
/* execute select statement */
rsqlExecute(hstmt);
stat = rsqlFetch(hstmt, NULL, NULL);
if ( stat != errSUCCESS ) {
printf("author %s not in database\n", last_name);
continue;
}
/* author's full_name */
rsqlGetData(hstmt, 1, &pColval, 0, NULL);
printf("%s:\n", pColval->vt.cv);
/* fetch short_bio blob data */
while ( rsqlGetData(hstmt, 2, &pColval, 80, &remlen) == errSUCCESS ) {
if ( pColval->type == tNULL || remlen == 0 )
printf("No short_bio has been entered\n");
break;
}
/* copy blob data block and add null terminator */
memcpy(short_bio, pColval->vt.lvv.buf, pColval->len);
short_bio[pColval->len] = '\0';
printf("%s\n", short_bio);
}
rsqlCloseStmt(hstmt);
}
rsqlFreeStmt(hstmt);
rsqlFreeConn(hdbc);
Using SQL in an Application Program
129
RDM SQL Language Guide
76
77
exit(0);
}
The select statement is shown in lines 30-31. The code that retrieves the blob value for the short_bio long
varchar column is given in while loop at lines 62 to 71. As a NULL could have been stored for the blob value that
is checked at line 63 (the test for remlen == 0 will probably never occur as that would mean that a zero length
blob value was stored -but it doesn't hurt to check). The value containing pColval->len bytes is memcpy'd
from the blob data buffer pointer (pColval->vt.lvv.buf into the local char array named short_bio (line
68) and a null string terminator byte is added at the end (line 69). Remember that character blobs are treated as
a single character string so there is only the one null-byte terminator as the last character stored in the blob.
Fetching Results From Retrieval Stored Procedures
Recall that a retrieval stored procedure was one that contained one or more select statements. To retrieve the
results from the select statements contained in a stored procedure you can either compile and execute an
execute statement that invokes the procedure or call function rsqlExecProc to directly execute the stored procedure. For example, the following script creates a stored procedure that returns the author name and list of titles
of books by that author.
create procedure books_by_author(name char) as
select full_name, title from author natural join book
where last_name like name
end procedure;
Note that the where clauses uses the like operator so that you can issue the following execute to retrieve the
books written by both Bronte sisters:
execute books_by_author("Bront%");
FULL_NAME
TITLE
Bronte, Charlotte
Jane Eyre. An autobiography. Ed. by Currer Bell [pseud.]
Bronte, Charlotte
Villette.
Bronte, Charlotte
Jane Eyre.
Bronte, Emily
Wuthering Heights. A novel.
The example program given below prompts the user (lines 41-43) for the author's last name (wild cards
allowed), generates an execute statement string that passes that name into the books_by_author procedure
(line 46) and then calls rsqlExecDirect to compile and execute it (line 49). After that, the result set is
retrieved just as if the stored procedure's select statement was itself compiled and executed (lines 57-58),.
Example Program: procs1Example_main.c
1
2
3
#include "rsql.h"
static void gettext(
...
17
Using SQL in an Application Program
130
RDM SQL Language Guide
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
/* =======================================================================
Simple RDM SQL stored proc execution example 1
*/
int main()
{
const RSQL_VALUE *row;
RSQL_ERRCODE
stat;
HCONN
hdbc;
HSTMT
hstmt;
char
last_name[35];
char
stmt[81];
rsqlAllocConn(&hdbc);
rsqlAllocStmt(hdbc, &hstmt);
stat = rsqlOpenDB(hdbc, "bookshop", "s");
if ( stat != errSUCCESS ) {
printf("unable to open bookshop database\n");
rsqlFreeConn(hdbc);
exit((int)stat);
}
for ( ; ; ) {
/* get parameter values from user */
gettext("\nenter author's last_name:", last_name, sizeof(last_name));
if ( !last_name[0] )
break;
/* construct execute statement */
sprintf(stmt, "execute books_by_author(\"%s\")", last_name);
/* execute the execute statement */
stat = rsqlExecDirect(hstmt, stmt);
if ( stat != errSUCCESS ) {
printf("error in execute statement\n");
continue;
}
/* fetch result set */
printf("NAME
TITLE\n");
printf("----------------------------------- -----\n");
while ( rsqlFetch(hstmt, &row, NULL) == errSUCCESS )
printf("%-35.35s %s\n", row[0].vt.cv, row[1].vt.cv);
}
rsqlFreeStmt(hstmt);
rsqlFreeConn(hdbc);
exit(0);
}
The second approach is actually a better solution because it does not incur the cost of recompiling an execute
statement each time. This is shown in the following example program.
Example Program: procs2Example_main.c
Using SQL in an Application Program
131
RDM SQL Language Guide
1
2
3
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
#include "rsql.h"
static void gettext(
...
/* =======================================================================
Simple RDM SQL stored proc execution example 2
*/
int main()
{
const RSQL_VALUE *row;
RSQL_VALUE
arg;
RSQL_ERRCODE
stat;
HCONN
hdbc;
HSTMT
hstmt;
char
last_name[35];
rsqlAllocConn(&hdbc);
rsqlAllocStmt(hdbc, &hstmt);
stat = rsqlOpenDB(hdbc, "bookshop", "s");
if ( stat != errSUCCESS ) {
printf("unable to open bookshop database\n");
rsqlFreeConn(hdbc);
exit((int)stat);
}
/* set up argument value container */
arg.type = tCHAR;
arg.status = vsOKAY;
arg.len = 0;
arg.vt.cv = last_name;
for ( ; ; ) {
/* get parameter values from user */
gettext("\nenter author's last_name:", last_name, sizeof(last_name));
if ( !last_name[0] )
break;
/* execute the execute statement */
stat = rsqlExecProc(hstmt, "books_by_author", 1, &arg);
if ( stat != errSUCCESS ) {
printf("error attempting to execute proc\n");
continue;
}
/* fetch result set */
printf("NAME
TITLE\n");
printf("----------------------------------- -----\n");
while ( rsqlFetch(hstmt, &row, NULL) == errSUCCESS )
printf("%-35.35s %s\n", row[0].vt.cv, row[1].vt.cv);
}
rsqlFreeStmt(hstmt);
rsqlFreeConn(hdbc);
exit(0);
Using SQL in an Application Program
132
RDM SQL Language Guide
65
}
Lines 39-42 sets up the argument value container (line 24) that will be passed into rsqlExecProc at line 51
that executes the books_by_author stored procedure. At that point, retrieval of the result set proceeds in the
usual manner.
Stored procedures can contain more than one select statement as shown in the following version of books_
by_author.
create procedure books_by_author(name char) as
select full_name, yr_born, short_bio from author where last_name = name
select title from book where last_name = name
end procedure;
Two select statements are contained in this procedure. After executing the stored procedure and fetching the
result rows from the first, in order to retrieve the results of the second the application needs to call function
rsqlMoreResults which will return status errSUCCESS when there is another select statement to be
executed or errNOMOREDATA after the last select has been processed. This is shown in the following example.
Example Program: procs3Example_main.c
1
2
3
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#include "rsql.h"
static void gettext(
...
/* =======================================================================
Simple RDM SQL stored proc execution example 3
*/
int main()
{
const RSQL_VALUE *row, *pColval;
RSQL_VALUE
arg;
RSQL_ERRCODE
stat;
HCONN
hdbc;
HSTMT
hstmt;
uint32_t
remlen;
char
short_bio[81];
char
last_name[35];
rsqlAllocConn(&hdbc);
rsqlAllocStmt(hdbc, &hstmt);
stat = rsqlOpenDB(hdbc, "bookshop", "s");
if ( stat != errSUCCESS ) {
printf("unable to open bookshop database\n");
rsqlFreeConn(hdbc);
exit((int)stat);
}
/* set up argument value container */
Using SQL in an Application Program
133
RDM SQL Language Guide
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
{
69
70
71
72
73
74
75
76
77
78
79
80
{
81
82
83
84
85
86
87
88
89
90
arg.type = tCHAR;
arg.status = vsOKAY;
arg.len = 0;
arg.vt.cv = last_name;
/* turn on deferred blob reading mode */
rsqlSetDeferBlobMode(hstmt, 1);
for ( ; ; ) {
/* get parameter values from user */
gettext("\nenter author's last_name:", last_name, sizeof(last_name));
if ( !last_name[0] )
break;
/* execute the execute statement */
stat = rsqlExecProc(hstmt, "books_by_author", 1, &arg);
if ( stat != errSUCCESS ) {
printf("error attempting to execute proc\n");
continue;
}
/* fetch 1st select's result set */
while ( rsqlFetch(hstmt, &row, NULL) == errSUCCESS ) {
printf("\nauthor
: %s\n", row[0].vt.cv);
printf("year of birth: %d\n", row[1].vt.sv);
printf("------------------------------------------------------\n");
/* fetch short_bio blob data */
while (rsqlGetData(hstmt, 3, &pColval, 80, &remlen) == errSUCCESS)
if ( pColval->type == tNULL || remlen == 0 ) {
printf("None\n");
break;
}
/* copy blob data block and add null terminator */
memcpy(short_bio, pColval->vt.lvv.buf, pColval->len);
short_bio[pColval->len] = '\0';
printf("%s\n", short_bio);
}
}
/* execute and fetch 2nd select's result set */
if ( rsqlMoreResults(hstmt) != errSUCCESS )
printf("Second SELECT not in books_by_author\n");
break;
}
printf("\ntitles in stock\n---------------\n");
while ( rsqlFetch(hstmt, &row, NULL) == errSUCCESS )
printf("%s\n", row[0].vt.cv);
}
rsqlFreeStmt(hstmt);
rsqlFreeConn(hdbc);
exit(0);
Using SQL in an Application Program
134
RDM SQL Language Guide
91
}
The call to rsqlMoreResults in line 80 executes the second select statement and its result set is returned in
the rsqlFetch while loop at line 85.
This example also includes a call to rsqlSetDeferBlobMode to turn on deferred reading of blob data (line
47) which is performed by the rsqlGetData while loop at line 68 (identical to that shown earlier in getdataExample_main.c example). Note that without having made that call, the rsqlGetData loop would
never exit as it would be returning the entire blob value in the single call. In getdataExample_main.c
deferred blob mode was automatically set when rsqlFetch was called with a NULL second argument.
Positioned Update and Delete Statements
A positioned update/delete statement updates/deletes the current row of an updateable select statement that is
currently being fetched on a separate statement handle within the same connection. Executing a select opens
what is commonly referred to as a cursor which can be thought of as an indicator of the current row in the select
statement's result set. After calling rsqlExecute the cursor is positioned before the first row. A call to
rsqlFetch advances the cursor to the next row if one exists. Associated with each statement handle is a
unique cursor name. This can be set by a call to function rsqlSetCursorName to specify your own cursor
name or you can call function rsqlGetCursorName to get the name automatically assigned by RDM SQL.
Cursor names are not case-sensitive.
The syntax for an updateable select and positioned update and delete statements is shown below.
updateable_select:
select { * | column_name [, column_name]...} from table_spec
[where conditional_expr]
for update [of
column_name [, column_name]...]
positioned_update_stmt:
update [db_name.]table_name
set
column_name = expression[, column_name = expression]...
where current of
cursor_name
positioned_delete_stmt:
delete from [db_name.]table_name
where current of
cursor_name
Only an updateable select statement can be used with a positioned update/delete. An updateable select must
adhere to the following rules:
1. 2. 3. 4. 5. Only one table can be listed in the from clause.
Result columns must not contain any expressions.
No distinct, order by or group by is allowed.
The for update clause must be specified.
4.5. If an of clause is specified then each of the specified column names must also appear in the select result set.
Using SQL in an Application Program
135
RDM SQL Language Guide
For a positioned update the columns that can be assigned new values in the set clause must be specified in the
corresponding select statement's result set and, if specified, listed in the for update of clause. Any columns
declared in the table can be referenced in the update (i.e., used in the set assignment of one of the updateable
columns).
A simple example program which performs a positioned delete is shown below. A positioned update would be
done similarly.
Example Program: pos_delExample_main.c
1
2
3
#include "rsql.h"
static void gettext(
...
17
18
/* =======================================================================
19
RDM SQL positioned delete example
20
*/
21
int main()
22
{
23
RSQL_ERRCODE
stat;
24
HCONN
hdbc;
25
HSTMT
sel_hstmt, del_hstmt;
26
const RSQL_VALUE *row;
27
char
reply[30];
28
29
rsqlAllocConn(&hdbc);
30
stat = rsqlOpenDB(hdbc, "bookshop", "s");
31
if ( stat != errSUCCESS ) {
32
printf("unable to open bookshop database\n");
33
rsqlFreeConn(hdbc);
34
exit((int)stat);
35
}
36
/* set up select statement cursor */
37
rsqlAllocStmt(hdbc, &sel_hstmt);
38
rsqlSetCursorName(sel_hstmt, "book_cursor");
39
rsqlPrepare(sel_hstmt, "select bookid, last_name, title from book for
update");
40
41
/* set up delete statement */
42
rsqlAllocStmt(hdbc, &del_hstmt);
43
rsqlPrepare(del_hstmt, "delete from book where current of book_cursor");
44
45
rsqlTransStart(hdbc, NULL);
46
47
rsqlExecute(sel_hstmt);
48
49
while ( rsqlFetch(sel_hstmt, &row, NULL) == errSUCCESS )
{
50
printf("bookid
: %s\n", row[0].vt.cv);
51
printf("last_name: %s\n", row[1].vt.cv);
52
printf("title
: %s\n", row[2].vt.cv);
53
gettext("do you want to delete this book (y|n)?", reply, sizeof
Using SQL in an Application Program
136
RDM SQL Language Guide
(reply));
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
}
if ( reply[0] == 'y' )
rsqlExecute(del_hstmt);
gettext("continue (y|n)?", reply, sizeof(reply));
if ( reply[0] != 'y' )
break;
}
rsqlTransCommit(hdbc);
rsqlFreeStmt(sel_hstmt);
rsqlFreeStmt(del_hstmt);
rsqlFreeConn(hdbc);
exit(0);
Two statement handles are allocated on the same connection handle: sel_hstmt (line 37) is used for the
select statement and del_stmt (line 42) is used for the delete. After allocating sel_hstmt function rsqlSetCursorName is called to set the cursor name to "book_cursor". This called could have been made after the call
to rsqlPrepare but must be made before the call to rsqlExecute. The select is compiled at lines 39. Note
that the for update clause must be specified. The delete statement at lines 43. The where current of clause identifies this as a positioned delete. Function rsqlTransStart is called at line 45 before the select is executed at
line 47. The rsqlFetch while loop retrieves and displays each row and gives the user the option of deleting that
row. If the reply begins with 'y' (so, "yes", "yo", "yea", "ya", "you better not", etc. all will delete the book from the
database) then that row is deleted. The process continues as long as the reply to the prompt at lines 57-58 is 'y'.
When the loop exits the rsqlTransCommit will commit the changes to the database. Note that rsqlCloseStmt is not explicitly called. This is because the rsqlFreeStmt will close the cursor automatically. However, if more processing is to be done with sel_hstmt then rsqlCloseStmt must be called before
proceeding. That's really all there is to it. Of course, a real application would probably have a more user-friendly
interface and properly handle the return codes from the function calls!
Using SQL in an Application Program
137
RDM SQL Language Guide
User-Defined Functions (UDFs) in SQL
Civilization advances by extending the
number of important operations which
we can perform without thinking about them.
- Alfred North Whitehead, Introduction to Mathematics (1911)
A User-Defined Function (UDF) is an application-specific function used just like the RDM SQL scalar and aggregate functions as described in the Retrieving Data from a Database section, but developed to meet the specific
needs of your application. UDFs are created in a C program module that conforms to a pre-defined API that will
be called by the SQL runtime system whenever the specific function is used in an SQL statement.
Your UDF can be either a scalar or an aggregate function. A scalar UDF operates on a single row and retrieves
a single value. An aggregate function is used with the group by clause of a select statement and performs computations on sets of rows that result from the select statement.
This section will show you how to write a RDM SQL UDF in C through two simple example UDFs: a scalar UDF
that implements a soundex code for names, and an aggregate UDF that counts the number of occurrences of a
column (or expression) of type character that match a specified string.
The soundex function takes a single character string argument that should contain the name of a person beginning with the last name. It returns the 4 character soundex code based on the rules given in the Wikipedia article
"soundex" (http://en.wikipedia.org/wiki/Soundex). If the string does not conform to a name, the function returns
code "xERR". For example, the following query returns the name and soundex code for each row of the person
table in the nsfawards database.
select name, soundex(name) from person;
The example aggregate UDF is called matchcount and takes two character arguments. The first is a column
or string expression and the second is a character column or string expression that the first is to match. The function tracks the count of the number of matches that are encountered in each group. For example, the query
below returns the counts of the number of person table rows in the nsfawards database of male, female, and
unknown gender.
select matchcount(gender,"F"), matchcount(gender,"M"), matchcount(gender,"U")
from person;
matchcount(gender, "F") matchcount(gender, "M") matchcount(gender, "U")
17537
57385
10982
UDF Load Table Definition and Registration
A UDF implementation consists of the seven C functions described in the following table.
Table 1. UDF Implementation Functions
Function Entry
udfCheck
Description
Checks argument types and returns result
data type.
User-Defined Functions (UDFs) in SQL
When Called by SQL
When SQL statement is compiled.
138
RDM SQL Language Guide
Function Entry
Description
When Called by SQL
Initializes a given execution of the UDF usually When SQL statement is executed.
needed to allocate memory for any needed
UDF context data.
udfTerm
Performs any needed cleanup—usually to free When execution completes or when the cursor
any memory allocated by the udfInit or
is closed (on a select statement).
udfCall functions.
udfScalarCall Performs one execution of the scalar function. When next row is processed.
udfAggCall
Performs one execution of the aggregate func- When next row of group is processed.
tion for each row of the group
udfAggResult Called to return the aggregate computation
Either during or after aggregate accumulation.
value.
udfAggReset
Resets the aggregate calculation.
When group changes.
udfInit
The entry points for these functions are provided through a UDF load table that is passed from your application to
the RDM SQL system by calling function rsqlRegisterUDFs. This table is an array of type UDFLOADTABLE
defined in header file rsqltypes.h (automatically included with header file rsql.h) and shown below.
typedef struct udfloadtable {
char
udfName[NAMELEN];
SQL_T
udfType;
PUDFCHECK
udfCheck;
PUDFINIT
udfInit;
PUDFINIT
udfTerm;
PUDFSCALARCALL udfScalarCall;
PUDFAGGCALL
udfAggCall;
PUDFAGGRESULT udfAggResult;
PUDFRESET
udfAggReset;
} UDFLOADTABLE;
/*
/*
/*
/*
/*
/*
/*
/*
/*
name of user function */
data type of return value */
address of arg type checking function */
address of initialization function */
address of termination function */
address of user function */
address of user function */
address of user function */
address of aggregate reset function */
The first field in the table, udfName, is a char string containing the name of the UDF that will be used in SQL
statements. The second field, udfType, is the data type of the value returned by the function. If the return type
of the function depends on the type of its argument then this should be set to tNOVAL. In any case, the data type
returned by function udfCheck is the type that is used by SQL during compilation. The other fields in UDFLOADTABLE contain pointers to the functions that implement the UDF. Note that udfInit, udfTerm, udfScalarCall, udfAggCall, udfAggResult and udfAggReset can all be NULL. However,
udfScalarCall must be specified and all three udfAgg functions must be NULL for a scalar UDF. Similarly,
all three udfAgg functions must be specified and udfScalarCall must be NULL for an aggregate UDF. Each
of the seven implementation functions must conform to its prototype definition given in header file
rsqltypes.h as follows.
typedef RSQL_ERRCODE
HSTMT
void
uint16_t
const RSQL_VALUE
SQL_T
int16_t
(EXTERNAL_FCN
hstmt,
/*
*pRegCtx, /*
noargs,
/*
*pArgs,
/*
*pType,
/*
*pDeterm); /*
User-Defined Functions (UDFs) in SQL
UDFCHECK)( /* udfCheck */
in: statement handle */
in: ptr to registration context */
in: number of arguments */
in: ptr to array of arg values (types) */
out: result data type */
out: deterministic fcn flag (0 or 1) */
139
RDM SQL Language Guide
typedef RSQL_ERRCODE (EXTERNAL_FCN UDFINIT)( /* udfInit */
HSTMT
hstmt,
/* in: statement handle */
void
*pRegCtx, /* in: ptr to registration context */
void
*pFcnCtx); /* in: ptr to fcn execution context data area */
typedef void (EXTERNAL_FCN UDFTERM)( /* udfTerm */
HSTMT
hstmt,
/* in: statement handle */
void
*pFcnCtx); /* in: ptr to fcn execution context data area */
typedef RSQL_ERRCODE
HSTMT
void
uint16_t
const RSQL_VALUE
RSQL_VALUE
(EXTERNAL_FCN
hstmt,
/*
*pFcnCtx, /*
noargs,
/*
*pArgs,
/*
*pResult); /*
UDFSCALARCALL)( /* udfScalarCall */
in: statement handle */
in: ptr to fcn execution context data area */
in: number of arguments */
in: ptr to array of argument values */
out: ptr to function result value */
typedef RSQL_ERRCODE (EXTERNAL_FCN UDFAGGCALL)( /* udfAggCall */
HSTMT
hstmt,
/* in: statement handle */
void
*pFcnCtx, /* in: ptr to fcn execution context data area */
uint16_t
noargs,
/* in: number of arguments */
const RSQL_VALUE *pArgs);
/* in: ptr to array of argument values */
typedef RSQL_ERRCODE (EXTERNAL_FCN UDFAGGRESULT)( /* udfAggResult */
HSTMT
hstmt,
/* in: statement handle */
void
*pFcnCtx, /* in: ptr to fcn execution context data area */
RSQL_VALUE
*pResult); /* out: ptr to function result value */
typedef RSQL_ERRCODE (EXTERNAL_FCN UDFAGGRESET)( /* udfAggRest */
HSTMT
hstmt,
/* in: statement handle */
void
*pFcnCtx); /* in: ptr to fcn execution context data area */
The function names are italicized to indicate that they can be named whatever you like. Note that the first argument to each function is a statement handle. This is the statement handle of the SQL statement that contains the
reference to the UDF. You will only need to use this argument when your UDF needs to make calls to the RDM
SQL functions. Details on how to do this will be discussed later on in this section.
The code snippet below is from the example UDF C module udf.c (contained in the GettingStarted\examples\sqlUDF directory) and shows the definition of the UDFLOADTABLE for the soundex and matchcount functions. Each uses a predefined prototype (e.g., UDFCHECK) to ensure that the
arguments are properly defined.
/* UDF functions for soundex */
static UDFCHECK
SndxCheck;
User-Defined Functions (UDFs) in SQL
140
RDM SQL Language Guide
static UDFSCALARCALL SndxCall;
/* user function for
static UDFCHECK
static UDFAGGCALL
static UDFAGGRESULT
static UDFAGGRESET
matchcount */
CntCheck;
CntCall;
CntResult;
CntReset;
/*-------------------------------------------------------------------------Table of user-defined functions for this module
---------------------------------------------------------------------------*/
/* table of user functions callable from within an sql expression */
const UDFLOADTABLE UdfTable[] = {
/*
Scalar
Aggregate--------------- */
/* Name
Type
Check
Init Term Call
Call
Result
Reset */
/* -------------- --------- ---- ---- -------- ------- --------- ------ */
{"soundex",
tCHAR, SndxCheck,NULL,NULL,SndxCall,NULL,
NULL,
NULL},
{"matchcount",tBIGINT,CntCheck, NULL,NULL,NULL,
CntCall,CntResult,CntReset}
};
RDM SQL is informed about the existence of these functions by the application through a call to function rsqlRegisterUDFs (which must occur before compiling/executing any SQL statement that references them).
The code snippet below shows how this is done.
extern const UDFLOADTABLE UdfTable[];
extern const size_t szUdfCtx;
MyApplication()
{
HCONN hdbc;
if ( rsqlAllocConn(&hdbc) == errSUCCESS ) {
rsqlRegisterUDFs(hdbc, 2, UdfTable, NULL, szUdfCtx);
...
}
Five arguments are passed into function rsqlRegisterUDFs: the connection handle, the number of entries in
the UDF load table, the address of the UDF load table, a pointer to a user registration context data area (which
can be NULL if unnecessary), and the maximum size that is needed for a UDF execution context (e.g., aggregate
functions in particular will use this space to keep track of computationally important data from each detail row of
the set of rows comprising each aggregate). The prototype for rsqlRegisterUDFs is given below. Note that
only one call to this function is allowed for any given connection.
RSQL_ERRCODE EXTERNAL_FCN rsqlRegisterUDFs(
HCONN
hConn,
/* in: connection handle */
uint16_t
noudfs,
/* in: number of UDFs */
const UDFLOADTABLE *udftab,
/* in: ptr to UDF load table */
void
*pRegCtx, /* in: ptr to user's registration context */
User-Defined Functions (UDFs) in SQL
141
RDM SQL Language Guide
const size_t
alloc'd */
szFcnCtx) /* in:
size of function context space to be
The pRegCtx can be used by the application program to pass in any application-specific, execution-independent data that will be needed by one or more UDFs. If no registration context is needed the pRegCtx argument should be NULL. The specified pRefCtx pointer is passed to the udfCheck and udfInit functions.
The szFcnCtx needs to be set to the largest context data area used for all of the UDFs. This space will be automatically allocated by the RDM SQL engine and passed to the execution-time UDF functions (all but
udfCheck). If no function context is needed then szFcnCtx should be 0.
UDF Type Checking Function: udfCheck
This function is called by SQL during compilation (i.e. rsqlPrepare) of a SQL statement that contains a reference to the UDF. Six arguments are passed into the udfCheck function as described in the following table.
Table 2. Function udfCheck Argument Descriptions
Argument
Type
Description
hStmt
pRegCtx
HSTMT
void *
noargs
args
uint16_t
RSQL_VALUE *
fcntype
SQL_T *
pDeterm
int16_t *
Statement handle of SQL statement referencing this UDF
Pointer to the user program allocated registration context data area that
was originally passed in through the call to rsqlRegisterUDFs.
Number of arguments specified in SQL statement's UDF call
Array of noargs argument value entries. The first argument is contained in
args[0]. As this function is called during compilation, only the data type
specified in each args entry should be referenced as the actual data value
will only be present for literal constant arguments.
The data type of the value that will be returned by the UDF is returned in
this output variable.
Set to 1 to indicate that the function is deterministic otherwise set to 0. A
function is deterministic if it always returns the same value for the same
arguments. SQL will call deterministic functions at compile time when all of
the argument values are known (i.e., literals) and replace the call with the
result value in the compiled code.
If no errors are detected the function needs to return status errSUCCESS. If an error is detected, then the status
code associated with that particular error needs to be returned by the udfCheck function. The specific error
code that is returned can be any of the RDM SQL codes but it is recommended that the following codes be used.
Table 3. UDF Error Return Codes
Error Code
errUDFNOARGS
errUDFARG
errUDF
Description
Incorrect number of function arguments
Invalid function argument type
Other UDF error
Most of the time only the data type from the the args RSQL_VALUE array (e.g., args[0].type) needs to be
inspected as the actual data value will only be present when a literal constant value is being passed to the function. In order to know which arguments have a literal value, the status field of RSQL_VALUE can be checked
(e.g., args[0].status). When a value is present the status will be set to vsOKAY, if no value is present the
User-Defined Functions (UDFs) in SQL
142
RDM SQL Language Guide
status will be set to vsNOVAL. You can use this, for example, when you want to define an argument for a particular function that is only allowed to take a literal constant.If an argument was specified using a parameter
marker then its corresponding type will be tPARAMREF or if the argument is a stored procedure argument the
type will be tPROCVAR. In either case, the actual type checking will need to be done at execution time by the udfScalarCall/udfAggCall function.
The data type returned by the UDF is returned through the pType argument. The valid RDM SQL_T data type
values that can be returned by a UDF are specified in the table below.
Table 4. SQL Data Type Values
SQL Data Type
SQL_T value
C Data Type
char
varchar
wchar
wvarchar
binary
varbinary
boolean
tinyint
smallint
integer
bigint
real
float, double
date
time
timestamp
tCHAR
tVARCHAR
tWCHAR
tWVARCHAR
tBINARY
tVARBINARY
tBOOL
tTINYINT
tSMALLINT
tINTEGER
tBIGINT
tREAL
tFLOAT, tDOUBLE
tDATE
tTIME
tTIMESTAMP
char
char
wchar_t
wchar_t
uint8_t
uint8_t
int8_t
int8_t
int16_t
int32_t
int64_t
float
double
int32_t
int32_t
int64_t
The udfCheck implementation for the soundex UDF is given below.
/* ======================================================================
Soundex - type checking function (1 argument == name to be encoded)
*/
static RSQL_ERRCODE EXTERNAL_FCN SndxCheck(
HSTMT
hStmt,
/* in: statement handle */
void
*pRegCtx,
/* in: ptr to registration context */
uint16_t
noargs,
/* in: number of arguments to function */
const RSQL_VALUE *args,
/* in: array of argument values */
SQL_T
*fcntype,
/* out: result data type */
int16_t
*pDeterm)
/* out: = 1 deterministic */
{
RSQL_ERRCODE status;
UNREF_PARM(hStmt)
UNREF_PARM(pRegCtx)
if ( !args || noargs != 1 )
status = errUDFNOARGS;
else if ( args->type != tNOVAL && args->type !=tCHAR && args->type !=tVARCHAR
)
User-Defined Functions (UDFs) in SQL
143
RDM SQL Language Guide
status = errUDFARG;
else {
status = errSUCCESS;
*fcntype = tCHAR;
*pDeterm = 1;
}
return status;
}
When an argument has been specified with a parameter marker, SQL will not know its data type at compilation
time. In those situations, the argument type will be tNOVAL and it is therefore a good idea to allow this by the
udfCheck function. So you can see that both tNOVAL and tCHAR/tVARCHAR are allowed in the soundex type
checking function. This also means that the udfScalarCall function will also need to validate the argument
type.
The soundex function is deterministic (i.e., always computes the same value for a particular set of argument
values), so it sets *pDeterm to 1. This means that when all of the argument values for a particular call are literals then SQL will call udfInit, udfScalarCall, and udfTerm when the statement that references the
UDF is compiled and then replace the call with the literal result value in the compiled statement code.
The udfCheck function for the matchcount UDF is as follows.
/* ======================================================================
Type checking call, used for matchcount() UDF
*/
static RSQL_ERRCODE EXTERNAL_FCN CntCheck (
HSTMT
hStmt,
/* in: system handle */
void
*pRegCtx,
/* in: ptr to registration context */
uint16_t
noargs,
/* in: number of arguments to function */
const RSQL_VALUE *args,
/* in: array of argument values */
SQL_T
*fcntype,
/* out: result data type */
int16_t
*pDeterm)
/* out: = 0: not deterministic */
{
RSQL_ERRCODE stat;
UNREF_PARM(hStmt)
UNREF_PARM(pRegCtx)
if ( noargs != 2 )
stat = errUDFNOARGS;
else if ( args[0].type != tNOVAL
&& args[0].type != tCHAR && args[0].type != tVARCHAR
&& args[1].type != tNOVAL
&& args[1].type != tCHAR && args[1].type != tVARCHAR )
stat = errUDFARG;
else {
stat = errSUCCESS;
*fcntype = tBIGINT;
*pDeterm = 0;
}
User-Defined Functions (UDFs) in SQL
144
RDM SQL Language Guide
return stat;
}
UDF Initialization Function: udfInit
The udfInit function is called by RDM SQL when the SQL statement containing the UDF call is executed
(rsqlExecute). This function is used to initialize data that needs to survive multiple calls to the udfScalarCall or udfAggCall functions during the processing of the SQL statement. The pointer to this allocated memory is called the function context pointer and is passed to the udfInit function (as well as each of
the other execution-time functions) through the pFcnCtx argument. If no initialization is needed then this function is unnecessary and its entry in the UDFLOADTABLE can be assigned to NULL (as is the case with both the
soundex and matchcount UDFs).
The three arguments that are passed to the udfInit function are described below.
Table 5. Function udfInit Argument Descriptions
Argument
Type
Description
hStmt
pRegCtx
HSTMT
void *
pFcnCtx
void *
Statement handle of SQL statement referencing this UDF
Pointer to the user program allocated registration context data area that
was originally passed in through the call to rsqlRegisterUDFs.
Pointer to the user function context data area.
The context data is typically defined as a struct type with fields defined for any of the data that needs to survive
the calls to the udfScalarCallor udfAggCall functions. For example, the context declarations for the soundex and matchcount functions' context is given below.
/* Soundex UDF data context packet */
typedef struct sndx_ctx {
char
sndx[5];
/* code buffer needs to survive each soundex() call */
} SNDX_CTX;
/* Matchcount UDF data context packet */
typedef struct count_cxt {
RSQL_ERRCODE stat;
/* CntCall error status */
int64_t
count;
/* Current match count */
} COUNT_CTX;
const size_t szUdfCtx = RDM_MAX(sizeof(SNDX_CTX), sizeof(COUNT_CTX));
Note how the szUdfCtx variable is initialized to the maximum of the sizes of the two struct typedefs. This
is the variable that is passed in to rsqlRegisterUDFs to specify the amount of space the RDM SQL system
will allocate for the UDF function context.
The sndx field will contain the last soundex code returned by the udfScalarCall function. It is placed in the
UDF context so that repeated allocations for the code string do not have to occur on each call. The count field of
COUNT_CTX keeps track of the match count for the current aggregate set. The stat field is simply used by the
udfAggCall function to inform the udfAggResult function of an argument error.
As initialization functions are not needed for the two example UDFs as stub version is given below.
User-Defined Functions (UDFs) in SQL
145
RDM SQL Language Guide
/* ======================================================================
Initialization function for generic UDF
*/
static RSQL_ERRCODE EXTERNAL_FCN MyUdfInit (
HSTMT
hstmt,
/* in: statement handle */
void
*pRegCtx,
/* in: ptr to registration context */
void
*pFcnCtx);
/* in: ptr to fcn execution context data area */
{
MYUDF_CTX *pCtx = (MYUDF_CTX *)pFcnCtx;
UNREF_PARM(hStmt)
UNREF_PARM(pRegCtx)
/* do needed initialization of pCtx */
return errSUCCESS;
}
UDF Termination Function: udfTerm
The udfTerm function is called after the SQL statement containing the UDF reference has completed executing
which, in the case of a select, means when the cursor has been closed either through the call to rsqlFetch that
returns status errNOMOREDATA (automatically closing the cursor) or through a call to rsqlCloseStmt which
is used to close a cursor before having scrolled completely through it.
The two arguments that are passed to the udfterm function are described below.
Table 6. Function udfTerm Argument Descriptions
Argument
Type
Description
hStmt
pFcnCtx
HSTMT
void *
Statement handle of SQL statement referencing this UDF
Pointer to the user function context data area.
This function is called to perform any needed termination processing when the SQL statement containing the
UDF reference has completed its execution. For example, any memory allocated by the udfInit function
would be freed by udfTerm.
As termination functions are not needed for the two example UDFs as stub version is given below.
/* ======================================================================
Termination function for generic UDF
*/
static void EXTERNAL_FCN MyUdfTerm (
HSTMT
hstmt,
/* in: statement handle */
void
*pFcnCtx);
/* in: ptr to fcn execution context data area */
{
MYUDF_CTX *pCtx = (MYUDF_CTX *)pFcnCtx;
UNREF_PARM(hStmt)
User-Defined Functions (UDFs) in SQL
146
RDM SQL Language Guide
/* do needed termination from pCtx */
}
Scalar Call Function: udfScalarCall
The udfScalarCall function is called by RDM SQL during execution of the SQL statement containing the
UDF function reference to perform the desired calculation/evaluation. The five arguments to udfScalarCall
are described in the following table.
Table 7. Function udfScalar Call Argument Descriptions
Argument
Type
hStmt
pFcnCtx
uint16_t
args
HSTMT
Statement handle of SQL statement referencing this UDF
void *
A pointer to the UDF function context pointer
noargs
Number of arguments (i.e., size of args array)
const RSQL_VALUE * Pointer to an array of noargs argument value entries. The first argument is
contained in args[0]. The argument value is contained in the vt field of
RSQL_VALUE.
RSQL_VALUE *
Pointer to the output RSQL_VALUE variable that will contain the function
result value.
result
Description
The udfScalarCall implementation for the soundex UDF is given below.
1
/* ======================================================================
2
Soundex() UDF - return soundex code for specified name
3
*/
4
static RSQL_ERRCODE EXTERNAL_FCN SndxFunc (
5
HSTMT
hStmt,
/* in: system handle */
6
void
*cxtp,
/* in: UDF context pointer */
7
uint16_t
noargs,
/* in: number of arguments to function */
8
const RSQL_VALUE *args,
/* in: array of arguments */
9
RSQL_VALUE
*result)
/* out: result value */
10
{
11
/* Soundex conversion table. See Wikipedia "Soundex" page */
12
static char *codes[] = {"bfpv", "cgjkqsxz", "dt", "l", "mn", "r", "hw",
NULL};
13
static char sndxerr[] = "xERR";
14
int
cpos, cndx;
15
char
cur_c, last_c;
16
SNDX_CTX
*scp = (SNDX_CTX *)cxtp;
17
char
*sndx = &scp->sndx[0];
18
char
*name = args->vt.cv;
19
20
UNREF_PARM(hStmt)
21
UNREF_PARM(noargs)
22
23
result->type = tCHAR;
24
result->len
= 0;
25
User-Defined Functions (UDFs) in SQL
147
RDM SQL Language Guide
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
if ( !name || !isalpha(*name)
|| (args->type != tCHAR && args->type != tVARCHAR) ) {
result->vt.cv = sndxerr;
return errSUCCESS;
}
sndx[0] = toupper(*name++);
strcpy(&sndx[1], "000");
for (last_c = 0, cpos = 1; cpos < 4 && isalpha(*name); ++name) {
for (cndx = 0; codes[cndx]; ++cndx) {
if ( strchr(codes[cndx], tolower(*name)) ) {
if ( cndx < 6 ) { /* "hw" */
cur_c = '1' + cndx;
if ( cur_c != last_c ) {
sndx[cpos++] = cur_c;
last_c = cur_c;
}
}
break;
}
}
if ( !codes[cndx] )
last_c = 0;
}
result->vt.cv = sndx;
return errSUCCESS;
}
Function SndxFunc will never be called by SQL without having executed a prior successful call to SndxCheck.
Hence it is certain that noargs is equal to 1 and does not need to be checked. However, it is possible that the
argument type not be equal to tCHAR (or tVARCHAR) because it may have been specified with a parameter
marker that was assigned to a non-tCHAR (or tVARCHAR) variable. Lines 26 to 30 contain a check of the argument types and if they are not correct, rather than returning an error code, SndxFunc returns a special code that
indicates that an error for that particular row occurred. If an actual error code is returned then SQL will abort the
processing at that point, returning the error to the application program. Of course, for many UDFs that will be
exactly the correct thing to do. Note that in this case, the type of the argument could be valid but if the character
string does not begin with a letter then it cannot be a name (the isalpha test at line 26).
The details of the soundex algorithm are not particularly important except to note that the code is a four character code where the first is the upper-case first letter of the name followed by three digits. The result is stored in
the context field, sndx (see lines 17, 31-32, and 40). The result type field is tCHAR (line 23) and the result len
field is zero (line 24) indicating that this is not an SQL allocated string. The pointer to the result string is assigned
to field vt.cv at line 50.
Aggregate UDF Call Function: udfAggCall
The udfAggCall function is called by RDM SQL for each detail row from the current set of aggregate rows to
perform the detail calculations needed by the aggregate function. The four arguments to udfAggCall are
User-Defined Functions (UDFs) in SQL
148
RDM SQL Language Guide
described in the following table.
Table 8. Function udfAggCall Argument Descriptions
Argument
Type
Description
hStmt
pFcnCtx
uint16_t
args
HSTMT
void *
noargs
const RSQL_VALUE *
Statement handle of SQL statement referencing this UDF
A pointer to the UDF function context pointer
Number of arguments (i.e., size of args array)
Pointer to an array of noargs argument value entries. The first argument is
contained in args[0]. The argument value is contained in the vt field of
RSQL_VALUE.
Note that a locally-declared 5 character array variable could not be used to contain the resulting soundex code
and assigned to result->vt.cv because it would go out of context when the function returns. This is why it is
necessary to the UDF function context to contain the buffer. Moreover, a global variable cannot be used as that is
not thread safe should the function be called from another thread from the same program.
The udfAggCall implementation for the matchcount UDF is shown below.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
/* ======================================================================
User function for matchcount() UDF
*/
static RSQL_ERRCODE EXTERNAL_FCN CntCall (
HSTMT
hStmt,
/* in: system handle */
void
*cxtp,
/* in: UDF context pointer */
uint16_t
noargs, /* in: number of arguments to function */
const RSQL_VALUE *args)
/* in: array of arguments */
{
COUNT_CTX *ccp = cxtp;
UNREF_PARM(hStmt)
UNREF_PARM(noargs)
if ( args[0].type != tNOVAL && args[1].type != tNOVAL ) {
if (args[0].type != tNULL) {
if ( (args[0].type != tCHAR && args[0].type != tVARCHAR)
||(args[1].type != tCHAR && args[1].type != tVARCHAR) )
ccp->stat = errUDFARG;
else {
ccp->stat = errSUCCESS;
if ( strstr(args[0].vt.cv, args[1].vt.cv) )
++ccp->count;
}
}
}
return errSUCCESS;
}
The count field of the UDF context COUNT_CTX is declared as type int64_t (the _t integer types are defined
in the RDM header files). It is used to contain the count of the number of calls to CntFunc when the two arguments match. There are two points that need to be made from this example to which you will want to pay particular attention.
User-Defined Functions (UDFs) in SQL
149
RDM SQL Language Guide
First, notice the checks for tNOVAL at line 15 and the check for tNULL in line 16. In the implementation of an
aggregate function, the tNOVAL types will be passed in on the initial call to the function for each aggregate set so
they should not be considered erroneous but no computation needs to occur. It is also possible that a null argument can be passed in and this too needs to be allowed. Note that in standard SQL aggregate computations are
supposed to ignore nulls. In this example that has no effect on the result. However, it does matter with any computation that depends on the number of candidate rows.
Lines 17-20 show how error handling from within the udfAggCall function needs to be done. It is not quite the
same as in the udfCheck function where a simple status code is returned. Two methods for returning an error
can be used. In this example, result->type is set to tSMALLINT and result->vt.sv is set to the desired
error code (errUDFARG) and status errSQLERROR is returned by the function. SQL will then return the specified status along with the name of the UDF to the application from the invoking function (either rsqlExecute
or rsqlFetch). Another method is to set result->type to tCHAR and assign a pointer to a static char
string error message to result->vt.cv. SQL will then return that message along with the UDF name in the
error info buffer associated with that statement (retrievable through a call to function rsqlGetErrorInfo) and
return error code errUDF to the application from the invoking function (rsqlExecute or rsqlFetch). This
alternative approach could be coded for CntFunc as follows.
17
18
19
20
if ( args[0].type != tCHAR || args[1].type != tCHAR ) {
result->type = tCHAR;
result->vt.cv = "invalid argument type";
return errSQLERROR;
Aggregate UDF Result Function: udfAggResult
The udfAggResult function is called by RDM SQL during execution of the SQL statement containing the UDF
function reference to perform and return the desired aggregate calculation result. This function is designed to be
called once after all of the detail rows have been processed. However, at this time, RDM SQL actually calls this
function after each detail row has been fetched and after the udfAggCall function has been called. So, this
function should never reset the aggregate computational value—that is the job of the udfAggReset function
described in the next section. The three arguments to udfAggResult are described in the following table.
Table 9. Function udfAggResult Argument Descriptions
Argument
Type
Description
hStmt
pFcnCtx
result
HSTMT
void *
RSQL_VALUE *
Statement handle of SQL statement referencing this UDF
A pointer to the UDF function context pointer
Pointer to the output RSQL_VALUE variable the will contain the function
result value.
The udfAggResult implementation for the matchcount UDF is given below.
/* ======================================================================
User function for matchcount() UDF
*/
static RSQL_ERRCODE EXTERNAL_FCN CntResult (
HSTMT
hStmt,
/* in: system handle */
void
*cxtp,
/* in: UDF context pointer */
User-Defined Functions (UDFs) in SQL
150
RDM SQL Language Guide
RSQL_VALUE
*result)
/* out: result value */
{
RSQL_ERRCODE stat;
COUNT_CTX *ccp = (COUNT_CTX *)cxtp;
UNREF_PARM(hStmt)
if ( ccp->stat != errSUCCESS ) {
result->type = tSMALLINT;
result->vt.sv = (int16_t) ccp->stat;
stat = errSQLERROR;
}
else {
result->type
= tBIGINT;
result->vt.llv = ccp->count;
stat = errSUCCESS;
}
return stat;
}
Aggregate UDF Reset Function: udfAggReset
The udfAggReset function is only used with aggregate UDFs. Its function is to reset the aggregated computational result to its initial value. The function is called by SQL each time the group by column values change.
The two arguments that are passed to the udfReset function are described below.
Table 10. Function udfReset Argument Descriptions
Argument
Type
Description
hStmt
ctxp
HSTMT
void *
Statement handle of SQL statement referencing this UDF
A pointer to the allocated UDF context pointer containing the aggregated
computational result value.
The udfReset implementation for the matchcount UDF is shown below. As it is quite trivial no further comment is needed.
/* ======================================================================
Reset function for matchcount() UDF
*/
static RSQL_ERRCODE EXTERNAL_FCN CntReset(
HSTMT
hStmt,
/* in: system handle */
void
*cxtp)
/* in: UDF context pointer */
{
COUNT_CTX *ccp = (COUNT_CTX *)cxtp;
UNREF_PARM(hStmt)
ccp->count = 0;
User-Defined Functions (UDFs) in SQL
151
RDM SQL Language Guide
return errSUCCESS;
}
Calling RSQL API Functions from a UDF
If your UDF needs to make calls to the RDM SQL API functions there are some important things that you need to
know. The statement handle that is passed into each of the UDF implementation functions is the one associated
with the statement containing the call to the UDF. There are only a limited number of functions that can be safely
called using this statement handle as listed in the table below.
Table 11. Function Calls that Can Be Made Using hStmt
Function
rsqlGetColDescr
rsqlGetConnHandle
rsqlGetCursorName
rsqlGetNumParams
rsqlGetNumResultCols
rsqlGetParamDescr
rsqlGetRowCount
rsqlGetSelectType
rsqlGetStmtState
rsqlGetStmtString
rsqlGetStmtType
rsqlGetTableName
Description
Get description information for a select statement result column
Get connection handle associated with specified statement handle
Get the cursor name associated for the specified statement handle
Get the number of parameter markers in the compiled statement
Get the number of result columns in the compiled select statement
Get description information for a SQL statement parameter marker
Get the count of the # of rows affected by the executed statement
Get the type of select statement
Get the statement handle's statement state
Get the SQL statement string
Get the statement type of the prepared statement
Get result column’s table name
Calls to any other RDM SQL API function into which you pass hStmt will return error code errNOTINUDF.
Most often you will want to allocate a new statement handle to use within the UDF. Function rsqlGetConnHandle must be called to retrieve the connection handle associated with the calling statement handle.
You can then pass this into rsqlAllocStmt in order to allocate a statement handle for use within the UDF.
If the UDF is deterministic, it may be important to know whether the UDF is being called during compilation or
execution. This can be discovered via a call to function rsqlGetStmtState using the original statement handle. Note that when called during compilation, the locks that are needed by the invoking statement cannot be
guaranteed to be in place when the UDF is called. If the UDF relies on those locks then udfCheck needs to indicate that the UDF is not deterministic.
You can also use the connection handle returned from the call to rsqlGetConnHandle to call some, but not
all, connection-related RDM SQL API calls. The following table lists those functions which can be called.
Table 12. Function Calls that Can Be Made Using hStmt's Connection Handle
Function
rsqlAllocStmt
rsqlCloseDB
rsqlGetAutoCommit
rsqlGetDateFormat
Description
Allocate a statement handle
Close a database
Get the connection handle’s current auto commit status
Get the current date format setting
User-Defined Functions (UDFs) in SQL
152
RDM SQL Language Guide
Function
rsqlGetDateSeparator
rsqlGetDBNames
rsqlGetDBTask
rsqlGetGenCFiles
rsqlGetTimeout
rsqlLockTables
rsqlOpenCat
rsqlOpenDB
rsqlSetDateFormat
rsqlSetDateSeparator
rsqlSetTimeout
rsqlTransStatus
rsqlUnlockTable
Description
Get the current date separator character
Get a list of the names of the currently opened databases
Get the RDM task handle associated with a connection handle
Get the connection handle's "generate C files" mode
Get lock wait timeout in seconds for the connection
Issue an explicit lock request for one or more database tables
Open a database through its compiled catalog module
Open a database by name
Set the date constant format for the connection
Set the current date constant separator character for the connection
Set lock wait timeout in seconds for the connection
Return the current transaction state for the specified connection
Free a read lock on a database table
Calls to any other RDM SQL API function into which you pass the connection handle associated with hStmt will
return error code errNOTINUDF.
All of the connection's open databases and locks are inherited by the UDF. You can call rsqlGetDBNames to
get a semi-colon separated list of the names of the open databases. If rsqlOpenDB (rsqlOpenCat) is called
then the UDF needs to make sure that those databases are closed in udfCleanup. If you call rsqlAllocStmt to allocate a separate statement handle on the connection handle returned from the call to rsqlGetConnHandle you can use it with any RDM SQL API call that takes a statement handle.
You can allocate a separate connection handle with no restrictions on the calls that can be made. Note, however,
that the open databases and locks held by the original connection are not inherited and you will need to be very
careful not to attempt to lock a table that is blocked by a lock held by the original connection because it will not
regain control (and free the lock) until the UDF returns. Because of this we recommend that you never call
rsqlAllocConn from a UDF.
User-Defined Functions (UDFs) in SQL
153
RDM SQL Language Guide
Using Virtual Tables to Access Any Data
'Virtual Reality' is a name being slapped on
almost anything these days, especially if it's lame.
- Mark Hamilton
A virtual table provides the ability to present any kind of data to SQL as a table. . It is important to recognize that
virtual tables do not behave like standard database tables. RDM SQL does not lock a virtual table. Virtual tables
are not transactional—you cannot commit or rollback an insert statement. The data in a virtual table is not necessarily persistent. A virtual table's implementation of an insert statement may not actually store a new "row" into
the table but might actually be used to simply provide data that is used to control an embedded device. Some virtual tables may have an unlimited number of rows as in, for example, a virtual table that returns the status data
from sensors in an embedded system that varies over time. The virtual table implementation described in this section is quite basic supporting only insert and select statements yet that is sufficient to allow you to interface SQL
with just about any kind of non-SQL data from your embedded systems application.
A virtual table is defined through a combination of the create virtual table DDL statement and a set of userwritten C functions that conform to a pre-defined function call interface specification. A pointer to a pre-defined
structure array that contains an entry for each virtual table with the addresses of each of the virtual table interface
functions is passed into SQL before the database is opened by calling the rsqlRegisterVirtualTables
function. The virtual table interface functions are then called by SQL at the appropriate times during the
execution of any SQL statement that references the virtual table. This interaction is depicted in the figure below
which shows SQL calling the function in the application's virtual table function module to fetch a row of weather
data from a wireless sensor network (WSN).
Figure 1. Virtual Table Operation
Using Virtual Tables to Access Any Data
154
RDM SQL Language Guide
This section will show you how to develop a virtual table implementation through the use of a simple example. Virtual tables are defined using the create virtual table SQL DDL statement described in the Defining a Database
section and implemented in a C program module that conforms to a pre-defined API that will be called by the
SQL runtime system in order to process any insert (or import) and select statements that access the virtual table
(note that at the present time update and delete statements are not allowed on a virtual table). The example virtual table is defined as follows in the vtabs example database DDL specification (file vtabs.sql).
create database vtabs;
create table stdtab(
pkey integer primary key,
name char(24) key,
addr char(32),
city char(24),
state char(2),
zip char(10)
);
create virtual table virtab(
pkey integer primary key,
name char(24),
addr char(32),
city char(24),
state char(2),
zip char(10)
);
Note that two identical tables are defined except for the defined keys. One is a standard table and one is a virtual
table. A database must contain at least one standard table. Of course, it is not required that you have an identical
standard table for each virtual table. The purpose of the example is to demonstrate how easy it is to load a standard table from a virtual table using the insert into table from select statement.
Virtual Table Load Table Definition and Registration
A virtual table implementation consists of the six C functions described in the table below.
Table 1. Virtual Table Implementation Functions
Function Entry
Description
vtInsert
Executes an insert statement which "inserts" When SQL insert statement is executed
the specified data values.
(rsqlExecute). Can be NULL.
vtRowCount
Returns an estimate of the current number of When SQL statement is compiled (rsqlPrerows contained in the virtual table.
pare).
vtSelectCount Returns the actual current number of rows
contained in the virtual table.
When Called by SQL
When "select count(*)" is executed on the virtual table.
vtSelectOpen Executes a select statement which performs When SQL select statement is executed
any needed initialization for subsequent calls (rsqlExecute).
to vtFetch.
vtFetch
Fetches the next row in the virtual table.
vtSelectClose Performs any needed cleanup—e.g., to free
Using Virtual Tables to Access Any Data
When rsqlFetch is called.
When select execution completes (e.g., when
155
RDM SQL Language Guide
Function Entry
Description
When Called by SQL
any memory allocated by the vtSelectOpen or vtFetch functions.
the cursor is closed). Can be NULL.
The entry points for these functions are provided through a virtual table load table that is passed from your application to the RDM SQL system by calling function rsqlRegisterVirtualTables before processing any
SQL statements that reference a virtual table. This table is an array of type VTLOADTABLE defined in header file
rsqltypes.h (automatically included with header file rsql.h) and shown below.
typedef struct vtfloadtable {
char
vtName[NAMELEN];
PVTINSERT
vtInsert;
PVTROWCOUNT
vtRowCount;
PVTSELECTCOUNT vtSelectCount;
PVTSELECTOPEN vtSelectOpen;
PVTFETCH
vtFetch;
PVTSELECTCLOSE vtSelectClose;
} VTFLOADTABLE;
/*
/*
/*
/*
/*
/*
/*
name of the virtual table */
ptr to INSERT execution function */
ptr to row count est. function*/
ptr to actual row count function */
ptr to SELECT init function */
ptr to fetch next row function */
ptr to SELECT term function */
The first field in the table, vtName, is a char string containing the name of the virtual table and must be the same
as that specified in its corresponding create virtual table statement (case insensitive). The remaining fields in
VTLOADTABLE contain pointers to the functions that implement the virtual table. Each of the six implementation
functions must conform to its prototype definition given in header file rsqltypes.h as follows.
typedef RSQL_ERRCODE (EXTERNAL_FCN VTINSERT)( /* vtInsert() */
HSTMT
hstmt,
/* in: statement handle */
uint16_t
nocols,
/* in: no. of ref'd columns */
VCOL_INFO
*colsvals,
/* in: array of ref'd column value containers */
void
*pRegCtx)
/* in: ptr to user's registration context */
typedef RSQL_ERRCODE (EXTERNAL_FCN VTROWCOUNT)( /* vtRowCount() */
HSTMT
hstmt,
/* in: statement handle */
void
*pRegCtx,
/* in: ptr to user's registration context */
uint64_t
*pNoRows)
/* out: ptr to row count value */
typedef RSQL_ERRCODE (EXTERNAL_FCN VTSELECTCOUNT)( /* vtSelectCount() */
HSTMT
hstmt,
/* in: statement handle */
void
*pRegCtx,
/* in: ptr to user's registration context */
void
*pFetchCtx,
/* in: ptr to fetch context */
uint64_t
*pNoRows)
/* out: ptr to row count value */
typedef RSQL_ERRCODE (EXTERNAL_FCN VTSELECTOPEN)( /* vtSelectOpen() */
HSTMT
hstmt,
/* in: statement handle */
uint16_t
nocols,
/* in: no. of ref'd columns */
VCOL_INFO
*colsvals,
/* in: array of ref'd column value containers */
Using Virtual Tables to Access Any Data
156
RDM SQL Language Guide
void
void
RSQL_VALUE
*pRegCtx,
*pFetchCtx,
*pkeyval)
/* in:
/* in:
/* in:
ptr to registration context */
ptr to fetch context */
ptr to primary key value */
typedef RSQL_ERRCODE (EXTERNAL_FCN VTFETCH)( /* vtFetch() */
HSTMT
hstmt,
/* in: statement handle */
uint16_t
nocols,
/* in: no. of ref'd columns */
VCOL_INFO
*colsvals,
/* in: array of ref'd column value containers */
void
*pRegCtx,
/* in: ptr to registration context */
void
*pFetchCtx)
/* in: ptr to fetch context */
typedef void (EXTERNAL_FCN VTSELECTCLOSE)( /* vtSelectClose() */
HSTMT
hstmt,
/* in: statement handle */
void
*pRegCtx,
/* in: ptr to registration context */
void
*pFetchCtx)
/* in: ptr to fetch context */
The function names are italicized to indicate that they can be named whatever you like. Note that the first argument to each function is a statement handle. This is the statement handle of the SQL statement that contains the
reference to the virtual table. In general you do not need to use this argument. If the implementation of your virtual table needs to make calls to the RDM SQL functions you can use the statement handle to retrieve its associated connection handle by calling rsqlGetConnHandle which can then be used to call rsqlAllocStmt to
allocate a new statement handle that could be used by the virtual table implementation functions.
The code snippet below is from the example virtual table C module vtabfcns.c (contained in the GettingStarted\examples\sqlVT directory) and shows the definition of the VTLOADTABLE for the virtab
table.
static
static
static
static
static
VTINSERT
VTROWCOUNT
VTSELECTCOUNT
VTSELECTOPEN
VTFETCH
vtInsert;
vtRowCount;
vtSelectCount;
vtSelectOpen;
vtFetch;
const VTFLOADTABLE vtFcnTable[] = {
{"virtab",vtInsert,vtRowCount,vtSelectCount,vtSelectOpen,vtFetch,NULL}
};
const size_t vtFetchSz = sizeof(VTAB_CTX);
RDM SQL is informed about the existence of these functions by the application through a call to function rsqlRegisterVirtualTables which must occur before opening the database in which they are declared. The code
snippet below shows how this is done.
extern const UDFLOADTABLE vtFcnTable[];
extern const size_t vtFetchSz;
MyApplication()
Using Virtual Tables to Access Any Data
157
RDM SQL Language Guide
{
HCONN hdbc;
if ( rsqlAllocConn(&hdbc) == errSUCCESS ) {
rsqlRegisterVirtualTables(hdbc, "vtabs", 1, vtFcnTable, NULL, vtFetchSz);
if ( rsqlOpenDB(hdbc, "vtabs", "s") != errSUCCESS )
...
}
Six arguments are passed to rsqlRegisterVirtualTables: the connection handle, the name of the database containing the declarations of the virtual tables, the number of virtual tables in the load table, the address of
the virtual table load table, a pointer to a user registration context data area (which can be NULL if unnecessary)), and the maximum size that is needed for the fetch context data area. The prototype for rsqlRegisterVirtualTables is given below.
RSQL_ERRCODE EXTERNAL_FCN rsqlRegisterVirtualTables(
HCONN
hConn,
/* in: connection handle */
const char
*dbname,
/* in: name of db */
uint16_t
novts,
/* in: number of virtual tables */
const VTFLOADTABLE *vtftab,
/* in: ptr to VTF load table */
void
*pRegCtx,
/* in: ptr to user's registration context */
const size_t
szFetchCtx) /* in: size of fetch context to be alloc'd */
The pRegCtx can be used by the application program to allocate the space for the data to be manipulated by the
virtual table interface in order for the interface functions to operate reentrantly without having to use the synchronization functions described in the next section. Of course, this only works when the data to be accessed
does not need to be shared by multiple connections in which case the technique described in the next section
must still be used. The pRegCtx pointer is passed all of the virtual table functions by the RDM SQL engine. If no
registration context is needed the pRegCtx should be NULL.
The szFetchCtx needs to be set to the largest fetch context data area used for all the virtual tables in database
dbname. This space will be automatically allocated by the RDM SQL engine and passed to the execution-time
functions (all but vtRowCount) through the pFetchCtx argument. If no context is needed then szFetchCtx
should be 0.
Thread-safe Access to Global Data Used by a Virtual Table
Interface
The virtual table example provided in this section stores its data in a global table. As such, access to that data
needs to be done in a safe manner when used in multi-threaded applications. RDM's platform support package
(PSP) includes a set of synchronization functions that can be used to serialize access to the shared data. These
functions are described in the table below.
Table 2. RDM PSP Synchronization Functions
Function
Description
psp_enterCritSec Enter a process-wide critical section. This function blocks execution of all other threads
running in the application's process except the calling one until psp_exitCritSec is
Using Virtual Tables to Access Any Data
158
RDM SQL Language Guide
Function
Description
called.
psp_exitCritSec
Exits the critical section started by the last call to psp_enterCritSec allowing other
threads to execute.
psp_syncCreate
Creates a semaphore that can be used with psp_syncEnterExcl to serialize access
to the shared data that is to be protected by that semaphore.
psp_syncEnterExcl Enter exclusive, one-thread-at-a-time access controlled by the specified semaphore. The
calling thread will block until all other threads that have already called psp_syncEnterExcl on that semaphore have exited.
psp_syncExitExcl Exits the exclusive access section controlled by the specified semaphore.
The shared data used by the virtab table interface is declared in module vtabfcns.c and is shown below.
struct virtab {
int32_t
pkey;
char
name[25];
char
addr[33];
char
city[24];
char
state[3];
char
zip[10];
int8_t
is_null[6];
};
static
static
static
static
PSP_SEM
vtsem = NO_PSP_SEM;
const uint32_t maxrows = 1000;
struct virtab *vtrows = NULL;
uint32_t
norows = 0;
The PSP_SEM variable vtsem is the semaphore that will be used to serialize access to the vtrows array and
the norows variable. The two functions that are included in the vtabfcns.c module that encapsulate the calls
to the PSP synchronization functions are shown below.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
/* ========================================================================
Enter serialized access to vtrows data
*/
static void vtEnter()
{
if ( vtsem == NO_PSP_SEM ) {
psp_enterCritSec();
if ( vtsem == NO_PSP_SEM )
vtsem = psp_syncCreate(PSP_MUTEX_SEM);
psp_exitCritSec();
}
psp_syncEnterExcl(vtsem);
}
/* ========================================================================
Exit serialized access to vtrows data
*/
Using Virtual Tables to Access Any Data
159
RDM SQL Language Guide
18
19
20
21
static void vtExit()
{
psp_syncExitExcl(vtsem);
}
Note that the call to psp_enterCritSec at line 7 will only be called once and that the recheck of the vtsem
value at line 8 is a common method to guard against one thread having created the vtsem semaphore between
another thread's execution at line 6 and its successful return from the call at line 7. The call to psp_syncEnterExcl at line 12 will serialize access to the shared data. Hence, the virtual functions will call vtEnter()
before accessing vtrows and/or norows and then call vtExit() when the needed access is finished.
Virtual Table Execution Function: vtInsert
This function is called by SQL to execute the SQL insert statement that references the virtual table. Four arguments are passed into the vtInsert function as described in the following table.
Table 3. Function vtInsert Argument Descriptions
Argument
Type
Description
hStmt
nocols
colsvals
pRegCtx
HSTMT
uint16_t
VCOL_INFO *
void *
Statement handle of SQL statement containing the virtual table reference.
Number of referenced columns (size of colsvals array).
Array of referenced column value containers.
Pointer to the user program allocated context data area that was originally
passed in through the call to rsqlRegisterVirtualTables.
Each entry of the colsvals array contains information about a virtual table column that is referenced in the
SQL statement. This information is contained in the VCOL_INFO struct type whose fields are described in
the following table.
Table 4. VCOL_INFO Description
Field Name
Data Type
Description
colno
int16_t
len
is_null
uint32_t
int16_t *
data
void *
Ordinal position of column in table declaration: 0 (first column) to # of columns in table – 1 (last column).
Column length in bytes.
Pointer to variable containing the null indicator flag: *is_null = 0 => not null,
*is_null = 1 => is null.
Pointer to the buffer containing the column value.
Note that the is_null field is a pointer to the int16_t variable that is used by SQL system to indicate that a
column value is null. By assigning this through the pointer it eliminates the need for the SQL system to perform an
extra loop through the colsvals array.
The values contained in the colsvals array are those specified in the values clause of the associated insert
statement. The vtInsert implementation for the virtab table is given below.
1
2
3
/* ========================================================================
Virtual table INSERT execution function
*/
Using Virtual Tables to Access Any Data
160
RDM SQL Language Guide
4
5
6
7
*/
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
static RSQL_ERRCODE EXTERNAL_FCN vtInsert( /* vtInsert() */
HSTMT
hstmt,
/* in: statement handle */
uint16_t
nocols,
/* in: no. of ref'd columns */
VCOL_INFO
*colsvals,
/* in: array of ref'd column value containers
void
*pRegCtx)
/* in:
unused */
{
int32_t
uint32_t
int16_t
RSQL_ERRCODE
lv;
rowno;
pkno = -1;
stat = errSUCCESS;
UNREF_PARM(hstmt)
UNREF_PARM(pRegCtx)
vtEnter();
if ( !vtrows ) {
/* allocate virtab data area */
vtrows = calloc(maxrows, sizeof(struct virtab));
}
/* locate specified primary key value, if any */
for (pkno = 0; pkno < nocols; ++pkno) {
if ( colsvals[pkno].colno == 0 ) {
/* locate row with matching primary key */
memcpy(&lv, colsvals[pkno].data, sizeof(int32_t));
for ( rowno = 0; rowno < norows; ++rowno ) {
if ( vtrows[rowno].pkey == lv ) {
vtExit();
return errDUPLICATE;
}
}
}
}
stat = vtStoreRow(norows, nocols, colsvals);
if ( stat == errSUCCESS )
++norows;
vtExit();
return stat;
}
The colsvals array contains the values of the table columns to be inserted. The nocols argument specifies
the number of entries in the colsvals array which could be less than the number of columns declared in the
table.
Since the virtab table has a primary key, the function needs to locate the primary key value in the colsvals
array so that its uniqueness can be checked. This is work is done at lines 24 to 36. Since the primary key is
declared on the first column of the table, its value is located in the colsvals entry that has colno equal to 0
(line 26). Once found, the value is copied into the local int32_t variable lv. If a matching row is found the func-
Using Virtual Tables to Access Any Data
161
RDM SQL Language Guide
tion returns status errDUPLICATE indicate that an attempt was made to insert a row with a duplicate primary
key value (lines 30-33).
If no duplicate is found, function vtStoreRow (shown below) is called to add the new row to the vtrows array.
1
2
3
4
5
6
7
*/
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
break;
25
break;
26
break;
27
break;
28
break;
29
break;
30
31
32
33
34
/* ========================================================================
Store column values in specified row (0 = first row)
*/
static RSQL_ERRCODE vtStoreRow(
uint32_t
rowno,
/* in: row number into which store col vals */
uint16_t
nocols,
/* in: no. of ref'd columns */
const VCOL_INFO *colsvals) /* in: array of ref'd column value containers
{
uint16_t
cno;
const VCOL_INFO *pCol;
struct virtab
*pRow;
if ( rowno >= maxrows )
return errVTSPACE;
pRow = &vtrows[rowno];
for (pCol = colsvals, cno = 0; cno < nocols; ++cno, ++pCol ) {
if ( *pCol->is_null )
pRow->is_null[pCol->colno] = 1;
else {
pRow->is_null[pCol->colno] = 0;
switch (pCol->colno) {
case 0: memcpy(&pRow->pkey, pCol->data, sizeof(int32_t));
case 1: strncpy(pRow->name,
(char *)pCol->data, 24);
case 2: strncpy(pRow->addr,
(char *)pCol->data, 32);
case 3: strncpy(pRow->city,
(char *)pCol->data, 24);
case 4: strncpy(pRow->state, (char *)pCol->data, 2);
case 5: strncpy(pRow->zip,
(char *)pCol->data, 9);
} /*lint !e744 */
}
}
return errSUCCESS;
}
The rowno argument is index into vtrows into which the row will be stored. The pRow pointer (assigned at line
16) is simply used to derefence that row in the code which follows. Lines 18-32 loop through the colsvals
array in order to assign the values for each individual column into its field in the vtrows struct array entry. It
is important to note that the table column number is not cno but pCol->colno (lines 20, 22, and 23). Also note
that in this example the len field of VCOL_INFO is not used but it could (should!) have been used to, for
Using Virtual Tables to Access Any Data
162
RDM SQL Language Guide
example, check for a possible truncation (i.e., where pCol->len is greater than the declared size of the column).
Using Virtual Tables to Access Any Data
163
RDM SQL Language Guide
Virtual Table Row Count Function: vtRowCount
This function is called by SQL during compilation of a SQL select statement that contains a reference to the virtual table in order to fetch an estimate of the number of rows in the table. Three arguments are passed into the
vtRowCount function as described in the following table.
Table 5. Function vtRowCount Argument Descriptions
Argument
Type
Description
hStmt
pRegCtx
HSTMT
void *
pNoRows
uint64_t *
Statement handle of SQL statement containing the virtual table reference.
Pointer to the user program allocated registration context data area that
was originally passed in through the call to rsqlRegisterVirtualTables.
Pointer to the output variable into which the estimate of the number of rows
in the table is to be returned.
The vtRowCount implementation for the virtab table is provided below.
/* ========================================================================
Virtual table row count function
*/
static void EXTERNAL_FCN vtRowCount( /* vtRowCount() */
HSTMT
hstmt,
/* in: statement handle */
void
*pRegCtx, /* in: unused */
uint64_t
*pNoRows) /* out: ptr to row count value */
{
UNREF_PARM(hstmt)
UNREF_PARM(pRegCtx)
vtEnter();
*pNoRows = (uint64_t)norows;
vtExit();
}
The UNREF_PARM macro is provided in RDM to indicate that a particular argument is unused and to avoid the
associated compiler warning. Note the necessary absence of the terminating semi-colon (";").
Here you can clearly see how access to the norows variable is protected by the bracketing calls to functions
vtEnter and vtExit.
If an exact row count value cannot be determined at compilation time then the vtRowCount function should
return an estimate of the number of rows. It does not have to be an exact value.
Virtual Table Row Count Function: vtSelectCount
This function is only called by SQL during execution of a SQL "select count(*) from virtab" statement in order to
fetch the actual number of rows in the table. Four arguments are passed into the vtRowCount function as
described in the following table.
Using Virtual Tables to Access Any Data
164
RDM SQL Language Guide
Table 6. Function vtSelectCount Argument Descriptions
Argument
Type
Description
hStmt
pRegCtx
HSTMT
void *
pFCtx
pNoRows
void *
uint64_t *
Statement handle of SQL statement containing the virtual table reference.
Pointer to the user program allocated registration context data area that
was originally passed in through the call to rsqlRegisterVirtualTables.
Pointer to the fetch context data area.
Pointer to the output variable into which the number of rows in the table is
to be returned.
The vtSelectCount implementation for the virtab table is almost identical to the vtRowCount and is provided below.
/* ========================================================================
Virtual table select count function
*/
static void EXTERNAL_FCN vtSelectCount( /* vtSelectCount() */
HSTMT
hstmt,
/* in: statement handle */
void
*pRegCtx, /* in: unused */
void
*pFCtx,
/* in: fetch context pointer */
uint64_t
*pNoRows) /* out: ptr to row count value */
{
UNREF_PARM(hstmt)
UNREF_PARM(pRegCtx)
UNREF_PARM(pFCtx)
vtEnter();
*pNoRows = (uint64_t)norows;
vtExit();
}
If an exact row count value cannot be determined at compilation time then the vtRowCount function should
return an estimate of the number of rows. It does not have to be an exact value.
Using Virtual Tables to Access Any Data
165
RDM SQL Language Guide
Virtual Table Select Open Function: vtSelectOpen
This function is called by SQL to initialize execution of the SQL select statement that references the virtual table.
Six arguments are passed into the vtSelectOpen function as described in the following table.
Table 7. Function vtSelectOpen Argument Descriptions
Argument
Type
Description
hStmt
nocols
colsvals
pRegCtx
HSTMT
uint16_t
VCOL_INFO *
void *
pFCtx
pkeyval
void *
RSQL_VALUE *
Statement handle of SQL statement containing the virtual table reference.
Number of referenced columns (size of colsvals array).
Array of referenced column value containers.
Pointer to the user program allocated context data area that was originally
passed in through the call to rsqlRegisterVirtualTables.
Pointer to the fetch context data area.
Pointer to specified primary key value (NULL if not specified).
Each entry of the colsvals array contains information about a virtual table column that is referenced in the
SQL statement. This information is contained in the VCOL_INFO struct type whose fields are described in
the following table.
Table 8. VCOL_INFO Description
Field Name
Data Type
Description
colno
int16_t
len
is_null
uint32_t
int16_t *
data
void *
Ordinal position of column in table declaration: 0 (first column) to # of columns in table – 1 (last column).
Column length in bytes.
Pointer to variable containing the null indicator flag: *is_null = 0 => not null,
*is_null = 1 => is null.
Pointer to the buffer containing the column value.
Note that the is_null field is a pointer to the int16_t variable that is used by SQL system to indicate that a
column value is null. By assigning this through the pointer it eliminates the need for the SQL system to perform an
extra loop through the colsvals array.
The implementation of vtSelectOpen for the virtab virtual table example is given below. Note the calls to
vtEnter and the reciprocal call to vtExit. As stated above, this serializes thread access to the shared
vtrows and norows variables.
1
2
3
4
5
6
7
*/
8
9
10
11
12
13
/* ========================================================================
Virtual table SELECT execution function
*/
static RSQL_ERRCODE EXTERNAL_FCN vtSelectOpen( /* vtSelectOpen() */
HSTMT
hstmt,
/* in: statement handle */
uint16_t
nocols,
/* in: no. of ref'd columns */
VCOL_INFO
*colsvals,
/* in: array of ref'd column value containers
void
void
RSQL_VALUE
*pRegCtx,
*pFCtx,
*pkeyval)
/* in:
/* in:
/* in:
ptr to registration context */
ptr to fetch context */
ptr to primary key value */
{
RSQL_ERRCODE stat = errSUCCESS;
uint32_t
rowno;
Using Virtual Tables to Access Any Data
166
RDM SQL Language Guide
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
VTAB_CTX
*pCtx = (VTAB_CTX *)pFCtx;
UNREF_PARM(hstmt)
UNREF_PARM(pRegCtx)
pCtx->rowcnt
pCtx->rowno
pCtx->pkeyval
= 0;
= rowno = 0;
= pkeyval;
vtEnter();
if ( !vtrows ) {
vtrows = calloc(maxrows, sizeof(struct virtab));
}
else if ( pkeyval ) {
/* locate row with matching primary key */
for ( rowno = 0; rowno < norows; ++rowno ) {
if ( pkeyval->vt.lv == vtrows[rowno].pkey )
break;
}
pCtx->rowno = rowno;
}
vtExit();
return stat;
}
It is important to note that any dynamic allocations that need to be made for any of the shared data will necessarily live for the life of the invoking process (unless, for some reason, it is explicitly freed in the vtSelectOpen function).
The select statement operational requirements for the vtSelectOpen function to set the rowno variable to
the first row to be fetched.
The fetch context that is passed to vtSelectOpen must be used to save any information that will be used by
vtFetch to control the fetching of rows from the virtual table. The context used in the virtab example is
defined by the VTAB_CTX struct typedef declaration given below.
typedef struct vtab_ctx {
uint64_t
rowcnt;
uint64_t
rowno;
RSQL_VALUE
*pkeyval;
} VTAB_CTX;
/* count of rows fetched */
/* number of next row to be fetched */
/* ptr to primary key's value */
The rowno contains the vtrows index of the next row to be returned by vtFetch. The rowcnt and a nonNULL pkeyval is used to ensure that only one row is returned when the select statement included the "where
pkey = value" clause.
If a primary key value is specified then vtSelectOpen needs to locate the row with that value (lines 30-34) and
set pCtx->rowno to it. If it is not found then pCtx->rowno is set to norows which will cause vtFetch to
return errNOMOREDATA.
Using Virtual Tables to Access Any Data
167
RDM SQL Language Guide
Virtual Table Fetch Function: vtFetch
This function is called by SQL to fetch the next row from the virtual table. Five arguments are passed into the
vtFetch function as described in the following table.
Table 9. Function vtFetch Argument Description
Argument
Type
Description
hStmt
nocols
colsvals
pRegCtx
HSTMT
uint16_t
VCOL_INFO *
void *
pFCtx
void *
Statement handle of SQL statement containing the virtual table reference.
Number of referenced columns (size of colsvals array).
Array of referenced column value containers.
Pointer to the user program allocated context data area that was originally
passed in through the call to rsqlRegisterVirtualTables.
Pointer to the fetch context data area.
The fetch context pointer, pFCtx, references the fetch context data area containing any virtual table specific
data needed for processing the fetch (e.g., current row number). If a primary key lookup value was specified,
then only one row should be retrieved. If not, then all rows in the table should be retrieved with status errNOMOREDATA being returned on the first call after the last row has been fetched. The necessary programming logic
is best explained through the virtab example as shown below.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
/* ========================================================================
Virtual table fetch function
*/
static RSQL_ERRCODE EXTERNAL_FCN vtFetch( /* vtFetch() */
HSTMT
hstmt,
/* in: statement handle */
uint16_t
nocols,
/* in: no. of ref'd columns */
VCOL_INFO
*colsvals,
/* in: array of ref'd col value containers */
void
*pRegCtx,
/* in: ptr to registration context */
void
*pFCtx)
/* in: ptr to fetch context */
{
int16_t
cno;
VTAB_CTX
*pCtx = (VTAB_CTX *)pFCtx;
uint32_t
rno = (uint32_t)pCtx->rowno;
vtEnter();
if ( rno == norows || (pCtx->pkeyval && pCtx->rowcnt) ) {
pCtx->rowno = 0;
vtExit();
return errNOMOREDATA;
}
for (cno = 0; cno < nocols; ++cno) {
const VCOL_INFO *pCVal = &colsvals[cno];
if ( vtrows[rno].is_null[pCVal->colno] )
*pCVal->is_null = 1;
else {
*pCVal->is_null = 0;
switch ( pCVal->colno ) {
case 0:
memcpy(pCVal->data, &vtrows[rno].pkey, sizeof(int32_t));
break;
Using Virtual Tables to Access Any Data
168
RDM SQL Language Guide
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
case 1:
strcpy(pCVal->data,
break;
case 2:
strcpy(pCVal->data,
break;
case 3:
strcpy(pCVal->data,
break;
case 4:
strcpy(pCVal->data,
break;
case 5:
strcpy(pCVal->data,
break;
} /*lint !e744 */
vtrows[rno].name);
vtrows[rno].addr);
vtrows[rno].city);
vtrows[rno].state);
vtrows[rno].zip);
}
}
++pCtx->rowcnt;
++pCtx->rowno;
vtExit();
return errSUCCESS;
}
As with vtSelectOpen, note here as well the call to vtEnter at line 15 and its reciprocal calls to vtExit at
lines 19 and 53 serializing access to the norows and vtrows variables. The if statement at line 17 tests the
two conditions under which an errNOMOREDATA status code is to be returned.
The loop at lines 22 to 49 is used to copy the fetched row's information for each column in the colsvals array.
This involves setting the correct null value indicator (lines 24-25) and, for the non-null columns, copying its value
into the column's data buffer pointed to by the VCOL_INFO data field (lines 30, 33, 36, 39, 42, and 45).
Finally, the row count and row number values are incremented (lines 50-51).
Using Virtual Tables to Access Any Data
169
RDM SQL Language Guide
Virtual Table Select CloseFunction: vtSelectClose
This function is called by SQL when the application has completed its processing of the statement containing the
virtual table reference in order to terminate the select statement access to the virtual table. Any memory that was
allocated by vtSelectOpen for the vtFetch calls would need to be freed by this function. Three arguments
are passed into the vtSelectClose function as described in the following table.
Table 10. Function vtSelectCClose Argument Descriptions
Argument
Type
Description
hStmt
pRegCtx
HSTMT
void *
pFCtx
void *
Statement handle of SQL statement containing the virtual table reference.
Pointer to the user program allocated context data area that was originally
passed in through the call to rsqlRegisterVirtualTables.
Pointer to the fetch context data area.
No vtSelectClose function is needed for the virtab virtual table implementation. But an example stub is
shown below.
/* ========================================================================
Virtual table close function
*/
typedef void EXTERNAL_FCN vtSelectClose(
HSTMT
hstmt,
/* in: statement handle */
void
*pRegCtx,
/* in: ptr to registration context */
void
*pFetchCtx)
/* in: ptr to fetch context */
/*
Called by SQL when SELECT statement containing virtual table reference
completes execution (i.e., when cursor is closed).
Use this function to do any needed cleanup and device termination actions.
*/
{
/* code to free any allocated memory or, perhaps
to power down virtual table device. */
}
Virtual Table Usage
Virtual Tables Are Not Transaction Sensitive
An insert on a virtual table cannot be committed nor can it be rolled-back. In fact, an insert doesn't even have to
do an "insert". It simply sends a set of data values to the vtInsert function for the specified virtual table. What
that function actually does with the data is up to it. For example, in a wireless sensor network (WSN) application
an insert could be used to send control settings to a sensor.
Using Virtual Tables to Access Any Data
170
RDM SQL Language Guide
Some Virtual Tables May Have an Unlimited Number of Rows
Only a little imagination is needed to see that data from sources such as a WSN have no natural end. As long as
the sensors continue to operate, data will always be available. This presents a particularly difficult problem when
the data needs to be summarized over some aggregate collection. Consider the following two tables shown
below from the weather data WSN application database from the Defining a Database section.
create table weather_summary(
longitude integer,
latitude integer,
rdg_date date,
hour_of_day smallint,
avg_temp smallint,
avg_ press smallint,
avg_hum smallint,
avg_lumens smallint,
foreign key (longitude, latitude) references location
);
create virtual readonly table weather_data(
sensor_id bigint primary key,
loc_long integer,
loc_lat integer,
rdg_time timestamp,
temperature smallint,
pressure smallint,
humidity smallint,
light smallint,
power integer
);
The weather_summary table contains the averages of the readings from each sensor as collected over each
hour of the day. In order to compute these aggregated values, SQL needs to sort the fetched rows by sensor_
id and rdg_time (timestamp when the sensor data was read). But any sort needs to have a fixed number of
rows. How is this done when there is an unlimited number of rows?
To address this problem, the select statement includes a non-standard clause that can limit the number of rows
that are returned as specified in the following syntax.
select_stmt:
select … from
table
where … limit( num limit_unit)
limit_unit:
rows | hours | mins | secs | msecs
The limit clause limits either the number of rows that are returned or the amount of time the select statement is
allowed to run. The following example shows a select statement that stores the averages per hour from each
weather sensor in the weather_summary table.
Using Virtual Tables to Access Any Data
171
RDM SQL Language Guide
insert into weather_summary
select loc_long, loc_lat, convert(rdg_time,date), hour(rdg_time),
avg(temperature), avg(pressure), avg(humidity), avg(light) from weather_data
group by 1,2,4 limit(4 hours);
Each row is fetched and sorted over each four hour span of time. At the end of that time, the sorted data is
scanned and the aggregate calculations performed and the resulting rows are then stored in the weather_summary table. The time limit can be shorter but, in this case, not any less than an hour as that is the smallest unit
over which the aggregation is made (of course, this assumes that the select is synchronized to execute at the
start of an hour).
It is important to note that even though the virtual table has no fixed number of rows, the vtRowCount function
still needs to return a value. Based on how you choose to limit the select statements that retrieve data from your
virtual table just have vtRowCount return an estimate of the average number of rows that will be returned from
any given execution of the select. It does not have to be an exact value.
Virtual Table Data Is Not Necessarily Persistent
The data contained in the example virtab virtual table is clearly not persistent. The stdtab table can be used
to save a persistent copy of the data as shown in the following SQL statements.
insert into stdtab from select * from virtab;
commit;
Then, when the application is restarted, virtab can be reloaded by simply doing the reverse (only without the
commit).
insert into virtab from select * from stdtab;
Using Virtual Tables to Access Any Data
172
RDM SQL Language Guide
Accessing a Core (non-SQL) Database in
RDM SQL
I am as vulnerable and fragile as it is
possible to be. I am shredded to the core.
I am at the point where I am stripped bare.
- Rachel Hunter,
New Zealand model (1968 - )
RDM SQL allows opening a RDM core database (i.e., a native, non-SQL, database) in read only mode. Besides
providing the ability to perform SQL queries using the native RDM SQL API it also allows access to RDM core
databases from ODBC, JDBC, or ADO.NET clients.
A core database is one for which the schema was created using the core API instead of through SQL. SQL will
internally create a compatible catalog based on the database dictionary contents. However, RDM core databases have features that are not available through SQL. This section will describe how core databases are
mapped into a SQL database. Knowledge of both RDM native and SQL database definition is assumed throughout this section.
How Core Database Record Types are Mapped to SQL
Tables
Each core record type will map directly into an SQL table that will have the same name. This includes the system
record even though it will not have any columns and is not used in SQL.
Each data field in a core record type will map into its equivalent SQL column. However, since SQL does not support unsigned integer types, unsigned integers map into the signed integer type of the same size. Grouped
(struct) fields, array fields and DB_ADDR fields will map into a SQL binary array of the appropriate size.
Note that meaningful access to the binary form can only occur when the computer on which the data is returned
through SQL has the same native architecture as the computer on which the database is stored because of byte
ordering and alignment differences that necessarily exist between different computers. This is only possible
when using remote access to the database through rdmsqlserver.
Fields of type blob_id will map into SQL long varbinary columns.
The table below summarizes the core data type mappings into SQL.
Table 1. Core Data Type SQL Mappings
Core Data Type
Mapped SQL Data Type
char
uint8_t
[unsigned] short, uint16_t,
int16_t
[unsigned] int, uint32_t, int32_
t
[unsigned] long
uint64_t, int64_t
char
tinyint
smallint
integer
integer
bigint
Accessing a Core (non-SQL) Database in RDM SQL
173
RDM SQL Language Guide
Core Data Type
Mapped SQL Data Type
float
double
[unsigned] char[33]
wchar, wchar_t
varchar[256]
varwchar
blob_id
int32_t[10]
char[2][10]
struct { int32_t, char[20]}
DB_ADDR
real
float (double)
char(32)1
wchar
varchar(255)
wvarchar
long varbinary
binary(40)2
binary(20)
binary(24)
binary(8)
Mapping Core Keys to SQL Keys
Key fields and compound keys map directly into SQL keys. Unique keys will map into a primary key. Where a record type has more than one unique key, SQL will identify which one will serve as the primary key based on the following criteria in order of priority.
1. The first declared hash key.
2. The smallest, single field key (i.e., not compound key).
3. The smallest key.
If two or more candidate keys have the same length then the first declared key is chosen as primary.
Table 2. Example Core Keys to SQL Mappings
Core DDL SnippetCore DDL Snippet
Mapped SQL DDL Snippet
record recname {
unique key char name[25];
hash[1000] int32_t code;
char text[81];
}
create table recname(
name char(24) unique key,
code integer primary key hash
[1000],
text char(80)
);
record recname {
char name[25];
int32_t code;
unique key char soundex[5];
compound unique key name_code {
code; name;
}
}
create table recname(
name(24),
code integer,
soundex char(4) primary key,
unique key name_code(code, name)
);
1Note that the core char array size includes the null byte whereas the SQL declared size does not (but internally it does).
Same is true for varchar, etc.
2The actual binary column size depends on computer alignment issues. True for all of the following binary mappings.
Accessing a Core (non-SQL) Database in RDM SQL
174
RDM SQL Language Guide
Since SQL does not support unsigned integer types, core keys on unsigned integer fields cannot be used except
for equality lookups due to the potential problem that can occur should an unsigned value map into a signed negative value. If the values actually stored in the unsigned data field can never be that large then simply removing
the unsigned attribute from the core DDL field declaration will allow SQL to use the key. Core unique keys on
unsigned integer fields are treated by SQL as if it were a hash key which allows the key to be used for equality
lookups.
Mapping Core Sets to SQL Foreign Keys
Sets map into SQL foreign keys but only when the owner record type has a unique key. Foreign key columns are
added to the SQL table that corresponds to the set member record type. These columns match their primary key
counterparts in the SQL table that corresponds to the set owner record type. The values for foreign key columns
will be retrieved by SQL via the set from the primary key (i.e. set owner) table.
The names of the foreign key columns will be assigned the same name as its corresponding field in the owner record. However, if the member record already has a field with that same name then the name will be appended
with "$r" followed by a number to make the column name unique.
Table 3 below gives two examples of how core sets map into SQL foreign keys.
Table 3. Example Core Set to Foreign Key Mappings
Core DDL SnippetCore DDL Snippet
Mapped SQL DDL Snippet
record info {
unique key varchar id_code[48];
varchar info_title[80];
char publisher[32];
char pub_date[12];
int16_t info_type;
}
record key_word {
unique key char kword[32];
}
record intersect {
int16_t int_type;
}
set key_to_info {
order last;
owner key_word;
member intersect;
}
set info_to_key {
order last;
owner info;
member intersect;
}
create table info(
id_code char(47) primary key,
info_title char(79),
publisher char(31),
pub_date char(11),
info_type smallint
);
create table key_word(
kword char(31) primary key
);
create table intersect(
int_type smallint,
kword char(31) references key_
word,
id_code char(47) references info
);
record ownrec {
unique key char idcode[9];
char title[33];
create table ownrec(
idcode char(8) primary key,
title char(32)
Accessing a Core (non-SQL) Database in RDM SQL
175
RDM SQL Language Guide
Core DDL SnippetCore DDL Snippet
}
record memrec {
key int32_t idcode;
char txtln[81];
}
set notes {
order last;
owner ownrec;
member memrec;
}
Mapped SQL DDL Snippet
);
create table memrec(
idcode integer,
txtln char(80),
idcode$r1 char(8) references ownrec
);
Multi-Member Sets and Explicit Locking
Multi-member sets can be declared in the core level database. These present no problem for SQL except in the
event that explicit table is being used (see Locking in RDM SQL). If locks are being explicitly issue through use of
the lock table statement then it will be necessary to lock all of the tables that participate as a member of a set that
may be used to access one of the member tables. An errNOTLOCKED status will be returned when SQL
attempts to access the next member of a multi-member set that is a row from an alternate member table that has
not been explicitly locked.
Order of Columns in the Table
The fields declared in the core record type map directly into columns of its corresponding SQL table in exactly the
same order. These are followed by the virtual columns for each foreign key which are created in the order in
which the sets in which the record type is a member are declared in the core DDL specification (e.g., see
"create table intersect" above in Table 3).
Null Values
RDM core databases do not support null data field values. Note that this does not mean that null values can not
occur. Foreign key references can still be null and outer joins can produce null values.
Adding Column Information and Creating a Catalog
Two RDM-specific SQL statements can be used in conjunction with core databases. The set column statement
can be used to specify the SQL data type for certain core data fields that contained SQL-understandable data
(e.g., long varchar). It can also be used to specify the number of distinct values and/or the range values used by
the SQL query optimizer. Once all of the needed set column statements have been processed for a given core
database, the create catalog statement can be executed which will create and store the SQL catalog file for the
core database.
The syntax for the set column statement is given below.
Accessing a Core (non-SQL) Database in RDM SQL
176
RDM SQL Language Guide
set_column_stmt:
set column [db_name.]table_name.column_name
[type [to | =] {date | time | timestamp | long | {varchar | wvarchar}}]
[distinct values = num]
[range
constant
to
constant]
|
set column stats [db_name.]table_name.column_name
[distinct values = num]
[range
constant
to
constant]
The type clause can be used to specify an SQL-specific data type for a core database field. You can specify date
for an (32-bit) integer field but it must contain a valid DATE_VAL value (the number of elapsed days since Jan 1,
1 AD which has a value 1). You can specify time for an (32-bit) integer field but it must contain a valid TIME_VAL
value (the number of elapsed seconds since midnight times 10,000). You can specify timestamp for a (64-bit)
bigint field but it must contain a valid TIMESTAMP_VAL value (DATE_VAL and TIME_VAL combined). Since
core databases do not differentiate between binary and character blob fields, you can also specify long varchar
or long wvarchar for a blob field.
Two types of statistics can be specified. The number of distinct values specifies the approximate number of different values stored in the column. For example, a column of type smallint can theoretically contain 65,535 different values. If, however, the actual number of different values is considerably smaller then that can have an
important impact on the access choices the optimizer might be inclined to make. Similarly, the range clause is
used to identify the range of values that the column can contain. Note that specifying the range only affects the
optimizer. It does not mean that the SQL system will check to ensure that only those values are stored in the column. The values specified in these two clauses are understood to be estimates and no problems are created
when, for example, a column value actually falls outside the specified range. The database in which the table column is declared must be opened when set column is called.
The syntax for the create catalog statement is as follows.
create_catalog_stmt:
create catalog for dbname
The database must be opened in exclusive access mode in order to execute the create catalog statement.
For example, the following snippet shows a portion of a core DDL version of the bookshop database definition.
record author {
unique key char last_name[14];
char full_name[36];
char gender[2];
int16_t yr_born;
int16_t yr_died;
blob_id short_bio;
compound key yob_gender_key {
yr_born ascending;
gender ascending;
}
}
Accessing a Core (non-SQL) Database in RDM SQL
177
RDM SQL Language Guide
record book {
unique key char bookid[15];
key varchar title[256];
char descr[62];
varchar publisher[137];
key int16_t publ_year;
char lc_class[34];
int32_t date_acqd;
int32_t date_sold;
double price;
double cost;
}
The following SQL statement script shows how the set column statement is used to specify the needed data
types and stats as specified in its SQL DDL counterpart (see "Antiquarian Bookshop Database" in the "Example
Databases" section in the Defining a Database section).
open database bookshop in exclusive mode;
set column author.gender distinct values = 2;
set column author.short_bio type to long varchar;
set column book.publ_year range 1500 to 1980;
set column book.date_acqd type to date;
set column book.date_sold type to date;
create catalog for bookshop;
Accessing a Core (non-SQL) Database in RDM SQL
178
RDM SQL Language Guide
SQL Built-In Function Reference
RDM provides many built-in functions that you can use in queries to return data or perform operations on data.
Aggregate Functions
Aggregate functions perform a calculation on a set of values and return a single value. Except for COUNT, aggregate functions ignore null values. Aggregate functions are frequently used with the GROUP BY clause of the
SELECT statement.
Table 10. Built-in Aggregate Functions
Function
count
sum
avg
min
max
Description
Returns the number (distinct) of rows in the aggregate.
Returns the sum of the (distinct) values of expression in the aggregate.
Returns the average of the (distinct) values of expression in the aggregate.
Returns the minimum expression value in the aggregate.
Returns the maximum expression value in the aggregate.
Scalar Functions
Mathematical Functions
The following scalar functions perform a calculation, usually based on input values that are provided as arguments, and return a numeric value:
Table 6. Built-in Numeric Functions
Function
abs
acos
asin
atan
atan2
ceil | ceiling
cos
cot
exp
floor
ln | log
mod
pi
rand
sign
Description
Returns the absolute value of an expression.
Returns the arccosine of an expression.
Returns the arcsine of an expression.
Returns the arctangent of an expression.
Returns the arctangent of an x-y coordinate pair.
Finds the upper bound for an expression.
Returns the cosine of an angle.
Returns the cotangent of an angle.
Returns the value of an exponential function.
Finds the lower bound for an expression.
Returns the natural logarithm of an expression.
Returns the remainder of arith_expr1/arith_expr2.
Returns the value of pi.
Returns next random floating-point number. Non-zero num is seed.
Returns the sign of an expression (-1, 0, +1).
SQL Built-In Function Reference
179
RDM SQL Language Guide
Function
sin
sqrt
tan
Description
Returns the sine of an angle.
Returns the square root of an expression.
Returns the tangent of an angle.
Date and Time Functions
The data type 'date' assumes the Gregorian calendar even for dates prior to the introduction of the Gregorian calendar. This means that databases that store historical dates prior to the introduction of the Gregorian calendar may not compute select with date ranges, dayofweek, and week correctly.
Table 7. Date/Time Functions
Function
age
curdate
curtime
dayofmonth
dayofweek
dayofyear
hour
minute
month
quarter
second
week
year
Description
Calculate number of whole years from date_expr to current date
Retrieve the current date
Retrieve the current time
Retrieve the day of the month
Retrieve the day of the week
Retrieve the day of the year
Retrieve the hour
Retrieve the minute
Retrieve the month
Retrieve the quarter
Retrieve the second
Retrieve the week
Retrieve the year
String Functions
The following scalar functions perform an operation on a string input value and return a string or numeric value:
Table 8. Built-in String Functions
Function
ascii
char
concat
convert
insstr
lcase
left
length
locate
ltrim
Description
Returns the numeric ASCII value of a character
Returns the ASCII character with numeric value num
Concatenates two strings
Convert an expression to a data type or a character string
Replace num2 chars from string_expr2 in string_expr1 beginning at position num1
(1st position is 1 not 0)
Converts a string to lowercase
Returns the leftmost num characters from the string
Returns the length of the string
Locate string_expr1 from position num in string_expr2
Removes all leading spaces from string
SQL Built-In Function Reference
180
RDM SQL Language Guide
Function
repeat
replace
right
rtrim
substring
ucase
unicode
wchar(num)
Description
Repeats string num times
Replace string_expr2 with string_expr3 in string_expr1
Returns the rightmost num characters from string
Removes all trailing spaces from string
Returns num2 characters from string_expr beginning at position num1.
Convert string to uppercase
Returns the numeric Unicode value of a character
Returns a Unicode character with numeric value num.
SQL Built-In Function Reference
181
RDM SQL Language Guide
abs
Retrieve the absolute value of an expression
Syntax
abs(arith_expr)
Parameters
arith_expr
An arithmetic expression.
Description
This scalar numeric function retrieves the absolute value of the specified arithmetic expression.
SQL Built-In Function Reference
182
RDM SQL Language Guide
acos
Retrieve the arccosine of an expression
Syntax
acos(arith_expr)
Parameters
arith_expr
An arithmetic expression with a value between -1.0 and +1.0.
Description
This scalar numeric function retrieves the arccosine, in radians, of the specified arithmetic expression.
SQL Built-In Function Reference
183
RDM SQL Language Guide
age
Returns the age (in full years)
Syntax
age(date_expr)
Parameters
date_expr
A date expression from which the age will be calculated
Description
Return the number of years from the date_expr to the current date.
The data type 'date' assumes the Gregorian calendar even for dates prior to the introduction of the Gregorian calendar. This means that databases that store historical dates prior to the introduction of the Gregorian calendar may not compute select with date ranges, dayofweek, and week correctly.
SQL Built-In Function Reference
184
RDM SQL Language Guide
asin
Retrieve the arcsine of an expression
Syntax
asin(arith_expr)
Parameters
arith_expr
An arithmetic expression with a value between -1.0 and +1.0.
Description
This scalar numeric function retrieves the arcsine, in radians, of the specified arithmetic expression.
SQL Built-In Function Reference
185
RDM SQL Language Guide
atan
Retrieve the arctangent of an expression
Syntax
atan(arith_expr)
Parameters
arith_expr
An arithmetic expression.
Description
This scalar numeric function retrieves the arctangent, in radians, of the specified arithmetic expression.
SQL Built-In Function Reference
186
RDM SQL Language Guide
atan2
Retrieve the arctangent of an x-y coordinate pair
Syntax
atan2(arith_expr_X, arith_expr_Y)
Parameters
arith_expr_X
arith_expr_Y
An arithmetic expression providing the x coordinate.
An arithmetic expression providing the y coordinate.
Description
This scalar numeric function retrieves the arctangent, in radians, of the specified x and y coordinates.
SQL Built-In Function Reference
187
RDM SQL Language Guide
avg
Compute the average of the results for an aggregate result set
Syntax
avg(arith_expr)
Parameters
arith_expr
An arithmetic expression.
Description
This aggregate (calculation) function computes the average of the results of the specified expression for all rows
of an aggregate result set.
Example
select sale_name,
convert(avg(amount), char, 10, "$#,#.##") "avg sale amt"
from salesperson natutal join customer natural join sales_order
group by 1;
sale_name
Flores, Bob
Jones, Walter
Kennedy, Bob
McGuire, Sidney
Nash, Gail
Porter, Greg
Robinson, Stephanie
Stouffer, Bill
Warren, Wayne
Williams, Steve
Wyman, Eliska
SQL Built-In Function Reference
avg sale amt
$19,233.56
$28,170.70
$61,362.11
$18,948.37
$34,089.70
$87,869.30
$24,993.63
$3,631.66
$21,263.85
$27,464.44
$23,617.38
188
RDM SQL Language Guide
ceiling
Find the upper bound for an expression
Syntax
ceiling(arith_expr)
Parameters
arith_expr
An arithmetic expression.
Description
This scalar numeric function retrieves an upper bound (ceiling) for the specified arithmetic expression. The ceiling is the smallest integer greater than or equal to the expression.
SQL Built-In Function Reference
189
RDM SQL Language Guide
convert
Convert an expression to a data type or a character string
Syntax
convert(expression, convert_type_type)
convert(expression, {char | wchar}, width, format_spec)
convert_type:
char |smallint | integer | real
|
double | date | time | timestamp | tinyint | bigint
convert_format:
numeric_format | datetime_format
numeric_format:
"[<< | >> | ><]['text' | $][- | (][#,]#[.#[#]...][e | E]['text' | $ | %]"
datetime_format:
"[<< | >> | ><]['text' | spchar | date_code | time_code]..."
date_code:
m | mm | mmm | mon | mmmm | month
|
d | dd | ddd | dddd | day
|
yy | yyyy
time_code:
h | hh | m | mm | s | ss | .f[f]... | [a/p | am/pm | A/P | AM/PM]
Parameters
expr
arg_type
char | wchar
width
fmt
The expression to be converted.
Specifies the data type into which the expression is to be converted.
Specifies the character type of the result when using the second form of the convert
function specified above.
The maximum width, in characters, of the result string.
The specification of the format of the result character string into which the numeric or
date/time values will be converted. The individual elements of the format specifiers are
described in the Numeric Format Specifier and Date/Time Format Specifier tables
below.
Description
This system function converts an expression to a different type or string representation. There are two forms of
this function.
SQL Built-In Function Reference
190
RDM SQL Language Guide
The first form of this function, shown above, converts an expression to the specified data type. The second form
converts an expression to a character string in the specified format.
Numeric Format Specifier
The format specifier for numeric values is represented as shown in the box below. The minimum specifier that
must be used for a numeric format is "#". If the display field width (width parameter) is too small to contain a
numeric value, the convert function formats the value in exponential format (for example, 1.759263e08).
The elements for this specifier are explained in the following table.
Numeric Format Specifier Elements
Element Description
[<< | >> The justification specifier. You can specify left-justified text (<<), right-justified text (>>), or centered
| ><]
text (><). The default for numeric values is right-justified.
A text character or string to use as a prefix for the result string. You must enclose the character or text
['text' | $]
with single quotation marks unless the prefix is one dollar sign.
The display specifier for negative values. You can show negative values with a minus sign or with paren[- | (]
theses around the value. If parentheses are used, positive values are shown with an ending space to
ensure alignment of the decimal point.
[#,]#[.# The numeric format specifier. You can specify whether to show commas every third place before the
[#]...]
decimal point. Also, you can specify how many digits (if any) to show after the decimal point.
Whether to use exponential format to show numeric values. If this option is omitted, exponential format
[e | E] is used only when the value is too large or small to be shown otherwise. You can specify display of an
lowercase or uppercase exponent indicator.
['text' | $ A text character or string to use as a suffix for the result string. You must enclose the character or text
| %]
with single quotation marks unless the suffix is one dollar or percent sign.
Formatting Date/Time Values
The format specifier for date/time values is given in the above syntax box. The date/time format specifier can contain any number of text items or special characters that are interspersed with the date or time codes. You can
arrange these items in any order, but a time specifier must adhere to the ordering rules described in the syntax
under "time_code". For the minute codes to be interpreted as minutes (and not months) they must follow the
hour codes. You cannot specify the minutes of a time value without also specifying the hour. You can specify the
hour by itself. Similarly, you cannot specify the seconds without having specified minutes and you cannot specify
fractions of a second without specifying seconds. Thus, the order "hours, minutes, seconds, fractions" must be
preserved.
Date and Time Format Specifier Elements
General Formatting Elements
Element
Description
The justification specifier. You can specify left-justified text (<<), right-justified text (>>), or cen[<< | >> | ><]
tered text (><). The default for numeric values is left-justified.
A string or a special character (for example, "-", "/", or ".") to be copied into the result string. The
'text' | spchar
special character is often useful in separating the entities within a date and time.
Date-Specific Formatting Elements
Element
Description
m
Month number (1-12) without a leading zero.
SQL Built-In Function Reference
191
RDM SQL Language Guide
mm
mmm
mon
mmmm
month
d
dd
ddd
dddd
day
yy
yyyy
Month number with a leading zero.
Three-character month abbreviation (e.g., "Jan").
Same as mmm.
Fully spelled month name (e.g., "January").
Same as mmmm.
Day of month (1-31) without leading zero.
Day of month with leading zero.
Three character day of week abbreviation (e.g., "Wed").
Fully spelled day of week (e.g., "Wednesday").
Same as dddd.
Two-digit year AD with leading zero if year between 1950 and 2049; otherwise same as yyyy.
Year AD up to four digits without leading zero.
Time-Specific Formatting Elements
Element
Description
h
Hour of day (0-12 or 23) without leading zero.
hh
Hour of day with leading zero.
m
Minute of hour (0-59) without leading zero (only after h or hh).
mm
Minute of hour with leading zero (only after h or hh).
s
Second of minute (0-59) without leading zero (only after m or mm).
ss
Second of minute with leading zero (only after m or mm).
.f[f]...
Fraction of a second: four decimal place accuracy (only after s or ss).
a/p | am/pm | Hour of day is 0-12; AM or PM indicator will be output to result string (only after last time code eleA/P | AM/PM ment).
Example
The following examples show numeric format specifiers and their results.
Function
convert(14773.1234, char, 10, "#.#")
convert(736620.3795, char, 12, "#,#.###")
convert(736620.3795, char, 12, "$#,#.##")
convert(736620.3795, char, 12, "<<#.######e")
convert(56.75, char, 8, "#.##%")
convert(56.75, char, 8, "#.##' percent'")
Result
"
14773.1"
"736,620.380"
"$736,620.38"
"7.366204e05"
" 56.75%"
" 56.75 percent"
The examples below show date/time format specifiers and corresponding results. These examples show how
Tuesday, October 23, 1951 at 4:42:27.1750 a.m. can be returned. The format specifier, rather than the entire
function, is shown here in the left column.
Format Spec.
mmm dd, yyyy
hh'hours' on ddd month dd, yyyy
dd 'of' month 'of the year' yyyy
dddd hh.mm.ss.ffff mm-dd-yyyy
'date:'yyyy.mm.dd 'at' hh:mm A/P
SQL Built-In Function Reference
Result
Oct 23, 1951
04hours on Tue October 23, 1951
23 of October of the year 1951
Tuesday 04.42.27.1750 10-23-1951
date:1951.10.23 at 04:42 AM
192
RDM SQL Language Guide
cos
Retrieve the cosine of an angle
Syntax
cos(arith_expr)
Parameters
arith_expr
An arithmetic expression.
Description
This scalar numeric function retrieves the cosine of the specified arithmetic expression. Cosine operations return
values between -1.0 and +1.0.
SQL Built-In Function Reference
193
RDM SQL Language Guide
cot
Retrieve the cotangent of an angle
Syntax
cot(arith_expr)
Parameters
arith_expr
An arithmetic expression.
Description
This scalar numeric function retrieves the cotangent of the specified arithmetic expression.
SQL Built-In Function Reference
194
RDM SQL Language Guide
count
Count the rows of an aggregate result set
Syntax
count({* | column_name})
Parameters
*
column_name
All columns of the result set.
A column name.
Description
This aggregate (calculation) function returns the total number of rows of an aggregate.
Example
select company, count(ord_num) from customer natural join sales_order
group by 1;
COMPANY
"Bills We Pay" Financial Corp.
Bears Market Trends, Inc.
Bengels Imports
Broncos Air Express
Browns Kennels
Bucs Data Services
Cardinals Bookmakers
Chargers Credit Corp.
Chiefs Management Corporation
Colts Nuts & Bolts, Inc.
Cowboys Data Services
Dolphins Diving School
Eagles Electronics Corp.
Falcons Microsystems, Inc.
Forty-Niners Venture Group
Giants Garments, Inc.
Jets Overnight Express
Lions Motor Company
Oilers Gas and Light Co.
Packers Van Lines
Patriots Computer Corp.
Raiders Development Co.
Rams Data Processing, Inc.
Redskins Outdoor Supply Co.
Saints Software Support
SQL Built-In Function Reference
COUNT(ORD_NUM)
5
5
5
7
7
4
5
3
5
8
3
2
5
3
3
2
4
5
3
4
6
4
8
4
3
195
RDM SQL Language Guide
Seahawks Data Services
Steelers National Bank
Vikings Athletic Equipment
SQL Built-In Function Reference
6
2
6
196
RDM SQL Language Guide
curdate
Retrieve the current date
Syntax
curdate()
Description
This scalar date/time function retrieves the current date. You can also use today as a literal for the current date.
See Also
curtime
SQL Built-In Function Reference
197
RDM SQL Language Guide
curtime
Retrieve the current time
Syntax
curtime()
Description
This scalar date/time function retrieves the current local (server) time.
See Also
curdate
SQL Built-In Function Reference
198
RDM SQL Language Guide
dayofmonth
Retrieve the day of the month
Syntax
dayofmonth(date_expr)
Parameters
date_expr
A date expression from which the day of the month will be extracted.
Description
This scalar date/time function retrieves the day of the month in the specified date expression as a number
between 1 and 31.
The data type 'date' assumes the Gregorian calendar even for dates prior to the introduction of the Gregorian calendar. This means that databases that store historical dates prior to the introduction of the Gregorian calendar may not compute select with date ranges, dayofweek, and week correctly.
SQL Built-In Function Reference
199
RDM SQL Language Guide
dayofweek
Retrieve the day of the week
Syntax
dayofweek(date_expr)
Parameters
date_expr
A date expression from which the day of week will be extracted.
Description
This scalar date/time function retrieves the day of the week in the specified date expression as a number
between 1 and 7, where 1 is Sunday.
The data type 'date' assumes the Gregorian calendar even for dates prior to the introduction of the Gregorian calendar. This means that databases that store historical dates prior to the introduction of the Gregorian calendar may not compute select with date ranges, dayofweek, and week correctly.
SQL Built-In Function Reference
200
RDM SQL Language Guide
dayofyear
Retrieve the day of the year
Syntax
dayofyear(date_expr)
Parameters
date_expr
A date expression from which the day of the year will be extracted.
Description
This scalar date/time function retrieves the day of the year in the specified date expression as a number between
1 and 366.
The data type 'date' assumes the Gregorian calendar even for dates prior to the introduction of the Gregorian calendar. This means that databases that store historical dates prior to the introduction of the Gregorian calendar may not compute select with date ranges, dayofweek, and week correctly.
SQL Built-In Function Reference
201
RDM SQL Language Guide
exp
Retrieve the value of an exponential function
Syntax
exp(arith_expr)
Parameters
arith_expr
An arithmetic expression.
Description
This scalar numeric function retrieves the value of an exponential function with the specified arithmetic expression as an exponent (that is, earith_expr).
SQL Built-In Function Reference
202
RDM SQL Language Guide
floor
Find the lower bound for an arithmetic expression
Syntax
floor(arith_expr)
Parameters
arith_expr
An arithmetic expression.
Description
This scalar numeric function retrieves the lower bound (floor) for the specified arithmetic expression. The floor is
the largest integer less than or equal to the expression.
SQL Built-In Function Reference
203
RDM SQL Language Guide
hour
Retrieve the hour
Syntax
hour(time_expr)
Parameters
time_expr
An expression representing either a time or a timestamp value.
Description
This scalar date/time function retrieves the hour in the specified time expression as a number between 0 and 23.
SQL Built-In Function Reference
204
RDM SQL Language Guide
if
Implement a conditional selection
Syntax
if(cond_expr,expression1,expression2)
Parameters
cond_expr
expression1
expression2
The conditional expression.
The expression to be evaluated and returned if the conditional expression evaluates to
TRUE.
The expression to be evaluated and returned if the conditional expression evaluates to
FALSE.
Description
This function conditionally evaluates one of two expressions for each row of the select statement in which it is
used. The expression to be evaluated and returned is based on the value of the specified conditional expression
for each row. If the conditional expression evaluates to TRUE, the if evaluates and retrieves the value of the first
expression (expression1). If the conditional expression evaluates to FALSE, the function evaluates and
returns the value of the second expression (expression2). Both expressions must return values of identical
data types.
Example
select quantity, prod_id, prod_desc,
if(quantity > 20, .8*price, if(quantity > 5, .9*price, price)) "PRICE"
from item natural join product;
update sales_order
set tax = if(state="WA", amount*0.085, if(state="CO", amount*0.062, 0.0))
where state in ("CA","WA");
select
sum(if(prod_id=10320,
sum(if(prod_id=10333,
sum(if(prod_id=10433,
sum(if(prod_id=10450,
from item;
SQL Built-In Function Reference
quantity,
quantity,
quantity,
quantity,
0))
0))
0))
0))
"386/20",
"386/33",
"486/33",
"486/50",
205
RDM SQL Language Guide
ifnull
Retrieve an expression if another expression is null
Syntax
ifnull(expr1, expr2)
Parameters
expr1
expr2
The expression to be evaluated and, if not null, returned.
The expression to be evaluated and returned if expr1 is null.
Description
This system function retrieves the value of the first specified expression (expr1) if it is not null. If expr1 is null,
the ifnull function returns the value of second expression (expr2). The two expressions must be of compatible
data types.
SQL Built-In Function Reference
206
RDM SQL Language Guide
log
Retrieve the natural logarithm of an expression
Syntax
log(arith_expr)
Parameters
arith_expr
An arithmetic expression.
Description
This scalar numeric function retrieves the natural logarithm of the specified arithmetic expression.
SQL Built-In Function Reference
207
RDM SQL Language Guide
max
Compute the maximum of the results for an aggregate
Syntax
max(expression)
Parameters
expression
The expression from which the maximum value is to be determined.
Description
This aggregate (calculation) function computes the maximum value for the specified expression for all rows of an
aggregate.
Example
set double display(12, "#,#.##");
select month(ord_date), max(amount) from sales_order group by 1;
month(ord_date)
1
2
3
4
5
6
SQL Built-In Function Reference
max(amount)
274,375.00
124,660.00
143,375.00
252,425.00
39,675.95
104,019.50
208
RDM SQL Language Guide
min
Compute the minimum of the results for an aggregate
Syntax
min(expression)
Parameters
The expression from which the minimum value is to be determined.
expression
Description
This aggregate (calculation) function computes the minimum value for the specified expression for all rows of an
aggregate.
Example
set double display(12, "#,#.##");
select month(ord_date), min(amount) from sales_order group by 1;
month(ord_date)
1
2
3
4
5
6
min(amount)
408.00
344.48
631.78
68.75
2,673.75
4,487.76
SQL Built-In Function Reference
209
RDM SQL Language Guide
minute
Retrieve the minute
Syntax
minute(time_expr)
Parameters
time_expr
An expression representing either a time or a timestamp value.
Description
This scalar date/time function returns the minute in the specified time expression as a number between 0 and 59.
SQL Built-In Function Reference
210
RDM SQL Language Guide
mod
Perform a modulo arithmetic operation
Syntax
mod(arith_expr1,arith_expr2)
Parameters
arith_expr1
arith_expr2
The expression to divide.
The expression that is used as the divisor.
Description
This scalar numeric function performs a modulo arithmetic operation of the form arith_expr1 modulo
arith_expr2. In other words, the function retrieves the remainder resulting from dividing arith_expr1 by
arith_expr2.
SQL Built-In Function Reference
211
RDM SQL Language Guide
month
Retrieve the month
Syntax
month(date_expr)
Parameters
A date expression.
date_expr
Description
This scalar date/time function retrieves the number of the month in the specified date expression as a number
between 1 and 12.
Example
set double display(12, "#,#.##");
select month(ord_date), min(amount) from sales_order group by 1;
month(ord_date)
1
2
3
4
5
6
min(amount)
408.00
344.48
631.78
68.75
2,673.75
4,487.76
SQL Built-In Function Reference
212
RDM SQL Language Guide
pi
Retrieve the value of pi
Syntax
pi()
Description
This scalar numeric function retrieves the value of pi as a double data type (3.14159...).
SQL Built-In Function Reference
213
RDM SQL Language Guide
quarter
Retrieve the quarter
Syntax
quarter(date_expr)
Parameters
date_expr
A date expression.
Description
This scalar date/time function retrieves the number of the quarter in the specified date expression as a number
between 1 and 4.
SQL Built-In Function Reference
214
RDM SQL Language Guide
query
Evaluate a single-row query
Syntax
query(select_stmt_str[, param_value]...)
Parameters
select_stmt_str
param_val
A string which specifies the select statement to be executed. The select statement
must only return at most one row. If no rows are returned then the function returns a null
value. The select statement can contain parameter markers.
Provides the value of a parameter marker specified the corresponding parameter marker
in select_stmt_str. For each parameter marker specified in the select statement
there must be a param_val argument specified as well and the param_val arguments must be listed in the same order as the parameter markers in the select statement.
Description
This scalar function executes the select statement specified in the select_stmt_str argument. The select
statement must select only one column and return only one row. Parameter markers (indicated by a '?') can be
specified in the select statement string. For each one that is specified, a param_val argument that supplies the
value of the parameter marker must be provided.
This function allows single-valued queries to be specified in expression evaluation contexts where normal subqueries are not allowed.
Example
update customer set sales_tot =
query("select sum(amount) from sales_order where cust_id=?", cust_id);
select sale_name,
query("select city from outlet where loc_id=?", office) office
from salesperson;
SQL Built-In Function Reference
215
RDM SQL Language Guide
rand
Retrieve a random floating-point number
Syntax
rand(num)
Parameters
num
An integer to use as the seed for the floating-point number.
Description
This scalar numeric function retrieves a random floating-point number (between 0.0 and 1.0) using the specified
integer as the seed. If 0 is specified, the rand function retrieves the next random floating-point number for the current seed.
SQL Built-In Function Reference
216
RDM SQL Language Guide
second
Retrieve the second
Syntax
second(time_expr)
Parameters
time_expr
An expression that is either a time or a timestamp value.
Description
This scalar date/time function returns the second in the specified time expression as a number between 0 and
59.
SQL Built-In Function Reference
217
RDM SQL Language Guide
sign
Retrieve the sign of an expression
Syntax
sign(arith_expr)
Parameters
arith_expr
An arithmetic expression.
Description
This scalar numeric function returns -1 if arith_expr is less than 0, 0 if arith_expr equals 0, and 1 if
arith_expr is greater than 0.
SQL Built-In Function Reference
218
RDM SQL Language Guide
sin
Retrieve the sine of an angle
Syntax
sin(arith_expr)
Parameters
arith_expr
An arithmetic expression.
Description
This scalar numeric function retrieves the sine of the specified arithmetic expression. Sine operations return
values between -1.0 and +1.0.
SQL Built-In Function Reference
219
RDM SQL Language Guide
sqrt
Retrieve the square root of an expression
Syntax
sqrt(arith_expr)
Parameters
arith_expr
An arithmetic expression.
Description
This scalar numeric function retrieves the square root of the specified arithmetic expression.
SQL Built-In Function Reference
220
RDM SQL Language Guide
sum
Compute the sum of the results for an aggregate
Syntax
sum(arith_expr)
Parameters
An arithmetic expression.
arith_expr
Description
This aggregate (calculation) function computes the sum of results of the specified expression for each row of an
aggregate.
Example
set double display(12, "#,#.##");
select cust_id, company, sum(amount) from customer natural join sales_order
group by 1;
cust_id
ATL
BUF
CHI
CIN
CLE
DAL
DEN
DET
GBP
HOU
IND
KCC
LAA
LAN
MIA
MIN
NEP
NOS
NYG
NYJ
PHI
PHO
PIT
SDC
company
Falcons Microsystems, Inc.
'Bills We Pay' Financial Corp.
Bears Market Trends, Inc.
Bengels Imports
Browns Kennels
Cowboys Data Services
Broncos Air Express
Lions Motor Company
Packers Van Lines
Oilers Gas and Light Co.
Colts Nuts & Bolts, Inc.
Chiefs Management Corporation
Raiders Development Co.
Rams Data Processing, Inc.
Dolphins Diving School
Vikings Athletic Equipment
Patriots Computer Corp.
Saints Software Support
Giants Garments, Inc.
Jets Overnight Express
Eagles Electronics Corp.
Cardinals Bookmakers
Steelers National Bank
Chargers Credit Corp.
SQL Built-In Function Reference
sum(amount)
113,659.75
263,030.36
160,224.65
120,800.56
43,284.54
43,392.40
498,952.76
439,346.50
163,177.30
77,781.36
29,053.30
141,535.34
167,411.68
172,936.31
29,481.99
49,461.20
120,184.69
185,633.50
15,829.64
124,487.78
130,006.17
237,392.56
15,386.04
34,556.48
221
RDM SQL Language Guide
SEA
SFF
TBB
WAS
Seahawks Data Services
Forty-niners Venture Group
Bucs Data Services
Redskins Outdoor Supply Co.
SQL Built-In Function Reference
60,756.36
112,345.66
104,038.25
63,039.90
222
RDM SQL Language Guide
tan
Retrieve the tangent of an angle
Syntax
tan(arith_expr)
Parameters
arith_expr
An arithmetic expression.
Description
This scalar numeric function retrieves the tangent of the specified arithmetic expression.
SQL Built-In Function Reference
223
RDM SQL Language Guide
week
Retrieve the week
Syntax
week(date_expr)
Parameters
date_expr
A date expression.
Description
This scalar date/time function retrieves the number of the week of the year in the specified date expression as a
number between 1 and 53.
The data type 'date' assumes the Gregorian calendar even for dates prior to the introduction of the Gregorian calendar. This means that databases that store historical dates prior to the introduction of the Gregorian calendar may not compute select with date ranges, dayofweek, and week correctly.
SQL Built-In Function Reference
224
RDM SQL Language Guide
year
Retrieve the year
Syntax
year(date_expr)
Parameters
date_expr
A date expression.
Description
This scalar date/time function retrieves the number of the year in the specified date expression.
SQL Built-In Function Reference
225
RDM SQL Language Guide
SQL Language Syntax Summary
The syntax for the SQL statements that are implemented in RDM SQL is given below. Note that those items in
red have not yet been implemented. Refer to "A Language for Describing a Language" for a description of how to
read the syntax specification. C-style comments are explanatory and not part of the syntax.
RDM_SQL:
RDM_ddl_stmts | RDM_dml_stmts | RDM_proc_stmts
RDM DDL Statements
RDM_ddl_stmts:
create_schema_stmt
{create_domain_stmt | create_table_stmt}...
{create_catalog_stmt}
create_schema_stmt:
create {schema | database} db_name
[pagesize = num] [inmemory [persistent | volatile | read]]
create_domain_stmt:
create domain domain_name [as] data_type
[default {constant | null}]
create_catalog_stmt:
create catalog for dbname
create_table_stmt:
standard_table | virtual_table
virtual_table:
create virtual [read only] table table_name (
vcolumn_def[, vcolumn_def]…
)
vcolumn_def:
column_name base_type
[distinct values = num] [range constant to constant]
[primary key]
SQL Language Syntax Summary
226
RDM SQL Language Guide
standard_table:
create [circular] table table_name (
column_def[, column_def]...
[, key_def[, key_def]...]
) [pagesize = num] [inmemory [persistent | volatile | read]]
[maxpgs = num] [maxrows = num]
column_def:
column_name {type_spec | domain_name}
[distinct values = num] [range constant to constant]
[not null] [key_spec] [refs_spec]
type_spec:
data_type [default {constant | null}]
data_type:
base_type | blob_type
base_type:
{character | char } [(length)]
|
{{character | char} varying | varchar } (length)
|
{binary [(length)]
|
{double [precision] | float | real }
|
{ tinyint | smallint | int | integer | long | bigint}
|
date | time | timestamp
blob_type:
{{character | char} large object | long varchar | clob} [(length)] file_option
|
{binary large object | large varbinary | blob} [(length)] file_option
file_option:
[pagesize = num] [inmemory [persistent | volatile | read]]
key_spec:
|
[primary | unique] key ['['keysize']']
{primary | unique} key [hash { (num) | of num rows}] ['['keysize']']
refs_spec:
references
table_name[.column_name] [triggered_action]
key_def:
[primary | unique] key [hash {(num) | of num rows}] ['['keysize']'] [key_name]
(column_name[asc | desc] [, column_name[asc | desc] ]...)
[pagesize = num] [inmemory [persistent | volatile | read]] [maxpgs = num]
SQL Language Syntax Summary
227
RDM SQL Language Guide
|
foreign key [set_name] (column_name[, column_name]...
references table_name[(column_name[, column_name]...)]
[triggered_action]
triggered_action:
on update action_spec [on delete action_spec]
|
on delete action_spec [on update action_spec]
action_spec:
cascade | restrict | set null
RDM DML Statements
RDM_dml_stmt:
db_stmt | select_stmt | mod_stmt
|
trans_stmt | lock_stmt | set_stmt
db_stmt:
open_db_stmt | close_db_stmt | init_db_stmt
mod_stmt:
insert_stmt | update_stmt | delete_stmt | import_stmt | export_stmt
trans_stmt:
start_stmt | savepoint_stmt | release_stmt
|
rollback_stmt | commit_stmt | end_trans_stmt
lock_stmt:
lock_stmt | unlock_stmt
open_db_stmt:
open [database] db_spec
[[in] {share | read only | exclusive} [mode] | as union of tfs_spec[, tfs_spec]...]
db_spec:
db_name | "[pathspec/]db_name"
close_db_stmt:
close [database] db_name
SQL Language Syntax Summary
228
RDM SQL Language Guide
init_db_stmt:
initialize [database] db_name
dropdb_stmt:
drop database {db_name | "db_name@tfs_spec"}
tfs_spec:
"HostComputerName[:ddddd]"
select_stmt:
select [first] [all | distinct] {* | select_item[, select_item]...}
from table_ref[, table_ref]...
[where conditional_expr]
[grouping | sorting | grouping sorting]
[limit (num {rows | mins | secs | msecs})]
[for {read only | update [of
column_name[, column_name]...]}]
grouping:
group by sort_col[, sort_col]... [having conditional_expr]
sorting:
order by sort_col [asc | desc][, sort_col [asc | desc]]...
sort_col:
num | column_name
select_item:
expression [alias_name | "column heading"]
table_ref:
table_primary | table_join
table_primary:
table_spec | ( table_join )
table_spec:
[db_name.]table_name [[as] correlation_name]
table_join:
natural_join | qualified_join | cross_join
SQL Language Syntax Summary
229
RDM SQL Language Guide
natural_join:
table_ref natural [inner | {left | right} [outer]] join table_primary
qualified _join:
table_ref [inner | {left | right} [outer]] join table_primary
[using (column_name[, column_name]...) | on conditional_expr]
cross_join:
table_ref cross join table_primary
arith_expr:
expression
/* involving only numeric operands and operations */
dt_expr:
expression
string_expr:
expression
/* involving only date/time/timestamp operands and operations */
/* involving only string operands and operations */
expression:
operand [arith_operator operand]...
operand:
constant | param_ref | column_ref | function | (expr)
param_ref:
? | :param_name
column_ref:
[{table_name | correlation_name}.]column_name
arith_operator:
+|-|*|/
function:
aggregate_fcn | scalar_fcn
aggregate_fcn:
{sum | avg | max | min} (expression)
|
count ({* | column_ref })
|
aggregate_udf_name ([expression][, expression]...)
SQL Language Syntax Summary
230
RDM SQL Language Guide
scalar_fcn:
|
if (conditional_expr, expression, expression)
|
numeric_function | datetime_function | string_function
|
scalar_udf_name ([expression][, expression]...)
numeric_function:
abs(arith_expr)
|
acos(arith_expr)
|
asin(arith_expr)
|
atan(arith_expr)
|
atan2(arith_expr)
|
{ceil | ceiling}(arith_expr)
|
cos(arith_expr)
|
cot(arith_expr)
|
exp(arith_expr)
|
floor(arith_expr)
|
{ln | log}(arith_expr)
|
mod(arith_expr)
|
pi()
|
rand(num)
|
sign(arith_expr)
|
sin(arith_expr)
|
sqrt(arith_expr)
|
tan(arith_expr)
datetime_function:
age(dt_expr)
|
{curdate | current_date}()
|
{curtime | current_time}()
|
dayofmonth(dt_expr)
|
dayofyear(dt_expr)
|
hour(dt_expr)
|
minute(dt_expr)
|
month(dt_expr)
|
quarter(dt_expr)
|
second(dt_expr)
|
week(dt_expr)
|
year(dt_expr)
string_function:
ascii(string_expr)
|
char(num)
|
concat(string_expr, string_expr)
|
convert(expression, {convert_type | {char}, width, convert_format})
|
lcase(string_expr)
|
left(string_expr, num)
|
length(string_expr)
|
locate(string_expr, string_expr, num)
SQL Language Syntax Summary
231
RDM SQL Language Guide
|
|
|
|
|
|
|
|
ltrim(string_expr)
repeat(string_expr, num)
replace(string_expr, string_expr, string_expr)
right(string_expr, num)
rtrim(string_expr)
substring(string_expr, num, num)
ucase(string_expr)
unicode(string_expr)
convert_type:
char |smallint | integer | real
|
double | date | time | timestamp | tinyint | bigint
convert_format:
numeric_format | datetime_format
numeric_format:
"[<< | >> | ><]['text' | $][- | (][#,]#[.#[#]...][e | E]['text' | $ | %]"
datetime_format:
"[<< | >> | ><]['text' | spchar | date_code | time_code]..."
date_code:
m | mm | mmm | mon | mmmm | month
|
d | dd | ddd | dddd | day
|
yy | yyyy
time_code:
h | hh | m | mm | s | ss | .f[f]... | [a/p | am/pm | A/P | AM/PM]
conditional_expr:
rel_expr [bool_oper rel_expr]...
rel_expr:
|
|
|
|
|
|
expression [not] rel_oper expression
expression [not] between constant and
expression [not] in (constant[, constant]...)
column_ref is [not] null
string_expr [not] like "string"
not rel_expr
( conditional_expr )
constant
rel_oper:
|
|
|
= | ==
<
>
<=
SQL Language Syntax Summary
232
RDM SQL Language Guide
|
|
>=
<> | != | /=
bool_oper:
& | && | and
|
"|" | "||" | or
insert_stmt:
insert into [db_name.]table_name [(column_name[, column_name]... )] data_source
data_source:
values value_expr[, value_expr]...
|
[from] select_stmt
value_expr:
value_operand [{+ | - | * | /} value_operand]…
value_operand:
constant | arg_name | column_name | ? | scalar_fcn | ( value_expr )
update_stmt:
update [db_name.]table_name
set column_name = expression[, column_name = expression]...
[where {conditional_expr | current of cursor_name}]
delete_stmt:
delete from [db_name.]table_name
[where {conditional_expr | current of cursor_name}]
import_stmt:
import into
table_name
from [char | wchar | xml] file "filename"
export_stmt:
export into [char | wchar | xml] file "filename" from select_stmt
start_stmt:
{start trans[action] | begin [work] [trans[action]]} [read only]
savepoint_stmt:
savepoint savepoint_id
release_stmt:
release savepoint savepoint_id]
SQL Language Syntax Summary
233
RDM SQL Language Guide
rollback_stmt:
rollback [work] [[to savepoint] savepoint_id]
commit_stmt:
{commit [work] | end [trans[action]]}
end_trans_stmt:
end read only trans[action]
lock_stmt:
lock table [in db_name] table_lock[, table_lock]...
table_lock:
table_name [read | write | default]
unlock _stmt:
unlock table {[db_name.]table_name | all}
set_stmt:
set_option_stmt | set_column_stmt
set_option_stmt:
set timeout [to | =] constant
|
set autocommit [to | =] {on | off}
|
set read only trans[action] mode [to | =] {auto | manual}
|
set debug [to | =] {0 | 1}
set_column_stmt:
set column [db_name.]table_name.column_name
[type [to | =] {date | time | timestamp | long | {varchar | wvarchar}}]
[distinct values = num]
[range
constant
to
constant]
|
set column stats [db_name.]table_name.column_name
[distinct values = num]
[range
constant
to
constant]
RDM Procedure Statements
RDM_proc_stmts:
create_proc_stmt | drop_proc_stmt | execute_stmt
SQL Language Syntax Summary
234
RDM SQL Language Guide
create_proc_stmt:
create {proc | procedure} proc_name [(arg_name arg_type[, arg_name arg_type]...)] as
{select_stmt... |
[start_stmt] {insert_stmt | update_stmt | delete_stmt}... [commit_stmt]}
end {proc | procedure}
arg_type:
|
|
|
{character | char }
{double [precision] | float | real }
{tinyint | smallint | int | integer long | bigint}
date | time | timestamp
arg_type:
|
|
|
{character | char }
{double [precision] | float | real }
{tinyint | smallint | int | integer long | bigint}
date | time | timestamp
drop_proc_stmt:
drop proc[edure] proc_name
execute_stmt:
[exec[ute] | run] proc_name [(constant[, constant]...)]
SQL Reserved Words for RDM
The table below lists reserved words that cannot be used when creating your SQL schema, except when used
for their intended purpose (i.e., the reserved word "DATABASE" cannot be used as your database name because
it is used in the SQL grammar "CREATE DATABASE ...").
Note: * Represents reserved words that are not reserved in the SQL Standard but are reserved in the underlying Native DDL.
BS
ACOS
AGE
ALL
*ASC
*ASCENDING
ASCII
ASIN
ATAN
ATAN2
AVG
DAYOFWEEK
DAYOFYEAR
DB_ADDR
DBA4
DBA8
DELETE
*DESC
*DESCENDING
DISTINCT
DOUBLE
END
SQL Language Syntax Summary
LEFT
LENGTH
LIMIT
LN
LOCALTIME
LOCALTIMESTAMP
LOCATE
LOG
LONG
LOWER
LTRIM
ROLLBACK
ROUND
ROWID
RTRIM
SECOND
SELECT
*SET
SHORT
SIGN
SIN
SMALLINT
235
RDM SQL Language Guide
BEGIN
BIGINT
BIT
*BLOB
BOOLEAN
*BY
CEIL
CEILING
CHAR
CHARACTER
CHARACTER_LENGTH
*CIRCULAR
COMMIT
*COMPACT
*COMPOUND
CONCAT
*CONST
*CONTAINS
CONVERT
COS
COT
COUNT
CROSS
CURDATE
CURRENT_DATE
CURRENT_TIME
CURRENT_TIMESTAMP
CURTIME
DATA
DATABASE
DATE
DATETIME
DAYOFMONTH
EXP
EXPORT
FALSE
*FILE
FIRST
FLOAT
FLOOR
FOR
FOREIGN
FROM
FULL
GROUP
HASH
HOUR
IF
IFNULL
IMPORT
IN
INDEX
*INITIAL
INNER
INSERT
INSSTR
INT
*INT16_T
*INT32_T
*INT64_T
*INMEMORY
INTEGER
JOIN
KEY
*LAST
LCASE
SQL Language Syntax Summary
MAX
*MAXPGS
*MAXSLOTS
*MEMBER
MIN
MINUTE
MOD
MONTH
NAT
NATURAL
*NEXT
NOT
NOW
NULL
*NULLABLE
OCTET_LENGTH
ON
*OPT
*OPTIONAL
*ORDER
*OWNER
*PAGESIZE
*PCTINCREASE
*PERSISTENT
PI
QUARTER
RAND
*READ
REAL
*RECORD(S)
REPEAT
REPLACE
RIGHT
SQRT
*STATIC
*STRUCT
SUBSTRING
SUM
TAN
*THRU
TIME
TIMESTAMP
TINYINT
TRUE
TYPE
*TYPEDEF
TYPEOF
UCASE
UNICODE
UNIQUE
UNLOCK
*UNSIGNED
UPDATE
UPPER
USING
*VARDATA
*VOLATILE
WCHAR
WCHARACTER
WEEK
WHERE
WORK
YEAR
236
RDM SQL Language Guide
SQL Statement Reference
The primary purpose of the Data statement is to give names to constants;
instead of referring to pi as 3.141592653589793 at every appearance,
the variable Pi can be given that value with a Data statement and used
instead of the longer form of the constant. This also simplifies
modifying the program, should the value of pi change.
- Fortran manual for Xerox Computers
Each individual SQL statement is described in this section. The descriptions are listed in alphabetical order by
statement. Oh, and sorry, we don't have a data statement (but we do have pi, however, our version requires that
it never changes value!).
The following table summarizes each RDM SQL statement.
Table 23. RDM SQL Statement Summary
Statement
close
commit / end
create catalog
create database
create domain
create procedure
create table
create virtual table
delete
drop database
drop procedure
end read only transaction
execute
export
import
initialize
insert
lock table
open
release
rollback
savepoint
select
set
set column
start/ begin
unlock table
update
SQL Statement Reference
Description
Close an open database
Commit transaction's changes to the database
Create a new catalog file
Create a database definition
Create a column domain specification
Create a stored procedure
Create a table definition
Create a virtual table for an external data source
Delete one or more rows from a table
Drop (delete) a database
Drop a stored procedure
Terminate a read only transaction
Execute a stored procedure
Export select results to an external file
Import data into a table from an external file
Initialize a database
Insert a row or rows into a table
Explicitly lock one or more tables
Open a database
Release a transaction savepoint
Rollback (undo) a transaction's changes
Mark a transaction savepoint
Retrieve a set of rows of data from the database
Set an SQL operational parameter value
Set column statistics or SQL type for core database column
Start a transaction
Unlock (all) read-locked table(s)
Update one or more rows in a table
237
RDM SQL Language Guide
close
Close an open database
Syntax
close_db_stmt:
close [database] db_name
Description
The close statement can be used to close any open database. Attempts to execute a close statement when a
transaction is active will result in an error.
Example
open bookshop;
...access bookshop database
close bookshop;
open database nsfawards;
...access nsfawards database
close database nsfawards;
See Also
open
SQL Statement Reference
238
RDM SQL Language Guide
commit
Commit transaction's changes to the database
Syntax
commit_stmt:
{commit [work] | end [trans[action]]}
Description
The commit statement causes all database modifications that have been made since the beginning of the transaction to be permanently written to the database. Upon successful return the transaction's changes are guaranteed to be in the database and all locks are freed.
A transaction is explicitly started through execution of a start transaction statement or implicitly through the
execution of the first database modification statement (insert, update, or delete). It is recommended that you
always use the start transaction statement to mark the beginning of a transaction.
RDM SQL also provides the ability to run in auto-commit mode in which each insert, update, and delete statement is automatically committed. This mode is made available to support some third-party ODBC tools. However, the use of auto-commit mode is not recommended as transactions are designed to allow the grouping of
related database changes and that is not possible when running with auto-commit enabled.
Execution of a commit statement when a transaction is not currently active will free all of the read locks held by
the connection.
Example
start transaction;
... insert, update, and/or delete statements
commit;
See Also
start
rollback
set autocommit
SQL Statement Reference
239
RDM SQL Language Guide
create catalog
Create a new catalog file
Syntax
create_catalog_stmt:
create catalog for dbname
Description
The create catalog statement is used to either create a catalog file for a RDM core (i.e., non-SQL) database or
to update the catalog of a RDM SQL database in order to store column statistics updated through prior calls to
the set column statement.
When a core database is opened in SQL, the RDM SQL engine creates an internal catalog from the core database dictionary. Once opened, since the database dictionary does not contain the range and distinct values that
in SQL can be specified for table columns, the set column statement can be used to provide this information.
Moreover, as core databases also do not distinguish between character and binary blob data, the set column
statement can be used to specify a blob column to be either a long varchar or long wvarchar. Having done so, a
catalog containing the SQL version of the core database along with the additional information provided in previously executed set column statements can be permanently stored in a catalog by executing the create catalog
statement.
For an SQL database, this statement can be used to update the column statistics specified in previously
executed set column statements contained in the catalog for the specified database.
Execution of this statement requires that the database has been opened in exclusive access mode. This statement is not transactional. Hence, once executed it cannot be undone.
Example
open database mycoredb in exclusive mode;
set column geosensor.descr to long varchar;
set column geosensor.type distinct values 20;
... other set column statements
create catalog for mycoredb;
See Also
open
set column
SQL Statement Reference
240
RDM SQL Language Guide
create database
Create a database definition
Syntax
create_schema_stmt:
create {schema | database} db_name
[pagesize = num] [inmemory [persistent | volatile | read]]
Description
The create database statement is used to introduce the database definition for a new database. The definition is
contained in the sequence of DDL statements (create domain or create table) that are submitted immediately
following this statement. The name of the database is specified by the db_name identifier.
The system stores the rows of each database table in a separate system file. It also stores the indexes associated with keys in separate system files as well. The default page size for the database files is 1024 bytes but
can be changed by the pagesize option. This will be the default page size used for each database file created for
the database. Specific page sizes for tables and keys that override the default can be specified in the create
table statement.
You can specify that all database files are to be stored in shared memory by including the inmemory option. The
read, persistent, and volatile options control whether the database files are read from disk when the database is
opened (read, persistent), and whether they are written to the disk when the database is closed (persistent).
The default is volatile meaning that the database is created empty each time it is opened. The read option
means that the entire database is read from the files when the database is opened, changes to the data are
allowed but are not written back to the files on closing. The persistent option means that the entire database is
read on opening and all changes that were made while the database was open are written when the database is
closed. As with the pagesize option, the create table statement allows specific tables and/or keys to be inmemory.
The database is automatically created and initialized upon the successful compilation of all of its subsequent
DDL statements and execution of the first non-DDL statement (usually commit) that follows the DDL statements. At that point, the database is open and ready for use.Example
Only one create database can be issued in a given connection and no other databases can be opened when the
create database is issued.
Example
create database bookshop pagesize=4096;
create table author(
last_name
char(11) primary key,
full_name
char(35),
SQL Statement Reference
241
RDM SQL Language Guide
gender
yr_born
yr_died
short_bio
char(1),
smallint,
smallint,
varchar(250)
);
... other DDL statements for the bookshop database
commit;
See Also
create domain
create table
SQL Statement Reference
242
RDM SQL Language Guide
create domain
Create a column domain specification
Syntax
create_domain_stmt:
create domain domain_name [as] data_type
[default {constant | null}]
data_type:
base_type | blob_type
base_type:
{character | char } [(length)]
|
{{character | char} varying | varchar } (length)
|
{binary [(length)]
|
{double [precision] | float | real }
|
{ tinyint | smallint | int | integer | long | bigint}
|
date | time | timestamp
blob_type:
{{character | char} large object | long varchar | clob} [(length)] file_option
|
{binary large object | large varbinary | blob} [(length)] file_option
file_option:
[pagesize = num] [inmemory [persistent | volatile | read]]
Description
A "domain" is simply a user-defined and named data type which can then be specified as the data type for columns declared in a create table statement. The create domain statement must be submitted before any create
table statements that reference it.
The name of the domain is specified as the domain_name. The data_type specifies the base type for the domain.
A constant value or null can be specified as the default.
Example
create database bookshop;
create domain money as double
default null;
SQL Statement Reference
243
RDM SQL Language Guide
create table book(
bookid
char(14) primary key,
last_name
char(11) references author,
title
varchar(255),
descr
char(61),
publisher
char(136),
publ_year
smallint,
lc_class
char(33),
date_acqd
date,
date_sold
date,
price
money,
cost
money
);
See Also
create database
create table
SQL Statement Reference
244
RDM SQL Language Guide
create procedure
Create a stored procedure
Syntax
create_proc_stmt:
create {proc | procedure} proc_name [(arg_name arg_type[, arg_name arg_type]...)] as
{select_stmt... |
[start_stmt] {insert_stmt | update_stmt | delete_stmt}... [commit_stmt]}
end {proc | procedure}
arg_type:
|
|
|
{character | char }
{double [precision] | float | real }
{tinyint | smallint | int | integer long | bigint}
date | time | timestamp
Description
Stored procedures that execute one or more basic SQL statements can be created with the create procedure
statement. A stored procedure can either contain one or more select statements (retrieval procedure) or a
sequence of insert, update, and/or delete statements (modification procedure) optionally enclosed in a transaction (transactional procedure). The name of the stored procedure is specified by the identifier procname which
can be executed using the execute statement.
Any number of arguments can be declared with the stored procedure. Each arg_name must be an identifier than
is not an SQL reserved word or the name of any table or column in the database. The type of the argument must
also be specified as shown in the above syntax. Argument values of type char represent a (null-terminated) character string of any length. Each arg_name can be simply referenced by name in any of the stored procedures
SQL statements in any context in which a value of that data type can be specified.
The additional result sets from a retrieval procedure that contains more than one select statement are accessed
by the application through a call to the rsqlMoreResults function after the last call to rsqlFetch on the
prior select statement has returned errNOMOREDATA. Function rsqlMoreResults itself will return errNOMOREDATA after the last row of the last result set has been returned.
It is recommended that you use transactional procedures for all of your transactions that involve the execution of
more than one insert, update, and/or delete statement involving modifications to more than one table. Execution
of a modification or transactional procedure will issue a single grouped lock request for all of the referenced
tables at the start of execution so that either all or none of the locks are granted. Grouped locking in this way guarantees that the application is deadlock free. Use of a transactional procedure ensures that either all or none of
the changes are committed to the database.
Execution of a modification procedure when auto-commit mode is enabled, behaves the same as a transactional
procedure. This provides a way to ensure that the modifications from more than one statement are committed
together even in auto-commit mode.
SQL Statement Reference
245
RDM SQL Language Guide
An inherited read lock is a read lock that is active at the time a transaction begins (e.g., locks that may be held by
an active cursor on another statement handle in the same connection). In auto-commit mode, all inherited read
locks remain in place after the changes are committed (or rolled back, in the event that one of the modification (or
transactional) procedure's statements encounter an execution error such as a referential integrity violation).
When auto-commit is not active, all transaction commits (or rollbacks) free all locks.
The advantage of using stored procedures is that the cost of compiling the stored procedure statements is
incurred only once. Compiled stored procedures are stored in the referenced database's directory on the TFS in
a file named procname.ssp. An embeddable (through #include directives) C module containing statically initialized tables comprising the compiled form of the procedure is also created. This file along with a companion
header file is named procname_ssp.c (or .h). It can be compiled with your C application and directly execute
through a call to function rsqlExecProc.
Examples
create proc authors_books(lastnm char) as
select publ_yr, title from book where last_name = lastnm
end proc;
...
authors_books("PotterB");
PUBL_YR TITLE
1903 The Tailor of Gloucester
1903 The tale of Squirrel Nutkin
1904 The tale of Benjamin Bunny
1904 The tale of Peter Rabbit; thirty-one illustrations.
1905 The pie and the patty-pan.
1905 The tale of Mrs. Tiggy-Winkle
1906 The tale of Mr. Jeremy Fisher
1908 The tale of Jemima Puddle-Duck
1907 The tale of Tom Kitten
1911 The tale of Timmy Tiptoes
1912 The tale of Mr. Tod
1913 The tale of Pigling Bland
1918 The tale of Johnny Town-mouse
...
create procedure sold(pid char, bid char, offer double, sale_date char) as
start transaction
update book set price = offer, date_sold = sale_date where bookid = bid
insert into sale values bid, pid
commit
end proc;
...
execute sold("SMD", "potter08", 750.0, date "2011-04-03");
...
See Also
execute
rsqlExecProc
SQL Statement Reference
246
RDM SQL Language Guide
create table
Specifies a file to contain blob field data
Syntax
standard_table:
create [circular] table table_name (
column_def[, column_def]...
[, key_def[, key_def]...]
) [pagesize = num] [inmemory [persistent | volatile | read]]
[maxpgs = num] [maxrows = num]
column_def:
column_name {type_spec | domain_name}
[distinct values = num] [range constant to constant]
[not null] [key_spec] [refs_spec]
type_spec:
data_type [default {constant | null}]
data_type:
base_type | blob_type
base_type:
{character | char } [(length)]
|
{{character | char} varying | varchar } (length)
|
{binary [(length)]
|
{double [precision] | float | real }
|
{ tinyint | smallint | int | integer | long | bigint}
|
date | time | timestamp
blob_type:
{{character | char} large object | long varchar | clob} [(length)] file_option
|
{binary large object | large varbinary | blob} [(length)] file_option
file_option:
[pagesize = num] [inmemory [persistent | volatile | read]]
key_spec:
|
[primary | unique] key ['['keysize']']
{primary | unique} key [hash { (num) | of num rows}] ['['keysize']']
SQL Statement Reference
247
RDM SQL Language Guide
refs_spec:
references
table_name[.column_name] [triggered_action]
key_def:
|
[primary | unique] key [hash {(num) | of num rows}] ['['keysize']'] [key_name]
(column_name[asc | desc] [, column_name[asc | desc] ]...)
[pagesize = num] [inmemory [persistent | volatile | read]] [maxpgs = num]
foreign key [set_name] (column_name[, column_name]...
references table_name[(column_name[, column_name]...)]
[triggered_action]
triggered_action:
on update action_spec [on delete action_spec]
|
on delete action_spec [on update action_spec]
action_spec:
cascade | restrict | set null
Description
The create table statement is used to define a table to be included in the database. Create table statements can
only be issued after the create database statement and before issuing any other non-DDL statements. Any
domain types that are used in column declarations included in the create table statement must have already
been declared through the issuance of a prior create domain statement.
The table_name is a user-specified identifier that names the table. The contents of the table is comprised of the
columns that are declared within it. Columns are declared to be of a specific data type which is either explicitly
given or specified through use of a previously declared domain name. A default value can also optionally be specified unless the column was declared with a domain type.
The distinct values clause specifies the number of distinct values that will be stored in this column. The range
clause specifies the minimum and maximum values that will be stored in the column. These two clauses provide
important information that is only used by the RDM SQL query optimizer to determine the best possible
execution plan for a query. Note that these clauses do not specify column validation checks. It will still be possible
to store values that are outside of the specified range.
Columns can be specified with one or more constraints which declare the column to be:
l not null—null values are not allowed for the column,
l a primary/unique or non-unique key—on which an index will be automatically created,
l a foreign key that references the primary/unique key of the specified table.
Columns declared as not null will cause any insert or update statement that attempts to assign a null value to
that column to return an error.
SQL Statement Reference
248
RDM SQL Language Guide
Foreign key references are automatically implemented by RDM SQL for quick access and maintenance of referential integrity1 . A triggered_action can be specified with foreign key columns in order to indicate what should
happen when the referenced row is updated or deleted. The default action is restrict meaning that primary key
rows that have existing foreign key references cannot be updated/deleted. If on ... cascade is specified, then all
of the referenced rows are updated or deleted when the primary key row is updated (i.e., the primary key column
value) or deleted. Note that the referencing table may itself have a primary key declared that is referenced by foreign keys in other tables that may not have a cascade triggered action specified. Thus, a delete of the referenced
row of a cascade-delete-allowed table may be denied due to a restrict foreign key on a row of a referencing table.
If on ... set null is specified, then all of the referencing foreign key columns will be set to null. This option is not
allowed when the foreign key column has been declared as not null.
A key_def on a table is used to declare primary/unique/non-unique keys and foreign keys on one or more columns. The [primary | unique] key clause is used to identify the columns from the table on which a key is to be
formed. A table can have only one primary key. Keys that include the keysize clause will index a maximum of
only keysize number of bytes of the column values. By default keys are maintained in a B-tree index file which
maintains the keys in sorted order based on the data type of the columns comprising the key. You can also specify that a key be stored in a hash index which is designed for very fast lookups of specific keys but cannot be used
for sorting or range searches. The hash specification must include an estimate of the number of rows on which
the hash is to be based.
The contents (rows) of each table is contained in a separate RDM data file. Each key is contained in a separate
RDM key file. The values for each blob type column is stored in a separate RDM blob file.
A pagesize value that differs from the default pagesize (see create database) can be specified. You can also
specify that the table's file is inmemory. The read, persistent, and volatile options control whether the table is
read from disk when the database is opened (read, persistent), and whether changes to the table are written to
the disk when the database is closed (persistent). The default is volatile meaning that the table is created empty
each time it is opened. The read option means that the entire table is read from the file when the database is
opened, changes to the table are allowed but are not written back to the file on closing. The persistent option
means that the entire table is read on opening and all changes that were made while the database was open are
written back to the table's file when the database is closed.
A circular table is one which has a fixed number of rows as specified by the maxrows clause (which is required
when circular is specified). An insert into a circular table inserts the specified row into the next row position in the
table. When maxrows have been inserted the next row will be written to the first row in the table overwriting the
original row value. Circular tables are useful for storing time-dependent information such as log entries, operational status records, and so on. Note that foreign key references to a circular table are not allowed.
The data type 'date' assumes the Gregorian calendar even for dates prior to the introduction of the Gregorian calendar. This means that databases that store historical dates prior to the introduction of the Gregorian calendar may not compute select with date ranges, dayofweek, and week correctly.
Example
create database sales;
1Declared foreign and primary key relationships are implemented using RDM core-level sets.
SQL Statement Reference
249
RDM SQL Language Guide
create domain money as double;
create table product
(
prod_id smallint primary key,
prod_desc char(39) not null,
price money range 11.95 to 12495.00,
cost money range 5.5 to 8800.00,
key prod_pricing(price, prod_id)
);
create table outlet
(
loc_id char(3) primary key,
city char(17) not null,
state char(2) distinct values = 11 range "AZ" to
region smallint distinct values = 4 range 0 to 3
key loc_geo(state, city)
);
create table on_hand
(
loc_id char(3) not null
references outlet(loc_id),
prod_id smallint not null
references product,
quantity smallint not null,
primary key(loc_id, prod_id)
);
create table salesperson
(
sale_id char(3) primary key,
sale_name char(30) not null,
dob date,
commission double,
region smallint distinct values = 4 range 0 to 3
sales_tot money,
office char(3) distinct values = 12,
mgr_id char(3)
references salesperson on delete set null on
key sales_region (region, office)
);
create table customer
(
cust_id char(3) primary key,
company char(30) not null,
contact char(30),
street char(30),
city char(17),
state char(2) distinct values = 50,
zip char(5),
orders_tot money,
sale_id char(3)
references salesperson on delete set null on
SQL Statement Reference
"WA" not null,
not null,
not null,
update cascade,
update cascade
250
RDM SQL Language Guide
);
create table sales_order
(
cust_id char(3)
references customer on delete set null on update cascade,
ord_num smallint primary key,
ord_date date,
ord_time time,
amount money,
tax double default 0.0,
key order_ndx(ord_date, amount, ord_time)
);
create table item
(
ord_num smallint not null
references sales_order on delete cascade on update cascade,
prod_id smallint not null
references product on update cascade,
loc_id char(3) distinct values = 12 not null
references outlet on update cascade,
quantity smallint not null
);
create table note
(
note_id char(12) not null,
note_date date not null,
sale_id char(3) distinct values = 14 not null,
cust_id char(3)
references customer on delete cascade on update cascade,
unique key(sale_id, note_id, note_date)
);
create table note_line
(
note_id char(12) not null,
note_date date not null,
sale_id char(3) distinct values = 14 not null,
txtln char(81) not null,
foreign key(sale_id, note_id, note_date)
references note(sale_id, note_id, note_date)
on delete cascade on update cascade
);
See Also
create database
SQL Statement Reference
251
RDM SQL Language Guide
create virtual table
Create a virtual table for an external data source
Syntax
virtual_table:
create virtual [read only] table table_name (
vcolumn_def[, vcolumn_def]…
)
vcolumn_def:
column_name base_type
[distinct values = num] [range constant to constant]
[primary key]
base_type:
{character | char } [(length)]
|
{{character | char} varying | varchar } (length)
|
{binary [(length)]
|
{double [precision] | float | real }
|
{ tinyint | smallint | int | integer | long | bigint}
|
date | time | timestamp
Description
An RDM SQLvirtual table is a feature that allows just about any kind of external data to be accessed as an SQL
table. It is defined through a combination of the create virtual table statement and a set of user developed C functions that conform to a particular interface specification. A pointer to a pre-defined structure array that contains
an entry for each virtual table with the addresses of each of the virtual table interface functions is passed into
SQL through a call to rsqlRegisterVirtualTables before the database is opened. These functions are
then called by SQL at the appropriate times during the execution of any SQL statement that references the virtual
table.
The read only option indicates that the table can only be referenced in a select statement.
Only single-column primary keys are allowed and only one column in the table can be declared to be the primary
key. SQL will call the vtLookup virtual table interface function to handle single-valued lookups from a where conditional of the form "pkeycol = value".
In a DDL specification, all create virtual table statements must come after all standard create table statements
for the database have been submitted.
SQL Statement Reference
252
RDM SQL Language Guide
Example
create database weather_db;
create table sensor_location(
longitude integer,
latitude integer,
sensor_id bigint,
descr char(48),
county char(24),
state char(2),
primary key loc_id(longitude, latitude)
);
create table weather_summary(
longitude integer,
latitude integer,
rdg_date date,
hour_of_day smallint,
avg_temp smallint,
avg_ press smallint,
avg_hum smallint,
avg_lumens smallint,
foreign key (longitude, latitude) references sensor_location
);
create virtual readonly table weather_data(
sensor_id bigint primary key,
loc_long integer,
loc_lat integer,
rdg_time timestamp display(19, "yyyy-mon-dd hh:mm:ss"),
temperature smallint range -10 to 100,
pressure smallint,
humidity smallint,
light smallint,
power integer
);
See Also
rsqlRegisterVirtualTables
SQL Statement Reference
253
RDM SQL Language Guide
delete
Delete one or more rows from a table
Syntax
delete_stmt:
delete from [db_name.]table_name
[where {conditional_expr | current of cursor_name}]
conditional_expr:
rel_expr [bool_oper rel_expr]...
rel_expr:
|
|
|
|
|
|
expression [not] rel_oper expression
expression [not] between constant and
expression [not] in (constant[, constant]...)
column_ref is [not] null
string_expr [not] like "string"
not rel_expr
( conditional_expr )
constant
rel_oper:
|
|
|
|
|
= | ==
<
>
<=
>=
<> | != | /=
bool_oper:
& | && | and
|
"|" | "||" | or
Description
This statement deletes one or more rows from table table_name. Two types of delete are supported. In a
searched delete, the delete statement deletes all rows of the table that satisfy the conditional expression (conditional_expr) specified in the where clause. In a positioned delete, the delete statement deletes the current row
associated with the specified cursor (cursor_name) in the where current of clause. The cursor_name must have
been established through a prior call to either rsqlGetCursorName or rsqlSetCursorName on a compiled, updateable select statement associated with a separate statement handle.
Deleting rows that have referencing foreign keyed rows will either succeed or fail based on the cascade or
restrict settings associated with the related foreign key specifications. If all referenced rows specify cascade
SQL Statement Reference
254
RDM SQL Language Guide
then all of the referencing rows will be deleted in addition to the rows from this particular table. However, if the
restrict option is specified and referencing rows exist, then the delete will fail with a referential integrity error.
Note also that while a foreign key to this table may have cascade set a foreign key to the referencing table may
itself have restrict set and thus the cascaded deletion could cause the delete to fail due to a referential integrity
constraint violation.
A call to rsqlGetRowCount after a successful execution of delete will return the count of all rows from all
affected (i.e., cascaded) tables that were deleted.
Example
delete from book where date_sold < date "2003-01-01";
...
delete from sponsor where state < "A" or state > "Z";
...
delete from person where current of SQL_CUR_f3f0_08b0;
See Also
select
update
rsqlGetCursorName
rsqlSetCursorName
rsqlGetRowCount
SQL Statement Reference
255
RDM SQL Language Guide
drop database
Drop (delete) a database
Syntax
dropdb_stmt:
drop database {db_name | "db_name@tfs_spec"}
tfs_spec:
"HostComputerName[:ddddd]"
Description
The drop database statement can be used to drop (i.e., delete) the database named db_name. The string form
must be used if it is necessary to identify the TFS on which the database is located. The tfs_spec is a string
specifying the location on the network of the TFS where HostComputerName is just that and ddddd is the five
digit TCP/IP port number on which that TFS is listening (default is 21553).
If the database is open you only need to specify the db_name and then execution of the drop database statement will close it. The database remains closed even when the drop database statement fails (except for errTRACTIVE).
Status errNODB is returned if the database cannot be found. Status errDBINUSE is returned if another task or
user has the database open. Status errTFSFAILURE is returned when a connection to the specified TFS cannot be made.
Execution of a drop database completely deletes the database and is irrecoverable (i.e., a rollback statement
cannot undo a drop database).
Example
open bookshop;
drop database bookshop;
drop database "nsfawards@nsfTFS:21695";
See Also
initialize
SQL Statement Reference
256
RDM SQL Language Guide
drop procedure
Drop a stored procedure
Syntax
drop_proc_stmt:
drop proc[edure] proc_name
Description
This statement can be used to drop (delete) a stored procedure from its database's document directory on the
TFS.
Example
create procedure getacct(mid char) as
select * from acctmgr where mgrid = mid
end proc;
...
execute getacct("JOE");
...
drop proc getacct;
See Also
create procedure
SQL Statement Reference
257
RDM SQL Language Guide
end read only transaction
End a read only transaction
Syntax
end_trans_stmt:
end read only trans[action]
Description
This statement is used to terminate a read only transaction.
Example
start transaction read only;
select * from book;
end read only trans;
See Also
commit
rollback
start transaction
SQL Statement Reference
258
RDM SQL Language Guide
execute
Execute a stored procedure
Syntax
execute_stmt:
[exec[ute] | run] proc_name [(constant[, constant]...)]
Description
The execute statement will execute the stored procedure named proc_name. An argument value, constant, of
the proper data type must be specified for each argument that was declared in the create procedure statement
for proc_name. Specification of the execute keyword is optional. Thus, the procedure can be invoked simply by
specifying proc_name followed by the argument values enclosed in parentheses.
When executing a modification or transactional stored procedure, either all or none of the changes by the procedure's insert, update, and delete statements will be made. If an error occurs (e.g., a referential integrity error)
during execution of any one of the included statements then all changes made since the start of the procedure
will be discarded.
For retrieval stored procedures that contain more than one select statement, rsqlMoreResults must be
called to execute each subsequent select after the first. After the last select has returned errNOMOREDATA, a
call to rsqlMoreResults will also return errNOMOREDATA indicating that the last select has been executed.
Example
create proc authors_books(lastnm char) as
select publ_yr, title from book where last_name = lastnm
end proc;
...
authors_books("PotterB");
PUBL_YR TITLE
1903 The Tailor of Gloucester
1903 The tale of Squirrel Nutkin
1904 The tale of Benjamin Bunny
1904 The tale of Peter Rabbit; thirty-one illustrations.
1905 The pie and the patty-pan.
1905 The tale of Mrs. Tiggy-Winkle
1906 The tale of Mr. Jeremy Fisher
1908 The tale of Jemima Puddle-Duck
1907 The tale of Tom Kitten
1911 The tale of Timmy Tiptoes
1912 The tale of Mr. Tod
1913 The tale of Pigling Bland
1918 The tale of Johnny Town-mouse
...
SQL Statement Reference
259
RDM SQL Language Guide
create procedure sold(pid char, bid char, offer double, sale_date char) as
start transaction
update book set price = offer, date_sold = sale_date where bookid = bid
insert into sale values bid, pid
commit
end proc;
...
execute sold("SMD", "potter08", 750.0, date "2011-04-03");
...
See Also
create procedure
SQL Statement Reference
260
RDM SQL Language Guide
export
Export select statement result rows into a file
Syntax
export_stmt:
export into [char | wchar | xml] file "filename" from select_stmt
Description
The export statement is used to store the result rows from a select statement in either a comma-delimited character (file, char file) or wide character (unicode) file (wchar file) or into an XML formatted file (xml file).
The file identified by filename will be created on the remote SQL server if the application is connected to a remote
SQL server. Otherwise it will be created locally.
In XML format (xml file) the result column values are identified using XML attributes or tags to identify the column name with which the tagged value is associated. The columns can be in any order but all necessary columns must be included (i.e., columns declared as not null without a default value or which are declared as a
primary or unique key). Each row is bracketed between pairs of <ROW> and </ROW> tags. For each row column values are specified between pairs of <column_name> and </column_name> tags. The file begins with
a <RAIMA-SQL> tag and ends with a </RAIMA-SQL> tag.
Exporting to a comma separated file can be done for any select statement where for example columns may be
reordered or expressions is used instead of column. When importing such files the actual order in the table must
match the order of the columns for the table they are imported into.
Exporting to a XML file can also be done for any select statement. However where expressions is used instead of
column the column name will not be meaningful. Such files can not be imported without manually editing the column names.
Example
export into file "acctmgrs.txt" from select * from acctmgr;
export into xml file "books.xml" from select * from book;
See Also
import
SQL Statement Reference
261
RDM SQL Language Guide
import
Import rows into a table from a file
Syntax
import_stmt:
import into
table_name
from [char | wchar | xml] file "filename"
Description
The import statement is used to insert new rows into table table_name in database db_name. If db_name is not
specified, then the first table named table_name found in the set of currently opened databases will be used. The
file identified by filename must exist and be accessible on the remote SQL server if the application is connected to
a remote SQL server. Otherwise it must exist and be accessible locally.
The data must either be stored in a comma-delimited or XML format. A comma-delimited format (file, char file,
or wchar file) requires that each column value be specified in the order in which the columns are declared in the
table. Absence of a column value is indicated by a blank or empty entry (e.g., ",,"). Specify wchar if the file is
stored with wide characters. If either 'char', 'wchar', 'xml' is specified it defaults to 'char'.
In XML format (xml file) the column values are identified using XML attributes or tags to identify the column
name with which the tagged value is associated. The columns can be in any order but all necessary columns
must be included (i.e., columns declared as not null without a default value or which are declared as a primary
or unique key). Each row is bracketed between pairs of <ROW> and </ROW> tags. For each row column values
are specified between pairs of <column_name> and </column_name> tags. The file begins with a <RAIMASQL> tag and ends with a </RAIMA-SQL> tag.
Exporting to a comma separated file can be done for any select statement where for example columns may be
reordered or expressions is used instead of column. When importing such files the actual order in the table must
match the order of the columns for the table they are imported into.
Exporting to a XML file can also be done for any select statement. However where expressions is used instead of
column the column name will not be meaningful. Such files can not be imported without manually editing the column names.
Example
The following statements are used to load the sample data contained in comma-delimited text files into bookshop example database.
open database bookshop exclusive;
import into author from file "c:\bookshop\authors.txt";
import into book from file "c:\bookshop\books.txt";
import into genres from file "c:\bookshop\genres.txt";
import into subjects from file "c:\bookshop\subjects.txt";
import into related_name from file "c:\bookshop\names.txt";
SQL Statement Reference
262
RDM SQL Language Guide
import into
import into
import into
import into
import into
import into
import into
import into
import into
commit;
genres_books from file "c:\bookshop\bookgens.txt';
subjects_books from file "c:\bookshop\booksubs.txt";
acctmgr from file "c:\bookshop\acctmgrs.txt";
patron from file "c:\bookshop\patrons.txt";
note from file "c:\bookshop\bnotes.txt";
note_line from file "c:\bookshop\bnotelines.txt";
note from file "c:\bookshop\pnotes.txt";
note_line from file "c:\bookshop\pnotelines.txt";
sale from file "c:\bookshop\sales.txt";
A portion of file sponsors.xml which can be used to load the sponsor table in the nsfawards database is
shown below.
<RAIMA-SQL>
...
<ROW>
<name>UNAVCO, Inc.</name>
<addr>3360 Mitchell Lane</addr>
<city>Boulder</city>
<state>CO</state>
<zip>80301</zip>
</ROW>
<ROW>
<name>UNIAX Corporation</name>
<addr>6780 Cortona Drive</addr>
<city>Santa Barbara</city>
<state>CA</state>
<zip>93117</zip>
</ROW>
<ROW>
<name>UNIVERSITY OF MICHIGAN</name>
<addr>2455 Hayward Street</addr>
<city>Ann Arbor</city>
<state>MI</state>
<zip>48109</zip>
</ROW>
<ROW>
<name>UNIVERSITY OF WISCONSIN MA</name>
<addr></addr>
<city></city>
<state> </state>
<zip> / </zip>
</ROW>
<ROW>
<name>UNT Hlth Sci Ctr at Fort W</name>
<addr>Camp Bowie at Montgomery</addr>
<city>Fort Worth</city>
<state>TX</state>
<zip>76107</zip>
</ROW>
SQL Statement Reference
263
RDM SQL Language Guide
<ROW>
<name>URS Group, Inc.</name>
<addr>566 El Dorado Street - 2nd Floor</addr>
<city>Pasadena</city>
<state>CA</state>
<zip>91101</zip>
</ROW>
<ROW>
<name>US Army Corps of Engineers</name>
<addr>Transatlantic Programs Center</addr>
<city>Winchester</city>
<state>VA</state>
<zip>22601</zip>
</ROW>
...
</RAIMA-SQL>
See Also
export
SQL Statement Reference
264
RDM SQL Language Guide
initialize
Initialize database
Syntax
init_db_stmt:
initialize [database] db_name
Description
The initialize statement can be used to (re)initialize the database named db_name. Execution of this statement
requires that the database has been opened in exclusive access mode and that it is the only database that is
open.
Note that this statement will delete the entire contents of the specified database so be sure you know what you're doing before you execute this statement! Note that the initialize statement is not transactional - i.e., you cannot rollback the changes made by this statement.
Example
open database bookshop exclusive;
initialize bookshop;
...import bookshop tables
See Also
open
SQL Statement Reference
265
RDM SQL Language Guide
insert
Insert a row or rows into a table
Syntax
insert_stmt:
insert into [db_name.]table_name [(column_name[, column_name]... )] data_source
data_source:
values value_expr[, value_expr]...
|
[from] select_stmt
value_expr:
value_operand [{+ | - | * | /} value_operand]…
value_operand:
constant | arg_name | column_name | ? | scalar_fcn | ( value_expr )
Description
The insert statement is used to insert new rows into table table_name in database db_name. If db_name is not
specified, then the first table named table_name found in the set of opened databases starting from the most
recently opened will be used.
If a column_name list is not specified, the values must be listed in the same order as the columns have been
declared in the create table statement for table_name.
Two forms of the insert statement are available. Use of the values clause specifies the values of the columns of
the single row to be inserted into table_name. If a select_stmt is specified, it must return the number of result columns that match either the specified column_name list or the columns in the order declared in the table. The data
type of each expression result in the values list or the select statement result columns must be commensurate
with the corresponding table column's data type.
Column names can be referenced in a values expression but only one column reference in a value_expr is
allowed and the referenced column's value_expr itself cannot contain a column reference.
The arg_name value_operand only applies if the insert statement is part of a create procedure statement.
Example
insert into author values "BarrieJ", "Barrie, J. M. (James Matthew)", "M", 1860,
1937,
"Scottish author and dramatist, best remembered today as the creator of Peter
Pan.";
insert into book values "descartes01", "DescartesR", "Principia philosophiae",
SQL Statement Reference
266
RDM SQL Language Guide
"12 p.l., 310 p. illus., diagrs. 21 cm.", "Amstelodami, apud Ludovicum Elzevirium",
1644, "B1860 1644", date "2010-09-22", null, 1.20*cost, 12750.0;
...
insert into se_tfs.nsforg select * from ne_tfs.nsforg;
...
insert into person(name) values "Unknown, Manager";
See Also
delete
update
SQL Statement Reference
267
RDM SQL Language Guide
lock table
Explicitly lock one or more database tables
Syntax
lock_stmt:
lock table [in db_name] table_lock[, table_lock]...
table_lock:
table_name [read | write | default]
Description
The lock table statement can be used to explicitly lock one or more tables contained in any of the databases currently open in the connection in which this statement is executed. The in db_name clause can be specified to
identify the specific database that contains the listed tables in the event that more than one database is open that
have duplicate table names.
If neither read nor write is specified, then read is the default outside of a transaction and write is the default inside
a transaction. Either all lock requests will succeed or none will. I.e., this is an either all or none request which can
be used to prevent a deadlock situation in which one process holds a lock on table A while requesting a lock on
table B while a second process is holding a lock on table B while requesting a lock on table A.
Write lock requests issued when a transaction is not active will return an error. If a read only transaction is active
then the lock request will also return an error.
The system will switch into explicit locking mode on execution of the first lock table statement. In this mode, all
tables that are accessed by any subsequent SQL statements must be explicitly locked. If not, SQL will return an
errNOTLOCKED status. Note that the values of foreign key columns are retrieved from the referenced row in the
primary key table (RDM SQL does not actually store them in the foreign key table). Hence, both the foreign and
primary key tables must be explicitly locked when accessing foreign key column values. Once all explicitly lock
tables have been freed, the system will switch back into implicit locking mode.
Read-locked tables can be freed by the unlock table statement. Write-locked tables can only be freed by a commit or rollback. Execution of a commit or rollback statement outside a transaction can also be used to free all
read-locked tables.
Explicit locking allows you to issue a single grouped lock request at the beginning of a transaction that involves
modifications to more than one table in order to ensure that the transaction will not cause a deadlock situation to
arise. With implicit locking, the lock requests are made by execution of each insert, update, and delete statement
which can potentially create a deadlock situation. Alternatively, you can use transactional stored procedures with
implicit locking to achieve the same deadlock free guarantee.
NOTE: When using the Standalone TFS Configuration, lock requests are treated ignored as the database is opened exclusively
SQL Statement Reference
268
RDM SQL Language Guide
Example
start trans;
lock table acctmgr, patron;
insert into patron values "RLM","Merilatt, Randy", ..., "KATE";
commit;
See Also
unlock table
create procedure
SQL Statement Reference
269
RDM SQL Language Guide
open
Open a database
Syntax
open_db_stmt:
open [database] db_spec
[[in] {share | read only | exclusive} [mode] | as union of tfs_spec[, tfs_spec]...]
db_spec:
db_name | "[pathspec/]db_name"
tfs_spec:
"HostComputerName[:ddddd]"
Description
Databases are normally intended to be opened through calls to the RDM SQL API function rsqlOpenDB. The
open statement provides an alternative that can be helpful when doing ad hoc testing using a utility such as
rdmsql. The database to be opened in specified by the identifier db_name. The string form of the db_spec can
have a path (subdirectory or IP address) prefixed to the db_name. If no other options are specified, the database
is opened in shared mode on the default Transaction File Server (TFS). The open mode can be explicitly specified as share or exclusive. If exclusive then the open only succeeds when no other tasks have the database
open. If read only then the database can only be accessed by select statements and any attempt to start a transaction or execute an insert, update, or delete statement will return an error.
Difference instances of database db_name that are stored on separate TFSs can be opened as a union by specifying the host computer and port numbers of each TFS. The tfs_spec is a string specifying the location on the network of the TFS where HostComputerName is just that and dddd is the four digit TCP/IP port number on which
that TFS is listening. Each database is opened in read-only mode. Access to the content of the databases must
be made through normal select statements that are executed inside a read-only transaction. Note that a database union is a union of different instances of the same database schema (i.e., definition) contained on separate
TFSs. This is not to be confused with the standard SQL union of select statements operation.
NOTE: If the pathspec or HostComputerName is specified, the database specification must be quoted.
Example
open bookshop exclusive;
insert into author values "BarrieJ", "Barrie, J. M. (James Matthew)", "M", 1860,
1937,
SQL Statement Reference
270
RDM SQL Language Guide
"Scottish author and dramatist, best remembered today as the creator of Peter
Pan.";
...
open nsfawards as union of "Northeast_TFS:1650", "Southeast_TFS:1650",
"Midwest_TFS:1650", "West_TFS:1650";
start read only transaction;
select state, sum(amount) from award join sponsor on sponsor_nm = name group by
state;
See Also
start transaction
SQL Statement Reference
271
RDM SQL Language Guide
release
Release a transaction savepoint
Syntax
release_stmt:
release savepoint savepoint_id]
Description
The release statement is used to release a transaction savepoint identified by savepoint_id that was established
by a prior execution of a savepoint statement. Once a savepoint is released, all of the changes made since that
savepoint can only be discarded by a rollback of the entire transaction.
Of course, this statement requires that a transaction has been started and that a savepoint has been executed
for the specified savepoint_id.
Savepoints are also discarded through execution of a rollback to a prior savepoint, or a rollback or commit of
the transaction.
Example
start trans;
insert into acctmgr ... new account manager
savepoint new_patron;
insert into patron ... new patron for new acct manager
insert into patron ... another for the new acct manager
... no problems encountered
release savepoint new_patron;
... other changes
commit;
See Also
savepoint
SQL Statement Reference
272
RDM SQL Language Guide
rollback
Rollback (undo) a transaction's changes
Syntax
rollback_stmt:
rollback [work] [[to savepoint] savepoint_id]
Description
The rollback statement discards (undoes) all changes that have been made to any open databases since the
most recent start transaction statement or, if no start was issued, since the last commit or rollback statement
was executed, or, if neither a start, commit, or rollback have been issued, since the start of the session.
This statement can also used to rollback the changes that have been made since the savepoint specified by savepoint_id was issued.
This statement is also used to terminate a read only transaction.
Example
start transaction;
... /* make some changes to the database */
... /* system detects invalid data */
rollback;
See Also
commit
start transaction
rsqlTransRollback
rsqlTransEndReadOnly
SQL Statement Reference
273
RDM SQL Language Guide
savepoint
Mark a transaction savepoint
Syntax
savepoint_stmt:
savepoint savepoint_id
Description
The savepoint statement is used to mark a transaction savepoint identified by savepoint_id that can be the target
of a subsequently executed rollback [to savepoint] savepoint_id statement which will cause all of the database
modifications made after this savepoint to be discarded while keeping intact all changes made in the transaction
prior to this savepoint.
Of course, this statement requires that a transaction has been started.
Savepoints are discarded through execution of a release savepoint statement, a rollback to a prior savepoint,
or a rollback or commit of the transaction.
Example
start trans;
insert into acctmgr ... new account manager
savepoint new_patron;
insert into patron ... new patron for new acct manager
insert into patron ... another for the new acct manager
... discover problem with new patrons
rollback savepoint to new_patron;
commit;
See Also
release
rollback
SQL Statement Reference
274
RDM SQL Language Guide
select
Retrieve a set of rows of data from the database
Syntax
select_stmt:
select [first] [all | distinct] {* | select_item[, select_item]...}
from table_ref[, table_ref]...
[where conditional_expr]
[grouping | sorting | grouping sorting]
[limit (num {rows | mins | secs | msecs})]
[for {read only | update [of
column_name[, column_name]...]}]
grouping:
group by sort_col[, sort_col]... [having conditional_expr]
sorting:
order by sort_col [asc | desc][, sort_col [asc | desc]]...
sort_col:
num | column_name
select_item:
expression [alias_name | "column heading"]
table_ref:
table_primary | table_join
table_primary:
table_spec | ( table_join )
table_spec:
[db_name.]table_name [[as] correlation_name]
table_join:
natural_join | qualified_join | cross_join
natural_join:
table_ref natural [inner | {left | right} [outer]] join table_primary
SQL Statement Reference
275
RDM SQL Language Guide
qualified _join:
table_ref [inner | {left | right} [outer]] join table_primary
[using (column_name[, column_name]...) | on conditional_expr]
cross_join:
table_ref cross join table_primary
arith_expr:
expression
/* involving only numeric operands and operations */
dt_expr:
expression
string_expr:
expression
/* involving only date/time/timestamp operands and operations */
/* involving only string operands and operations */
expression:
operand [arith_operator operand]...
operand:
constant | param_ref | column_ref | function | (expr)
param_ref:
? | :param_name
column_ref:
[{table_name | correlation_name}.]column_name
arith_operator:
+|-|*|/
function:
aggregate_fcn | scalar_fcn
aggregate_fcn:
{sum | avg | max | min} (expression)
|
count ({* | column_ref })
|
aggregate_udf_name ([expression][, expression]...)
scalar_fcn:
|
if (conditional_expr, expression, expression)
|
numeric_function | datetime_function | string_function
|
scalar_udf_name ([expression][, expression]...)
SQL Statement Reference
276
RDM SQL Language Guide
numeric_function:
abs(arith_expr)
|
acos(arith_expr)
|
asin(arith_expr)
|
atan(arith_expr)
|
atan2(arith_expr)
|
{ceil | ceiling}(arith_expr)
|
cos(arith_expr)
|
cot(arith_expr)
|
exp(arith_expr)
|
floor(arith_expr)
|
{ln | log}(arith_expr)
|
mod(arith_expr)
|
pi()
|
rand(num)
|
sign(arith_expr)
|
sin(arith_expr)
|
sqrt(arith_expr)
|
tan(arith_expr)
datetime_function:
age(dt_expr)
|
{curdate | current_date}()
|
{curtime | current_time}()
|
dayofmonth(dt_expr)
|
dayofyear(dt_expr)
|
hour(dt_expr)
|
minute(dt_expr)
|
month(dt_expr)
|
quarter(dt_expr)
|
second(dt_expr)
|
week(dt_expr)
|
year(dt_expr)
string_function:
ascii(string_expr)
|
char(num)
|
concat(string_expr, string_expr)
|
convert(expression, {convert_type | {char}, width, convert_format})
|
lcase(string_expr)
|
left(string_expr, num)
|
length(string_expr)
|
locate(string_expr, string_expr, num)
|
ltrim(string_expr)
|
repeat(string_expr, num)
|
replace(string_expr, string_expr, string_expr)
|
right(string_expr, num)
|
rtrim(string_expr)
|
substring(string_expr, num, num)
SQL Statement Reference
277
RDM SQL Language Guide
|
|
ucase(string_expr)
unicode(string_expr)
convert_type:
char |smallint | integer | real
|
double | date | time | timestamp | tinyint | bigint
convert_format:
numeric_format | datetime_format
numeric_format:
"[<< | >> | ><]['text' | $][- | (][#,]#[.#[#]...][e | E]['text' | $ | %]"
datetime_format:
"[<< | >> | ><]['text' | spchar | date_code | time_code]..."
date_code:
m | mm | mmm | mon | mmmm | month
|
d | dd | ddd | dddd | day
|
yy | yyyy
time_code:
h | hh | m | mm | s | ss | .f[f]... | [a/p | am/pm | A/P | AM/PM]
conditional_expr:
rel_expr [bool_oper rel_expr]...
rel_expr:
|
|
|
|
|
|
expression [not] rel_oper expression
expression [not] between constant and
expression [not] in (constant[, constant]...)
column_ref is [not] null
string_expr [not] like "string"
not rel_expr
( conditional_expr )
constant
rel_oper:
|
|
|
|
|
= | ==
<
>
<=
>=
<> | != | /=
bool_oper:
& | && | and
|
"|" | "||" | or
SQL Statement Reference
278
RDM SQL Language Guide
Description
The select statement retrieves a subset of data (the result set) from a table or tables. The result set contains
rows that satisfy a conditional expression (where clause). If there is no condition for the where clause, the select
statement retrieves all rows from the table or tables. If the select statement includes a group by clause, only
rows that satisfy the where clause are reflected in grouping calculations.
A select first only returns the first row of the result set. A select distinct will eliminate duplicate rows from the
result set. Note that this necessarily requires that the rows first be sorted and can be quite an expensive (i.e.,
time consuming) operation and should be avoided unless absolutely necessary. The default behavior is select
all which returns all of the rows of the result set.
The select_item expressions can optionally be given an alias or alternate column heading.
The natural join specification indicates that the join is to be performed based on the common columns (names
and types) from the two tables. The join is based on the columns from the table (or tables) specified on the left
side of "natural … join" with those columns from the table (or tables) on the right side that have the same name. A
natural left (right) outer join includes the results of the inner join plus those rows of the left (right) table that do not
have a corresponding matching row in the joined table. An inner join is the default so that the specification of "natural join" produces a natural inner join. For outer joins, "outer" does not need to be specified.
A qualified join is like a natural join except that it requires that the columns on which the join is to be formed be
explicitly specified. Two specification methods are provided. The using clause requires you to name the common
column names between the joined tables which are to be used to form the join allowing you to choose only the
matching columns on which you want the join formed. The on clause requires you to specify the join predicates
as conditional expressions exactly as they would be specified in the where clause. The on clause is necessary
whenever the join is to be performed between columns that do not have the same name.
A cross join is simply a cross product of the two tables where each row of the left table is joined with each row of
the right table so that the cardinality of the result (i.e., the number of result rows) is equal to the product of the cardinalities of the two tables. An on clause cannot be specified with a cross join. However, there is nothing that
restricts including join conditions in the where clause. In practice, there are very few times when a cross join is
needed and since it can be a very expensive operation that can potentially produce huge result sets, its use
should be avoided.
Parentheses are sometimes needed to be used to group joins when more than two tables are involved in the
from clause. They are required when one table needs to be joined with two or more tables.
The group by clause defines a set of aggregate rows upon which computations are to be made. An aggregate
consists of those rows that have identical values in the columns that are named in the group by specification.
Each of the other selected columns should either have a unique value within each aggregate or be a computation
that uses of one or more aggregate functions (sum, avg, min, max, count, or an aggregate UDF). Only one row
is reported for each aggregate resulting from the select.
The having clause is similar to the where clause in that it is used to conditionally select which resultant rows will
be reported. However, the having conditional expression is not evaluated until after the group by processing has
been performed. The conditional expression will include comparisons that typically involve the aggregate functions in the select column list.
The limits clause can be specified to limit either the number of rows that are returned or the amount of time the
select statement is allowed to run. This feature is particularly useful when retrieving data from a virtual table
which may represent a never-ending source of data (such as from a weather sensor network).
SQL Statement Reference
279
RDM SQL Language Guide
The for read only clause will cause RDM SQL to execute the select statement within its own read only transaction which accesses a static, transaction-consistent version of the database at the time the select statement
executes and does not require any locking to be performed.
The for update clause indicates that the select statement is updateable by a positioned update on a separate
statement handle in the same connection that references the cursor name associated with this select.. An
updateable select is one for which the select result expressions are only simple column names, only one table is
listed in the from clause, and no order by clause is specified. If an of column name list clause is specified then
only those select result columns can be updated. If the of column name list clause is not specified then any of the
select result columns can be updated. Any columns declared in the table can be referenced in the associated
update (i.e., used in the set assignment of one of the updateable columns). The cursor name associated with the
select statement can be set by a call to function rsqlSetCursorName or the system-generated cursor name
can be retrieved through a call to rsqlGetCursorName. The cursor name needs to be specified in the where
current of clause of the related positioned update statement.
Example
select name, sum(amount) from sponsor join award on sponsor_nm = name
group by name order by 2 desc;
...
select sum(if(gender="M",1,0)) men, sum(if(gender="F",1,0)) women
from award natural join investigator natural join person;
...
select loc_long, loc_lat, convert(rdg_time,date), hour(rdg_time),
avg(temperature), avg(pressure), avg(humidity), avg(light) from weather_data
group by 1,2,4
limit(4 hours);
...
select bookid, publ_year, last_name, title from book where publ_year < 1800;
...
select aucid, count(*) from auction natural join bid where start_date = curdate()
group by 1;
...
See Also
set read only transaction mode
update
SQL Statement Reference
280
RDM SQL Language Guide
set
Set an SQL operational parameter value
Syntax
set_option_stmt:
set timeout [to | =] constant
|
set autocommit [to | =] {on | off}
|
set read only trans[action] mode [to | =] {auto | manual}
|
set debug [to | =] {0 | 1}
Description
The set statement is used to set a variety of different RDM SQL operational parameters. The set currency, thousands, and decimal statements set the currency, thousands separator, and decimal symbols to be used in the
format_spec of the display clause of the create domain and create table statements and the convert string function. All of the parameter settings apply to the connection handle and, thus, all of the statement handles that have
been allocated on that connection.
The set timeout sets the number of seconds to wait for a locked table to become available. The default is 30 seconds. Setting timeout to -1 will disable timeouts which we do not recommend doing. A timeout value of 0 will
cause lock requests to timeout immediately when the requested lock is not available.
The set autocommit can be used to turn on or off autocommit mode. When autocommit is on, each insert,
update, and delete statement will automatically issue a transaction commit at the end of the statement unless a
transaction was explicitly started by the application prior to the statement's execution.
The read only transaction mode is set to manual by default. In manual mode, each select statement will issue
read lock requests on the tables to be accessed. In this mode, execution of a select statement can return an
errTIMEOUT status. When read only transaction mode is set to auto, select statements that are executed outside of a transaction will automatically execute a start transaction read only marking the beginning of a group of
related database reads in which the data being read has been "frozen" to its state at the time the transaction was
started. Changes made after this by other connections are not blocked but they are also not visible. When the
select statement completes (i.e., the cursor is closed), the read only transaction is automatically terminated.
The set debug statement can be used to enable the writing of files named "debug.ddd" into the current directory where ddd begins with "000" and increases monotonically. Each file contains information for a single compiled SQL select, update, or delete that is used by the RDM SQL query optimizer. At this time, this information is
only of particular use to Raima support engineers and its use is, therefore, discouraged.
Example
set read only transaction mode to auto;
set timeout to 5;
SQL Statement Reference
281
RDM SQL Language Guide
See Also
create table
start transaction
rsqlSetAutoCommit
rsqlSetReadOnlyTrmode
SQL Statement Reference
282
RDM SQL Language Guide
set column
Set column statistics or SQL type for core database column
Syntax
set_column_stmt:
set column [db_name.]table_name.column_name
[type [to | =] {date | time | timestamp | long | {varchar | wvarchar}}]
[distinct values = num]
[range
constant
to
constant]
|
set column stats [db_name.]table_name.column_name
[distinct values = num]
[range
constant
to
constant]
Description
The set column statement is used to specify an SQL-specific data type for a core (non-SQL) database and/or
specify table column statistics that can be used by the RDM SQL optimizer to make better access method
choices. (Note that the set column stats syntax is provided for compatibility with the earlier version of RDM
SQL.)
Two types of statistics can be specified. The number of distinct values specifies the approximate number of different values stored in the column. For example, a column of type smallint can theoretically contain 65,535 different values. If, however, the actual number of different values is considerably smaller then that can have an
important impact on the access choices the optimizer might be inclined to make. Similarly, the range clause is
used to identify the range of values that the column can contain. Note that specifying the range only affects the
optimizer. It does not mean that the SQL system will check to ensure that only those values are stored in the column. The values specified in these two clauses are understood to be estimates and no problems are created
when, for example, a column value actually falls outside the specified range. The database in which the table column is declared must be opened when set column is called. The assigned values are only active for the duration
of the connection. However, you can use the create catalog statement to update the catalog with the new values.
The type clause can be used to specify an SQL-specific data type for a core database field. You can specify date
for an (32-bit) integer field but it must contain a valid DATE_VAL value (the number of elapsed days since Jan 1,
1 AD which has a value 1). You can specify time for an (32-bit) integer field but it must contain a valid TIME_VAL
value (the number of elapsed seconds since midnight times 10,000). You can specify timestamp for a (64-bit)
bigint field but it must contain a valid TIMESTAMP_VAL value (DATE_VAL and TIME_VAL combined). Since
core databases do not differentiate between binary and character blob fields, you can also specify long varchar
or long wvarchar for a blob field.
Example
open nsfawards;
set column nsfawards.person.gender distinct values = 3;
set column nsfawards.person.jobclass distinct values = 2;
SQL Statement Reference
283
RDM SQL Language Guide
...
open mycoredb;
set column coretab.blobfield type to long varchar;
See Also
create table
create catalog
rsqlPackDate
rsqlPackTime
rsqlPackTimestamp
SQL Statement Reference
284
RDM SQL Language Guide
start
Start a transaction
Syntax
start_stmt:
{start trans[action] | begin [work] [trans[action]]} [read only]
Description
The start transaction statement does just that: it begins a transaction. A transaction is defined as a group of
related database changes that are either committed (made permanent) or rolled-back (discarded) as a group.
This is necessary in order to maintain the logical consistency of the database content in case the system fails
(e.g., power failure) in the middle of the transaction. All database changes (insert, update, delete statement
executions) made after start are written in a single atomic operation upon execution of the commit statement.
The changes made after start can be discarded (e.g., in the event of a user input error) upon execution of the rollback statement.
Note that SQL will automatically start a transaction upon execution of the first insert, update, or delete statement where a start transaction has not already been executed.
The read only option extends the transaction concept beyond being just "a group of related database changes"
to being "a group of related database operations." A read only transaction marks the beginning of a group of
related database reads in which the data being read has been "frozen" to its state at the time the transaction was
started. Changes made by other connections are not blocked but they are also not visible to the connection issuing the start transaction read only statement until it is terminated by an end read only transaction, commit or
rollback (any of which can be used to end a read only transaction) statement. Read only transactions improve
total system throughput because they do not block (i.e., by issuing locks) database writers. However, is it important that read only transactions be short-lived as, due to implementation necessities, performance can degrade
over time.
Issuing a start transaction when a transaction is already active is not allowed.
If autocommit is enabled, the execution of a start transaction will disable autocommit until the next commit or
rollback is executed.
Example
...connection alpha...
start trans read only;
... issue a series of select statements
...meanwhile, over at connection omega...
start trans;
... issue a series of related insert, update, and delete statements
SQL Statement Reference
285
RDM SQL Language Guide
commit;
-- alpha cannot see omega's changes
...back at alpha...
commit;
-- ends alpha's read only transactions
... subsequent reads can now see omega's changes
See Also
commit
rollback
end read only transaction
SQL Statement Reference
286
RDM SQL Language Guide
unlock table
Explicitly unlock one or all read-locked database tables
Syntax
unlock _stmt:
unlock table {[db_name.]table_name | all}
Description
This statement will free the read lock on table table_name or will free all read locks from previously executed lock
table statements. This statement can only be executed outside of a transaction. The locks held within a transaction can only be freed through a transaction commit or rollback.
Example
lock table acctmgr, patron;
select * from acctmgr;
unlock table acctmgr;
select * from patron;
unlock table patron;
See Also
lock table
SQL Statement Reference
287
RDM SQL Language Guide
update
Update one or more rows in a table
Syntax
update_stmt:
update [db_name.]table_name
set column_name = expression[, column_name = expression]...
[where {conditional_expr | current of cursor_name}]
conditional_expr:
rel_expr [bool_oper rel_expr]...
rel_expr:
|
|
|
|
|
|
expression [not] rel_oper expression
expression [not] between constant and
expression [not] in (constant[, constant]...)
column_ref is [not] null
string_expr [not] like "string"
not rel_expr
( conditional_expr )
constant
rel_oper:
|
|
|
|
|
= | ==
<
>
<=
>=
<> | != | /=
bool_oper:
& | && | and
|
"|" | "||" | or
Description
The update statement modifies the column values in one or more rows from the specified table table_name. The
statement sets the column values to the results of the specified expressions or null. Table columns that are referenced in the conditional_expr and in each expression can only come from table_name.
The update statement is capable of two types of updates: searched updates and positioned updates. In a
searched update, the update statement modifies all rows of the table that satisfy the specified conditional expression. A positioned update is specified using the where current of cursor_name clause. The cursor_name must
be that associated with an updateable select statement on another statement handle in the same connection
SQL Statement Reference
288
RDM SQL Language Guide
that has been compiled, executed, and fetched so that it is positioned on a valid row of its result set when the positioned update is executed. The columns that can be updated are only those that are specified in the select statement's for update clause. If no of column name list was specified there, then any of the select statement result
columns can be updated. Any columns declared in the table can be referenced in the associated update (i.e.,
used in the set assignment of one of the updateable columns). The cursor name associated with the select statement can be retrieved by a call to rsqlGetCursorName or set by the application through a call to rsqlSetCursorName in the RDM SQL API.
If a primary or unique key is referenced by foreign keys, the behavior of the update statement is determined
based on the on update clause specified in the create table. The default action (no on update clause specified) is
to restrict (i.e. disallow) updates on a primary or unique key column in which there exists one or more rows in the
referencing table with matching foreign key values. The on update restrict option explicitly specifies this same
behavior. If the foreign key is declared with on update cascade then the values of all matching foreign key rows
will be changed to the new primary or unique key value. Note that in RDM SQL this happens automatically with
very little negative performance impact.
Example
start trans;
update author set last_name = "BronteE" where last_name = "Bronte";
insert into author values "BronteC", "Bronte, Charlotte", "F", 1816, 1855,
"English novelist, one of the 3 sisters whose novels are English lit. standards.";
commit;
See Also
create table
select
rsqlGetCursorName
rsqlSetCursorName
SQL Statement Reference
289
RDM SQL Language Guide
SQL UDF Reference
Function
udfInit
udfTerm
udfCheck
udfScalarCall
udfAggCall
udfAggResult
udfAggReset
SQL UDF Reference
Description
Initialize execution of a user-defined function
Terminate execution of a user-defined function
Check user-defined function argument types and return result type
Process call to a scalar user-defined function
Process call to an aggregate user-defined function
Fetch aggregate user-defined function result calculation
Reset aggregate user-defined function grouping calculations
290
RDM SQL Language Guide
udfAggCall
Process call to an aggregate user-defined function
Prototype
RSQL_ERRCODE EXTERNAL_FCN udfAggCall(
HSTMT
hstmt,
void
*pFcnCtx,
uint16_t
noargs,
const RSQL_VALUE
*pArgs)
Arguments
hStmt
pFcnCtx
noargs
pArgs
(input)
(input)
(input)
(input)
Statement handle of SQL statement referencing this UDF.
Pointer to the user program allocated registration context data area.
Number of arguments specified in SQL statement's UDF call.
Array of noargs argument value entries.
Description
The udfAggCall function is called by RDM SQL for each detail row from the current set of aggregate rows to perform the detail calculations needed by the aggregate function.
The hStmt argument is the statement handle of the SQL statement that contains a reference to the UDF. It can
be used by any of the UDF implementation functions to discover any needed information about the invoking statement (e.g., rsqlGetStmtType).
The pArgs argument is a pointer to an RSQL_VALUE array of noargs elements containing the value for each
argument. The first argument value is contained in pArgs[0]. Refer to the SQL Data Types and Values section
for details on the use of the RSQL_VALUE struct.
Example
#include "rsql.h"
...
/* ======================================================================
User function for matchcount() UDF
*/
static RSQL_ERRCODE EXTERNAL_FCN CntCall (
HSTMT
hStmt,
/* in: system handle */
void
*cxtp,
/* in: UDF context pointer */
uint16_t
noargs, /* in: number of arguments to function */
const RSQL_VALUE *args)
/* in: array of arguments */
{
COUNT_CTX *ccp = cxtp;
SQL UDF Reference
291
RDM SQL Language Guide
UNREF_PARM(hStmt)
UNREF_PARM(noargs)
if ( args[0].type != tNOVAL && args[1].type != tNOVAL ) {
if (args[0].type != tNULL) {
if ( (args[0].type != tCHAR && args[0].type != tVARCHAR)
||(args[1].type != tCHAR && args[1].type != tVARCHAR) )
ccp->stat = errUDFARG;
else {
ccp->stat = errSUCCESS;
if ( strstr(args[0].vt.cv, args[1].vt.cv) )
++ccp->count;
}
}
}
return errSUCCESS;
}
Return Codes
Error Code
0
83
86
Enum Identifier
errSUCCESS
errUDF
errUDFARG
SQL State
00000
RX011
21000
Description
no error was detected
user-defined function error
invalid funtion argument type
See Also
rsqlRegisterUDFs
udfCheck
udfInit
udfTerm
udfScalarCall
udfAggResult
udfAggReset
SQL UDF Reference
292
RDM SQL Language Guide
udfAggReset
Reset aggregate user-defined function grouping calculations
Prototype
RSQL_ERRCODE EXTERNAL_FCN udfAggReset(
HSTMT
hStmt,
void
*pFcnCtx)
Arguments
hStmt
pFcnCtx
(input)
(input)
Statement handle of SQL statement referencing this UDF.
Pointer to the user program allocated registration context data area.
Description
The udfAggReset function is only used with aggregate UDFs. Its function is to reset the aggregated computational result to its initial value. The function is called by SQL at the beginning of execution and each time the
group by column values change.
Example
#include "rsql.h"
...
/* ======================================================================
Reset function for matchcount() UDF
*/
static RSQL_ERRCODE EXTERNAL_FCN CntReset(
HSTMT
hStmt,
/* in: system handle */
void
*cxtp)
/* in: UDF context pointer */
{
COUNT_CTX *ccp = (COUNT_CTX *)cxtp;
UNREF_PARM(hStmt)
ccp->count = 0;
return errSUCCESS;
}
Return Codes
Error Code
0
Enum Identifier
errSUCCESS
SQL UDF Reference
SQL State
00000
Description
no error was detected
293
RDM SQL Language Guide
See Also
rsqlRegisterUDFs
udfCheck
udfInit
udfTerm
udfScalarCall
udfAggCall
udfAggResult
SQL UDF Reference
294
RDM SQL Language Guide
udfAggResult
Fetch aggregate user-defined function result calculation
Prototype
RSQL_ERRCODE EXTERNAL_FCN udfAggResult(
HSTMT
hStmt,
void
*pFcnCtx,
RSQL_VALUE
*pResult)
Arguments
hStmt
pFcnCtx
pResult
(input)
(input)
(output)
Statement handle of SQL statement referencing this UDF.
Pointer to the user program allocated registration context data area.
Pointer to the RSQL_VALUE variable to contain the result value.
Description
The udfAggResult function is called by RDM SQL during execution of the SQL statement containing the UDF
function reference to perform and return the desired aggregate calculation result. This function is designed to be
called once after all of the detail rows have been processed. However, at this time, RDM SQL actually calls this
function after each detail row has been fetched and after the udfAggCall function has been called. So, this function should never reset the aggregate computational value—that is the job of the udfAggReset function.
The result value needs to be returned in the RSQL_VALUE variable pointed to by the pResult output argument.
Note that for tCHAR/tVARCHAR result values the pResult->vt.cv is assigned to a pointer to a null-terminated char array for a character string result value. The memory containing the string must not be local to the
udfAggResult function as it will go out of scope as soon as the function returns. The memory needed for results
that are dynamic (e.g., character strings, binary arrays, etc.) will normally be contained or managed in the function context data area (pFcnCtx). Refer to the SQL Data Types and Values section for details on the use of the
RSQL_VALUE struct.
Example
#include "rsql.h"
. . .
/* ======================================================================
User function for matchcount() UDF
*/
static RSQL_ERRCODE EXTERNAL_FCN CntResult (
HSTMT
hStmt,
/* in: system handle */
void
*cxtp,
/* in: UDF context pointer */
RSQL_VALUE
*result) /* out: result value */
{
RSQL_ERRCODE stat;
SQL UDF Reference
295
RDM SQL Language Guide
COUNT_CTX *ccp
= (COUNT_CTX *)cxtp;
UNREF_PARM(hStmt)
if ( ccp->stat != errSUCCESS ) {
result->type = tSMALLINT;
result->vt.sv = (int16_t) ccp->stat;
stat = errSQLERROR;
}
else {
result->type
= tBIGINT;
result->vt.llv = ccp->count;
stat = errSUCCESS;
}
return stat;
}
Return Codes
Error Code
0
-2
Enum Identifier
errSUCCESS
errSQLERROR
SQL State
00000
RX002
Description
no error was detected
internal SQL error
See Also
rsqlRegisterUDFs
udfCheck
udfInit
udfTerm
udfScalarCall
udfAggCall
udfAggReset
SQL UDF Reference
296
RDM SQL Language Guide
udfCheck
Check user-defined function argument types and return result type
Prototype
RSQL_ERRCODE EXTERNAL_FCN udfCheck(
HSTMT
hStmt,
void
*pRegCtx,
uint16_t
noargs,
const RSQL_VALUE
*pArgs,
SQL_T
*pType,
int16_t
*pDeterm)
Arguments
hStmt
pRegCtx
noargs
pArgs
pType
pDeterm
(input)
(input)
(input)
(input)
(output)
(output)
Statement handle of SQL statement referencing this UDF.
Pointer to the user program allocated registration context data area.
Number of arguments specified in SQL statement's UDF call.
Array of noargs argument value entries.
Pointer to variable to contain the data type of the UDF result value.
Pointer to int16_t variable to contain the deterministic UDF indicator flag.
Description
This function is called by SQL during compilation (i.e. rsqlPrepare) of a SQL statement that contains a reference to the user-defined function (UDF) for which this particular udfCheck function has been associated in the
UDFLOADTABLE specified in a prior call to the rsqlRegisterUDFs function. The function can have any name
you choose.
The hStmt argument is the statement handle of the SQL statement that contains a reference to the UDF. It can
be used by any of the UDF implementation functions to discover any needed information about the invoking statement (e.g., rsqlGetStmtType).
The pRegCtx is the registration context pointer that was passed by the application to the rsqlRegisterUDFs
function. This can be used to pass any necessary application-specific control information that may be needed by
any of the UDFs (e.g., a random number seed for any function that generates random numbers).
The pArgs argument is a pointer to an RSQL_VALUE array of noargs elements. The first argument is contained in pArgs[0]. Most of the time, only the data type from the pArgs RSQL_VALUE array (e.g., args
[0].type) needs to be inspected as the actual data value will only be present when a literal constant value is
being passed to the function. In order to know which arguments have a literal value, the status field of RSQL_
VALUE can be checked (e.g., args[0].status). When a value is present the status will be set to vsOKAY, if
no value is present the status will be set to vsNOVAL. You can use this, for example, when you want to define an
argument for a particular function that is only allowed to take a literal constant. If an argument was specified
using a parameter marker or the argument is a stored procedure argument the type will be tNOVAL. In which
case, the actual type checking will need to be done at execution time by the udfScalarCall/udfAggCall function.
SQL UDF Reference
297
RDM SQL Language Guide
The data type returned by the UDF is returned through the pType argument. The valid RDM SQL_T data type
values that can be returned by a UDF are specified in the table below.
Table 1. SQL Data Type Values
SQL Data Type
SQL_T value
C Data Type
char
varchar
wchar
wvarchar
binary
varbinary
boolean
tinyint
smallint
integer
bigint
real
float, double
date
time
timestamp
tCHAR
tVARCHAR
tWCHAR
tWVARCHAR
tBINARY
tVARBINARY
tBOOL
tTINYINT
tSMALLINT
tINTEGER
tBIGINT
tREAL
tFLOAT, tDOUBLE
tDATE
tTIME
tTIMESTAMP
char
char
wchar_t
wchar_t
uint8_t
uint8_t
int8_t
int8_t
int16_t
int32_t
int64_t
float
double
int32_t
int32_t
int64_t
The pDeterm argument is returned from udfCheck to indicate whether or not the function is deterministic. Setting *pDeterm to 1 indicates that the function is deterministic. Setting *pDeterm to 0 indicates that it is not. A
deterministic function always returns the same value for all calls that pass the same argument values. This
means that when all of the argument values for a particular call are literals then SQL will call udfInit, udfScalarCall, and udfTerm when the statement that references the UDF is compiled and then replace the call with
the literal result value in the compiled statement code.
Example
#include "rsql.h"
...
/* ======================================================================
Soundex - type checking function (1 argument == name to be encoded)
*/
static RSQL_ERRCODE EXTERNAL_FCN SndxCheck(
HSTMT
hStmt,
/* in: statement handle */
void
*pRegCtx,
/* in: ptr to registration context */
uint16_t
noargs,
/* in: number of arguments to function */
const RSQL_VALUE *args,
/* in: array of argument values */
SQL_T
*fcntype,
/* out: result data type */
int16_t
*pDeterm)
/* out: = 1 deterministic */
{
RSQL_ERRCODE status;
UNREF_PARM(hStmt)
UNREF_PARM(pRegCtx)
SQL UDF Reference
298
RDM SQL Language Guide
if ( !args || noargs != 1 )
status = errUDFNOARGS;
else if ( args->type != tNOVAL && args->type !=tCHAR && args->type !=tVARCHAR
)
status = errUDFARG;
else {
status = errSUCCESS;
*fcntype = tCHAR;
*pDeterm = 1;
}
return status;
}
Return Codes
Error Code
0
83
86
Enum Identifier
errSUCCESS
errUDF
errUDFARG
SQL State
00000
RX011
21000
Description
no error was detected
user-defined function error
invalid funtion argument type
See Also
rsqlRegisterUDFs
udfInit
udfTerm
udfScalarCall
udfAggCall
udfAggResult
udfAggReset
SQL UDF Reference
299
RDM SQL Language Guide
udfInit
Initialize execution of a user-defined function
Prototype
RSQL_ERRCODE EXTERNAL_FCN udfInit(
HSTMT
hStmt,
void
*pRegCtx,
void
*pFcnCtx)
Arguments
hStmt
pFcnCtx
pResult
(input)
(input)
(output)
Statement handle of SQL statement referencing this UDF.
Pointer to the user program allocated registration context data area.
Pointer to the RSQL_VALUE variable to contain the result value.
Description
The udfInit function is called by RDM SQL when the SQL statement containing the UDF call is executed
(rsqlExecute). This function is used to initialize data that needs to survive multiple calls to the udfScalarCall or
udfAggCall functions during the processing of the SQL statement. The pointer to this allocated memory is called
the function context pointer and is passed to the udfInit function (as well as each of the other execution-time functions) through the pFcnCtx argument. If no initialization is needed then this function is unnecessary and its entry
in the UDFLOADTABLE can be assigned to NULL.
The hStmt argument is the statement handle of the SQL statement that contains a reference to the UDF. It can
be used by any of the UDF implementation functions to discover any needed information about the invoking statement (e.g., rsqlGetStmtType).
The pRegCtx argument is the registration context pointer that was passed by the application to the rsqlRegisterUDFs function. This can be used to pass any necessary application-specific control information that may
be needed by any of the UDFs (e.g., a random number seed for any function that generates random numbers).
The pFcnCtx argument is a pointer to the function context data area and is typically defined as a struct type with
fields defined for any of the data that needs to survive the calls to the udfScalarCall or udfAggCall functions.
RDM SQL will allocate and clear this memory based on the size (in bytes) specified in the call to rsqlRegisterUDFs (argument szFcnCtx).
Example
#include "rsql.h"
...
/* ======================================================================
Initialization function for generic UDF
*/
SQL UDF Reference
300
RDM SQL Language Guide
static RSQL_ERRCODE EXTERNAL_FCN MyUdfInit (
HSTMT
hstmt,
/* in: statement handle */
void
*pRegCtx,
/* in: ptr to registration context */
void
*pFcnCtx);
/* in: ptr to fcn execution context data area */
{
MYUDF_CTX *pCtx = (MYUDF_CTX *)pFcnCtx;
UNREF_PARM(hStmt)
UNREF_PARM(pRegCtx)
/* do needed initialization of pCtx */
. . .
return errSUCCESS;
}
Return Codes
Error Code
0
Enum Identifier
errSUCCESS
SQL State
00000
Description
no error was detected
See Also
rsqlRegisterUDFs
udfCheck
udfTerm
udfScalarCall
udfAggCall
udfAggResult
udfAggReset
SQL UDF Reference
301
RDM SQL Language Guide
udfScalarCall
Process call to a scalar user-defined function
Prototype
RSQL_ERRCODE EXTERNAL_FCN udfScalarCall(
HSTMT
hstmt,
void
*pFcnCtx,
uint16_t
noargs,
const RSQL_VALUE
*pArgs,
RSQL_VALUE
*pResult)
Arguments
hStmt
pFcnCtx
noargs
pArgs
pResult
(input)
(input)
(input)
(input)
(output)
Statement handle of SQL statement referencing this UDF.
Pointer to the user program allocated registration context data area.
Number of arguments specified in SQL statement's UDF call.
Array of noargs argument value entries.
Pointer to the RSQL_VALUE variable to contain the result value.
Description
The udfScalarCall function is called by RDM SQL (usually) during execution of the SQL statement containing the
user-defined function (UDF) reference to perform the desired calculation. It can also be called at compilation
time when 1) the function is deterministic (as indicated by the pDeterm output argument from a prior call to the
udfCheck function), and 2) when all of the argument values are literal constants.
The hStmt argument is the statement handle of the SQL statement that contains a reference to the UDF. It can
be used by any of the UDF implementation functions to discover any needed information about the invoking statement (e.g., rsqlGetStmtType).
The pArgs argument is a pointer to an RSQL_VALUE array of noargs elements containing the value for each
argument. The first argument value is contained in pArgs[0]. The result value needs to be returned in the
RSQL_VALUE variable pointed to by the pResult output argument. Note that for tCHAR/tVARCHAR result
values the pResult->vt.cv is assigned to a pointer to a null-terminated char array for a character string
result value. The memory containing the string must not be local to the udfScalarCall function as it will go out of
scope as soon as the function returns. The memory needed for results that are dynamic (e.g., character strings,
binary arrays, etc.) will normally be contained or managed in the function context data area (pFcnCtx). Refer to
the SQL Data Types and Values section for details on the use of the RSQL_VALUE struct.
Example
#include "rsql.h"
...
SQL UDF Reference
302
RDM SQL Language Guide
/* ======================================================================
Soundex() UDF - return soundex code for specified name
*/
static RSQL_ERRCODE EXTERNAL_FCN SndxCall (
HSTMT
hStmt,
/* in: system handle */
void
*cxtp,
/* in: UDF context pointer */
uint16_t
noargs,
/* in: number of arguments to function */
const RSQL_VALUE *args,
/* in: array of arguments */
RSQL_VALUE
*result)
/* out: result value */
{
/* Soundex conversion table. See Wikipedia "Soundex" page */
static const char *const codes[] = {"bfpv","cgjkqsxz","dt","l","mn","r","hw",
NULL};
static const char *const sndxerr = "xERR";
int32_t
cpos = 1;
int32_t
cndx;
char
cur_c;
char
last_c = '\0';
SNDX_CTX
*scp = cxtp;
char
*sndx = &scp->sndx[0];
const char
*name = args->vt.cv;
UNREF_PARM(hStmt)
UNREF_PARM(noargs)
result->type
result->len
= tCHAR;
= 0;
if ( !name || !isalpha(*name)
|| (args->type != tCHAR && args->type != tVARCHAR) ) {
result->vt.cv = sndxerr;
return errSUCCESS;
}
sndx[0] = (char) toupper(*name++);
strcpy(&sndx[1], "000");
for ( ; cpos < 4 && isalpha(*name); ++name) {
for (cndx = 0; codes[cndx] && cpos < 4; ++cndx) {
if ( strchr(codes[cndx], tolower(*name)) ) {
if ( cndx < 6 ) { /* "hw" */
cur_c = (char) ('1' + cndx);
if ( cur_c != last_c ) {
sndx[cpos++] = cur_c;
last_c = cur_c;
}
}
break;
}
}
if ( !codes[cndx] )
last_c = 0;
}
SQL UDF Reference
303
RDM SQL Language Guide
result->vt.cv = sndx;
return errSUCCESS;
}
Return Codes
Error Code
0
83
86
Enum Identifier
errSUCCESS
errUDF
errUDFARG
SQL State
00000
RX011
21000
Description
no error was detected
user-defined function error
invalid funtion argument type
See Also
rsqlRegisterUDFs
udfCheck
udfInit
udfTerm
udfAggCall
udfAggResult
udfAggReset
SQL UDF Reference
304
RDM SQL Language Guide
udfTerm
Terminate execution of a user-defined function
Prototype
void EXTERNAL_FCN udfTerm(
HSTMT
hStmt,
void
*pFcnCtx)
Arguments
hStmt
pFcnCtx
(input)
(input)
Statement handle of SQL statement referencing this UDF.
Pointer to the user program allocated registration context data area.
Description
The udfAggResult function is called after the SQL statement containing the UDF reference has completed
executing which, in the case of a select, means when the cursor has been closed either through the call to
rsqlFetch that returns status errNOMOREDATA (automatically closing the cursor) or through a call to
rsqlCloseStmt which is used to close a cursor before having scrolled completely through it.
The hStmt argument is the statement handle of the SQL statement that contains a reference to the UDF. The
pFcnCtx argument is a pointer to the function context data area and is typically defined as a struct type with
fields defined for any of the data that needs to survive the calls to the udfScalarCall or udfAggCall functions.
RDM SQL will allocate and clear this memory based on the size (in bytes) specified in the call to rsqlRegisterUDFs (argument szFcnCtx).
Example
/* ======================================================================
Termination function for generic UDF
*/
static void EXTERNAL_FCN MyUdfTerm (
HSTMT
hstmt,
/* in: statement handle */
void
*pFcnCtx);
/* in: ptr to fcn execution context data area */
{
MYUDF_CTX *pCtx = (MYUDF_CTX *)pFcnCtx;
UNREF_PARM(hStmt)
/* do needed termination from pCtx */
. . .
}
SQL UDF Reference
305
RDM SQL Language Guide
See Also
rsqlRegisterUDFs
udfCheck
udfInit
udfScalarCall
udfAggCall
udfAggResult
udfAggReset
SQL UDF Reference
306
RDM SQL Language Guide
SQL Virtual Table Function Reference
Function
vtFetch
vtInsert
vtRowCount
vtSelectClose
vtSelectCount
vtSelectOpen
Description
Fetch the next row in the virtual table
Process execution of an insert statement into a virtual table
Return estimate of number of rows in virtual table
Close select statement execution access to virtual table
Return actual number of rows in virtual table
Process execution of SQL statement access to virtual table
SQL Virtual Table Function Reference
307
RDM SQL Language Guide
vtFetch
Fetch the next row in the virtual table
Prototype
RSQL_ERRCODE EXTERNAL_FCN vtFetch(
HSTMT
hstmt,
uint16_t
nocols,
VCOL_INFO
*colsvals,
void
*pRegCtx,
void
*pFetchCtx)
Arguments
hstmt
nocols
colsvals
pRegCtx
(input)
(input)
(input)
(input)
pFetchCtx
(input)
Statement handle of SQL statement containing the virtual table reference.
Number of referenced columns (size of colsvals array).
Array of referenced column value containers.
Pointer to the user program allocated context data area that was originally passed in
through the call to rsqlRegisterVirtualTables.
Pointer to the fetch context data area.
Description
This function is called by SQL to fetch the next row from the virtual table. The fetch context pointer, pFCtx, references the fetch context data area containing any virtual table specific data needed for processing the fetch
(e.g., current row number). If a primary key lookup value was specified, then only one row should be retrieved. If
not, then all rows in the table should be retrieved with status errNOMOREDATA being returned on the first call
after the last row has been fetched. The necessary programming logic is best explained through the virtab
example as shown below.
Example
1
2
3
4
5
6
7
8
9
10
11
12
13
/* ========================================================================
Virtual table fetch function
*/
static RSQL_ERRCODE EXTERNAL_FCN vtabFetch( /* vtFetch() */
HSTMT
hstmt,
/* in: statement handle */
uint16_t
nocols,
/* in: no. of ref'd columns */
VCOL_INFO
*colsvals,
/* in: array of ref'd col value containers */
void
*pRegCtx,
/* in: ptr to registration context */
void
*pFCtx)
/* in: ptr to fetch context */
{
int16_t
cno;
VTAB_CTX
*pCtx = (VTAB_CTX *)pFCtx;
uint32_t
rno = (uint32_t)pCtx->rowno;
SQL Virtual Table Function Reference
308
RDM SQL Language Guide
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
vtabEnter();
if ( rno == norows || (pCtx->pkeyval && pCtx->rowcnt) ) {
pCtx->rowno = 0;
vtabExit();
return errNOMOREDATA;
}
for (cno = 0; cno < nocols; ++cno) {
const VCOL_INFO *pCVal = &colsvals[cno];
if ( vtabrows[rno].is_null[pCVal->colno] )
*pCVal->is_null = 1;
else {
*pCVal->is_null = 0;
switch ( pCVal->colno ) {
case 0:
memcpy(pCVal->data, &vtabrows[rno].pkey, sizeof(int32_t));
break;
case 1:
strcpy(pCVal->data, vtabrows[rno].name);
break;
case 2:
strcpy(pCVal->data, vtabrows[rno].addr);
break;
case 3:
strcpy(pCVal->data, vtabrows[rno].city);
break;
case 4:
strcpy(pCVal->data, vtabrows[rno].state);
break;
case 5:
strcpy(pCVal->data, vtabrows[rno].zip);
break;
} /*lint !e744 */
}
}
++pCtx->rowcnt;
++pCtx->rowno;
vtabExit();
return errSUCCESS;
}
Note the call to vtabEnter at line 15 and its reciprocal calls to vtabExit at lines 19 and 53 serializing access
to the norows and vtabrows variables. The if statement at line 17 tests the two conditions under which an
errNOMOREDATA status code is to be returned.
The loop at lines 22 to 49 is used to copy the fetched row's information for each column in the colsvals array.
This involves setting the correct null value indicator (lines 24-25) and, for the non-null columns, copying its value
into the column's data buffer pointed to by the VCOL_INFO data field (lines 30, 33, 36, 39, 42, and 45).
SQL Virtual Table Function Reference
309
RDM SQL Language Guide
Return Codes
Error Code
0
-1
Enum Identifier
errSUCCESS
errNOMOREDATA
SQL State
00000
02000
Description
no error was detected
no more data
See Also
rsqlRegisterVirtualTables
vtRowCount
vtSelectCount
vtSelectOpen
vtSelectClose
SQL Virtual Table Function Reference
310
RDM SQL Language Guide
vtInsert
Process execution of an insert statement into a virtual table
Prototype
RSQL_ERRCODE EXTERNAL_FCN vtInsert(
HSTMT
hstmt,
uint16_t
nocols,
VCOL_INFO
*colsvals,
void
*pRegCtx)
Arguments
hstmt
nocols
colsvals
pRegCtx
(input)
(input)
(input)
(input)
Statement handle of SQL statement containing the virtual table reference.
Number of referenced columns (size of colsvals array).
Array of referenced column value containers.
Pointer to the user program allocated context data area that was originally passed in
through the call to rsqlRegisterVirtualTables.
Description
This is a callback function, implemented by you, that is called by SQL to execute the SQL insert statement that
references the virtual table. The name of the function can be anything as the RDM SQL system only calls this
function through a pointer to it contained in the VTFLOADTABLE struct entry for its associated virtual table.
Each entry of the colsvals array contains information about a virtual table column that is referenced in the
SQL statement. This information is contained in the VCOL_INFO struct type whose fields are described in
the following table.
Table 4. VCOL_INFO Description
Field Name
Data Type
Description
colno
int16_t
len
is_null
uint32_t
int16_t *
data
void *
Ordinal position of column in table declaration: 0 (first column) to # of columns in table – 1 (last column).
Column length in bytes.
Pointer to variable containing the null indicator flag: *is_null = 0 => not null,
*is_null = 1 => is null.
Pointer to the buffer containing the column value.
Note that the is_null field is a pointer to the int16_t variable that is used by SQL system to indicate that a
column value is null. By assigning this through the pointer it eliminates the need for the SQL system to perform an
extra loop through the colsvals array.
All of the information needed to do the insert is provided in the vtInsert arguments. The colsvals array contains the values of the table columns to be inserted. The nocols argument specifies the number of entries in the
colsvals array which could be less than the number of columns declared in the table.
SQL Virtual Table Function Reference
311
RDM SQL Language Guide
If the associated virtual table has a primary key then it is the responsibility of this function to ensure that any specified primary key column value is unique. If a duplicate entry is found then the function needs to return status
errDUPLICATE.
Example
1
2
3
4
5
6
7
*/
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
/* ========================================================================
Virtual table INSERT execution function
*/
static RSQL_ERRCODE EXTERNAL_FCN vtabInsert( /* vtInsert() */
HSTMT
hstmt,
/* in: statement handle */
uint16_t
nocols,
/* in: no. of ref'd columns */
VCOL_INFO
*colsvals,
/* in: array of ref'd column value containers
void
*pRegCtx)
/* in:
unused */
{
int32_t
uint32_t
int16_t
RSQL_ERRCODE
lv;
rowno;
pkno = -1;
stat = errSUCCESS;
UNREF_PARM(hstmt)
UNREF_PARM(pRegCtx)
vtabEnter();
if ( !vtabrows ) {
/* allocate virtab data area */
vtabrows = calloc(maxrows, sizeof(struct virtab));
}
/* locate specified primary key value, if any */
for (pkno = 0; pkno < nocols; ++pkno) {
if ( colsvals[pkno].colno == 0 ) {
/* locate row with matching primary key */
memcpy(&lv, colsvals[pkno].data, sizeof(int32_t));
for ( rowno = 0; rowno < norows; ++rowno ) {
if ( vtabrows[rowno].pkey == lv ) {
vtabExit();
return errDUPLICATE;
}
}
}
}
stat = vtabStoreRow(norows, nocols, colsvals);
if ( stat == errSUCCESS )
++norows;
vtabExit();
return stat;
}
SQL Virtual Table Function Reference
312
RDM SQL Language Guide
Since the virtab table has a primary key, the function needs to locate the primary key value in the colsvals
array so that its uniqueness can be checked. This is work is done at lines 24 to 36. Since the primary key is
declared on the first column of the table, its value is located in the colsvals entry that has colno equal to 0
(line 26). Once found, the value is copied into the local int32_t variable lv. If a matching row is found the function returns status errDUPLICATE indicate that an attempt was made to insert a row with a duplicate primary
key value (lines 30-33).
1
2
3
4
5
6
7
*/
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
break;
25
break;
26
break;
27
break;
28
break;
29
break;
30
31
32
33
34
/* ========================================================================
Store column values in specified row (0 = first row)
*/
static RSQL_ERRCODE vtabStoreRow(
uint32_t
rowno,
/* in: row number into which store col vals */
uint16_t
nocols,
/* in: no. of ref'd columns */
const VCOL_INFO *colsvals) /* in: array of ref'd column value containers
{
uint16_t
cno;
const VCOL_INFO *pCol;
struct virtab
*pRow;
if ( rowno >= maxrows )
return errVTSPACE;
pRow = &vtabrows[rowno];
for (pCol = colsvals, cno = 0; cno < nocols; ++cno, ++pCol ) {
if ( *pCol->is_null )
pRow->is_null[pCol->colno] = 1;
else {
pRow->is_null[pCol->colno] = 0;
switch (pCol->colno) {
case 0: memcpy(&pRow->pkey, pCol->data, sizeof(int32_t));
case 1: strncpy(pRow->name,
(char *)pCol->data, 24);
case 2: strncpy(pRow->addr,
(char *)pCol->data, 32);
case 3: strncpy(pRow->city,
(char *)pCol->data, 24);
case 4: strncpy(pRow->state, (char *)pCol->data, 2);
case 5: strncpy(pRow->zip,
(char *)pCol->data, 9);
} /*lint !e744 */
}
}
return errSUCCESS;
}
The rowno argument is index into vtabrows into which the row will be stored. The pRow pointer (assigned at
line 16) is simply used to derefence that row in the code which follows. Lines 18-32 loop through the colsvals
array in order to assign the values for each individual column into its field in the vtabrows struct array entry.
SQL Virtual Table Function Reference
313
RDM SQL Language Guide
It is important to note that the table column number is not cno but pCol->colno (lines 20, 22, and 23). Also
note that in this example the len field of VCOL_INFO is not used but it could (should!) have been used to, for
example, check for a possible truncation (i.e., where pCol->len is greater than the declared size of the column).
Return Codes
Error Code
0
90
Enum Identifier
errSUCCESS
errDUPLICATE
SQL State
00000
42000
Description
no error was detected
duplicate primary/unique key value
See Also
rsqlRegisterVirtualTables
vtRowCount
vtSelectCount
vtSelectOpen
vtFetch
vtSelectClose
SQL Virtual Table Function Reference
314
RDM SQL Language Guide
vtRowCount
Return estimate of number of rows in virtual table
Prototype
RSQL_ERRCODE EXTERNAL_FCN vtRowCount(
HSTMT
hstmt,
void
*pRegCtx,
uint64_t
*pNoRows)
Arguments
hstmt
pRegCtx
(input)
(input)
pNoRows
(output)
Statement handle of SQL statement containing the virtual table reference.
Pointer to the user program allocated context data area that was originally passed in
through the call to rsqlRegisterVirtualTables.
Pointer to the variable to contain the number of rows.
Description
This is a callback function that implemented by you that is called by SQL during compilation of a SQL select statement that contains a reference to the virtual table in order to fetch an estimate of the number of rows in the table.
The name of the function can be anything as the RDM SQL system only calls this function through a pointer to it
contained in the VTFLOADTABLE struct entry for its associated virtual table.
The function is always called during compilation of a select statement. The returned number of rows does not
need to be exact as it is only being used by the query optimizer to get an estimate of the number of rows in the
table.
Some virtual tables (e.g., those that map to real-time sensors) may have an unlimited number of rows. Nevertheless, a value does need to be returned so you can set it to whatever makes the most sense for your application.
The pRegCtx pointer is that which was given in the call to rsqlRegisterVirtualTables which registered
the VTFLOADTABLE for the database containing the definition for this particular virtual table.
The function must return status code errSUCCESS unless some application-dependent error has occurred
which needs to be reported.
Example
/* ========================================================================
Virtual table 'virtab' row count function
*/
static RSQL_ERRCODE EXTERNAL_FCN vtabRowCount( /* vtRowCount() */
HSTMT
hstmt,
/* in: statement handle */
SQL Virtual Table Function Reference
315
RDM SQL Language Guide
void
uint64_t
*pRegCtx, /* in: unused */
*pNoRows) /* out: ptr to row count value */
{
UNREF_PARM(hstmt)
UNREF_PARM(pRegCtx)
vtabEnter();
*pNoRows = (uint64_t)norows;
vtabExit();
return errSUCCESS;
}
Return Codes
Error Code
0
Enum Identifier
errSUCCESS
SQL State
00000
Description
no error was detected
See Also
rsqlRegisterVirtualTables
vtSelectCount
vtSelectOpen
vtFetch
vtSelectClose
SQL Virtual Table Function Reference
316
RDM SQL Language Guide
vtSelectClose
Close select statement execution access to virtual table
Prototype
RSQL_ERRCODE EXTERNAL_FCN vtSelectClose(
HSTMT
hstmt,
void
*pRegCtx,
void
*pFetchCtx)
Arguments
hstmt
pRegCtx
(input)
(input)
pFetchCtx
(input)
Statement handle of SQL statement containing the virtual table reference.
Pointer to the user program allocated context data area that was originally passed in
through the call to rsqlRegisterVirtualTables.
Pointer to the fetch context data area.
Description
This is a callback function, implemented by you, that is called by SQL when execution of the select statement contains a reference to the virtual table is closed. The name of the function can be anything as the RDM SQL system
only calls this function through a pointer to it contained in the VTFLOADTABLE struct entry for its associated
virtual table.
The pRegCtx pointer is that which was given in the call to rsqlRegisterVirtualTables which registered
the VTFLOADTABLE for the database containing the definition of this particular virtual table.
The pFetchCtx points to the fetch context data area. Any additional allocated memory contained in pointers
stored in this data area to support processing of the select statement referencing the virtual table should be freed
by this function.
The function must return status code errSUCCESS unless some application-dependent error has occurred
which needs to be reported.
Example
/* ========================================================================
Virtual table close function
*/
typedef void EXTERNAL_FCN vtabSelectClose(
HSTMT
hstmt,
/* in: statement handle */
void
*pRegCtx,
/* in: ptr to registration context */
void
*pFetchCtx)
/* in: ptr to fetch context */
/*
Called by SQL when SELECT statement containing virtual table reference
SQL Virtual Table Function Reference
317
RDM SQL Language Guide
completes execution (i.e., when cursor is closed).
Use this function to do any needed cleanup and device termination actions.
*/
{
/* code to free any allocated memory or, perhaps
to power down virtual table device. */
}
Return Codes
Error Code
0
Enum Identifier
errSUCCESS
SQL State
00000
Description
no error was detected
See Also
vtRowCount
vtSelectCount
vtSelectOpen
vtFetch
SQL Virtual Table Function Reference
318
RDM SQL Language Guide
vtSelectCount
Return actual number of rows in virtual table
Prototype
RSQL_ERRCODE EXTERNAL_FCN vtSelectCount(
HSTMT
hstmt,
void
*pRegCtx,
void
*pFetchCtx,
uint64_t
*pNoRows)
Arguments
hstmt
pRegCtx
(input)
(input)
pFetchCtx
pNoRows
(input)
(output)
Statement handle of SQL statement containing the virtual table reference.
Pointer to the user program allocated context data area that was originally passed in
through the call to rsqlRegisterVirtualTables.
Pointer to the fetch context data area.
Pointer to the variable to contain the number of rows.
Description
This is a callback function, implemented by you, that is called by SQL during compilation of a SQL select statement that contains a reference to the virtual table in order to fetch the count of the actual number of rows in the
table. The name of the function can be anything as the RDM SQL system only calls this function through a
pointer to it contained in the VTFLOADTABLE struct entry for its associated virtual table.
It is only called during the execution of a "select count(*) from virtab" statement in order to return the current
actual number of rows in the virtual table.
Some virtual tables (e.g., those that map to real-time sensors) may have an unlimited number of rows. Nevertheless, a value does need to be returned. For the "select count(*)" the value returned still needs to be a fixed
value so you can set it to whatever makes the most sense for your application.
The pRegCtx pointer is that which was given in the call to rsqlRegisterVirtualTables which registered
the VTFLOADTABLE for the database containing the definition of this particular virtual table.
The function must return status code errSUCCESS unless some application-dependent error has occurred
which needs to be reported.
Example
/* ========================================================================
Virtual table 'virtab' select count function
*/
static RSQL_ERRCODE EXTERNAL_FCN vtabSelectCount( /* vtSelectCount() */
SQL Virtual Table Function Reference
319
RDM SQL Language Guide
HSTMT
void
void
uint64_t
hstmt,
*pRegCtx,
*pCtx,
*pNoRows)
/*
/*
/*
/*
in:
in:
in:
out:
statement handle */
unused */
unused */
ptr to row count value */
{
vtabEnter();
*pNoRows = (uint64_t)norows;
vtabExit();
return errSUCCESS;
}
Return Codes
Error Code
0
Enum Identifier
errSUCCESS
SQL State
00000
Description
no error was detected
See Also
rsqlRegisterVirtualTables
vtSelectCount
vtSelectOpen
vtFetch
vtSelectClose
SQL Virtual Table Function Reference
320
RDM SQL Language Guide
vtSelectOpen
Process execution of SQL statement access to virtual table
Prototype
RSQL_ERRCODE EXTERNAL_FCN vtSelectOpen(
HSTMT
hstmt,
uint16_t
nocols,
VCOL_INFO
*colsvals,
void
*pRegCtx,
void
*pFetchCtx,
RSQL_VALUE
*pkeyval)
Arguments
hstmt
nocols
colsvals
pRegCtx
(input)
(input)
(input)
(input)
pFetchCtx
pkeyval
(input)
(input)
Statement handle of SQL statement containing the virtual table reference.
Number of referenced columns (size of colsvals array).
Array of referenced column value containers.
Pointer to the user program allocated context data area that was originally passed in
through the call to rsqlRegisterVirtualTables.
Pointer to the fetch context data area.
Pointer to specified primary key value. Non-NULL only when executing "select ...
from virtab where pkey = value" statement.
Description
This is a callback function, implemented by you, that is called by SQL to execute a select statement that references the virtual table. The name of the function can be anything as the RDM SQL system only calls this function through a pointer to it contained in the VTFLOADTABLE struct entry for its associated virtual table.
Each entry of the colsvals array contains information about a virtual table column that is referenced in the
SQL statement. This information is contained in the VCOL_INFO struct type whose fields are described in
the following table.
Table 4. VCOL_INFO Description
Field Name
Data Type
Description
colno
int16_t
len
is_null
uint32_t
int16_t *
data
void *
Ordinal position of column in table declaration: 0 (first column) to # of columns in table – 1 (last column).
Column length in bytes.
Pointer to variable containing the null indicator flag: *is_null = 0 => not null,
*is_null = 1 => is null.
Pointer to the buffer containing the column value.
Note that the is_null field is a pointer to the int16_t variable that is used by SQL system to indicate that a
column value is null. By assigning this through the pointer it eliminates the need for the SQL system to perform an
extra loop through the colsvals array.
SQL Virtual Table Function Reference
321
RDM SQL Language Guide
The fetch context pointer contains the address of a data area that is be used by vtFetch to control the fetching
of rows from the virtual table. The context used in the virtab example is defined by the VTAB_CTX struct
typedef declaration given below.
typedef struct vtab_ctx {
uint64_t
rowcnt;
uint64_t
rowno;
RSQL_VALUE
*pkeyval;
} VTAB_CTX;
/* count of rows fetched */
/* number of next row to be fetched */
/* ptr to primary key's value */
The rowno contains the vtabrows index of the next row to be returned by vtFetch. The rowcnt and a nonNULL pkeyval is used to ensure that only one row is returned when the select statement included the "where
pkey = value" clause.
If a primary key value is specified then vtSelectOpen needs to locate the row with that value (lines 30-34) and set
pCtx->rowno to it. If it is not found then pCtx->rowno is set to norows which will cause vtFetch to return
errNOMOREDATA.
Example
1
2
3
4
5
6
7
*/
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
/* ========================================================================
Virtual table SELECT execution function
*/
static RSQL_ERRCODE EXTERNAL_FCN vtabSelectOpen( /* vtSelectOpen() */
HSTMT
hstmt,
/* in: statement handle */
uint16_t
nocols,
/* in: no. of ref'd columns */
VCOL_INFO
*colsvals,
/* in: array of ref'd column value containers
void
void
RSQL_VALUE
*pRegCtx,
*pFCtx,
*pkeyval)
/* in:
/* in:
/* in:
ptr to registration context */
ptr to fetch context */
ptr to primary key value */
{
RSQL_ERRCODE stat = errSUCCESS;
uint32_t
rowno;
VTAB_CTX
*pCtx = (VTAB_CTX *)pFCtx;
UNREF_PARM(hstmt)
UNREF_PARM(pRegCtx)
pCtx->rowcnt
pCtx->rowno
pCtx->pkeyval
= 0;
= rowno = 0;
= pkeyval;
vtabEnter();
if ( !vtabrows ) {
vtabrows = calloc(maxrows, sizeof(struct virtab));
}
else if ( pkeyval ) {
/* locate row with matching primary key */
SQL Virtual Table Function Reference
322
RDM SQL Language Guide
30
31
32
33
34
35
36
37
38
39
for ( rowno = 0; rowno < norows; ++rowno ) {
if ( pkeyval->vt.lv == vtabrows[rowno].pkey )
break;
}
pCtx->rowno = rowno;
}
vtabExit();
return stat;
}
Return Codes
Error Code
0
Enum Identifier
errSUCCESS
SQL State
00000
Description
no error was detected
See Also
rsqlRegisterVirtualTables
vtRowCount
vtSelectCount
vtFetch
vtSelectClose
SQL Virtual Table Function Reference
323
RDM SQL Language Guide
Glossary
B
B-tree
Also called a multiway tree, a B-tree is a fast data-indexing method that organizes the index
into a multi-level set of nodes. Each node contains a sorted array of key values (the indexed
data). Two important properties of a B-tree are that all nodes are at least half-full and that the
tree is always balanced (that is, an identical number of nodes must be read in order to
locate all keys at any given level in the tree). A well-organized B-tree will have only three or
four levels.
buffer
An in-memory store of data read from a disk file, in which database operations are performed.
C
cache
A set of buffers used to optimize database input and output operations. All RDM Embedded
database input and output is performed using a cache.
combine
The concatenation of the members of two or more set types into one set type.
commit
The point at which database changes made during a single transaction are actually written
to the database files.
compound key
A key field composed of any combination of fields (not necessarily contiguous) from a record. Each field of a compound key may be stored in ascending or descending order.
connect
The process of inserting a member record occurrence into a set occurrence.
currency tables
A table of database addresses maintained by the RDM Embedded runtime system for controlling record access and set navigation. The currency tables consist of the current member
table, current owner table, and the current record.
Glossary
324
RDM SQL Language Guide
current database
The database that is currently accessible by the RDM Embedded runtime functions when
multiple databases have been opened. The current database is changed by the database
number function argument or by function d_setdb.
current member
Contains, for each set, the database address of a record occurrence that is a valid member
of that set. Usually, the current member of a set is the last record accessed using a set navigation function (d_findfm, d_findlm, d_findnm, or d_findpm).
current owner
Contains for each set, the database address of a record occurrence that is a valid owner of
that set. Usually, the current owner of a set is established using the set navigation function
d_findco or by using a currency manipulation function.
current record
Contains the database address of the most recently accessed record instance.
D
data field
A field represents the basic unit of information storage in a database and is always defined
to be an element of a record. A field has associated with it attributes such as name, type (for
example, char or int), and length. Other terms used for field include: attribute, entity, or column.
data file
An RDM Embedded file defined in a DDL specification that contains occurrences of one or
more record types.
database
An organized collection of related files.
database address
The location in the database of a record occurrence, frequently referred to as a DB_ADDR.
Composed of two numbers: the file index and the slot within the file. Either 4 or 8 bytes long.
database definition language
A programming-like language used to define the structure and content of a database. RDM
Embedded's Database Definition Language has been designed to be used with the C programming language.
Glossary
325
RDM SQL Language Guide
DDL
A programming-like language used to define the structure and content of a database. RDM
Embedded's Database Definition Language has been designed to be used with the C programming language.
deadlock
A situation in which multiple processes accessing the same database each hold locks
needed by the other processes in such a way that none of the processes can proceed.
Sometimes called deadly embrace.
delete chain
A linked list containing deleted records or nodes to be reused when a new record or node is
created.
derived revision
A revision that can be derived from a comparison of the source and destination database dictionary files.
destimation database
The db_REVISE-created database that stores the specified revisions.
dictionary
A repository containing a definition of the content and structure of a database. It is used by
the RDM Embedded runtime library functions for accessing and manipulating information
from that database.
disconnect
The process of removing a member record from a set occurrence.
document root
The path to the directory under which all files will be stored. Within the domain of one TFS,
no files outside of this path may be accessed.
domain name
The "name" of a computer which has visibility to another computer. This may be a published
name available on DNS servers and across the Internet, or an internal network name visible
only within a workgroup. The "ping" utility must be able to locate the IP address associated
with this name. In RDM Embedded, a server (tfserver, dbmirror, dbrep, or dbrepsql) may be
located through the domain name of the computer it is running on, together with the port on
which it is listening. A special domain name, "localhost" always refers to the same computer
as the application is running on (IP address is always 127.0.0.1).
Glossary
326
RDM SQL Language Guide
E
environment variable
A programmer-specified operating system parameter that is used to identify configuration
information to the runtime system.
F
field
A field represents the basic unit of information storage in a database and is always defined
to be an element of a record. A field has associated with it attributes such as name, type (for
example, char or int), and length. Other terms used for field include: attribute, entity, or column.
file
The primary physical storage unit into which a database is organized. In RDM Embedded,
files are used to store records and keys.
H
hierarchical database model
A data representation in which the relationships between record types are formed from parent-child structures, such that a record type may have many child relationships but only one
parent relationship.
I
index
A set of key values through which rapid retrieval of a record is provided, similar to the index
of a book. The term is often used synonymously with key file.
J
join
The creation of one record type from a hierarchy of record types.
K
key
A field through which rapid and/or sorted access to a record is desired.
Glossary
327
RDM SQL Language Guide
key file
A file that only contains keys. It may, in fact, contain more than one index because multiple
key types can be contained in a single RDM Embedded key file.
key scan
The process of performing an ordered traversal through all (or a subset of all) occurrences of
a given key field.
L
localhost
A special Domain Name that always refers to the computer on which the application software is running. It is the default domain name used by RDM Embedded utilities and runtime
library.
lock
A multi-user database synchronization mechanism, used to prevent simultaneous updates
to shared data. Locks can be applied to the entire database or to files.
logging
The process of making a copy of the database changes made during a transaction prior to a
commit. Logging is used to support the ability to perform a recovery in the event a failure
occurs during a commit.
M
many-to-many relationship
A relationship between two record types, A and B, such that for each occurrence of type A,
there are many related occurrences of type B and, for each occurrence of type B, there are
many related occurrences of type A. In RDM Embedded, many-to-many relationships can
be implemented using two one-to-many sets through a third, intersection record type.
member of set
Specifies a one-to-many relationship between record types. One occurrence of the owner
record type is related to many occurrences of a member record type. Also called a set type.
member pointer
Stores set membership linkage information. There is one member pointer stored with a record per set for which the record is a member. Each one contains the database addresses of
the owner record, previous member in the set, and next member in the set.
Glossary
328
RDM SQL Language Guide
N
navigation
The process of retrieving records from a database by moving through various navigational
methods. Methods include set navigation, key scanning, and record-type scanning.
network database model
A data representation in which the relationships are explicitly defined and maintained
through sets of owner/members, where any given record type may be the owner of multiple
types of sets and the member of multiple types of sets. Multiple set membership distinguishes the Network database model from the Hierarchical database model.
node
A component of a B-tree, consisting of a page of sorted keys stored in a key file.
normalize
The elimination of redundant record instances that own a new set, resulting in a one-tomany relationship.
O
occurrence
One record instance within a record type, specifically associated with record type scanning
(d_recfrst, d_recnext, d_recprev, d_reclast), where the current occurrence of a record type is
used to bookmark the position on a record type scan. Record occurrences are ordered by
their physical appearance in a data file. The current occurrence is not the same as the current record, although the current record will also be set by the scanning functions.
owner of set
Specifies a one-to-many relationship between record types. One occurrence of the owner
record type is related to many occurrences of a member record type. Also called a set type.
P
page
Files are blocked into contiguous fixed-length segments called pages. A page is the unit of
database I/O performed in RDM Embedded.
path name
The sequence of directories in a hierarchical file system that must be traversed to locate a
particular file.
Glossary
329
RDM SQL Language Guide
pointer
In a database, a pointer is data stored in a record occurrence that provides the necessary
information for locating related record occurrences. In a C program, a pointer is a variable
that contains a memory address.
port
Together with an IP address, a port number uniquely identifies an endpoint by which a
TCP/IP connection can be made to another program. In RDM Embedded, each server
(tfserver, dbmirror, dbrep or dbrepsql) identifies the port number that should be used to
locate it. The IP address is normally obtained through a domain name lookup (e.g.
tfs.raima.com is a domain name, and its IP address is 198.168.140.200).
process
An independently executing task or program. An individual execution of an RDM Embedded
application program.
projection
The placement of fields from one record type into one or more new record types.
Q
queue
A first-in-first-out waiting list. Lock requests for a locked resource will be placed at the end of
a queue. When the locked resource becomes available, the first lock request on the queue
will be granted.
R
record
Used synonymously with record type or record occurrence depending on the context in
which the term is used.
record occurence
One individual instance in a database of a record of a particular type. A database consists of
many occurrences of many different record types. For example, an employee record type
may consist of the fields name, employee_id, job_title, and pay. An employee record occurrence could be "name: Jones, Jim; employee_id: c87101, job_title: engr, pay: 3400".
recovery
The process of completing the transaction of a process that failed during a commit.
Glossary
330
RDM SQL Language Guide
redundant data
Identical data that is stored in multiple locations in a database. Typically used to form relationships between tables in a relational database management system.
relational database model
A data representation in which a database is viewed as consisting of two-dimensional
tables, each composed of one or more columns. Inter-table relationships are defined
through use of common column names and data. Tables and columns are analogous to
RDM Embedded records and fields, respectively.
remote procedure call
A programming mechanism that makes a library call appear to operate in the program space
of an application, even though the actual function exists in the program space of another program (called a "server"). A client application places a function identifier and parameter contents into a packet that is first transferred to the server, with results (return code, return
parameter values) transferred back to the caller.
Revision Definition Language
The RDL supplies information to db_REVISE that cannot be derived from a comparison of
the source and destination dictionary files.
root node
The top or start node of a B-tree.
RPC
A programming mechanism that makes a library call appear to operate in the program space
of an application, even though the actual function exists in the program space of another program (called a "server"). A client application places a function identifier and parameter contents into a packet that is first transferred to the server, with results (return code, return
parameter values) transferred back to the caller.
runtime system
The RDM Embedded C language library functions that perform all of the database access
required by an application program while it is executing.
S
schema
A conceptual model of the structure of a database that defines the data contents and relationships. A database definition language specification is an implementation of a particular
schema.
Glossary
331
RDM SQL Language Guide
set
Specifies a one-to-many relationship between record types. One occurrence of the owner
record type is related to many occurrences of a member record type. Also called a set type.
set occurence
An individual instance of a set in which one owner record occurrence has one or more
member record occurrences connected to it.
set pointer
Stores set ownership linkage information. There is one set pointer stored with a record per
set for which the record is an owner. Each one contains a count of the number of members
in the set, the database address of the first member record occurrence, and the database
address of the last member record occurrence in the set.
set scan
The process of performing an ordered traversal through all (or a subset of all) member record occurrences of a given set occurrence.
slot
A position in a data or key file for storage of a single record or key occurrence.
source database
The database containing the data that is to be revised. This database is used in a read-only
manner.
specified revision
A revision requiring specification by an RDL statement.
split
The separation of a multiple-member set type into two or more set types.
static revision
A revision that can be performed without changing the existing database content or structure.
synchronization
The process of ensuring that, in a multi-user database environment, updates to shared data
are performed serially, one user at a time.
system record
A special record type used to define the "top" record in a network database. There is only
one occurrence of the system record in a database. It is defined by naming "system" as a set
owner in one or more set definitions in the DDL. When a database is opened, the system
Glossary
332
RDM SQL Language Guide
record, if it exists, is set as the current owner of all sets for which it is named as owner. It
may not be a set member.
T
task
In an RDM Embedded Application, a task is a block of allocated memory that stores the complete database context for a thread of execution. It must be allocated through the d_opentask function and closed through the d_closetask function. A task represents one user in a
multi-user environment. A task can also represent one database transaction, with all locks
and database updates associated with the transaction.
TFS
A software component within the RDM Embedded system that maintains safe multi-user
transactional updates to a set of files, and responds to page requests. The tfserver utility
links to the TFS to allow it to run as a separate utility. The TFS may also be linked directly
into an application in order to avoid the RPC overhead of calling a separate server.
thread
An independent flow of control within a computer operating system. Differentiated from a
Process in that a process may contain one or more threads. Threads within the same process share common (or global) data but have their own stacks, which keeps track of the
thread's context. In RDM Embedded Applications, each thread must be associated with its
own task variable, and is treated as a separate user in a multi-user environment.
timeout
An event that occurs when a lock request has waited on a queue longer than a pre-determined amount of time. It is used to avoid deadlock.
transaction
A group of related database changes that are written to the database as a single unit during
a commit. The logical consistency of a database is maintained by placing all related
updates within transactions.
transactional file server
A software component within the RDM Embedded system that maintains safe multi-user
transactional updates to a set of files, and responds to page requests. The tfserver utility
links to the TFS to allow it to run as a separate utility. The TFS may also be linked directly
into an application in order to avoid the RPC overhead of calling a separate server.
Glossary
333
RDM SQL Language Guide
W
working database
A temporary database created by db_REVISE for use only during the database revision
process. db_REVISE removes the working database when the revision process is complete.
Glossary
334
RDM SQL Language Guide
Index
S
SQL
begin 285
close 238
commit 239
create catalog 240
create database 241
create domain 243
create procedure 245
create table 247
create virtual table 252
delete 254
drop database 256
drop procedure 257
end 239, 258
end read only transaction 258
exec 259
execute 259
export 261
import 262
initialize 265
insert 266
lock table 268
open 270
release 272
rollback 273
run 259
Index
335
RDM SQL Language Guide
savepoint 274
select 275
set 281
set column 283
start 285
unlock table 287
update 288
U
udfAggCall 148, 291
udfAggReset 151, 293
udfAggResult 150, 295
udfCheck 142, 297
udfInit 145, 300
udfScalarCall 147, 302
udfTerm 146, 305
V
vtFetch 168, 308
vtInsert 160, 311
vtRowCount 164, 315
vtSelectClose 170, 317
vtSelectCount 164, 319
vtSelectOpen 166, 321
Index
336