Download Using SAS/ACCESS Software with PC Files and Databases

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Database wikipedia , lookup

Microsoft Access wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Functional Database Model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database model wikipedia , lookup

Transcript
Using SAS/ACCESS® Software with PC Files and Databases
Forrest Boozer, SAS Institute Inc., .Cary, NC
ABSTRACT
EMPLOYEE Table
This paper discusses the SAS/ACCESS® software products
EMPID
FNAME
LNAME
DEPT
available under OS/2® and Microsoft Windows and describes the
way the products work in both environments. The paper explains
001
Henrietta
Hacker
ISO
020
Sammy
Slick
MKT
003
Justin
Doit
CEO
the differences i)etween the interfaces for DIF files, DBFfiles,
Database Manager, and AS/400® Data and demonstrates how to
use them efficiently. Plans for future enhancements to the
SAS/ACCESS products for pes are also discussed.
.
Salary Table
INTRODUCTION
The explosive use of pes in today's workplace has been a mixed
blessing. Everyone from administrative assistants to chairmen of
the board have pes on their desks. Often these users have the need
to access data from the corporate, departmental, workgroup and
personal levels of the organization. Frequently these levels of data
are organized and administered by different people within the
bUsiness organization. For example, corporate data may reside on
an AS/400 that is administered by ISO; departmental data may be
managed by the department's" PC Expert" and reside in OS/2
Database Manager or dBASE IV; personal data such as sales leads
or expenses may be maintained by the individual in a spreadsheet
such as Lotus 1-2-3.
EMPID
SALARY
BQNUSCOD
001
35,000
020
30,000
t
003
120,000
10
The EMPLOYEE table Simply contains the names, employee IDs
and departments for all employees in a company. Because of the
confidentiality of salary information it is kept in a separate table,
SALARY, so that access to the information may be restricted.
In the EMPLOYEE table, the EMPIO column contains numeric
values representing employee ids. The FNAME, LNAME and DEPT
columns are character and contain the values for the first names,
last names and departments of the employees.
At some point, an individual or org'anization within the business may
need'to collect and analyze data from these disjoint database
systertls Of write an application that allows users seamless access
to the data. The SAS® System provides applications programmers
and users with a vast number of development and analysis tools
for manipulating data while the SAS/ACCESS products allow these
tools to be used on data stored in non-SAS files or database
management systems (DBMSs).
In the SALARY table all columns are numeric. As in the EMPLOYEE
table, the EMPID -column contains employee IDs. The SALARY and
BONl)SCOD columns contain the salary and the bonus code that
the employees are to receive.
There are basically two different types of SAS/ACCESS interfaces
available on the PC platform. The first interface type simply reads
a file of a specific format. The second type actually makes calls to
a DBMS which in turn directly manipulates the data under its control
and returns results of queries to the interface.
FEATURES OF A SAS/ACCESS INTERFACE
A SAS/ACCESS interface may have any number of features 'and
abilities. In Release 6.08 of the SA'S System, a full featured
SAS/ACCESS product will have ~e following components:
The advantages and disadvantages of each interface and type of
interface will be discussed in this paper. However, some things are
common regardless of the interface you are using. This paper will
take advantage -of the common thread between the interfaces. First
a model of a full featured SAS/ACCESS product will be presented
at a functional level. Then the properties that are common within a
specific interface type will be presented followed by a discussion of
the unique properties of each interface of that type.
• ACCESS Procedure
o non-interactive descriptor creation
o interactive, window based descriptor cre~tion
o data extraction
• DBlOAD Procedure
o non-interactive table creation
EXAMPLE DATA
o interactive, window based table creation
The examples given in this paper will be used to illustrate a single
feature or attribute of an interface. Because the illustrations are
narrow in scope, the sample data will be used repeatedly where
ever appropriate. Forexample, the EMPLOYEE table may be a SAS
data file in one example and a DIF file in the next. The following
tables are used for the examples in this paper.
o access descriptor creation
o Sal statement
• Interface View Engine
o read-write data access
o Sal Procedure Pass-Through Facility
Each of the major components above may support the features
listed below them. !-lowever, a component for a particular interface
may not support all of the possible features: Every interface consists
of at least the ACCESS Procedure and Interface View Engine
components.
179
engine and SQl procedure are then able to dynamically process
the information returned from the DBMS.
ACCESS Procedure
The basic function of the ACCESS procedure is to provrde the SAS
engine supervisor and the interface 110 engine with information
concerning the location and type of data to be retrieved from an
external source. This information is stored in SAS descriptor files.
There are two types of descriptor files, access and view.
CHARACTERISTICS OF A PC FilE INTERFACE
A PC file interface is simply an interface that reads a specific file
format. The products that create these files typicarly have no
method by which the SAS System can call them to retrieve data.
The only access to the data is by accessing the file directly without
regard for the program creating the data. This method of acceSSing
the data is especially useful when the data has been exported from
a "non-DBMS" program such as a spreadsheet or'personal
scheduler that does not use DBF or DIF files as the native method
of data storage.
All descriptors contain identifying information such as database and
table name or filename, column names and types, and the SAS
variable names and formats that correspond to the columns in the
table. However, access and view descriptors differ in usage.
An access descriptor can be considered a master descriptor
containing a complete description of a single data source. It is
created by first identifying where and how the data is stored to the
access procedure. The procedure then looks at the data to
determine the names of the columns or fields and the type of data
that they contain. It may also generate default SAS variable names
and formats corresponding to each column. You then have the
ability to further modify the access descriptor by changing SAS
names and formats and dropping columns from the descriptor so
that they can not be accessed by the SAS System.
The lack of the ability to call a programming interface to retrieVe data
presents a problem for storing subsetting information in the view
descriptor. Typically in aSAS/ACCESSinterface, DBMS specific
clauses such as WHERE and ORDER BY may be entered into the
Selection Criteria Entry Window. The clauses entered into this
window are theh passed down to the DBMS to filter and order
records which will be returned to the SAS System. It is impossible
for a PC file interface to do this because the engine-reads'the data
directly from the file itself. However, the ability to subset the records
returned from a view descriptor is very 'desirable. To accommodate
this ability, a SAS WHERE clause may be entered into the Selection
Criteria Entry window and stored in the view descriptor. When the
view descriptor is used in a SAS program the WH ERE clause is
given to the SAS engine supervisor (which makes the calls into the
interface view engines) so that it will filter records as they are
returned from the engine before giving them to a SAS procedure
or data step. The figure below illustrates the flow of data between
a SA~ procedure and PC file.
A view descriptor contains a subset of the information in an access
descriptor, These descriptors are created based on the information
stored in an access descriptor j not by direcUy querying the source
of the data. You have the ability to select all of the information that
is made available to the SAS System in the access descriptor or
to select only a few of the columns. In addition to selecting columns,
specific rows may be selected ~y specifying a WHERE clause 10
be stored in the view. Each time the view is referenced in a SAS
program this WHERE clause is used to filter the records being
returned to the SAS System.
The ACCESS procedure for all PC based SAS/ACCESS interfaces
has the ability to create these deSCriptors using interactive windows
and to extract external data into a SAS data file. Optio'nally, some
SAS/ACCESS products have the ability to crea~e descriptors
non-interactively using procedure statements.
I SAS Procedure I SAS System
W /'-
DBlOAD Procedure
The primary function of the DBl(.)~D procedure is to create tables
or files in an external data source' and load them with the data from
a SAS data file or view descriptor. The creation of these tables or
files may be done interactively or non-interactively using either
DBlOAD procedure windows or statements.
I
Engine Supervisor
,1/ /1'
I Interface Enginek .
In addition to creating external data, the DBlOAD procedure is able
to create access descriptors for the tables and files that it creates.
Optionally, PROC DBlOAD for a particular SAS/ACCESS product
may be able to pass non-query SQlstatements to the DBMS.
These statements may be useful for dropping tables, creating
indexes on newly loaded tables or granting database permissions.
Not all SAS/ACCESS products support the DBlOAD procedure.
/
,
View
Descriptor
,
PC File
Interface View Engine
All SAS data sets and views access their data through engines.
Each SAS/ACCESS interface provides an engine that allows the
SAS System to direcUy access data in an external source without
first extracting it into a SAS data file.
Data flow between a SAS Procedure and PC file
Unfortunately there is no method by which ordering information can
be stored in the view descriptor of a PC file interface.
Use of the engine is transparent in a SAS program. Anytime a view
descriptor is used with a data step or procedure, the interface
engine is loaded and used to access the actual data. All interface
view engines provide ~t least read access to external data. Most
view engines provide write access to the data ~ well.
PC File DBlOAD Procedures
The DBlOAD procedure for PC files contains all of the features of
the standard DBlOAD procedure described earlier with one
exception. Because this interface operates at the file level, with no
DBMS to handle SAS System requests for data manipulation, the
PROC DBlOAD SOL statement is not supported.
In addition to providing data access via view descriptors, some
interface engines support the SQl Procedure Pass-Through
Facility. This facility allows you to specify both query and non-query
SQl statements that are passed directly to a DBMS. The interface
180
THE DBF FilE INTERFACE
carries very little information about the data it contains other than
the data itself. There is a header portion of the file which typically<
only contains the number of vectors (columns or vari'ables) and
tuples (rows or observations) that are contained in the file. Only two
types of data, character and numeric, are supported in the file.
These data tYpes may be intermixed within the same vector just as
they might appear in a spreadsheet. All data values in a DIF file are
stored in ASCII format with each individual value represented using
two variable length records. For example, the figure below displays
a single record for the employee SAMMY SLICK from the
EMPlOYEE.DIF file.
The SAS System interface to OBF files is available in the
SAS/ACCESS Interface to PC File Formats. This interface contains
PROC ACCESS, PROC DBlOAD and an Interface View engine.
DBF files are database files created by several PC based database
products, the most notable of which is dBASE. The format of a OBF
file is fairly structured. It contains a header section containing
descriptive information about the data within the file. Data records
within the file are organized in a rectangular manner following the
header information. All data, including numeric values, are stored
as the ASCII character representation aftha value. A number of
data types including date, numeric, float, characteJ and logical
are, supported in the SAS/ACCESS interface. However, because
the, data for a memo type field does not actually exist within the OBF
-1,0
BOT
O,2.000000000000000E+01
file it is not supported.
V
Another property of OBF files is that each data record is preceded
1,0
by asingle byte indicating whether or not the record has been
marked for deletion. Marking a record for deletion does not actually
remove it from the file. A separate utility must be executed to
remove the "deleted" records.
• sammy·
1,0
"Slick·
1,0
"MKT"
DBF FU~. ACCESS Procedure
The ACCESS procedure for OBF files is fairly uncomplicated
because the header information describes the fields contained in
the file. All that is required for the ACCESS procedure to obtain a
description of the file is the name of the file itself. The ACCESS
procedure reads the header information of the file and may create
default SAS names and formats lor each of the fields in the file.
OIF File ACCESS Procedure
The process for describing DIF file data is unique among PC based
SAS/ACCESS products. These differences are based on the DIF
file's association with spreadsheets. Frequendy both character and
numeric data are intermixed in the same column, this mixing is
reflec~d in the DIF file as well. Additionally, column names (labels)
are not part of some global header information, but rather are part
of the data itself. Thus, column names are indistinguish~ble _from
column data. To accommodate these file ,characteristics the
ACCESS procedure for DIF files give you the ability to direct and
customize the creation of access descriptors beyond providing the
filename and assigning SAS variable attributes.
In addition to the fields defined in the header of the_ OBF fiola, the
ACCESS procedure also treats the field that marks records for
deletion as adata field for the file, This gives you the ability to
associate a WHERE clause with the view descriptor th.at can filter
"deleted" records so that they are not used by SAS procedures.
OBF File Interface View Engine
You may .provide the ACCESS procedure with two pi£:lces of
information that determine the default names that will be associated
with the columns in the file and the row in the file that will be
considered the first row of valid data. By default, the ACCESS
procedure will generate column names in the form of COLO,'COl1,
COL2, ... for each of the columns represented in the DIF file and
will treat the first row in the file as the first observation. Typically
though, the .first row of these files will contain a row of character
values labeling each column. This row of labels may also be
separated from the actual data values by one or Il}.ore rows that
somehow further' distinguish them from the d.ata. On the .DIF File
Access Descriptor Identification Window you may indicate that the
first row of the OIF file should be treated as column names and how
many rows to Skip before the actual data begins. For -example, you
have a spreadsheet with column labels in the first row followed by
ablankrowwith actual data beginning inthethird row. To geta
proper descrip~jon of this file, you should ,indicate that column
names should be generated from the fir&t,row and that two rows
should be skipped before the actual data begins.
The interface view engine for OBF files supports reading, writing
and updating of records. Another feature of this engine that may be
advantageous in some applications is its ability to randomly access
records within the OBF file based on record number. In other words,
if you are using the FSEDIT procedure and want to read observation
2045 the en,gine can calculate where in the- file that record begins
and then position itself on it without reading any other records in the
file,
There are several other things that should be noted about the
engine that are due largely to the lack of a callable programming
interface.
rvtIltiple users are able to open the same DBF file for read access
using the engine. However, when the file is being opened for update
access, an exclusive lock is placed on the file and only,the user
opening the file for update will be able to access the file,
dBASE and other products using OBF files for data storage can
create index fi!$S that their products use to quickly access specific
rows ,of data or to return rows in a specific order. Because the OBF
files themselves do not contain information concerning the
existence or location of the index files created on them the interface
engine cannot·take advantage of these files.
An additional enhancementto the .DIF File Aocess Descriptor
Display window in Version 6.08 of the SAS System is the ability to
change the column types from that determined by the ACCESS
procedure. The ACCESS procedure will read the first row of data
(after skipping the number of rows indicated on the .DIF File Access
Descriptor Identification Window) and use the data tyPe olthe
valu$S in that row as the expe~d type for the values in following
rows. This method for deten:nining column types i$- at best an
educated guess for several reasons. First, OfF,files allow the
intermixing of data types within the same column. The column type
in the first row may not be the same as the predominant (or desired)
THE DIF FilE INTERFACE
The SAS System interface to OIF files is available in the
SAS/ACCESS Interface to PC File Formats. This interface contains
Proc ACCESS, Proc DBlOAD and an Interfa~ View engine.
The structure of the DIF file itself is somewhat free form and typically
181
type. Also, jf no data has been entered for several columns at the
end of the first row, the DIF file may contain absolutely no
information concerning those columns forthatrow. When this
occurs on the first row of data, the ACCESS procedure will by
default expect the values in those columns to be characters. You
may change. the column type from that determjned by the ACCESS
procedure simply by changing "C" (character) to "1>1" (numeric) or
vice versa for the individual columns that are incorrect.
character column will be returned to the SAS System as the
character representation of the number rather than a missing value.
DIF File Date & Time Values
Date values are a speciai case in DIF files because there is no date
data type, only character and numeric. However, lotus 1-2-3
software does allow you to enter and display date values on a
spreadsheet. These vaiues are stored as a numeric representation
of the date'and are written to the DIF file as a numeric value.
However, the Lotus 1-2-3 numeric representation of a date differs
from that of a SAS software date or datetime value. So, the numeric
date value must be converted between the SAS System
representation and the lotus representation in order to be
meaningful. The DBlOAD procedure and view interface engines do
this automatically for all numeric variables which have a date, time
or datetime SAS variable format.
DIF File DBLOAD Procedure
By default the DBlOAD procedure begins loading data directly in
the first row ofthe DIF file. No information about SAS variable
names or labels is written to the file. You may indicate that the SAS
variable names or labels be written to the first row of the DIF tile
on the .DIF File load Identification Window or by using the
DIFLABEl procedure statement. When this feature is used, a blank
row is automatically written after the labels to differentiate them from
the actual data which will begin in the third row. The tables below
illustrate the difference in using the DIFLABEl feature when loading
the employee data set.
001
Henrietta
Hacker
ISD
020
Sammy
Slick
MKT
003
Justin
Doit
CEO
CHARACTERISTICS OF A DBMS INTERFACE
The second type of SAS/ACCESS interface available on the PC
platform is an interface to a database management system. In these
interfaces. the data is completely controlled by the DBMS. The SAS
System makes requests of the DBMS which in turn accesses data
and returns information. The relationship between the interfac;e and
the DBMS provides for a number of additionaUeatures in these
SAS/ACCESS interfaces. The figure below iIIustrate~ the flow of
information between the elements of the SAS System and a DBMS
data_ source.
Data loaded into the EMPLOYEE.DIF Iile without DIFLABEL.
EMPID
FNAME
LNAME
DEPT
I SAS Procedure I SAS System
001
Henrietta
Hacker
ISD
020
Sammy
Slick
MKT
003
Justin
. Doit
CEO
,1/
i
/1'
I Engine- Supervisor
,I
W /1'
View
Descriptor
I Interface Engine
Data loaded into the EMPLOYEE.DlF Iile with DIFLABEL.
,V
DIF File Interfa!Ce Yiew Engine
The unstructured· nature of DIF files forces some restrictions on the
interface view engine. The most notable restriction is that the engine
is read-only. Also, the engine is not able to calculate where a
particular record will begin 'or end. This prevents the 'engine from
being able to jump directly to any requested observation or to the'
previous record. Instead, the erigine must positidn on a known
record prior to the requested record'and read forward until the
desired record is found. This problem can be quite 'obvious when
scrolling backward tJsingthe FSBROWSE procedure oil a large DIF
file. Each time that you scroll backward, the'engine must -repOSition
on'the first reco'rd and then read each record until the previous
record is read. This repositioning can cause a noticeable delay,
especrally if you are positioned somewhere well beyond the
beginning of the file.
I
DBMS
t=
Data
/
E)(ternal
Data
Source
Relationship between SAS Procedure, SAS/ACCESS Interface,
DBMS and the data
Currently on the PC-piatform.all of the DBMSs that are being
accessed are relational databases. Typically relationa1 data is
stored in rows and columns of a table within a database. The
actual file structures that the DBMS uses for storing and organizing
this data is unknown to applications other than the DBMS itself.
These products- use SOL as their primary data manipulation
lan-guage. Generally, this allows an interface to build Sal
statements for selecting, updating and inserting data. Additionally,
statements for administering the database suoh as granting
permissions and creating indexes may be passed to the DBMS.
Any time the engine encounters a data value whose type does not
match that expected in the view descriptor, that vaiue is treated as
a-missing value. For example, if a numeric value is found in a
column that is expected to contain character values only, it will
become a missing value. However, because the produ"ct that
produced the DIFfile allowed the intermixing of data, itmay be
advantageous for you to view the data in a'-SAS procedure without
having these numeric values converted to missing. In Release 6.08
of the SAS System this can be accomplished by setting an operating
system environment variable DIFNUMS to YES in your contig.sas
file. When DIFNUMS is YES, ail numeric values encountered in a
Selecting and Ordering DBMS Data
The SOL SELECT statement allows a WHERE Clause to be used
to filter the rows that are going to be returned from a table.
Additionally an ORDER BY clause may also be appended to it to
cause the rows to be returned in a specific order. The SAS/ACCESS
182
interfaces allow you to take advantage of these clauses in several
ways.
two clauses are logically ANDed together so that the data returned
must meet both criteria.
First, whenever a SAS WHERE or BY statement is used in
association with a view descriptor, the statement is translated into
a DBMS specific WHERE or ORDER BY clause and appended to
the SELECT statement generated by the engine. The translation of
the SAS statement typically requires changing SAS variable names
to DBMS column names and, in WHERE statements, changing SAS
specific expressions to logically equivalent DBMS expressions.
Accessing Specific DBMS Rows
When a DBMS processes a SELECT statement, it determines the
rows that meet the WHERE criteria and return those rows to the
application. Rows are not guaranteed to be returned in any specific
order unless an ORDER BY clause is specified on the SELECT
statement. In addition, there is no way to indicate to the DBMS that
the Nth row satisfying the statement should be returned. This
situation presents a problem for DBMS view engines similar to that
of the DIF File engine. There is no way for the engine to access a
specific row number Without fetching eaqh record that satisfies the
SELECT statement until the desired row inumber is found. For this
reason, scrolling backwards in a full screen procedure like FSED1T
may encounter a delay when compared to scrolling forward. Moving
forward simply causes the engine to fetch the next row from the
DBMS. Moving bacKward means that the current SELECT
statement must be closed, reopened and each row fetched until the
previous record number is found.
For example, the view descriptor VIEWLIB.SALARY is based on the
Database Manager table SASDEMO.SALARY. The SAS variable
name in the view descriptor for the BONUSCOD column is BONUS
and the variable name for the EMPID column is EMPID. If you want
to print a list of all employees who have a bonuscode sorted by
empid, submit the following SAS statements.
PRoe PRINT DATA=VIEWLIB.SALARY;
WHERE BONUS IS NOT MISSING;
BY EMPIO;
RUN;
In this case the following clauses:
DBLOAD Procedure SQL Statement
WHERE (·BONUSCOO" IS NOT NULL)
ORDER BY EMPID
In addition to creating and loading tables, the DBlOAD procedure
enables you to pass non-select statements to the DBMS. This
feature is extremely useful in the OBlOAD procedure for dropping
old tables and issuing administrative commands on newly created
tables.
are added to the SELECT statement generated by the engine.
When the select statement is passed to the DBMS, it can use
indexes and any other means aVc:Ulable to it to optimize the retrieval
of the data. Also, only the rows that meet the WHERE clause criteria
are passed back to the SAS interface. This cuts down on the amount
of 110 that takes place between the interface engine and the DBMS
as well as theamount of data conversion that the engine must
perform,
PROC DBLOAD DBMS=DBMGR DATA=SASUSER.SALARY;
IN 'SAMPLE';
TABLE=SASDEMO.SALARYi
SQL DROP TABLE SASDEMO.SALARY;
LIST ALL;
LOAD;
The second place that you can taKe advantage of the~e DBMS
specific Sal clauses is by storing them in the view descriptor. When
creating or editing a view descriptor you can enter these clauses
in the Selection Criteria Entry Window. The clauses that you enter
must be OBMS specific, using the column names, not SAS variable
names. Each time that the view descriptor is used in a SAS
program, the stored clauses are added to the select statement
generated by the engine.
RUN;
PR08 DBLOAD DBMS=DBMGR DATA=SASUSER.SALARY;
IN 'SAMPLE';
TABLE=SASDEMO.SALARYi
SQL CREATE UNIQUE INDEX SASDEMO.EMPID NDX
ON SASDEMO.SALARY (EMPID);
SQL GRANT SELECT ON SASDEMO.SALARY
TO PUBLIC;
RUN;
SRS~
F_t.
EJ'FID
8H..NIT
3.0
DCIl..l..NIIO.II!
II!. 0
EI.N.9DI
In the example above, the DB).,.OAD procedure for Database
Manager is used twice. Once to create the table
SASDEMO.SALARY table in the SAMPLE database and once to
define an index and grant permissions on the table. ,These
operations must take place in separate batch mode invocations of
the DBlOAD procedure because Sal statements are immediately
passed to the DBMS by the procedure, but tables are not actually
created until the RUN statement is encountered. So, although you
can drop and recreate a table in one invocation of the DBlOAD
procedure, indexes and ~rmissions cannot be granted on the new
table until after the RUN statement because the table will not exist
until that time.
SQL' Procedure Pass-Through Facility
For example, the display above shows WHERE and ORDER BY
clauses being entered to create the view descriptor
VIEWLlB.BONUS. Each time that this view is referenced, only the
data wher,e the BON USCODE exists will be returned and it will
always be returned in sorted order. No SAS WH ERE or BY
statements are required on the procedure or data step to make this
happen.
An additional engine feature that is available on some interfaces is
the Sal Procedure Pass-Through Facility. This feature is used
exclusively with the SOL procedure to allow both query and
non-query Sal statements to be passed to a database
management s~tem for execution.
The pass-through feature is especially useful for joining data from
multiple tables in a single database. This is because a Single SQl
SELECT statement that joins the data may be passed to the DBMS.
The DBMS, not the SAS System, then performs the join using any
In cases where a SAS WHERE clause is used in combinatio'n with
a view descriptor that already contains a DBMS WHERE clause, the
183
optimizations available to it and returning only the data that satisfies
the SELECT statement.
Database Manager Pass-Through Facility
An experimental feature, the SOL Procedure Pass-Through Facility,
was added to the Database Manager interface with the production
release of Release 6.08 of the SAS System. This feature was not
announced or documented at the time of the production release.
Initial reports from sites testing the interface are very favorable and
indicate that SOL Pass~ Through for Database Manager is fairly
stable. To acquire a copy of the documentation simply call SAS
Institute Technical Support Division and request the preliminary
documentation for SAS/ACCESs«' Interface for Database Manager:
PROC SQL;
TITLE "EMPLOYEE SALARIES';
SELECT * FROM CONNECTION TO AS400
(SELECT FNAME, LNAME, SALARY
FROM SASDEMO/EMPLOYEE,
SASDEMO/SALARY
WHERE EMPLOYEE.EMPID = SALARY.EMPID
ORDER BY SALARY);
SOL Procedure Pass- Through Facility.
In the example above, the AS/400 tables SASDEMO/EMPLOYEE
and SASDEMO/SALARY were joined to produce a listing of
employees and their salaries sorted by salary. The data was joined
and sorted on the AS/400 before being returned to the SOL
procedure. The output below shows the results produced by these
statements.
Setting Up the Database Manager Interface
Aside from actually installing Database Manager and defining
databases, there are several things that must be done prior to using
the interface.
All users using the Database Manager must be defined through
User Profile Management Services. The userids maintained in this
service are used to maintain the security of the data stored in
Database Manager. Individual users must begranted specific
permissions to access data in tables created by other users. If ,you
do not have any permissions on a specific table, Database Manager
will not allow you any access to the data.
EMPLOYEE SALARIES
FNAME
LNAME
SALARY
Sammy
Henrietta
Justin
Sli,ck
Hacker
Ooit
30000
35000
120000
Before the SAS System can access Database Manager tables, the
database must know that the SAS engine may try to access -it arid
what the interface may try to do. To do this an application must be
"bound" to each individual database that it needs to access. The
SAS/ACCESS interface to Database Manager supplies both a bind
module and a command file (ACCESS.CMD) for binding the
interface to Database Manager databases.
Several methods are also available with the SOL procedure to
create a SAS data file or a PROC SOL view reflecting the results
of the query. These methods will allow other SAS procedures to
operate on the joined data.
Without pass-through, the SOL procedure will have to use view
descriptors to access the DBMS data. This would mean that all data
accessible to each individual view would be returned to the SOL
procedure which would then perform the join. Using view
descriptors for joins has several drawbacks in that DBMS
optimizations such as indexes are not available to the SOL
procedure and sometimes large amounts of data not needed for the
join may be returned from the DBMS just to be discarded.
To perform the actual bind, you must have created the SASDEMO
use rid in User Profile Management Services and currently be
logged in as that user. Then enter the command:
ACCESS database-name
-CONFIG configuration- file-;:_name
where da tabase-name is the name of the database to which the
"bind" is'being performed and configuration-file-name is
the riame of the cenfig.sas file that you use to run the SAS S.ystem.
THE 0512 DATABASE MANAGER INTERFACE
The ACCESS command will locate the bind module and issue the
SQLBIND ~ommand to Database Manag,er. -Once the bind h~_ been
successfully accomplishe,~, you musti$-sue the foUowing Sal
statement to permit other users to acceS$ the database using the
SAS/ACCESS interface:
The SAS/ACCESS Interface to OS/2 Database Manager contains
the ACCESS and DBLOAD procedures, an interface view engine
and most of the features that may be found in these components.
The most notable feature that is currently not available is the
ACCESS procedure statements needed to create access and view
descriptors non":'interactively.
GRANT
OS/2 Database Manager is the IBM® relational databas'e
management system that runs under OS/2. The latest release of
database manager is packaged in the IBM Extended Services for
OS/2 and runs under OS/2 1.31 & 2.0 as a 16 bit application.
Database Manager databases define<;l on other machines
connected by a network may be accessed on a local machine using
the Remote Data Services component of Database Manager.
EXECUTE
ON
PROGRAM
SASOEMO.SASOII TO
PUBLIC
THE AS/400 DATA INTERFACE
The SAS/ACCESS Interface to AS/400 Data is composed of the
ACCESS procedure and an interface view engine. The ACCESS
procedure for this interface does not support statements that allow
the non-interactive creation of access and view descriptors.
However. the interface view engine supports the SOL Procedure
Pass-Through Facility, allowing Sal s~tements and querie,s to be
sent directly to the AS/400 for execution.
Although the Release 6.08 of the SAS System is a 32 bit applicetion
running under 0512 2.0, the SAS/ACCESS interface is still able to
use the 16 bit Database Man/ilger at a cost. The,cost is that each
call made to a Database Manager function must be "Thunked".
Thunking is the process by which program control and parameters
are passed between 32 bit and 16 bit applications. During the
transfer of control, the program stack must be manipulated and all
pointers converted between the 32 bit flat memory model and the
16 bit segmented memory model. These conversions must take
place while calling into a function and on returning from it.
The AS/400 provides a computing environment where hardware
and software are highly integrated. The operating system,
OS/400®, has many builtin features including a relational database
management system. Data in this relational database may be
managed natively via non-Sal means or in an SOL collection. The
SAS/ACCESS interface may access data managed by eith-er
means. The database functions of OS/400 operating system are
integrated to the extent that Sal queries for non-S'Ol managed
184
data are supported.
The AS400MSG environment variable is use.d to indicate the
librarylfile name of the compiled SASFMSG CL program. If the
name of the program is changed to anything other than SASFMSG,
or if the library where it resides is not in your *UBl library list then
this variable should be set. The format of value shOUld be
library/filename. If this environment variable is ·not,set, a default
of "L1BUSASFMSG will be used.
The SASIACCESS Interface to ASI400 Data is able to access data
on an AS/400 by using the Remote SOL Application Program
Interlace (API) of the IBM PC SUPPORTI400 software. Using this
interface, SASIACCESS software is able to pass SOL statements
and queries to the PC Support product which in turn sends the
request to· the AS/400 via Communications Manager and returns
the results to the PC.
AS400CMT
Indicates the level of commitment control and locking that the
interface should operate under. Valid values are:
Like OS/2 Database Manager, PC Supportl400 is a 16 bit
application. As discussed with the Database Manager interface, all
calls made into its API functions must.be thunked. The figure below
illustrates the flow of information between the SAS procedure and
the AS/400 database file.
1 SAS Procedure 1
II
NONE
No commitment control. Default locking is used such
that read or update locks are acquired as a row is read
and released when the next row is read.
CHG
Change control. Update locks for updateable CUr;., .Ji"$
only. Update locks for unchanged records
released as the cursor moves to the next row. Lo :r;
for changed rows are released at Commit C:, ','0!P);
Otherwise no locks are acquired .
CS
Cursor Stability control. Always uses a read or up(latp
lock. as appropriate. Read locks and Upd::F'" i, ,'"._, 'r""'
unchanged records are released as th~ cursor moves
to the next row. Locks for changed rows '~i;: rele':"JSed
at commit or rollback.
All
Full commitment control and locking. All locks acquireri
are maintained until a commit or rollback " '~':.:urs
SAS System
.J.T
E.g;••
SUP.~",o~
.J,. T
v;.w
1 Interlace Engine
Descriptor
J I
I. PC Support/400 1
I
I
AS/400Host
»
~PCSIPP09
Database
File
I
If a level of commitment control is specified (any va~u", othe; than
NONE), then commitment control is used. However, IJse of
commitment control prohibits the use of non·journaled dat',v:;ase
files. If no commitment control is used (i.e. NON!:.:; :·,en >-.-·ath
joumaled and non·journaled database files can be accessed. The
default value for this variable is NONE.
Data Flow between the SAS System and ASI400
Options Affecting the ASI400 Data Interface
If you are using PC Supporv400 Version 2 Release 1 the only valid
values for this variable are: NONE and ALL.
For the features supported, the abilities of the AS/400 interface
follow very closely with those mentioned earlier in the
"CHARACTERISTICS OF A DBMS INTERFACE" section.
However, these abilities may be altered and performance improved
by using a number of options that have been provided by the AS/400
interface. These options are passed to the interface by using OS/2
environment variables. Values may be assigned to these variables
by using the -SET system option in your config.sas file. For
example, the line:
AS400DEC
Indicates the numeric decimal separator to be used. Valid values
are a period (.) or comma (,). The default value is period (.).
AS400BUF
Defines the size of tt:le buffer used for data transfers. Valid values
are integers from 1 to 31' and represent the size of the buffer in
kilobytes. The default value is SK.
-SET AS400CMT CHG
may be added to your contig.sas file to assign the value CHG to
the AS400CMT environment variable. Thefollowing environment
variables are used with the AS/400 interface:
AS400UPD
Indicates if rows returned from a select statement may be updated.
A value of YES indicates that SELECT statements should be
invoked so that rows can be modified. NO would not allow updates
of selected rows.
AS400PLU
The Partner logical Unit name by which Communications Manager
recognizes the AS/400 host. If this environment variable is not set.
a default of 52S0PlU is used.
In addition to the ability to update.rows, this variable affects I/O with
the AS/400. When a value of YES is used, records are returned from
the AS/400 one ata time and rows can be fetched from severat
different tables at the same time. However, if a value of NO is used,.
rows are returned from the AS/400 in blocks and rows can only be
fetched from one table at a time. If you only intend to read data.from
one table during the execution of a proc, then you may want to set
this environment variable to NO to increase performance.
AS400MSG
Messages for Sal errors that occur on the AS/400 are not available
to the interface through PC Support's remote Sal API. In order for
the interface to get text for these error messages a Command
Language (CL) program, SASFMSG, must be uploaded and
compiled on the AS/400 when the interface is installed.
If this environment variable is not set, a default of YES is used.
When the interface encounters an Sal error during an operation,
information concerning the error is collected and passed to the
SASFMSG program on the AS1400. This program will take the
information and fetch the appropriate message from the AS/400
Sal message file sending it back to the interface on the PC.
185
running under both OS/2 2.0 and Microsoft Windows 3.1 and
Release 6.09 release is under development for Microsoft Windows
NT. The interfaces discussed in this paper are either available with
the Release 6.08 production release or will be released as an
add-on product some time after the initial release of the SAS
System.
AS400IYAT & AS400TlM
Data types for dates, times and timestamps were ad~ed to 05/400
in Ve~ion 2 Release 1.1. To accommodate these data types two
OS/2 enVironment variables are used to determine the format of the
date and time values returned to the engine from the AS/400. The
name of the environment variable used for the date format is
AS400DAT; for time format it is AS400TIM. The following table
All of the following interfaces, except the interface to SYBASE and
SQl Server, are available with the production release of Release
6.08 of the SAS System.
shows the possible values for these variables and which values are
valid for a particular format.
Valid for
AS400DAT
Valid for
AS400TIM
YMD
Y
N
MDY
Y
N
DMY
Y
N
JUL
Y
HMS
N
Y
ISO
Y
Y
USA
Y
Y
EUR
Y
Y
JIS
Y
Y
Variable Value
.
• SAS/ACCESS interface to PC File Formats
• SAS/ACCESS interface to OSI2 Database Manager
• SAS/ACCESS interface to AS/400 Data
• SAS/ACCESS interface to SYBASE and SOL Server
The DBF and DIF file interfaces discussed in this paper are
packaged together into the SAS/ACCESS Interface to PC File
Formats product.
-
N
Although each of the interfaces discussed in this paper are available
on the PC platform, they are not always supported on both the OSI2
and Microsoft Windows operating environments. The table below
reflects the availability afthe SAS/ACCESS products in these
environments.
SAS/Access Product
By default, date values will be returned from the AS/400 in format
specified by the aDATFMT system value on the AS/400. This same
value should be specified as the value for the AS400DAT
environment variable. In 05/400 Version 2 Release 1.1 ther.e is no
system value for the time format; the default is HMS. However, if
the time'format is modified by·another'means (such as modifying
the time format associated wrth the PC Support job that is
communicating with the engine), then the that time format should
be specified for the AS400TIM environment variable.
OS/2
. Microsoft
Windows
PC File Formats
Y
Y
Database Manager
Y
N
AS/400 Data
Y
N
SYBASE and SOL
Server
N
Y
CONCLUSION
This paper has discussed the components and features of the
SAS/ACCESS products that are available on the PC platlorm. Each
interface was discussed with special emphasis on the
characteristics and options that make ,them unique. You should
have gained an understanding of the abilities of each interface and
the principles upon which they are based. Using this understanding
you should be able to use these interfaces efficiently, avoiding the
pitfalls that may exist in some.
FUTURE ENHANCEMENTS TO PC BASED SAS/ACCESS
INTERFACES
To a large extent enhancements to SAS/ACCESS interfaces are
driven by demand from you. The first three interfaces discussed in
this paper, DIF Files, DBF Files and Database Manager are
complete with the exception of providing AC'CESS procedure
statements allowing non-interactive creation of descriptors. The
AS/400 interface also lacks this feature as well as the DBlOAD
procedure. All of these features are under consideration to become
enhancements; however availability has not been determined. If
you have a specific need for an enhancement to these interfaces,
please contact your SAS Software Representative.
SAS and SAS/ACCESS are registered trademarks or trademarks
of SAS Institute Inc. in the USA and other countries. IBM. ASJ400,
OS/2 and OS/400 are registered trademarks or trade:marks of
International Business Machines Corporation. ® indicates USA
registration
A new interface is currently under development to be a add-on
product under Release 6.08 of the SAS System for Microsoft
Windows. The SAS/ACCESS Interface to SYBASE and SQLServer
will be a full feature interface providing ACCESS and DBlOAD
procedures with windowed and non-interactive modes of operation.
The view engine of this new interface will also support the SQl
procedure Pass-Through Facility. The Sal Server interface is
being built against Release 4.2 of the SQl Server Client and will
support stored procedur~s and triggers.
Other brand and product names are registered trademarks or
trademarks of their respective companies.
'
SUMMARY OF SAS/ACCESS AVAILABILITY
The SAS system currently has Release 6.08 production releases
186