Download A Hierarchical Database Engine for Version 6 of the SAS' System

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Oracle Database wikipedia , lookup

IMDb wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Microsoft Access wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Ingres (database) wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Functional Database Model wikipedia , lookup

Relational model wikipedia , lookup

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

ContactPoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
A Hierarchical Database Engine for the Version 6 SAS· System
Barbara A. Barrett, SAS Institute Inc., Austin, TX
Using a Hierarchy with the SAS System
ABSTRACT
The challenge for a view Interface engine for a hierarchical database
Is that the SAS procedures expect to process taibuJar data. A hlerar·
chy contains not one typo of record, but many, and the records
repeat in groups that V81Y in dimension. Furthermore. while each
type of record is always related to at least one other type, not every
typo of record is related to every other typo. A number of issues
arise. What is an observation? What side effects does updating a
single record cause? What does it mean to delete an observation?
How do you qualify groups of records? What happans if portions
of the relationships are missing? The SASIACCESS Interface to
SYSTEM 2000 software answers these questions in ways that sur...
face the power of the database to the SAS System in a simplle and
straightforward manner. Consider the following examplle.
The first Version 6 SAS engine for hierarchical databases was built
for SYSTEM 2000" Data Management Software. SYSTEM 2000
software Is a hierarchical data management system for mainframe
computer systems ruMing under IBM· MVS and CMS host systems. A hierarchical database represents one-to-many relationships
among data records at different levels. This paper describes how
to use Version 6 of the SAS System to define, load, query, and
update a SYSTEM 2000 database.
INTRODUCTION
An engine Is a component of the Version 6 SAS System that reads
from or writes to a file. Each engine allows the SAS System to
aocess files with a particular fonnat. An interface view engine reads
and writes to files that have not been formatted by the base SAS
engine. The SASIACCESS· ilterface to SYSTEM 2000 software
i1cIudes an interface view engine for reading and writing SYSTEM
2000 databases. It also includes tha ACCESS procedure for definIng a new typo of SAS file caJled a descrtptor file, the DBLOAD pr0cedure for creating and loading databases, and the QUEST
procedure for using the SYSTEM 2000 languages directly.
A QUALITY CONTROL APPUCATION
This examplle is a quality control application. A manufactuf1ng company assembtes small power too!s. Technicians in the test area
monitor a number of variables, as well as track defects. Engineers
use the SASlQC" product to analyze whether or not the process
IS m control. Dunng each shHt, data are collected in SAS data files
on microcomputers in the test area. Of course, much of the data
are redundant, such as the data that JcJentHy the process or product.
Once an hour, the collected data are moved to a SYSTEM 2000 hier·
archical database on a mainframe.
SYSTEM 2000 DATABASES
A SYSTEM 2000 database Is hierarchical in that you store and
access data according to defined retationships between groups of
related data. In addition, SYSTEM 2000 databases include indexes,
qualification and sorting capabilities, passwords for securing
access, both Multi-User"' and single-user execution enviroMlents,
transaction joumaling, and rollback recovery.
A quality control analysis looks at ~her attributes or variables. Attributes are characteristics that either exist or do not, such as whether
a behander has a good plug or a bad plug. Variables are character·
istics that can be measured, such as the current drawn by an electric
drill. This application looks at both. Figure 1 shows an outDne of
the database.
Advantages of Hierarchical Databases
Product type
Status
The main reason to choose a hierarchy to represent Information is
to gain a porformance benef~. The typo of application that benef~s
from using a hierarchical database is one that contams several types
of records, each related to the other in some predictable way. As
separate fiies, the different record typos contain duplicate informa·
tIon, and are often quite large. But when you put these files in a hierarchical database, you can reduce both the amount of storage
required to hold the records and the processing time required to
access them. All data management software systems that handle
hierarchies are alike in that the software stores Information about
the data relationships in order to provide the performance advantage. Beyond that, different packages offer different benefits. For
example, SYSTEM 2000 software provides item-level security,
shared update access, and more.
I
Serial number
I
Lot number
Defect
Hour measured
Measurement type
Measurement value
Figure 1 Outline for Qualify Control Database
Defining the Hierarchy
The outline of the database reflects the hierarchical structure. Product type and status are at the top of the hierarchy, at level zero.
Each product typo, ike an electric screwdriver, has many serial
numbers that are tested, so serial number is at level one. Eadl serial
number may have several defects, so defect data is at level two.
The combination of information from level zero to level two is a path
in the database. Each product typo is measured many times, so
measurement data is at level one. Product typo plus measurement
data Is another path in the database, separate from the defect path.
Some data co.llections are Intrinsically hierarchical. Departments
have employees, experiments have resultS, accounts have transactions, ,and so on. These are all examples of one-to-many relationships occuning naturaUy in the data. While all these collections can
be stored In multiple relational tables, a hierarchical database can
be a very convenient alternative because you do not have to request
a valu&-based Join every time you need to hook tables together. The
hieran:hical database does ~ automatically.
429
Creating the Database
Loading Additional Records
There are two ways to initially set up the SYSTEM 2000 database
with the structure shown in Figure 1. The two methods are the
DBLOAO procedure and the QUEST procedure. Either procedure
can be run in batch or display manager mode.
You can load records into an existing database either with the
DBLOAD procedure or the APPEND procedure. In the following
example, collected data are loaded every hour. Both procedures
load data through the SYSTEM 2000 view interface engine. All
engines operate from special SAS files called view deSCriptors. A
view descriptor contains a list of database items that you want to
process with the SAS System. When you define a new database,
the DBLOAO procedure automatically creates a view descriptor. Or,
you can define your own view descriptor using the ACCESS procedure, described later. The following statements load
SASDATA.DEFECT into a database using a view descriptor called
S2KDATADEFECT. The defect records may be for existing serial
numbers or for new ones, or both.
With the DBLOAD procedure, you start with a SAS data file, for
example, SASDATA.DEFECT. which contains one hour's worth of
defect information. The new database is patterned after the input
file. You can define a single-path, multilevel database. The DBLOAD
procedure loads the input data immediately after creating the database, unless you choose otherwise. Submit the following statements in batch mode to define a three-level database. Other
statements are avaHable to rename variables, drop variables, define
indexes, and more.
proc dbload dbms=s2k Ifata=sasdata.defect;
viewdesc = s2kdata.defect;
load;
run;
proc db10ad dbms=s2k data2sasdata.defect;
dbn=quality;
s2kpw e demo;
level serial>d lot=2 defect.,2;
load;
run;
Creating Descriptor Files
An access descriptor file is a special type of SAS file created by the
ACCESS procedure. The access descriptor is a fist of all the items
in the database, along with information to access the database.
Because SYSTEM 2000 software supports password security, you
can also set up different access descriptors for each password,
where each password limits the set of items in the descriptor.
In display manager mode, the DBLOAD procedure displays two windows. In the first, you identify the database and provide a password.
In the second, you indicate those variables on SASDATA.DEFECT
that you want included in the database. You also indicate the structure of the database by typing in the hierarchical level of each field.
You can define indexes now, or later with the QUEST procedure.
Screen 1 shows the SYSTEM 2000 Load Display Window.
A view descriptor is another special type of file created by the
ACCESS procedure. View descriptors are the means by which the
interface view engine reads and writes to a database. You create
a view descriptor using the access deSCriptor as a blueprint. A view
descriptor contains a list of database items, as well as criteria for
subsetting database records by value. The person who uses the
Quality control database does not need to know anything about
how the database is organized. He just needs to know the name
of a view descriptor. You choose whether you require the person
to enter a database password.
+DB!.OAll, DATABASI! QUALITY- _______________________ . _____ ._
I Co .... and
.~.>
SYS'lEK 2000(r) !.oad. Display Kindo ..
Database, g1JJU.ITY
.=< CO,
- "
,"
-
-
Component HaOle
PRODUCT
STATUS
SSI\IAL
, '"
DEFECT
Illd.e~
SAS llama
format
PRODUCT
STATUS
SERIAL
".
'"
DUECT
(-- input
(-- output
no.
USTI2.
US712.
$20.
The DBLOAD procedure creates a default access descriptor when
you create a new database. However, you will probably find reasons
to create your own, such as to limit access to fields. To create an
access descriptor, use the ACCESS procedure.
proc access accdesc*s2kd:ata.qual1ty funct1on=c;
run;
Screen 1
If you have more than one SASIACCESS product, an Engine Selection Window prompts you to select the database management system you want. Put your cursor on SYSTEM 2000 and press ENTER.
The Access Descriptor Identification Window, shown in Screen 2,
appears next. Fill in the database name, your password, and
whether the database is under single-user or Multi-User access,
then press ENTER.
SYSTEM 2000 Load Display Window
The QUEST procedure is an alternative way to create a database.
You can define many paths with the QUEST procedure. You can
also use the DBLOAD procedure to define one path, and the QUEST
procedure to define others. The follOwing statements define two
paths in one database, one with three levels, the other with two
levels. No data are loaded.
proc quest;
user,demo; new data base is quality;
define;
1. product (char x(30));
2. status (lion-key char X(SJJ;
100. products (record);
101* serial (int 9(S) in 100};
200. defects (record in 100);
201. lot (non-key int 9999 in 2(0);
202' defect (char 1(20) in 200);
300' measurements (record in 0);
301. hour (non-key int 9(6) in 300);
302. mtype (ctlar x(20) in 300);
303. mvalue (oon-key dec 9(8).99 in 300);
map;
430
+ACC!SS: Create Descriptor ________________
I C_lId
~
________________________ +
SYSTI!:lI 20001r) Access Descriptor Identification Window
Descriptor,
Library, S2KDA'I'A
Membar, QUALITY
'I'ype, "CCESS
,
,
I
,
,
I
I
Descriptor:
Library: s2kdata
OlltPllt SAS data set: Library,
,.,
I1lIlti-uuqt..): YES
Databue, quality
CJ
Lvl
••
• ""
Cl00
C1Dl
c200
c201
C2Q2
C30G
C301
C302
C102
Screen 2
SYSTEM 2000 Access Descriptor Identification
Window
Screen 4
,
,
,I
Type: Vin I
Member: measure
Member:
Type, DATA I
Passvord to save:
ItIIlti-U8et(tm.), yU
• • items, 0 sele<::t"d *
COIIpODent Name
SAS Bame Format
INTRY
_RECORDO OJl;ECOIlOO
••.•••. $]0.
PRODUCT
STATUS
••.•.••. IS.
PRODUCTS
.1l~COIlD* *IECOIIO_
SERIAL
.•..•••• 5.
0RICOROO .IlECOIlDO
DEFECTS
CO,
•••••••• 4.
DEfECT
IIEASUREItEIITS
•••••••• $2G.
*RECORD_ *RECORe_
,
,
,
,
,
..
m~
MTYPE
$20.
•.•••••. 11.2
~""~
SYSTEM 2000 View Descriptor Display Window
You can also enter subsetting criteria by entering the command
SUBSET. An edit Window appears. Enter a WHERE clause, such
The Access Descriptor Display Window, shown in Screen 3,
appears next. You can save it as is, or drop fields that you do not
want view descriptors to have access to.
+ACCEBS, Creat. nucrlptor
I C...... nd ~~.>
I
SYSTIM 2000ln Vil'l" D<>scriptor Disphy windov
Oatabu", DUALITY
Password:
----------------+
+ACCISS: Crnt" Descriptor. ------------ ___________
I co ...... nd .R~>
I
~_~>
as
where status = production
--------------------------
You can also enter an ordering clause.
SYSTEM 2000tr) ACCess Dascrlptor Display Wl"do"
Library,
S210ATA
Mellber, QUALITY
USING YOUR DATABASE WITH THE SAS SYSTEM
Type, ACCESS
Database, QUALITY
,~O
,., ,.,
• ""
""
Cl00
cl0l
C200
C201
C202
C100
C301
C302
C102
Screen 3
• 8 items, 8 selected *
COIIPO"""t Mau."
lAS II ...... PQrI... t
!H'I'1It
.RECORD. _RiCOtlD_
PRODUCT
•••• 130.
STATUS
•••••••• IS.
PRODUCTS
SnrAL
0RECORDO 0IlICORO"
DEPICTS
The rest is easy. The DATA step and all the procedures of the SAS
System look at your view descriptor as if it were a SAS data file.
Not even the SAS procedures know where the data come from. You
simply reference a view descriptor with the DATA = option of the
SAS procedure. The SAS supervisor and the SYSTEM 2000 inter·
face view engine provide the transparent access. Here is how simple
it is to produce a Shewhart chart for our measurement variables.
,.
•••••••• 4.
DEFECT
IIUSURENEMTS
,,~
MTYPH
IIV.r.J,UE
$20.
*RECOliD* *IIECQRD.
..
title 'Wohhle Measured on Belt Sanders';
proc shewhart data=s2kdata ,measure;
J:[chart mvalue * hour I Doxlahel noleqend;
run;
••• $20.
.••.••• 11.2
Output 1 shows the output.
SYSTEM 2000 Access Descriptor Display Window
wobble Maarured
all lIe1t sanders
To create a view descriptor for measurement data, enter
1 Siqa>a. Lillits
For n~]:
proc access accdesc=s2kdata.quality;
r1l1l;
1. 125 +
,
,
,
,
The View Descriptor Display Window, shown in Screen 4, appears
next. it shows all the items in the access descriptor that were not
dropped. Provide a descriptor name, like S2KDATA.MEASURE.
Select those items that you want to use in SAS procedures". Enter
a database password only if you do not want to require your user
to do so. Data set options are availabte to supply or override various
fields.
1. 1 +
+ +
I ~~~~~~.~. ___ • ___ .~ ••• h~ • • • +~ •••••••••••••• u
1.075 +
••••
,
M 1.05 +
I
A
++---~~_+_
I
~~_~
1 utI. • 1.0681
,
I
____ +____ ____ +___ ++-++ __________ I l\ • 1. 0385
~
++
L 1.G25 +
,
++
+++
++1
I
M
1__________________________ .o • • • • • • • • u . n • • • _ • • • • 1 LCL. 1.0090
,.
,
~-~+-----+-----+-----+-----+-~---+-----+
.08 +
I ••••••••••••• _•• __ •••••• __ .~.~._ •••••••••••••
R •••
.06 +
,
~
.014
I
,
.04 +
_ _
I H-------+ ______ H ____ +++++++H+-------+ _______ H I II •• 029
+++ ++
++ +++
I
H
++
__
__
__ __
1 I.CI.
0
.02 +
o
I
I UeL
+_~
._~~~~~_~
~_.~~
~
~~_~~1~_~~
~---+------+-----+---~-+
~~~~_~_~~~
~
~---+-----+----~+
Output 1 Output from PROe SHEWHART
The database can be modified with any of the SAS update procedures, including the APPEND, DBlOAD, FSEDiT, FSViEW, and
431
SQL procedures. and SAS/AP software. The only SAS programs
that cannot update a database are those that create a new output
file. The DATA step and the SORT procedure are two examples.
Deleting Obeervationa
Wan update can affect more than one observation, can a delete also
do so? The answar is no. Consider Table 2, which shows sample
measurement data.
Delivering Up Observation.
The first job of arry Interfaoe view engine Is to deliver data that look
like SAS observations to the procedures. In the case of a hierarchical database, that can be complex. You may choose to define a view
desaiptor that contains data at several levels in the database. The
view descriptor S2KDATA.DEFECT contains the Items product
type, status, serial number, and defect from three separate database records. The measurement data portion of the hierarchy lies
on a separate path, and Is not included. Figure 2 shows a sample
data population.
electric drill
prototype
I
77521
I
---.-
I
I
----.-
broken switch
uneven paint
loose
screws
(no
defects)
Fogure 2 Sample Population for S2KDATA.DEFECT
w~h
Table 1 shows the five observations produced from this process.
Table 1 Observations In the V"w S2KDATA.DEFECT
Serial
Defect
1
2
3
4
5
electric drill
prototype
protolype
prototype
production
production
56399
97123
97123
77521
81146
dented case
electric drill
electric drill
circular saw
circular saw
1
2
3
4
grinder
grinder
grinder
grinder
circular saw
circular saw
10
10
10
10
11
11
motor torque
motor torque
motor torque
motor torque
current drawn
current drawn
7.11
7.29
7.11
7.29
3.39
3.27
But Inserts can also be ambiguous. Data above the object record
may have changed from the prior Observation. The SYSTEM 2000
engine provides an optional facility to resolve ambiguous Inserts and
still minimize redundant data. It Is called a BY key, and you define
H in the ACCESS procedure. Screen 5 shows how you designate
Items In a BY key.
3. Move to the next object record, broken switch, and repeat
step 2.
Status
Mvalue
When you Insert a new observation Into a hierarchical database, the
challenge for the engine is figuring out how many database records
should be inserted, and where. For example, a new defect record
may represent a new serial number, or It may be another defect on
a serial number you are already testing. The Interface view engine
determines Its actions In several ways. Arst, it tries the obvk>us solution. If you are looking at, or positioned to, one observation, does
the new observation have arry data In common with It? If only !he
data in the objact record have changed, then the insert Is unambiguous. A single new object record is Inserted, and Is anchored to the
ancestor that was already there. This action emphasizes the h"rarchical tendency to reduce redundancy whenever possible.
2. Starting with the first object record In the Ust, dented case,
retrieve the hierarchical ancestors of the object record up
through the levels contained in the view descriptor. These
are serial number 56399 and electric drill. Present these
values as one observation.
Product
Mtype
Inserting New Record.
the
1. Apply any subsettlng criteria to provide a list of qualified
object records. The obieCt record In this example is the
defect record, that Is, the record representing the deepest
level of detail in the database. The number of object
records defines the number of SAS observations.
Obs
Hour
An engineer notices a sample size on a SASIQC char1 doubled, and
upon investigation discovers that the 10:00 data were accidentally
sent to the mainframe twice. Unfortunately, the duplicate data are
already In the database. Using the FSVtEW procedure with
S2KDATA.MEASURE, the engineer deletes observations 3 and 4.
The display Immed'IBtely shows missing values for all the variables.
However, the l1inder product type record Is stiflln the datalbase,
because other records are anchored to it. If no other records were
anchored to grinder, It too would be deleted.
81146
The SYSTEM 2000 engine composes a SAS observation
four variables using the following logic:
Product
6
I
97123
Sample Measurement Oata
Obs
5
circular saw
production
56399
dented
case
Table 2
------------t
'_lid ...)
...CCII1, Crut. De.erlptDr--- --- ------------- ---I
I
I
ITS't&II10OO(rl Yiew D.,n-lv\:or III.phy "lAdow (11
I
broken switch
uneven paint
loose screws
(missing)
1 ~."rlptor:
Librar:r. I2~TA KeMler, DIrICT
T~' n ....
1 O"tptllt
d.h.••t, UbrarJ'1
. . . .r:
1'JPe1 DAT.
I
.....ord to ..... ,
..lll_UI•.rlu), n •
I n..tabl.", QUALITI
I
• ,,1eelld •
I rUII'"
Illfonaet
IDdu IT itf),
I
'.ICOU'
I
no.
I
I
I
CIOO
'UCOaD'
I
I
Clot
I
C200
I
C202
UO.
I
'AI'
. , '"• ""
•• "
"
,,
Updating Ob.ervationo
Engineers have auth~ to update the database Wa test technician
finds a reporting error, or If a process changes status. For example,
a tester determiiles that he entered the wrong serial number for the
second electric drill, serial number 97123. " should have been
79123. This value appears In two observations. However, the hierarchical database stores ~ only once. The engineer wants to update
the database using the FSEDIT procedure. Which obsarvatioo
should he change, 2 or 3, or both? The answer is that It does not
matter. Since there Is only one record In the database for serial number 97123, a change to obsarvation 2 in the FSEDIT procedure will
be reflected in observation 3.
...
..
,.,-,
it._.
•"
'.KOJJI'
"
+--------------------------------------------------------------+
Screen 5 Designating
432
"ems In a BY Key
Taking Care with Missing Values and NOT Operators
A BY key is like a super-match value. It is typically composed of
Hems from each level in the view descriptor, such that the concatenation of the items uniquely identifies a path in the database. When
you define a BY key. the SYSTEM 2000 engine goes to the database to search for matches. When successful, one or more new
records are attached to the matching ancestors. Again, this action
aims for minimum redundancy in the database. Without a BY key.
results are less predictable.
One thing to be aware of when you are qualifying records in a hierarchical database is the way the database handles missing values,
and how that may differ from the SAS System. SYSTEM 2000 software assumes that missing values, called null values, are outside
the domain of values for an item. in contrast, the SAS System
assumes missing values are values that compare less than any others. in a hierarchy, missing values arise not only from items that
have never been valued, but also from record holes in a path. For
example, the view S2KDATA.DEFECT may include observations for
a serial number that was tested but had no defects. A defect record
was never inserted for that serial number.
Locking the Database
locking is a way of holding infonnation constant so that it does not
change unexpectedly. A SAS procedure can request locks on individual records. on library members, and other types of locks.
The differences between the SYSTEM 2000 and SAS System interpretations of missing values can be important. A view WHERE
clause such as
The SYSTEM 2000 engine responds to a record lock request by
locking all the records that compose one observation. However, the
locks do not always succeed. In a hierarchy, records fan out into
other records, which fan out into even more records. With MultiUser access, there can be great contention for the records at the
top of the fan-out. In order to reduce the contention. the SYSTEM
2000 engine does not require that aU the individual record locks succeed for an observation. it does, however, require that a record lock
be obtained if an update to the record occurs. if the lock failed when
the observation was first retrieved, the lock is retried and the data
are compared to what was initially retrieved. The lock must succeed
and the data must match before the update occurs. All this happens
in a transparent fashion.
where defect"
or
where not mvdue < 12
can produce different results than if the same WHERE clause is
stated as a SAS WHERE clause. it can also matter whether the
WHERE clause was sent to SYSTEM 2000 software. For example,
the strict SAS System answer to WHERE MVAlUE < 12 includes
answers with missing values, but the view WHERE clause does not.
To guarantee that the hierarchical interpretation of missing values
and NOT operators are applied, put these types of conditions in the
view.
For any request stronger than a record lock, for example a library
member lock, the SYSTEM 2000 engine puts an exclUsive hold on
the entire database. An exclusive hold on a database locks out all
other users on all paths in the database.
COMBINING VIEWS, DATABASES, OR ENGINES
Subsetting Your Database
You can use the Sal procedure to combine a SYSTEM 2000 view
with other views or data files. For example, you can combine two
SYSTEM 2000 views from the same or different databases. You
can combine a SYSTEM 2000 view with a view descriptor created
by another SAS/ACCESS interface, such as the SAS/ACCESS
interface to DB2~ You can combine a SYSTEM 2000 view with a
SAS data file. Any combination of views or files is also called a view,
a PROC Sal view. This capability allows you to create very powerful applications. For example, you can join your product defect information to a supplier or customer database. You can compare
process statistics from different manufacturing sites. The only
restriction is that you cannot update a PROC Sal view, regardless
of the origin of the data sets.
You can enter a subsetting WHERE clause either in your view
descriptor or when you execute a SAS procedure. If you enter the
WHERE clause in both places, the two clauses are connected with
an AND and processed as one. in the view WHERE clause. you can
enter any valid SYSTEM 2000 WHERE clause syntax. in the SAS
WHERE clause. you can enter any valid SAS WHERE clause syntax.
The syntaxes are similar, but not identical. Each has capabilities the
other does not.
The SYSTEM 2000 syntax has the capability to express conditions
about the hierarchical structure itself. Although you can only select
items from one path, you can qualify on items in any path. For example, qualify those product types that have both a scratched case
defect and a case hardness measurement below 25. You cannot
express that request with a SAS WHERE clause because it references a variable (MTYPE) that is not known to the SAS procedure.
On the other hand, the SAS syntax includes extensive pattern
matching capabilities and arithmetic expressions that are not available in SYSTEM 2000 syntax. For example, qualify those product
types where the measurement variable type is 'plating thickness'
and the measurement value multiplied by 1.5 is greater than 10.
Secondly, because SAS data views and SAS data files are alike as
far as SAS procedures are concerned, you can develop one program that works on either a view or a file. You can write one SAS
program for both techniCians and engineers. TechniCians can use
it to do real·time monitoring of quality data as it is being collected
on the microcomputer SAS data files. Engineers can use the same
program for analysis on the mainframe SYSTEM 2000 hierarchical
database.
Fortunately, you can take advantage of the capabilities of both systems. The two WHERE clauses are connected with AND, then the
SYSTEM 2000 engine analyzes the entire clause. It resolves minor
syntax differences, such as, for range conditions, where SYSTEM
2000 software uses SPANS and the SAS System uses BETWEEN.
It sends to SYSTEM 2000 software only those conditions that
SYSTEM 2000 software will understand. Any conditions that are
sent to SYSTEM 2000 software take full advantage of the indexes
in the database. A performance benefit results because fewer
records are retrieved from the database into the SAS System. if
there are any conditions left. the SYSTEM 2000 engine flags the
SAS System to double-check all the records in the subset before
giving them to the SAS procedure.
Finally, you can use the DBlOAD procedure to create or load a
SYSTEM 2000 database from another database. The input view
descriptor can be a SAS file, a view descriptor for another SYSTEM
2000 database, or a view descriptor from another SAS/ACCESS
database interface product.
EXTRACTING DATA
The Version 5 SAS interfaces to databases included a SAS procedure to extract data into a SAS file. That capability still exists with
Version 6, and can be accomplished in a number of ways. The
ACCESS procedure can extract data. A SAS DATA step that reads
433
a view descriptor and sets it (using a SET statement) to an output
SAS data file performs an extract. The OUTPUT command in the
FSVIEW procedure accomplishes the same. SCL applications can
read views and write files. There is a performance issue to consider.
Reusing a view descriptor in consecutive SAS programs implies repetitious qualification, sorting, retrieval. and refonTlatting of data. If
you plan to access the same view descriptor repeatedly in a Iongrunning batch report job. it is usually cheaper to extract the data
than to reuse the view descriptor. Of course, once you extract the
data, they no longer reflect the actual database or any updates that
occur.
cal database, whose structures and hidden pOinters are complex,
was made to function transparently under the SAS System. The
SYSTEM 2000 interface view engine chose specific alternatives for
handling observation composition, updates, and data qualification.
These choices made it possible to provide additional benefits to SAS
applications: rapid access, security, Multi-User access, and automatic error recovery.
SAS, SAS/ACCESS, SAS/QC, SAS/AF and SYSTEM 2000 are registered trademarks of SAS Institute Inc., Cary, NC, USA. Multi-User
is a trademark of SAS Institute Inc., Cary, NC, USA.
IBM is a registered trademark of International Business Machines
Corporation. DB2 is a trademark of International Business Machines
Corporation.
CONCLUSION
The introduction of interface engines in Version 6.06 of the SAS
System means greater freedom from database details. A hierarchi-
434