* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Using SAS/ACCESS Software with PC Files and Databases
Entity–attribute–value model wikipedia , lookup
Microsoft Access wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Clusterpoint wikipedia , lookup
Relational model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Functional Database Model wikipedia , lookup
Using SAS/ACCESS® Software with PC Files and Databases Forrest Boozer, SAS Institute Inc., .Cary, NC ABSTRACT EMPLOYEE Table This paper discusses the SAS/ACCESS® software products EMPID FNAME LNAME DEPT available under OS/2® and Microsoft Windows and describes the way the products work in both environments. The paper explains 001 Henrietta Hacker ISO 020 Sammy Slick MKT 003 Justin Doit CEO the differences i)etween the interfaces for DIF files, DBFfiles, Database Manager, and AS/400® Data and demonstrates how to use them efficiently. Plans for future enhancements to the SAS/ACCESS products for pes are also discussed. . Salary Table INTRODUCTION The explosive use of pes in today's workplace has been a mixed blessing. Everyone from administrative assistants to chairmen of the board have pes on their desks. Often these users have the need to access data from the corporate, departmental, workgroup and personal levels of the organization. Frequently these levels of data are organized and administered by different people within the bUsiness organization. For example, corporate data may reside on an AS/400 that is administered by ISO; departmental data may be managed by the department's" PC Expert" and reside in OS/2 Database Manager or dBASE IV; personal data such as sales leads or expenses may be maintained by the individual in a spreadsheet such as Lotus 1-2-3. EMPID SALARY BQNUSCOD 001 35,000 020 30,000 t 003 120,000 10 The EMPLOYEE table Simply contains the names, employee IDs and departments for all employees in a company. Because of the confidentiality of salary information it is kept in a separate table, SALARY, so that access to the information may be restricted. In the EMPLOYEE table, the EMPIO column contains numeric values representing employee ids. The FNAME, LNAME and DEPT columns are character and contain the values for the first names, last names and departments of the employees. At some point, an individual or org'anization within the business may need'to collect and analyze data from these disjoint database systertls Of write an application that allows users seamless access to the data. The SAS® System provides applications programmers and users with a vast number of development and analysis tools for manipulating data while the SAS/ACCESS products allow these tools to be used on data stored in non-SAS files or database management systems (DBMSs). In the SALARY table all columns are numeric. As in the EMPLOYEE table, the EMPID -column contains employee IDs. The SALARY and BONl)SCOD columns contain the salary and the bonus code that the employees are to receive. There are basically two different types of SAS/ACCESS interfaces available on the PC platform. The first interface type simply reads a file of a specific format. The second type actually makes calls to a DBMS which in turn directly manipulates the data under its control and returns results of queries to the interface. FEATURES OF A SAS/ACCESS INTERFACE A SAS/ACCESS interface may have any number of features 'and abilities. In Release 6.08 of the SA'S System, a full featured SAS/ACCESS product will have ~e following components: The advantages and disadvantages of each interface and type of interface will be discussed in this paper. However, some things are common regardless of the interface you are using. This paper will take advantage -of the common thread between the interfaces. First a model of a full featured SAS/ACCESS product will be presented at a functional level. Then the properties that are common within a specific interface type will be presented followed by a discussion of the unique properties of each interface of that type. • ACCESS Procedure o non-interactive descriptor creation o interactive, window based descriptor cre~tion o data extraction • DBlOAD Procedure o non-interactive table creation EXAMPLE DATA o interactive, window based table creation The examples given in this paper will be used to illustrate a single feature or attribute of an interface. Because the illustrations are narrow in scope, the sample data will be used repeatedly where ever appropriate. Forexample, the EMPLOYEE table may be a SAS data file in one example and a DIF file in the next. The following tables are used for the examples in this paper. o access descriptor creation o Sal statement • Interface View Engine o read-write data access o Sal Procedure Pass-Through Facility Each of the major components above may support the features listed below them. !-lowever, a component for a particular interface may not support all of the possible features: Every interface consists of at least the ACCESS Procedure and Interface View Engine components. 179 engine and SQl procedure are then able to dynamically process the information returned from the DBMS. ACCESS Procedure The basic function of the ACCESS procedure is to provrde the SAS engine supervisor and the interface 110 engine with information concerning the location and type of data to be retrieved from an external source. This information is stored in SAS descriptor files. There are two types of descriptor files, access and view. CHARACTERISTICS OF A PC FilE INTERFACE A PC file interface is simply an interface that reads a specific file format. The products that create these files typicarly have no method by which the SAS System can call them to retrieve data. The only access to the data is by accessing the file directly without regard for the program creating the data. This method of acceSSing the data is especially useful when the data has been exported from a "non-DBMS" program such as a spreadsheet or'personal scheduler that does not use DBF or DIF files as the native method of data storage. All descriptors contain identifying information such as database and table name or filename, column names and types, and the SAS variable names and formats that correspond to the columns in the table. However, access and view descriptors differ in usage. An access descriptor can be considered a master descriptor containing a complete description of a single data source. It is created by first identifying where and how the data is stored to the access procedure. The procedure then looks at the data to determine the names of the columns or fields and the type of data that they contain. It may also generate default SAS variable names and formats corresponding to each column. You then have the ability to further modify the access descriptor by changing SAS names and formats and dropping columns from the descriptor so that they can not be accessed by the SAS System. The lack of the ability to call a programming interface to retrieVe data presents a problem for storing subsetting information in the view descriptor. Typically in aSAS/ACCESSinterface, DBMS specific clauses such as WHERE and ORDER BY may be entered into the Selection Criteria Entry Window. The clauses entered into this window are theh passed down to the DBMS to filter and order records which will be returned to the SAS System. It is impossible for a PC file interface to do this because the engine-reads'the data directly from the file itself. However, the ability to subset the records returned from a view descriptor is very 'desirable. To accommodate this ability, a SAS WHERE clause may be entered into the Selection Criteria Entry window and stored in the view descriptor. When the view descriptor is used in a SAS program the WH ERE clause is given to the SAS engine supervisor (which makes the calls into the interface view engines) so that it will filter records as they are returned from the engine before giving them to a SAS procedure or data step. The figure below illustrates the flow of data between a SA~ procedure and PC file. A view descriptor contains a subset of the information in an access descriptor, These descriptors are created based on the information stored in an access descriptor j not by direcUy querying the source of the data. You have the ability to select all of the information that is made available to the SAS System in the access descriptor or to select only a few of the columns. In addition to selecting columns, specific rows may be selected ~y specifying a WHERE clause 10 be stored in the view. Each time the view is referenced in a SAS program this WHERE clause is used to filter the records being returned to the SAS System. The ACCESS procedure for all PC based SAS/ACCESS interfaces has the ability to create these deSCriptors using interactive windows and to extract external data into a SAS data file. Optio'nally, some SAS/ACCESS products have the ability to crea~e descriptors non-interactively using procedure statements. I SAS Procedure I SAS System W /'- DBlOAD Procedure The primary function of the DBl(.)~D procedure is to create tables or files in an external data source' and load them with the data from a SAS data file or view descriptor. The creation of these tables or files may be done interactively or non-interactively using either DBlOAD procedure windows or statements. I Engine Supervisor ,1/ /1' I Interface Enginek . In addition to creating external data, the DBlOAD procedure is able to create access descriptors for the tables and files that it creates. Optionally, PROC DBlOAD for a particular SAS/ACCESS product may be able to pass non-query SQlstatements to the DBMS. These statements may be useful for dropping tables, creating indexes on newly loaded tables or granting database permissions. Not all SAS/ACCESS products support the DBlOAD procedure. / , View Descriptor , PC File Interface View Engine All SAS data sets and views access their data through engines. Each SAS/ACCESS interface provides an engine that allows the SAS System to direcUy access data in an external source without first extracting it into a SAS data file. Data flow between a SAS Procedure and PC file Unfortunately there is no method by which ordering information can be stored in the view descriptor of a PC file interface. Use of the engine is transparent in a SAS program. Anytime a view descriptor is used with a data step or procedure, the interface engine is loaded and used to access the actual data. All interface view engines provide ~t least read access to external data. Most view engines provide write access to the data ~ well. PC File DBlOAD Procedures The DBlOAD procedure for PC files contains all of the features of the standard DBlOAD procedure described earlier with one exception. Because this interface operates at the file level, with no DBMS to handle SAS System requests for data manipulation, the PROC DBlOAD SOL statement is not supported. In addition to providing data access via view descriptors, some interface engines support the SQl Procedure Pass-Through Facility. This facility allows you to specify both query and non-query SQl statements that are passed directly to a DBMS. The interface 180 THE DBF FilE INTERFACE carries very little information about the data it contains other than the data itself. There is a header portion of the file which typically< only contains the number of vectors (columns or vari'ables) and tuples (rows or observations) that are contained in the file. Only two types of data, character and numeric, are supported in the file. These data tYpes may be intermixed within the same vector just as they might appear in a spreadsheet. All data values in a DIF file are stored in ASCII format with each individual value represented using two variable length records. For example, the figure below displays a single record for the employee SAMMY SLICK from the EMPlOYEE.DIF file. The SAS System interface to OBF files is available in the SAS/ACCESS Interface to PC File Formats. This interface contains PROC ACCESS, PROC DBlOAD and an Interface View engine. DBF files are database files created by several PC based database products, the most notable of which is dBASE. The format of a OBF file is fairly structured. It contains a header section containing descriptive information about the data within the file. Data records within the file are organized in a rectangular manner following the header information. All data, including numeric values, are stored as the ASCII character representation aftha value. A number of data types including date, numeric, float, characteJ and logical are, supported in the SAS/ACCESS interface. However, because the, data for a memo type field does not actually exist within the OBF -1,0 BOT O,2.000000000000000E+01 file it is not supported. V Another property of OBF files is that each data record is preceded 1,0 by asingle byte indicating whether or not the record has been marked for deletion. Marking a record for deletion does not actually remove it from the file. A separate utility must be executed to remove the "deleted" records. • sammy· 1,0 "Slick· 1,0 "MKT" DBF FU~. ACCESS Procedure The ACCESS procedure for OBF files is fairly uncomplicated because the header information describes the fields contained in the file. All that is required for the ACCESS procedure to obtain a description of the file is the name of the file itself. The ACCESS procedure reads the header information of the file and may create default SAS names and formats lor each of the fields in the file. OIF File ACCESS Procedure The process for describing DIF file data is unique among PC based SAS/ACCESS products. These differences are based on the DIF file's association with spreadsheets. Frequendy both character and numeric data are intermixed in the same column, this mixing is reflec~d in the DIF file as well. Additionally, column names (labels) are not part of some global header information, but rather are part of the data itself. Thus, column names are indistinguish~ble _from column data. To accommodate these file ,characteristics the ACCESS procedure for DIF files give you the ability to direct and customize the creation of access descriptors beyond providing the filename and assigning SAS variable attributes. In addition to the fields defined in the header of the_ OBF fiola, the ACCESS procedure also treats the field that marks records for deletion as adata field for the file, This gives you the ability to associate a WHERE clause with the view descriptor th.at can filter "deleted" records so that they are not used by SAS procedures. OBF File Interface View Engine You may .provide the ACCESS procedure with two pi£:lces of information that determine the default names that will be associated with the columns in the file and the row in the file that will be considered the first row of valid data. By default, the ACCESS procedure will generate column names in the form of COLO,'COl1, COL2, ... for each of the columns represented in the DIF file and will treat the first row in the file as the first observation. Typically though, the .first row of these files will contain a row of character values labeling each column. This row of labels may also be separated from the actual data values by one or Il}.ore rows that somehow further' distinguish them from the d.ata. On the .DIF File Access Descriptor Identification Window you may indicate that the first row of the OIF file should be treated as column names and how many rows to Skip before the actual data begins. For -example, you have a spreadsheet with column labels in the first row followed by ablankrowwith actual data beginning inthethird row. To geta proper descrip~jon of this file, you should ,indicate that column names should be generated from the fir&t,row and that two rows should be skipped before the actual data begins. The interface view engine for OBF files supports reading, writing and updating of records. Another feature of this engine that may be advantageous in some applications is its ability to randomly access records within the OBF file based on record number. In other words, if you are using the FSEDIT procedure and want to read observation 2045 the en,gine can calculate where in the- file that record begins and then position itself on it without reading any other records in the file, There are several other things that should be noted about the engine that are due largely to the lack of a callable programming interface. rvtIltiple users are able to open the same DBF file for read access using the engine. However, when the file is being opened for update access, an exclusive lock is placed on the file and only,the user opening the file for update will be able to access the file, dBASE and other products using OBF files for data storage can create index fi!$S that their products use to quickly access specific rows ,of data or to return rows in a specific order. Because the OBF files themselves do not contain information concerning the existence or location of the index files created on them the interface engine cannot·take advantage of these files. An additional enhancementto the .DIF File Aocess Descriptor Display window in Version 6.08 of the SAS System is the ability to change the column types from that determined by the ACCESS procedure. The ACCESS procedure will read the first row of data (after skipping the number of rows indicated on the .DIF File Access Descriptor Identification Window) and use the data tyPe olthe valu$S in that row as the expe~d type for the values in following rows. This method for deten:nining column types i$- at best an educated guess for several reasons. First, OfF,files allow the intermixing of data types within the same column. The column type in the first row may not be the same as the predominant (or desired) THE DIF FilE INTERFACE The SAS System interface to OIF files is available in the SAS/ACCESS Interface to PC File Formats. This interface contains Proc ACCESS, Proc DBlOAD and an Interfa~ View engine. The structure of the DIF file itself is somewhat free form and typically 181 type. Also, jf no data has been entered for several columns at the end of the first row, the DIF file may contain absolutely no information concerning those columns forthatrow. When this occurs on the first row of data, the ACCESS procedure will by default expect the values in those columns to be characters. You may change. the column type from that determjned by the ACCESS procedure simply by changing "C" (character) to "1>1" (numeric) or vice versa for the individual columns that are incorrect. character column will be returned to the SAS System as the character representation of the number rather than a missing value. DIF File Date & Time Values Date values are a speciai case in DIF files because there is no date data type, only character and numeric. However, lotus 1-2-3 software does allow you to enter and display date values on a spreadsheet. These vaiues are stored as a numeric representation of the date'and are written to the DIF file as a numeric value. However, the Lotus 1-2-3 numeric representation of a date differs from that of a SAS software date or datetime value. So, the numeric date value must be converted between the SAS System representation and the lotus representation in order to be meaningful. The DBlOAD procedure and view interface engines do this automatically for all numeric variables which have a date, time or datetime SAS variable format. DIF File DBLOAD Procedure By default the DBlOAD procedure begins loading data directly in the first row ofthe DIF file. No information about SAS variable names or labels is written to the file. You may indicate that the SAS variable names or labels be written to the first row of the DIF tile on the .DIF File load Identification Window or by using the DIFLABEl procedure statement. When this feature is used, a blank row is automatically written after the labels to differentiate them from the actual data which will begin in the third row. The tables below illustrate the difference in using the DIFLABEl feature when loading the employee data set. 001 Henrietta Hacker ISD 020 Sammy Slick MKT 003 Justin Doit CEO CHARACTERISTICS OF A DBMS INTERFACE The second type of SAS/ACCESS interface available on the PC platform is an interface to a database management system. In these interfaces. the data is completely controlled by the DBMS. The SAS System makes requests of the DBMS which in turn accesses data and returns information. The relationship between the interfac;e and the DBMS provides for a number of additionaUeatures in these SAS/ACCESS interfaces. The figure below iIIustrate~ the flow of information between the elements of the SAS System and a DBMS data_ source. Data loaded into the EMPLOYEE.DIF Iile without DIFLABEL. EMPID FNAME LNAME DEPT I SAS Procedure I SAS System 001 Henrietta Hacker ISD 020 Sammy Slick MKT 003 Justin . Doit CEO ,1/ i /1' I Engine- Supervisor ,I W /1' View Descriptor I Interface Engine Data loaded into the EMPLOYEE.DlF Iile with DIFLABEL. ,V DIF File Interfa!Ce Yiew Engine The unstructured· nature of DIF files forces some restrictions on the interface view engine. The most notable restriction is that the engine is read-only. Also, the engine is not able to calculate where a particular record will begin 'or end. This prevents the 'engine from being able to jump directly to any requested observation or to the' previous record. Instead, the erigine must positidn on a known record prior to the requested record'and read forward until the desired record is found. This problem can be quite 'obvious when scrolling backward tJsingthe FSBROWSE procedure oil a large DIF file. Each time that you scroll backward, the'engine must -repOSition on'the first reco'rd and then read each record until the previous record is read. This repositioning can cause a noticeable delay, especrally if you are positioned somewhere well beyond the beginning of the file. I DBMS t= Data / E)(ternal Data Source Relationship between SAS Procedure, SAS/ACCESS Interface, DBMS and the data Currently on the PC-piatform.all of the DBMSs that are being accessed are relational databases. Typically relationa1 data is stored in rows and columns of a table within a database. The actual file structures that the DBMS uses for storing and organizing this data is unknown to applications other than the DBMS itself. These products- use SOL as their primary data manipulation lan-guage. Generally, this allows an interface to build Sal statements for selecting, updating and inserting data. Additionally, statements for administering the database suoh as granting permissions and creating indexes may be passed to the DBMS. Any time the engine encounters a data value whose type does not match that expected in the view descriptor, that vaiue is treated as a-missing value. For example, if a numeric value is found in a column that is expected to contain character values only, it will become a missing value. However, because the produ"ct that produced the DIFfile allowed the intermixing of data, itmay be advantageous for you to view the data in a'-SAS procedure without having these numeric values converted to missing. In Release 6.08 of the SAS System this can be accomplished by setting an operating system environment variable DIFNUMS to YES in your contig.sas file. When DIFNUMS is YES, ail numeric values encountered in a Selecting and Ordering DBMS Data The SOL SELECT statement allows a WHERE Clause to be used to filter the rows that are going to be returned from a table. Additionally an ORDER BY clause may also be appended to it to cause the rows to be returned in a specific order. The SAS/ACCESS 182 interfaces allow you to take advantage of these clauses in several ways. two clauses are logically ANDed together so that the data returned must meet both criteria. First, whenever a SAS WHERE or BY statement is used in association with a view descriptor, the statement is translated into a DBMS specific WHERE or ORDER BY clause and appended to the SELECT statement generated by the engine. The translation of the SAS statement typically requires changing SAS variable names to DBMS column names and, in WHERE statements, changing SAS specific expressions to logically equivalent DBMS expressions. Accessing Specific DBMS Rows When a DBMS processes a SELECT statement, it determines the rows that meet the WHERE criteria and return those rows to the application. Rows are not guaranteed to be returned in any specific order unless an ORDER BY clause is specified on the SELECT statement. In addition, there is no way to indicate to the DBMS that the Nth row satisfying the statement should be returned. This situation presents a problem for DBMS view engines similar to that of the DIF File engine. There is no way for the engine to access a specific row number Without fetching eaqh record that satisfies the SELECT statement until the desired row inumber is found. For this reason, scrolling backwards in a full screen procedure like FSED1T may encounter a delay when compared to scrolling forward. Moving forward simply causes the engine to fetch the next row from the DBMS. Moving bacKward means that the current SELECT statement must be closed, reopened and each row fetched until the previous record number is found. For example, the view descriptor VIEWLIB.SALARY is based on the Database Manager table SASDEMO.SALARY. The SAS variable name in the view descriptor for the BONUSCOD column is BONUS and the variable name for the EMPID column is EMPID. If you want to print a list of all employees who have a bonuscode sorted by empid, submit the following SAS statements. PRoe PRINT DATA=VIEWLIB.SALARY; WHERE BONUS IS NOT MISSING; BY EMPIO; RUN; In this case the following clauses: DBLOAD Procedure SQL Statement WHERE (·BONUSCOO" IS NOT NULL) ORDER BY EMPID In addition to creating and loading tables, the DBlOAD procedure enables you to pass non-select statements to the DBMS. This feature is extremely useful in the OBlOAD procedure for dropping old tables and issuing administrative commands on newly created tables. are added to the SELECT statement generated by the engine. When the select statement is passed to the DBMS, it can use indexes and any other means aVc:Ulable to it to optimize the retrieval of the data. Also, only the rows that meet the WHERE clause criteria are passed back to the SAS interface. This cuts down on the amount of 110 that takes place between the interface engine and the DBMS as well as theamount of data conversion that the engine must perform, PROC DBLOAD DBMS=DBMGR DATA=SASUSER.SALARY; IN 'SAMPLE'; TABLE=SASDEMO.SALARYi SQL DROP TABLE SASDEMO.SALARY; LIST ALL; LOAD; The second place that you can taKe advantage of the~e DBMS specific Sal clauses is by storing them in the view descriptor. When creating or editing a view descriptor you can enter these clauses in the Selection Criteria Entry Window. The clauses that you enter must be OBMS specific, using the column names, not SAS variable names. Each time that the view descriptor is used in a SAS program, the stored clauses are added to the select statement generated by the engine. RUN; PR08 DBLOAD DBMS=DBMGR DATA=SASUSER.SALARY; IN 'SAMPLE'; TABLE=SASDEMO.SALARYi SQL CREATE UNIQUE INDEX SASDEMO.EMPID NDX ON SASDEMO.SALARY (EMPID); SQL GRANT SELECT ON SASDEMO.SALARY TO PUBLIC; RUN; SRS~ F_t. EJ'FID 8H..NIT 3.0 DCIl..l..NIIO.II! II!. 0 EI.N.9DI In the example above, the DB).,.OAD procedure for Database Manager is used twice. Once to create the table SASDEMO.SALARY table in the SAMPLE database and once to define an index and grant permissions on the table. ,These operations must take place in separate batch mode invocations of the DBlOAD procedure because Sal statements are immediately passed to the DBMS by the procedure, but tables are not actually created until the RUN statement is encountered. So, although you can drop and recreate a table in one invocation of the DBlOAD procedure, indexes and ~rmissions cannot be granted on the new table until after the RUN statement because the table will not exist until that time. SQL' Procedure Pass-Through Facility For example, the display above shows WHERE and ORDER BY clauses being entered to create the view descriptor VIEWLlB.BONUS. Each time that this view is referenced, only the data wher,e the BON USCODE exists will be returned and it will always be returned in sorted order. No SAS WH ERE or BY statements are required on the procedure or data step to make this happen. An additional engine feature that is available on some interfaces is the Sal Procedure Pass-Through Facility. This feature is used exclusively with the SOL procedure to allow both query and non-query Sal statements to be passed to a database management s~tem for execution. The pass-through feature is especially useful for joining data from multiple tables in a single database. This is because a Single SQl SELECT statement that joins the data may be passed to the DBMS. The DBMS, not the SAS System, then performs the join using any In cases where a SAS WHERE clause is used in combinatio'n with a view descriptor that already contains a DBMS WHERE clause, the 183 optimizations available to it and returning only the data that satisfies the SELECT statement. Database Manager Pass-Through Facility An experimental feature, the SOL Procedure Pass-Through Facility, was added to the Database Manager interface with the production release of Release 6.08 of the SAS System. This feature was not announced or documented at the time of the production release. Initial reports from sites testing the interface are very favorable and indicate that SOL Pass~ Through for Database Manager is fairly stable. To acquire a copy of the documentation simply call SAS Institute Technical Support Division and request the preliminary documentation for SAS/ACCESs«' Interface for Database Manager: PROC SQL; TITLE "EMPLOYEE SALARIES'; SELECT * FROM CONNECTION TO AS400 (SELECT FNAME, LNAME, SALARY FROM SASDEMO/EMPLOYEE, SASDEMO/SALARY WHERE EMPLOYEE.EMPID = SALARY.EMPID ORDER BY SALARY); SOL Procedure Pass- Through Facility. In the example above, the AS/400 tables SASDEMO/EMPLOYEE and SASDEMO/SALARY were joined to produce a listing of employees and their salaries sorted by salary. The data was joined and sorted on the AS/400 before being returned to the SOL procedure. The output below shows the results produced by these statements. Setting Up the Database Manager Interface Aside from actually installing Database Manager and defining databases, there are several things that must be done prior to using the interface. All users using the Database Manager must be defined through User Profile Management Services. The userids maintained in this service are used to maintain the security of the data stored in Database Manager. Individual users must begranted specific permissions to access data in tables created by other users. If ,you do not have any permissions on a specific table, Database Manager will not allow you any access to the data. EMPLOYEE SALARIES FNAME LNAME SALARY Sammy Henrietta Justin Sli,ck Hacker Ooit 30000 35000 120000 Before the SAS System can access Database Manager tables, the database must know that the SAS engine may try to access -it arid what the interface may try to do. To do this an application must be "bound" to each individual database that it needs to access. The SAS/ACCESS interface to Database Manager supplies both a bind module and a command file (ACCESS.CMD) for binding the interface to Database Manager databases. Several methods are also available with the SOL procedure to create a SAS data file or a PROC SOL view reflecting the results of the query. These methods will allow other SAS procedures to operate on the joined data. Without pass-through, the SOL procedure will have to use view descriptors to access the DBMS data. This would mean that all data accessible to each individual view would be returned to the SOL procedure which would then perform the join. Using view descriptors for joins has several drawbacks in that DBMS optimizations such as indexes are not available to the SOL procedure and sometimes large amounts of data not needed for the join may be returned from the DBMS just to be discarded. To perform the actual bind, you must have created the SASDEMO use rid in User Profile Management Services and currently be logged in as that user. Then enter the command: ACCESS database-name -CONFIG configuration- file-;:_name where da tabase-name is the name of the database to which the "bind" is'being performed and configuration-file-name is the riame of the cenfig.sas file that you use to run the SAS S.ystem. THE 0512 DATABASE MANAGER INTERFACE The ACCESS command will locate the bind module and issue the SQLBIND ~ommand to Database Manag,er. -Once the bind h~_ been successfully accomplishe,~, you musti$-sue the foUowing Sal statement to permit other users to acceS$ the database using the SAS/ACCESS interface: The SAS/ACCESS Interface to OS/2 Database Manager contains the ACCESS and DBLOAD procedures, an interface view engine and most of the features that may be found in these components. The most notable feature that is currently not available is the ACCESS procedure statements needed to create access and view descriptors non":'interactively. GRANT OS/2 Database Manager is the IBM® relational databas'e management system that runs under OS/2. The latest release of database manager is packaged in the IBM Extended Services for OS/2 and runs under OS/2 1.31 & 2.0 as a 16 bit application. Database Manager databases define<;l on other machines connected by a network may be accessed on a local machine using the Remote Data Services component of Database Manager. EXECUTE ON PROGRAM SASOEMO.SASOII TO PUBLIC THE AS/400 DATA INTERFACE The SAS/ACCESS Interface to AS/400 Data is composed of the ACCESS procedure and an interface view engine. The ACCESS procedure for this interface does not support statements that allow the non-interactive creation of access and view descriptors. However. the interface view engine supports the SOL Procedure Pass-Through Facility, allowing Sal s~tements and querie,s to be sent directly to the AS/400 for execution. Although the Release 6.08 of the SAS System is a 32 bit applicetion running under 0512 2.0, the SAS/ACCESS interface is still able to use the 16 bit Database Man/ilger at a cost. The,cost is that each call made to a Database Manager function must be "Thunked". Thunking is the process by which program control and parameters are passed between 32 bit and 16 bit applications. During the transfer of control, the program stack must be manipulated and all pointers converted between the 32 bit flat memory model and the 16 bit segmented memory model. These conversions must take place while calling into a function and on returning from it. The AS/400 provides a computing environment where hardware and software are highly integrated. The operating system, OS/400®, has many builtin features including a relational database management system. Data in this relational database may be managed natively via non-Sal means or in an SOL collection. The SAS/ACCESS interface may access data managed by eith-er means. The database functions of OS/400 operating system are integrated to the extent that Sal queries for non-S'Ol managed 184 data are supported. The AS400MSG environment variable is use.d to indicate the librarylfile name of the compiled SASFMSG CL program. If the name of the program is changed to anything other than SASFMSG, or if the library where it resides is not in your *UBl library list then this variable should be set. The format of value shOUld be library/filename. If this environment variable is ·not,set, a default of "L1BUSASFMSG will be used. The SASIACCESS Interface to ASI400 Data is able to access data on an AS/400 by using the Remote SOL Application Program Interlace (API) of the IBM PC SUPPORTI400 software. Using this interface, SASIACCESS software is able to pass SOL statements and queries to the PC Support product which in turn sends the request to· the AS/400 via Communications Manager and returns the results to the PC. AS400CMT Indicates the level of commitment control and locking that the interface should operate under. Valid values are: Like OS/2 Database Manager, PC Supportl400 is a 16 bit application. As discussed with the Database Manager interface, all calls made into its API functions must.be thunked. The figure below illustrates the flow of information between the SAS procedure and the AS/400 database file. 1 SAS Procedure 1 II NONE No commitment control. Default locking is used such that read or update locks are acquired as a row is read and released when the next row is read. CHG Change control. Update locks for updateable CUr;., .Ji"$ only. Update locks for unchanged records released as the cursor moves to the next row. Lo :r; for changed rows are released at Commit C:, ','0!P); Otherwise no locks are acquired . CS Cursor Stability control. Always uses a read or up(latp lock. as appropriate. Read locks and Upd::F'" i, ,'"._, 'r""' unchanged records are released as th~ cursor moves to the next row. Locks for changed rows '~i;: rele':"JSed at commit or rollback. All Full commitment control and locking. All locks acquireri are maintained until a commit or rollback " '~':.:urs SAS System .J.T E.g;•• SUP.~",o~ .J,. T v;.w 1 Interlace Engine Descriptor J I I. PC Support/400 1 I I AS/400Host » ~PCSIPP09 Database File I If a level of commitment control is specified (any va~u", othe; than NONE), then commitment control is used. However, IJse of commitment control prohibits the use of non·journaled dat',v:;ase files. If no commitment control is used (i.e. NON!:.:; :·,en >-.-·ath joumaled and non·journaled database files can be accessed. The default value for this variable is NONE. Data Flow between the SAS System and ASI400 Options Affecting the ASI400 Data Interface If you are using PC Supporv400 Version 2 Release 1 the only valid values for this variable are: NONE and ALL. For the features supported, the abilities of the AS/400 interface follow very closely with those mentioned earlier in the "CHARACTERISTICS OF A DBMS INTERFACE" section. However, these abilities may be altered and performance improved by using a number of options that have been provided by the AS/400 interface. These options are passed to the interface by using OS/2 environment variables. Values may be assigned to these variables by using the -SET system option in your config.sas file. For example, the line: AS400DEC Indicates the numeric decimal separator to be used. Valid values are a period (.) or comma (,). The default value is period (.). AS400BUF Defines the size of tt:le buffer used for data transfers. Valid values are integers from 1 to 31' and represent the size of the buffer in kilobytes. The default value is SK. -SET AS400CMT CHG may be added to your contig.sas file to assign the value CHG to the AS400CMT environment variable. Thefollowing environment variables are used with the AS/400 interface: AS400UPD Indicates if rows returned from a select statement may be updated. A value of YES indicates that SELECT statements should be invoked so that rows can be modified. NO would not allow updates of selected rows. AS400PLU The Partner logical Unit name by which Communications Manager recognizes the AS/400 host. If this environment variable is not set. a default of 52S0PlU is used. In addition to the ability to update.rows, this variable affects I/O with the AS/400. When a value of YES is used, records are returned from the AS/400 one ata time and rows can be fetched from severat different tables at the same time. However, if a value of NO is used,. rows are returned from the AS/400 in blocks and rows can only be fetched from one table at a time. If you only intend to read data.from one table during the execution of a proc, then you may want to set this environment variable to NO to increase performance. AS400MSG Messages for Sal errors that occur on the AS/400 are not available to the interface through PC Support's remote Sal API. In order for the interface to get text for these error messages a Command Language (CL) program, SASFMSG, must be uploaded and compiled on the AS/400 when the interface is installed. If this environment variable is not set, a default of YES is used. When the interface encounters an Sal error during an operation, information concerning the error is collected and passed to the SASFMSG program on the AS1400. This program will take the information and fetch the appropriate message from the AS/400 Sal message file sending it back to the interface on the PC. 185 running under both OS/2 2.0 and Microsoft Windows 3.1 and Release 6.09 release is under development for Microsoft Windows NT. The interfaces discussed in this paper are either available with the Release 6.08 production release or will be released as an add-on product some time after the initial release of the SAS System. AS400IYAT & AS400TlM Data types for dates, times and timestamps were ad~ed to 05/400 in Ve~ion 2 Release 1.1. To accommodate these data types two OS/2 enVironment variables are used to determine the format of the date and time values returned to the engine from the AS/400. The name of the environment variable used for the date format is AS400DAT; for time format it is AS400TIM. The following table All of the following interfaces, except the interface to SYBASE and SQl Server, are available with the production release of Release 6.08 of the SAS System. shows the possible values for these variables and which values are valid for a particular format. Valid for AS400DAT Valid for AS400TIM YMD Y N MDY Y N DMY Y N JUL Y HMS N Y ISO Y Y USA Y Y EUR Y Y JIS Y Y Variable Value . • SAS/ACCESS interface to PC File Formats • SAS/ACCESS interface to OSI2 Database Manager • SAS/ACCESS interface to AS/400 Data • SAS/ACCESS interface to SYBASE and SOL Server The DBF and DIF file interfaces discussed in this paper are packaged together into the SAS/ACCESS Interface to PC File Formats product. - N Although each of the interfaces discussed in this paper are available on the PC platform, they are not always supported on both the OSI2 and Microsoft Windows operating environments. The table below reflects the availability afthe SAS/ACCESS products in these environments. SAS/Access Product By default, date values will be returned from the AS/400 in format specified by the aDATFMT system value on the AS/400. This same value should be specified as the value for the AS400DAT environment variable. In 05/400 Version 2 Release 1.1 ther.e is no system value for the time format; the default is HMS. However, if the time'format is modified by·another'means (such as modifying the time format associated wrth the PC Support job that is communicating with the engine), then the that time format should be specified for the AS400TIM environment variable. OS/2 . Microsoft Windows PC File Formats Y Y Database Manager Y N AS/400 Data Y N SYBASE and SOL Server N Y CONCLUSION This paper has discussed the components and features of the SAS/ACCESS products that are available on the PC platlorm. Each interface was discussed with special emphasis on the characteristics and options that make ,them unique. You should have gained an understanding of the abilities of each interface and the principles upon which they are based. Using this understanding you should be able to use these interfaces efficiently, avoiding the pitfalls that may exist in some. FUTURE ENHANCEMENTS TO PC BASED SAS/ACCESS INTERFACES To a large extent enhancements to SAS/ACCESS interfaces are driven by demand from you. The first three interfaces discussed in this paper, DIF Files, DBF Files and Database Manager are complete with the exception of providing AC'CESS procedure statements allowing non-interactive creation of descriptors. The AS/400 interface also lacks this feature as well as the DBlOAD procedure. All of these features are under consideration to become enhancements; however availability has not been determined. If you have a specific need for an enhancement to these interfaces, please contact your SAS Software Representative. SAS and SAS/ACCESS are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. IBM. ASJ400, OS/2 and OS/400 are registered trademarks or trade:marks of International Business Machines Corporation. ® indicates USA registration A new interface is currently under development to be a add-on product under Release 6.08 of the SAS System for Microsoft Windows. The SAS/ACCESS Interface to SYBASE and SQLServer will be a full feature interface providing ACCESS and DBlOAD procedures with windowed and non-interactive modes of operation. The view engine of this new interface will also support the SQl procedure Pass-Through Facility. The Sal Server interface is being built against Release 4.2 of the SQl Server Client and will support stored procedur~s and triggers. Other brand and product names are registered trademarks or trademarks of their respective companies. ' SUMMARY OF SAS/ACCESS AVAILABILITY The SAS system currently has Release 6.08 production releases 186