Download A Comparison of Cross-Database Access Techniques Using SAS® with SAS/ACCESS® Version 6.12 and SAS with SAS/ACCESS Version 8.1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

Microsoft Access wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Relational model wikipedia , lookup

SQL wikipedia , lookup

Database model wikipedia , lookup

PL/SQL wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Transcript
A Comparison of Cross-Database Access Techniques Using SAS  with SAS/ACCESS
Version 6.12 and SAS with SAS/ACCESS Version 8.1
Karen Glenn, netRegulus Inc., Denver, CO
ABSTRACT
SAS has been and remains the industry standard for
pharmaceutical data analysis, yet frequently these data reside in
database products other than SAS such as Microsoft SQL,
Microsoft Access or dBASE™. The SAS/ACCESS Version 6
generation software provided useful tools to access such data,
and now the Version 8 generation brings tremendous simplicity
and flexibility to data access making cross-database access
nearly invisible. Under Version 6 many lines of code were
required to read in outside data sources. In addition, as SAS
variable names were limited to a length of 8, longer variable
names were truncated and numbered to insure short but unique
names. These translated variable names could complicate
communication between statisticians and database owners. In
Version 8.1, the process is executed simply through specific
access engines in the libname statement. The increased length
of variable names in Version 8 also results in fewer if any
renamed variables in SAS.
INTRODUCTION
SAS software has been the accepted pharmaceutical industry
standard for data analysis for some time. At times though and
for a variety of reasons, other software packages are used to
collect or store data. As a result, statisticians and data
managers can be confronted with the need to access data in
other software packages. In some cases, the need to access a
particular store of data may be a one time only event, so export
from the original software package and conversion of these
data to SAS may be an option. Often though, access to these
data will recur. When repetitive data access is called for, a
programming approach is not only time efficient, but also
ensures accuracy, consistency and standardization in the
process and outcome.
DATA ACCESS UNDER SAS VERSION 6.12
SAS/ACCESS has long provided cross-package data access.
Under SAS Version 6.12, the process was effective but rather
cumbersome. With the SQL Procedure, you connected to the
SQL server and used a create table statement with a one or two
level SAS-compatible name to create the SAS databases. If you
wanted to include all variables from the original SQL table in the
resulting SAS data set, SAS would automatically read all
variables through a select * statement. The drawback to this
shortcut method was that if any of the SQL variable names
violated the SAS length restriction of 8 characters, SAS initiated
an automatic truncation and numbering convention that created
SAS-compatible variable names, but would then require an
additional data dictionary, and could result in confusing variable
names. For databases with these renaming issues, in order to
prevent automatic variable renaming by SAS, you could manually
address the variable renaming by not using the select *
statement, and instead explicitly specifying each SQL variable
name and assigning it a SAS variable name. Depending on the
size of the database, this process could be long and tedious.
In the following example code, the SQL database (hospital_data)
is read into a permanent SAS data set,
SAMPLE SAS VERSION 6.12 CODE
/*****************************************************************
*******Establish a connection to SQL server and
*******SQL database hospital_data
******************************************************************/
proc sql;
connect to sqlservr as statdata
(server=servername database=hospital_data
user=username password=password);
/*****************************************************************
*****read in all variables from SQL table subject_stats
*****and save to a permanent SAS data set. Assign
*****SAS-compatible variable names to each variable
*****violating the 8 character limit to prevent any automatic
*****renaming, otherwise keep the SQL variable name
******************************************************************/
create table libraryname.enroll as
select * from connection to statdata
(select
subject_id as subj_id,
screening_date as screendt,
enrollment_date as enrolldt,
study_start_date as studydt ,
date_of_birth as dob,
randomization_code as randcode,
height_inches as hght_in,
weight_pounds as wt_lbs ,
gender as gender,
event_code as eventcod,
event_classification as eventcls,
center_code as center;
attending_physician_code as attendmd,
diagnosis_at_admit as diag_adm,
admitting_physician as admit_md,
diagnosis_at_discharge as diag_dis,
discharge_physician as disch_md ;
discharge_medications as disc_med,
procedure_count as proc_ct,
followup_consent as followup,
from subject_stats);
run;
DATA ACCESS UNDER SAS VERSION 8.1
SAS/ACCESS for PC File Formats in Version 8.1 provides more
direct access to outside data sources. . As long as the original
data source was ODBC compliant, SAS ACCESS for PC File
Formats provides a driver that allows you direct access to any
configured ODBC data source. Once you configure an ODBC
data source through the ODBC Administrator on your system,
this SAS engine directly accesses the original data. If you
eliminate the select statement within the COPY Procedure, all
SQL tables within the referenced SQL database are copied and
become SAS data sets. In addition, SAS with Version 8.1 also
added support for longer variable names, up to 32 characters.
This decreased or even eliminated the need to rename variables.
In the following example code, the same SQL database
(hospital_data) is read into a permanent SAS data set,
SAMPLE SAS VERSION 8.1 CODE
/*****************************************************************
*****establish a connect to SQL server Servername through
*****ODBC data source.
******************************************************************/
libname statdata odbc
noprompt="DSN=data source name; SERVER=
servername; UID=username; PWD=password;
DATABASE=hospital_data" schema=dbo;
/*****************************************************************
*****read in all variables from SQL table subject_stats
*****within database hospital_data and save to permanent
***** SAS data set.
******************************************************************/
proc copy in=statdata out=libraryname;
select subject_stats;
run;
COMPARISON
Both generations of SAS software give you the ability to access
outside data sources. In a situation where the majority of the
SQL databases you are interested in contain few variables, you
are only interested in a very few variables, or you are only
interested in a small number of tables within a SQL database the
two methods of data access are roughly equivalent. However,
in situations where you want to make a SAS data set from each
SQL table within a SQL database or you want to include all
variables from a SQL table in the SAS data set, the ODBC engine
within the SAS/ACCESS for PC File Formats provides the
quickest and easiest solution.
CONCLUSION
A small time investment is required to upgrade existing code
written to access SQL data under Version 6.12 to Version 8.1.
The advantages gained through the new features in Version 8.1
include the simplicity and flexibility provided by the ODBC engine,
as well as the longer variable name standards that eliminated the
need for a variable renaming and creating a SQL to SAS data
translation dictionary. These advantages are well worth the
time investment.
REFERENCES
SAS Procedure Guide
New Features in Version 8 of the SAS System Course Notes
SAS OnlineDoc documentation
ACKNOWLEDGMENTS
The author wishes to thank William Kastner for teaching her
how to access other data sources using the SAS and
SAS/ACCESS Version 6.12, Becki Bucher Bartelson and Lisa
Garnsey Ensign for reviewing the manuscript and encouraging
the author to present this paper.
CONTACT INFORMATION
Your comments and questions are welcomed. Contact the
author at:
Karen Glenn
netRegulus, Inc.
11755 East Peakview Avenue
Englewood, CO 80111
Work Phone: 303-925-7733
Fax: 303-662-9320
Email: [email protected]
Web: www.netRegulus.com
TRADEMARK INFORMATION
SAS and all other SAS Institute product and service names are
trademarks or registered trademarks of SAS Institute Inc., Cary,
NC, USA. and other countries.
Other brand and product names are registered trademarks of
their respective companies.
® indicates USA registration.