* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download A Comparison of Cross-Database Access Techniques Using SAS® with SAS/ACCESS® Version 6.12 and SAS with SAS/ACCESS Version 8.1
Entity–attribute–value model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft Access wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Clusterpoint wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Relational model wikipedia , lookup
A Comparison of Cross-Database Access Techniques Using SAS with SAS/ACCESS Version 6.12 and SAS with SAS/ACCESS Version 8.1 Karen Glenn, netRegulus Inc., Denver, CO ABSTRACT SAS has been and remains the industry standard for pharmaceutical data analysis, yet frequently these data reside in database products other than SAS such as Microsoft SQL, Microsoft Access or dBASE™. The SAS/ACCESS Version 6 generation software provided useful tools to access such data, and now the Version 8 generation brings tremendous simplicity and flexibility to data access making cross-database access nearly invisible. Under Version 6 many lines of code were required to read in outside data sources. In addition, as SAS variable names were limited to a length of 8, longer variable names were truncated and numbered to insure short but unique names. These translated variable names could complicate communication between statisticians and database owners. In Version 8.1, the process is executed simply through specific access engines in the libname statement. The increased length of variable names in Version 8 also results in fewer if any renamed variables in SAS. INTRODUCTION SAS software has been the accepted pharmaceutical industry standard for data analysis for some time. At times though and for a variety of reasons, other software packages are used to collect or store data. As a result, statisticians and data managers can be confronted with the need to access data in other software packages. In some cases, the need to access a particular store of data may be a one time only event, so export from the original software package and conversion of these data to SAS may be an option. Often though, access to these data will recur. When repetitive data access is called for, a programming approach is not only time efficient, but also ensures accuracy, consistency and standardization in the process and outcome. DATA ACCESS UNDER SAS VERSION 6.12 SAS/ACCESS has long provided cross-package data access. Under SAS Version 6.12, the process was effective but rather cumbersome. With the SQL Procedure, you connected to the SQL server and used a create table statement with a one or two level SAS-compatible name to create the SAS databases. If you wanted to include all variables from the original SQL table in the resulting SAS data set, SAS would automatically read all variables through a select * statement. The drawback to this shortcut method was that if any of the SQL variable names violated the SAS length restriction of 8 characters, SAS initiated an automatic truncation and numbering convention that created SAS-compatible variable names, but would then require an additional data dictionary, and could result in confusing variable names. For databases with these renaming issues, in order to prevent automatic variable renaming by SAS, you could manually address the variable renaming by not using the select * statement, and instead explicitly specifying each SQL variable name and assigning it a SAS variable name. Depending on the size of the database, this process could be long and tedious. In the following example code, the SQL database (hospital_data) is read into a permanent SAS data set, SAMPLE SAS VERSION 6.12 CODE /***************************************************************** *******Establish a connection to SQL server and *******SQL database hospital_data ******************************************************************/ proc sql; connect to sqlservr as statdata (server=servername database=hospital_data user=username password=password); /***************************************************************** *****read in all variables from SQL table subject_stats *****and save to a permanent SAS data set. Assign *****SAS-compatible variable names to each variable *****violating the 8 character limit to prevent any automatic *****renaming, otherwise keep the SQL variable name ******************************************************************/ create table libraryname.enroll as select * from connection to statdata (select subject_id as subj_id, screening_date as screendt, enrollment_date as enrolldt, study_start_date as studydt , date_of_birth as dob, randomization_code as randcode, height_inches as hght_in, weight_pounds as wt_lbs , gender as gender, event_code as eventcod, event_classification as eventcls, center_code as center; attending_physician_code as attendmd, diagnosis_at_admit as diag_adm, admitting_physician as admit_md, diagnosis_at_discharge as diag_dis, discharge_physician as disch_md ; discharge_medications as disc_med, procedure_count as proc_ct, followup_consent as followup, from subject_stats); run; DATA ACCESS UNDER SAS VERSION 8.1 SAS/ACCESS for PC File Formats in Version 8.1 provides more direct access to outside data sources. . As long as the original data source was ODBC compliant, SAS ACCESS for PC File Formats provides a driver that allows you direct access to any configured ODBC data source. Once you configure an ODBC data source through the ODBC Administrator on your system, this SAS engine directly accesses the original data. If you eliminate the select statement within the COPY Procedure, all SQL tables within the referenced SQL database are copied and become SAS data sets. In addition, SAS with Version 8.1 also added support for longer variable names, up to 32 characters. This decreased or even eliminated the need to rename variables. In the following example code, the same SQL database (hospital_data) is read into a permanent SAS data set, SAMPLE SAS VERSION 8.1 CODE /***************************************************************** *****establish a connect to SQL server Servername through *****ODBC data source. ******************************************************************/ libname statdata odbc noprompt="DSN=data source name; SERVER= servername; UID=username; PWD=password; DATABASE=hospital_data" schema=dbo; /***************************************************************** *****read in all variables from SQL table subject_stats *****within database hospital_data and save to permanent ***** SAS data set. ******************************************************************/ proc copy in=statdata out=libraryname; select subject_stats; run; COMPARISON Both generations of SAS software give you the ability to access outside data sources. In a situation where the majority of the SQL databases you are interested in contain few variables, you are only interested in a very few variables, or you are only interested in a small number of tables within a SQL database the two methods of data access are roughly equivalent. However, in situations where you want to make a SAS data set from each SQL table within a SQL database or you want to include all variables from a SQL table in the SAS data set, the ODBC engine within the SAS/ACCESS for PC File Formats provides the quickest and easiest solution. CONCLUSION A small time investment is required to upgrade existing code written to access SQL data under Version 6.12 to Version 8.1. The advantages gained through the new features in Version 8.1 include the simplicity and flexibility provided by the ODBC engine, as well as the longer variable name standards that eliminated the need for a variable renaming and creating a SQL to SAS data translation dictionary. These advantages are well worth the time investment. REFERENCES SAS Procedure Guide New Features in Version 8 of the SAS System Course Notes SAS OnlineDoc documentation ACKNOWLEDGMENTS The author wishes to thank William Kastner for teaching her how to access other data sources using the SAS and SAS/ACCESS Version 6.12, Becki Bucher Bartelson and Lisa Garnsey Ensign for reviewing the manuscript and encouraging the author to present this paper. CONTACT INFORMATION Your comments and questions are welcomed. Contact the author at: Karen Glenn netRegulus, Inc. 11755 East Peakview Avenue Englewood, CO 80111 Work Phone: 303-925-7733 Fax: 303-662-9320 Email: [email protected] Web: www.netRegulus.com TRADEMARK INFORMATION SAS and all other SAS Institute product and service names are trademarks or registered trademarks of SAS Institute Inc., Cary, NC, USA. and other countries. Other brand and product names are registered trademarks of their respective companies. ® indicates USA registration.