* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download From Database to your Desktop: How to almost completely automate reports in SAS, with the power of Proc SQL
Serializability wikipedia , lookup
Oracle Database wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Microsoft Access wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Functional Database Model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Relational model wikipedia , lookup
Clusterpoint wikipedia , lookup
From Database to your Desktop: How to almost completely automate reports in SAS, with the power of Proc SQL Kirtiraj Mohanty, Department of Mathematics and Statistics, San Diego State University, San Diego, CA Trinh Nguyen, Department of Mathematics and Statistics, San Diego State University, San Diego, CA ABSTRACT S AS has many varied applications, and in many situations, data analysts use SAS, to automate reporting and data summarization processes. Proc SQL is a powerful procedure, which could be used to pull data from databases, manipulate and summarize the data within SAS, as per the requirement and then email or export Excel or CSV files to the end user. SAS connection to ODBC can be used to connect to any popular database servers (e.g. Teradata, Oracle, MS SQL Server, MS Access etc.) and conveniently bring data into SAS (as SAS dataset format) and then perform various data manipulation/summarization techniques using Proc SQL to bring the data into the desired format, for further analysis or reporting purposes. Macro variables could be used to dynamically generate variables, like date ranges, which need to change over time, to execute SQL queries. Proc SQL is a powerful procedure, where the SQL statements could be used on SAS datasets, to perform operations like count, sum, average, join (merging multiple datasets), filter, insert, delete etc. We found that Proc SQL can be used, in almost all scenarios, to bring the data into a desired format. Then the final dataset can be sent to the end user via email or exported to a hard drive. This whole process can be fully automated and then scheduled as SAS jobs in Windows Task Scheduler. This paper provides a step by step process of connecting to database, summarize data, export/email the final dataset and scheduling a batch job in Windows 7 OS, with examples. KEY WORDS: Proc SQL, Macro variables, Proc Export, Batch Jobs INTRODUCTION To perform advanced data analysis and/or modeling, first and foremost requirement is to bring the data into the right format. For Data Analysts and Statisticians, one of the very essential skills is, to learn how to pull data from a database and format or summarize the data into the right format, to carry out data analysis or modeling. SAS is a very powerful tool in this regard. Many companies (especially Online Retail, Traditional Retail, Credit Card companies etc.) generate huge amount of transactional data every day. Frequently, business executives want to look at those transactional data, in summarized and readable format, to help them make data driven decisions. When reports/charts/dashboards need to be updated on a periodic basis, SAS could be used to automate the whole process, with very little manual intervention. CONFIGURING ODBC FOR MS ACCESS Create an ACCESS database and store the weekly transactions dataset as a .mdb file. In our case, we created a database called Database1.mdb. In that database we created a table called Daily_transactions and stored some made up data for the purpose of demoing. One of the most important tasks, is to add the data source to the ODBC connections. Without this your codes will simply not work. The steps to add the data source to ODBC are as follows: 1. Open ODBC Data Source Administrator (Shown below in Display 1) Display 1: ODBC Data Source Administrator 1 From Database to your Desktop: How to almost completely automate reports in SAS, with the power of Proc SQL, continued 2. Click on MS ACCESS Database and then click on Configure 3. Then click on Select and browse the .mdb file, you just created. 4. Restart you computer, if your SAS codes gives an error saying that the database was not found. EXTRACTING DATA FROM A DATABASE Most companies store transactional data in databases like Oracle, Teradata, DB2, MS SQL Server, MS Access etc. Proc SQL could be used to connect to these databases and extract the data into SAS dataset format. Once the data is in SAS dataset format, any SAS procedure could be used on it, to carry out data summarization, analysis and/or modeling. Proc SQL is basically a procedure in SAS which enables a user, to incorporate SQL commands in SAS. Hence, to effectively use Proc SQL, the user needs to be trained in SQL first. In this paper we are going to show you, how reports can be created from a database (here MS ACCESS) to your desktop. Take an example of XYZ company, which sells product P, with an average price of $100. The daily transaction records of product P, are stored in an MS ACCESS database as show in Display 1 below. Display 2. Daily Transaction of Product P (XYZ Company) as stored in a MS ACCESS Database table Txn_dt is the date of the transactions, Txns are the number of transactions and Sales_Amt_USD is the sales amount of those transactions in US Dollars. This table is in database name = Database1 and table name = Daily_transactions. In this example, we are extracting the sum of transactions and sales, for the week of 01-JUL-2013 and 02JUL-2012 (the week corresponding to 2012 for comparison). The following SAS code calculates the start and end dates and extracts the required data. 2 From Database to your Desktop: How to almost completely automate reports in SAS, with the power of Proc SQL, continued The output dataset (i.e. xyz.transac) is as follows 3 From Database to your Desktop: How to almost completely automate reports in SAS, with the power of Proc SQL, continued Display 3. Output dataset which summarizes transaction and sales for the 2 weeks Here Transactions and Sales are for the week starting 01-JUL-2013 and Transactionsly and Salesly, are for the week starting 02-JUL-2012 (NOTE: ly stands for last year) Based on our experience we found that Proc SQL could be used to in almost all scenarios to bring the data into a desired format. Proc SQL gives you the power, to use the SQL language in SAS and carry out all the typical data manipulation/summarization techniques like count, sum, average, join (merging multiple datasets), filter, insert, delete etc. Please also note that hitting the database using proc sql should be minimized as much as possible, as extracting data from a database, could be much slower, as compared to working with SAS dataset files. EXPORTING/SENDING THE DATASET Once the desired dataset has been obtained, it is time to export/send the dataset to the end user. In this example, the weekly numbers were appended to a master dataset, where all previous weekly numbers were stored. Before appending the, xyz.transac should be brought into the correct format, for it to be appended to the master dataset. The weekly transaction master dataset (before the append) is shown in Display 4 Display 4. Weekly Transactions master dataset before the Append statement The format of xyz.transac is modified as follows: Then we execute the Proc Append statement to append xyz.transac to the Weekly Transactions master dataset. 4 From Database to your Desktop: How to almost completely automate reports in SAS, with the power of Proc SQL, continued After the Proc Append statement, the Weekly Transactions master dataset looks as below (Display 5): Display 5. Weekly Transactions master dataset after the Append statement The above dataset is the desired dataset, which needs to be either exported to a hard drive or sent to a distribution list via email, to generate the final report in EXCEL. The codes for those operations are shown below: SCHEDULING A SAS BATCH JOB IN WINDOWS 7 The entire process mentioned above could be fully automated and run periodically (in this case weekly), by using Windows’ Task Scheduler. The steps are 1. Open Task Scheduler and click on create task 2. In the general tab enter the job name. 3. In the triggers tab, schedule the job, as per your requirement (see Display 6 below) 5 From Database to your Desktop: How to almost completely automate reports in SAS, with the power of Proc SQL, continued Display 6. Setting the trigger for the batch job 4. Under Action, enter the following in the Program/script text field C:\Program Files\SAS Institute\SAS\V9\Sas.exe -sysin c:\Batch Jobs\WUSS.sas (see Display 7) Display 7. Entering the file path 5. Click OK to schedule your task. You can also check you task in the active tasks list (see Display 8) 6 From Database to your Desktop: How to almost completely automate reports in SAS, with the power of Proc SQL, continued Display 8. Checking the list of active jobs THE FINAL REPORT Once the CSV file has been received via email or from the hard drive, the analyst needs to create the final report. Many Company executives prefer to see the final result, in Excel format. Hence we are presenting the final report in Excel format, where the data pulled from the database is being used to show weekly transactions and sales trends, along with Year over Year (YoY) changes. Display 8 shows the final report: Display 9: The final Report This part of EXCEL processing could be automated by using some excel techniques and/or VBA for EXCEL. Demonstrating these techniques is beyond the scope of this paper. CONCLUSION It is quite evident that SAS is also a powerful tool to extract, summarize and present the data in a desired format, in fully automated way. These skills in SAS, along with Statistical data analysis and modeling, will equip the Data Analyst/Statistician to perform end-to-end processes, i.e. from extracting raw data from a source, to presenting highly advanced statistical inference on a Powerpoint, for enabling data-driven decisions/strategies. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Name: Raj Mohanty Enterprise: n/a 7 From Database to your Desktop: How to almost completely automate reports in SAS, with the power of Proc SQL, continued E-mail: [email protected] Web: www.linkedin.com/in/kirtiraj/ SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. 8