* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Corporate Data Bringing Together the Islands of Information with SAS/ACCESS Software
Survey
Document related concepts
Entity–attribute–value model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Relational model wikipedia , lookup
Functional Database Model wikipedia , lookup
Transcript
Corporate Data: Bringing Together the Islands of Information with SAS/ACCESSe Software Emily P. Wallace, SAS Institute Inc., Cary, NC ABSTRACT data and translate it into a SAS observation that can be used by the SAS application. Since many companies have more than one database management system, an important concern is how to bring together the data from these islands of information in the most efficient and effective manner. SAS/ACCESS software provides a transparent link to your database software. These interfaces allow everyone in the company from end users to the database administrators to use the tools that suit their activities to reach the database data and to perform their analysis, reporting, or querying functions. This paper examines the ways that SAS/ACCESS software integrates into SAse applications to deliver database information transparently. The DBLOAD procedure is supported by the DB2, SOUDS, ORACLE, RdbIVMS, INGRES, dBASE, DIF and OS/2 Database Manager products. The DBLOAD procedure creates and loads a database table or file using a SAS data set as input. Because the view descriptors used by the SASIACCESS products appear to SAS procedures as a SAS data set, data from one database system can be quickly migrated to another database system using PROC DBLOAD. In Release 6.07 the DB2, SOUDS, ORACLE, and RdbNMS interfaces have been enhanced to support the SOL procedure's PassThrough feature. This feature enables you to pass databasespecific SOL statements directly to the database system for processing. You can pass both SELECT statements and non-SELECT statements to the database. The Pass-Through facility allows you to optimize your PROC SOL queries by letting the database system handle the portions that it can optimize, such as joining two tables from the same database system, and by letting PROC SOL handle the portions that the database system cannot do such as joining two tables from different database systems. INTRODUCTION A typical corporate computer installation today consists of a mixture of hardware platforms, operating systems, telecommunications products and database systems. Oftentimes this is a result of events such as corporate mergers or acquisitions and data center consolidations or decentralization. Frequently it results in machines in one location exchanging data with machines of another hardware type in a remote location. Corporations are finding that consolidating data from the many different data sources can be a time-consuming task and that the need for up-to-the-minute information makes the task of consolidating this data a more challenging job. This paper discusses the ways that SASIACCESS software can be used along with other SAS products to bring together the data in these islands of information to produce reports, build applications or perform ad hoc queries. THE APPLICATION The parent corporation purchased a large conference hotel in the Bahamas several years ago. Recently they acquired the casino next to the hotel and decided to merge the casino and the hotel into the same business unit. They consolidated the two data centers into one and are rewriting applications to combine the hotel and the casino information. SAS/ACCESS software is a set of individually licensed products that provide interfaces between SAS software and popular database management systems. They provide transparent access to your database data so that you can use one set of commands and applications to get to all of your corporate data. The interface products currently available are for DB2~ IMS, ADABAS7 CA-DATACOM/DB" and SYSTEM 2000· software on MVS; SOUDS" and SYSTEM 2000 software on CMS; ORACLE· on VMS, AOSNS, and PRIMOS; and RdbNMS· and INGRES· on VMS. Also there are interface products under OS/2· for dBASEo files, DIP files and the OS/2 Database Manager. All of these interfaces are currently available with the current production releases of SAS software for their hosts. Other interfaces are under development for platforms such as Microsoft Windows, VSE and some UNIX operating systems, so many new interfaces will be available within the next year. .~ Combined Data Center Mainframe Running MVS SAS/ACCESS software consists of the ACCESS procedure, the interface engine and in many products the DBLOAD procedure. The ACCESS procedure is used to create access and view descriptors for the databases and tables of the database system. The descriptors contain the information required to communicate with the database system and to retrieve the database data for a SAS procedure or the DATA step. Figure 1 Two Database Systems on the Same Host The hotel has always used IMS databases for its corporate data, and the casino keeps its data in DB2 tables. Both databases are now running on the same CPU. The parent corporation wants to plan an advertiSing campaign for the hotel and casino and has requested a report of the 1991 profit figures by month to use in the planning process. The data processing staff asks their SAS wizard to write a program to produce the report. The interface engine uses the information in the view descriptor to translate requests for data from a SAS application into calls that can be processed by the database system. With the exception of the DIF engine, all of the interface engines also update database data. When a SAS application references a view deSCriptor, it is the interface engine that works behind the scenes to retrieve the database 86 as defined in the DBD. Following the RECORD statement are item statements for each field in the segment. They supply the 32character name of the field, the level number to indicate groups, the format that describes the way the data are stored on disk, the name of any search fields that are defined in the DBD, and whether the item is a key and a SAS name to use for the field. It is very important to supply this information correctly because this information is used by the IMS engine to generate its calls. The SAS wizard codes the following statements: proc access clbms.db2; create saauser .. hotel. acceS$; table=hotel. accounts; create sasuser .. hotel. view; select all; rename total htotal; format montli 2. year 2. rooms dollarl4.2 resUar dollar 14.2 giftshop dollarl4.2 htotal dollarI6.2; subset where year • 91; Once the access descriptor is created, one or more views can be created. The CREATE statement for the view descriptor must contain the name of the IMS Program SpeCification Block (PSB) that will be used by the IMS engine in its calls to the database system. The SUBSET clause specifies criteria to use in selecting records, and the IMS engine turns it into Segment Search Arguments (SSAs) to be used with the IMS calls when it is possible. run; proc access dbms=ims; create sasuser .casino. access; dbd=finance dbtype=hdam; record=profit sg"profit sl=100; item=month Iv=3 dbf.2. se-month key-y sn=month; Iv-3 dbf.2. se-year key-y sn.year; item=year item-'slot machines' Iv-3 dbf-14.2 se-slots sn-slots; item-blackjack Iv-3 dbf-14.2 se-blkjack sn-blkjack; item-baccarat 1v-3 dbf -14.2 se-baccarat sn-baccarat; item-poker Iv=3 dbf=14. 2 se-poker sn'poker; item-'sports bets' Iv=3 dbf=14.2 se=sportbet sn=sportbet; item=total Iv-3 dbf-16. 2 se-total sn-ctotal; create sasuser. casino. view psb-cfinance; select all; subset where year - 91; The third step in this program uses PROC SQl to join the IMS and DB2 data together using the two view descriptors that were just created. It uses the SUM function to aggregate some of the data and uses the AS keyword to supply names and formats for the summed columns. The WHERE clause supplies the joining information to relate the IMS data and DB2 data. Output 1 shows the summary information from the two database systems. run; proc sql; select hotel.month, hotel. year , rooms, resUar, giftshop, htotal, sum (slots, blkjack, baccarat, poker, sportbet) as gambling format dollarl4. 2, sum (restaurt, bars) as food format dollarI4.2, etotal, sum (htotal, etotal) as profit format dollarl6.2 from sasuser .hotel, sasuser. casino where hotel.month - casino . month and hotel. year - casino. year; quit; 1I0NTH YEAR HTO'1'AL PROFIT These SAS statements produce a report that displays the profit for each month in 1991 for the hotel, the casino, and their combined profit. The first PROC ACCESS step uses the statement syntax that is available in Release 6.07 to create an access and view descriptor for the hotel's accounting table. The first CREATE statement names the access descriptor, and the TABLE statement following it names the DB2 table that will be described by the access descriptor. Because there are no other statements before the next CREATE statement, the access descriptor is created with the default values of no dropped columns, no SAS names and the default formats. The second CREATE statement creates the view descriptor for the accounting table. The SELECT All statement causes all columns to be included in the view descriptor, and the RENAME and FORMAT statements specify names and formats to be used by the SAS procedures or DATA step when displaying the information. The SUBSET clause specifies criteria to use in selecting rows and in most cases is passed to DB2 for processing. ROOMS GAMBLING RBS'l-BAR GIr'rSHOP rOOD CTDTAL , 91 15,223,714.30 $856,799.118 $6,085,336.08 '17,653,3911,45 '3",933,01111,98 127,671,775.51 ,/1,822.30 121,586,_39.43 2 91 17,910,042.33 $1.1123,9117.09 U,3113,279.86 '14,858,948.15 12,831,11911.56 $27,033,722.57 U,290.411 '17,690,1142.71 12 91 17,910,042.33 '1,1123,9117.09 U,343,279.86 '14,858,9118.15 U,831,494.56 $27,033,722.57 U,290.44 $17,690,1142.71 Output 1 The SAS wizard decides to produce the report with a DATA step program because the report can be tailored. The program statements follow: proc access dbms-db2; same access statements The second PROC ACCESS step creates both access and view descriptors for the casino's accounting data which are stored in an IMS database. The access descriptor definition process is more complicated than it is for DB2 because IMS does not store descriptive information about the contents of its databases. This information must be supplied at access descriptor definitioll time in order to correctly process the IMS data. If the access descriptor is stored in a permanent SAS data library, it can be created onCe and used many times to create views so that all of the information does not have to be supplied every time a new view is needed. The DBD statement following the first CREATE statement supplies the name of the IMS database as it is defined in the IMS Database Description (DBD) control block, and the type field must also match the value contained in the DBD. There is one record statement for each segment in the database and it supplies the segment name and length run: data ...nulL; merge .sasuser. hotel sasuser. casino: by month year; if year=91; if month = 1 then do; put &20 '1991 Profit for Hotel and Casino'; put &24 '( in millions of dollars)'; pat I I: put &9 ,------- Hotel -------, &42 ,---- Casino ----, &67 'Total'; put 'Mon' &6 'Rooms' Q14 'Food' &21 'Gifts' &30 'Total' Q38 'Gambling' a48 'Food' &56 'Total' &66 'Profit'; end' gambling ~ (slots + blkjack + baccarat +.poker + sportbet)/l000000; food = (restaurt + bars)/l000000; 87 profit. (htotal + ctotal)/IOOOOOO; !rooms. roomI/IOOOOOO; doodl. resLbar/l000000; zgifts • giftshop/l000000; ztotall-. htotal/IOOOOOO; ztota12 • ctotalll000000; put month 2.0 !rooms 8.3 dood!' 8.3 Igifts 8.3 ztotall 8.3 gambling 10.3 _food 8.3 Itotal2-9.3 profit 10.3 ; Hotel and Casino Data Center run: Output 2 shows the report produced by the program statements. 1M3 MVS Mainflame 1"1 Profit for Hotel end CaBi·no «1n • .t111onl of dollln» sotel ------- C.. lno Gauling rood Total MOD Ro• • rood ~otal .roUt 1 2 5.2211 7.910 0.857 1.1124 0.005 0.009 1.085 9.3'3 17.&53 111.859 3.933 2.831 21.586 17.690 27.672 27.03' 12 7.9'0 1.4211 0.009 9.3113 14.859 2.831 17.690 27.03' Gifts '!'otal ~~~• .b PS/2 Figure 2 The SAS/ACCESS Interface to AS/400 Data is a product that runs under base SAS software on OS/2 and uses IBM Corporation's PC Supportl400 software to execute queries on a remote AS/400. A user running a SAS session under OS/2 can use the AS/400 interface product to retrieve AS/400 data for use by procedures or the DATA step. The data are fetched directly into the SAS procedure without storing them in an intermediate SAS file, and the interface allows both reading and updating of the AS/400 data. There is currently a test version of the prOduct available with Releases 6.06 and . 6.07 under OS/2. Output 2 The first two steps in this example are the same as in the previous example. If the view deScriptors have already been created, the first two steps do not need to be run again. The third step isa DATA step that combines the IMS and DB2 data using the MERGE statement, and it joins the IMS and DB2 data together using the values for the month and year columns of each. After the merge is done, it is a simple matter to pick out the 1-991 information and format a report. Only one statement is needed to combine the IMS and DB2 data, and the rest of the program is just specifying formatting information for the report. Combining data· from two different databases on the same host becomes very easy with SAS/ACCESS software. Once the view desCriptors are built, you use the same SAS code to combine them as you would use for combining two SAS data sets. These same techniques can also be u$ed to build full-screen applications using SAS/AP and SASlFSI'" software. Database Systems on Two HO$ts The hotel is adding a tour desk for its guests to use in booking sightseeing, sailing, diving, and other types of excursions. They are purchasing an ASl400" machine and a travel desk software package for it. They use the AS/400 to handle the reservations, payments and simple 8CQOunting tasks. The· hotel needs to transfer the accounting data to their mainfra/ne in order to incorporate them into the financial information sent to the parent company. They want to automate the transfer process so that it can be done easily every night. Their SAS wizard suggests that they use the new SAS/ACCESS interface to. AS/400 Data in conjuction with SAS/CONNECr software to move the financial data from the AS/400 to the mainframe. 88 I f! In order to bring data to the MVS system for analysis, the SAS wizard writes a program that uses the DOWNLOAD procedure that executes on an OS/2 system to retrieve data from an AS/400 table and copy it into a SAS data set on the MVS system. options com.mid • APPC; signon os2; rsubmit os2; proc download data-travel-receipts out=saBUser. tourdesk; endrsubmi t; signoff os2; rUDi The program uses SAS/CONNECT software to download the AS/400 data to the SAS session running on MVS. The OPTIONS statement specifies the name of the communications access method that will be used, and the SIGNON command initiates the process of logging in to the OS/2 session and starting a SAS execution. Once the SAS session has started, the DOWNLOAD procedure is submitted to the remote OS/2 host for execution. This causes PROC DOWNLOAD to run under OS/2 to retrieve the AS/400 table described by the travel.receipts view descriptor and put the data into a SAS data set on MVS. Once the SAS data set has been created on MVS, it can be combined with the other financial data from IMS and DB2 and used for reports, forecasting, graphs, or other types of analysis. The combination of SAS/ACCESS software and SAS/CONNECT software. makes it easy for you to move data from one host to another so that you can build applications and reports that summarize all of your corporate information. ,. +BUILD: DISPLAY PIB.PROGRAM I COIIIDI.Dd ,.".,.> COMBINING DATA FROM ONE HOST WITH DATABASE DATA ON ANOTHER HOST +BUILD: SOURCB PIB. PROGRAM I COlIIIIand ••• > The parent company keeps all of its financial data in RdbNMS tables on its CPU at the corporate headquarters. It needs to merge the hotel and casino finanCial data with the data from its other business units. The corporate SAS wizard suggests that each business unit can use SAS/CONNECT software to upload the data to the corporate headquarters. Once the SAS data sets are on the corporate CPU, the wizard can develop applications to analyze this data along with the corporate financial data stored in the RdbNMS tables. The wizard decides to build a full-screen application to report on the corporate financial information and to integrate all of the data from the various business units into one application. .(£)--------------------------------------+ I (£)-------------------------------------+ I I I 00001 INn: '* *' 1 00002 do upload frOll business units 00003 call display(upload.programl; 00004 submit sql continue; 00005 create view ytdprf as select a.roonth, 00006 sam( total, cproUt) as profit format-dollar1S.2 00007 froll connection to rdb ( select month, etotal from. 00008 corp-Unance where year-92) a, sasuser.hcprofit b 00009 where a.roonth = b.month and b.year=1992; 00010 endsubm1t; 00011 subm.i t continue; 00012 title color"'lwhite '1992 Year to Date Profit'; 00013 pattern 1 value-solid color-yellow; 000111 pattern2 value-solid color_cyan; 00015 pattern3 value-solid color=pink; 00016 proc gchart data=ytdprf; 00017 pie month' discrete sullvar-profit 00018 coutline-white noheading IDatehcolor; 00019 run; 00020 endsubJD,1t; 00021 MAIN: 00022 return; 00023 '1'IRM: 1 000211 return; I 00025 +------------------------------------------------------------------+ This program submits statements to SQl for processing using the SUBMIT SQl CONTINUE command. The SQl statements create a temporary view called ytdprf that joins data from the uploaded SAS data set sasuser.hcprofit with data in the RdbNMS table corp_finance that are obtained using the Pass-Through feature of SQL. When the GCHART procedure references the ytdprf view, the SELECT statement following the keywords CONNECTION TO ROB is sent directly to RdbNMS for processing. The data that are returned from the SELECT statement are formatted as if they were coming from a SAS file so that the SQl join can occur. The pie chart produced from the combined RdbNMS and SAS data is then displayed for the chief financial officer. Figure 3 The following menu displays a list of financial reports and graphs. that are in the application. CONCLUSION +rlnaDcial Reporting Syst8I1------------... ---------------- ____________ + COIIIIII8nd -==> I I Parent Corporation Financial Reporting System I SAS software offers you the flexibility to access your data even if they are stored on a variety of different hardware platforms and in a variety of different database systems and files. The powerful analytical, reporting, and graphics tools of SAS software can be used to build applications that access your corporate data wherever they reside. You use the same set of programming statements and application tools to work with your data whether they are in a SAS data set or in a database system. Because SAS applications are portable, the applications and programs can be moved from one machine to another as your application needs and database systems change. The SAS System makes it easy to connect your islands of information. I B,l.ct 011.8 report (Honth to Dat, Receipts I I I I I I I (Yeer to Date Receipts (Month to Date Expenses I I I I I I (Year to Date E:r:penses I I I (YTD Profit - Ple Chart I I I I SAS, SAS/ACCESS, SAS/AF,SAS/FSP, SYSTEM 2000, and SAS/CONNECT are registered trademar1<s or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA regiStration. OB2, AS/400, OS/2, and SQUDS are registered trademarks or trademarks of International Business Machines Corporation. +-----------------------.. -------------------------------------------+ If the chief financial officer selects the pie chart of year to date profit, he is invoking an application to upload the financial data from the various business centers, join it with the RdbNMS financial data and produce a pie chart of the profit figures. Other brand and product names are registered trademarks or trademarks of their respective companies. The Screen Control Language program that joins the data and produces the pie chart is: 89