Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
DB2TM and the SAS® System - Information Delivery for the 1990s Darius S. Baer, IBM Corporation Abstract As IBM's strategic product for relational databases, Database 2 (~T"') excels by handling very large volumes of data, interfacing with on-line programming languages such as CICS, and ensuring data integrity through automatic backup and recovery techniques. No other database product offers the benefits inherent in DB2. ~ is recognized for its superb data management and reporting capabilities that facilitate enduser information delivery. SAS also offers a welldesigned, easy-to-use interface between DB2 and SAS, SAS/DB2, which expedites movement of data from DB2 into· desired reports with minimal time expense and maximal product delivered. These environments range from business and academic to persona! and recreational. With the current wealth of tools, we truly are in the information age. In order for a fourth generation language (4GL) lil<e SAS to function most effectively, it is important that it be provided with a reliable source of data. Data can be stored in a wide variety of ways. These can range among three basic methods which include flat, sequential files without field-defining attributes, sequential files with field-defining attributes, and relational databases with field-defining attributes. DB2 is a good choice because it enforces a structure around the data which is consistent and reliable. By emphasizing the strengths of DB2 and SAS, successful information delivery can result. The DB2 database must be structured and designed to take advantage of the relational aspects of DB2 by using indexes and enforcing unique row identification. SAS code must be written to operate efficiently in the manipulation and summarization of data. Most importantly, the SAS/DB2 interface must be constructed as efficiently as possible, taking into account requirements imposed by the DB2 tables as well as needs defined by the SAS analysis programs. In order for a database language such as DB2 to be of greatest use, it is important that it be provided with a powerful easy-to-use data processing language and interface. The interface can use existing DB2 access methods, but the processing language can range from 2GL assembler to 3GLs like PUI to 4GLs like SAS. The advantages of using SAS to process the data is that the enduser can quickly get the desired information in a usable output format, and the application can quickly be modified as needed. SAS/DB2 is the referred to interface that takes advantage of existing DB2 access methods. Information delivery for the 1990s is the marriage between a relational database and a fourth generation data processing language. DB2 and SAS make strong partners in meeting this definition. The Information Age was to begin with the 1980·s. However, the required tools had not yet matured. These tools now exist to build information delivery systems in the 1990's which will greatly enhance business efficiencies and productivities. By interfacing relational tables of almost unlimited data with enduser menuing systems, we can deliver information in a dynamic, syntactic-free environment. Endusers need I<now only which information is needed and how. to make decisions using that information. Endusers need not know how to get that information, nor from where the data came. DB2 and SAS are premier. st~ 1rullii to help provide information delivery for the 1990's. RAW DATA DB2 Strengths As defined by Howe3 , A database is a collection of non-redundant data shareable between different application systems." IBM's DB2 is the best all-around relational database language avaiJable on the market today. The many strengths of DB2 reenforce this belief. These strengths include: SETOF INFORMATION • Data relationship capability and enforcement • Data integrity • Data security and backup • Data interface capabilities • Data formats • Data peliormance Introduction • Data management Information delivery. is the method by which raw data is converted into a cohesive, coherent set of information or answers to questions concerning those data as illustrated in figure 1. Information delivery depends on the ability of the system to deliver dynamic output from a syntactic-free interface in order to meet enduser requirements to manage their environment better. DB2 runs under the MVS operating system on IBM mainframe computers. The mainstay of DB2 is its ability to provide enforceable relationships among the different data placed in the database. This is accomplished through the use of separate tables containing nonredundant data. Data integrity is possible because the structures for columns (fields), tables, and indexes are DB2 Figure 1. Diagram of I ilion Delivery 11 defined prior to input of data. The use of field-defining attributes is the essence of data integrity. Uniqueness checking occurs with the implementation of a primary index for each table which ensures that exactly duplicate rows in a DB2 table cannot occur. Data security and backup is accomplished automatically after defining the parameters necessary to implement it. This ensures that data will NOT be lost. The SAS programming language is one of the premier 4GL data processing tools available. SAS satisfies the criteria that was specified in a paper presented at SUGI 14 titled "Expectations for a Fourth Generation Language,,1. These criteria include: Datamanagement Data analysis End-user interfacing (including the use of windows, default screens, etc.) • The table structures in the database • The relationships between fields in separate tables. • The use and need for primary and non-primary Indexes • The relationship between the information needed by the enduser and the way the database is structured SAS/DB2 offers the end user who is willing to use a: software tool the ability to extract data directly from DB2. The interactive interface in SAS/DB2 is very easy to use and leads the user through the required panels. The desired interface can be saved as a map between DB2 and SAS and then reused. This interface facilitates program development as well· as ad-hoc querying. With appropriate techniques, the nonprogrammer could be provided with menus that. interfacing with SAS/DB2, dynamically retrieve userdesignated data realtime from the database. Providing the enduser with a non-programmer interface can be accomplished with point-and-click and windowing technology soon to be available in Version 6 ofSAS. With these advanced techniques, the end user need only have a conceptual understanding of the data and Data input Graphics Fast data manipulation The best method. for learning how to implement the strengths of DB2 and SAS and produce effective information delivery is to develop the experience. In our shop, we have some new programmers (less than 1 year programming experience) who were able to successfully write DB2 extract programs in SAS and process the extracted data. This statement is given to p~o~ide an example that the process is neither overly difficult nor excessively time consuming. Large applications of 15 to 20 DB2 tables and S()OO lines of SAS code can be designed, developed, and delivered within a six month window or less depending on the number of programmers and their experience level. SAS Strengths • Fast application modification • Only if these four items are addressed will you succeed with the information delivery begun by transferring data from DB2 into the SASsystem. All of the characteristics mentioned contribute· to the definition of DB2 as a multifaceted data management resource and tool. Data output (including reporting) • The SAS/DB2 interface, first available in version 5 of SAS takes advantage of the SQL interface available in D.B2. SAS/DB2 allows the user to structure queries in either a TSOor batch environment and place the extracted data directly into SAS datasets. We have shown in a few tests that extracting data from DB2 through SAS/DB2 is faster and easier than using the DB2 unload utility and inputting the resultant flat file data into SAS. Although the SAS/DB2 interface allows for quick and easy availability of data for the SAS system, there are certain requirements that the programmer or information analyst must adhere to. These inc/ude an understanding of the following: As DB2 has matured from its first release in 1985 to the present, it has improved in all areas, but particularly in formatting and performance. There are a number of numeric and character storage formats available. The date and time formatting now in DB2, however, is extremelytlsefUI for many applications. Although DB2 has been able to manage large qlJantities of data since its early releases,it did· so with limited performance. With the latest release, version 2.2, DB2 is offering increased performance of as much as an order of magnitude, specifically in the area of data access. This suggests that report generation and querying of multimillion row tables may now be possible in a timely manner. • Fast application development Combining the Strengths Through the use of the Structured Query Language (S2L) which is the programming language that provides access to DB2 tables, users can easilY OUIIO ana restructure tables, input data into tables, and extract data from the tables. Many languages such as PUI and SAS have taken advantage of the SQL interface by integrating that interface into their languages. • Fast data access • • Fast reporting techniques In spite of all these strengths, SAS would be unusable in a DB2 environment were it not for the SAS/DB2 interface. Data interface capabilities are available for both input and output. Because DB2 is a realtime, dynamic database, data may be put into and extracted from DB2 by programs that are continually running. The GIGS language is a perfect match with DB2 for those applications which need to input data to DB2 based on the realtime input of data to a frontend application by endusers. GICS is a communications language that can interface from and to many different formats. The information delivery example which will be presented later uses CIGS based PUI programs to move data into DB2. • • The degree to which SAS has met the standards is covered in great detail elsewhere. The important point is that SAS offers a fast, efficient methodology for processing data. The word fast refers to the following: 12 be able to choose which data were needed and how that data might be processed and presented, Sorting, summarizing, reports, and graphics can all be automatically made available. The skills for accomplishing these tasks can be learned from textbooks and/or courses on SASand SASlDB2. The SAS Institute provides detailed documentation on the SAS/DB2 interface. The IBM corporation supplies volumes on the maintenance and structuring of DB2 as well as the use and methodologies available with SOL. Any user of SAS/DB2 should obtain the Sal User's manual and Reference Text. There is also an excellent text on the SAS/DB2 interface written by Diane Brown titled "Guide to SAS/DB2-2. To have a successful information delivery system, use OB2, SAS, Sal, and SAS/DB2. learn the system while you are building your first application. Information Delivery Example In our environment at IBM," we maintain a DB2 database which contains customer service data. By organizing these data properly and providing needed reports, service managers can make better decisions about how to manage and improve customer service. The database to which I refer has more than 20 tables some large, some small, all with unique indexes: though. Three of these tables are very large, containing between five and ten million rows each with as many as 50 columns per table. This volume of data might only be somewhat difficult to the manage if it were static and never changed. contra,ry ,these tables are constanti' g updated by CICS transactions. being sent from •. 3 system that produces the .data. We chose OB2 as .the repository of the data because .it fulfilled our requirements for realtime updating of the databaSe~ We chose CICS as the communications vehicle because it facilitates realtime database updating betWeen ~ystems. To ~ith the' CICS interface, we hav~successfuUy Implemented a vehicle that performs well while . '., . satisfying therieedfor a realtime database. Our problems arose, though, when .we looked althe' requirements. for producing mamlgement reports from the data in the database. We wrote 100me extraction applications ushig PUI, and although they produced the required output, they were slow to develop, slow in run time, and slow to modify. . During the past five months, we developed a new set of extraction progrCims to produce requested flat files and reports.· We used SAsiDB2 and SAS to extract and process the data. The development cycle was greatly reducod from what we might have anticipated using PUI. Weare also' able to quickly respond to users changing requirements and quickly modify the existing· programs. Furthermore, we developed these programs with' one' experienced SAS programmer and four programmers with less than one year each of non-SAS programming experience. This last statement is intended. to emphasize the ease with which the development of applications can be accomplished. Whenever pOSSible, use database and programming tools that enforce a structure that minimizes error and facilitates fast and accurate development. DB2 and SAS. meet these criteria to produce fast,accurate information delivery, Conclusions There are certain products in the software arena that are referred to as strategiC products or industry leaders. Programmers and analysts who make use of these strategic products are more likely to sucCeed in the long run, Strategic pr9ducts provide better support and a higher probability of continuance in the software arena. One of the purposes of this paper is to promote the use of DB2 and SAS for information delivery. This promotion can be justified on the basis that DB2 and SAS have been identified as strategic software products for relCitional database languages and data processing languages, respectively. . This paper was intended to emphasize the service of Information Delivery as anew, modern endeavor to be distinguished from data analysis. Data analysis involves separate programs, each producing their own output. Information Delivery focuses on the synergy between a relational database and a 4Gl data processing language wherein the structures of ffie components enforee that synergy. As DB2 and SAS work together, we can see that the whole (information delivery) is greater than the sum of the parts. Conceptualizing the form in which information might be delivered is not easy. However, conceptualization of the presentation form is the major chore. Multimedia presentation techniques provide a plethora of opportunity for information presentation. It is equally important that the information for presentation be easily available through tools that facilitate information delivery. Information delivery includes both the storage and maintenance as well as the processing of the data and presentation of the derived information. Asthe information age matures, we will see a greater formalization of information delivery methodologies. The task at hand is to design information. delivery ~yste~s. The better the tools we have are at assisting In the task, the better. will be the information delivery systems. DB2 and SAS have proven; and as strategic information delivery products, will continue to prove to be premier tools for information delivery for the 1990s and beyond. Bibliography·· 1. . Baer, Darius (1989). Expectations lor a Fourth . Generation language. In SAfiJ!> Users Group International Proceedings of the Fourteenth Annual Conference, SAS Institute, Inc., Cary, NC. 2. Brown. Diane (1989). McGraw-Hili; New York. Guide to SAS/DB2, '3, .Ho~e,.D: R (1989); Data Ana.lysis for Data Base DeSign, Edward Arnold,london. for more information. contact; Darius S.Baer, Ph.D. Department 77K Support Delivery Systems NSDBoulder . IBM Corporation 5600 N. 63rd Street Boulder, CO 80314 (303) 924-2108