Download DB2 and the SAS System - Information Delivery for the 1990s

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Database wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Functional Database Model wikipedia , lookup

Database model wikipedia , lookup

Transcript
DB2TM and the SAS® System - Information Delivery
for the 1990s
Darius S. Baer, IBM Corporation
Abstract
As IBM's strategic product for relational databases,
Database 2 (~T"') excels by handling very large
volumes of data, interfacing with on-line programming
languages such as CICS, and ensuring data integrity
through automatic backup and recovery techniques.
No other database product offers the benefits inherent
in DB2. ~ is recognized for its superb data
management and reporting capabilities that facilitate
enduser information delivery. SAS also offers a welldesigned, easy-to-use interface between DB2 and SAS,
SAS/DB2, which expedites movement of data from DB2
into· desired reports with minimal time expense and
maximal product delivered.
These environments range from business and
academic to persona! and recreational. With the current wealth of tools, we truly are in the information age.
In order for a fourth generation language (4GL) lil<e
SAS to function most effectively, it is important that it be
provided with a reliable source of data. Data can be
stored in a wide variety of ways. These can range
among three basic methods which include flat, sequential files without field-defining attributes, sequential files
with field-defining attributes, and relational databases
with field-defining attributes. DB2 is a good choice
because it enforces a structure around the data which is
consistent and reliable.
By emphasizing the strengths of DB2 and SAS,
successful information delivery can result. The DB2
database must be structured and designed to take
advantage of the relational aspects of DB2 by using
indexes and enforcing unique row identification. SAS
code must be written to operate efficiently in the
manipulation and summarization of data. Most
importantly, the SAS/DB2 interface must be constructed
as efficiently as possible, taking into account
requirements imposed by the DB2 tables as well as
needs defined by the SAS analysis programs.
In order for a database language such as DB2 to be of
greatest use, it is important that it be provided with a
powerful easy-to-use data processing language and
interface. The interface can use existing DB2 access
methods, but the processing language can range from
2GL assembler to 3GLs like PUI to 4GLs like SAS. The
advantages of using SAS to process the data is that the
enduser can quickly get the desired information in a
usable output format, and the application can quickly be
modified as needed. SAS/DB2 is the referred to
interface that takes advantage of existing DB2 access
methods.
Information delivery for the 1990s is the marriage
between a relational database and a fourth generation
data processing language. DB2 and SAS make strong
partners in meeting this definition.
The Information Age was to begin with the 1980·s.
However, the required tools had not yet matured.
These tools now exist to build information delivery
systems in the 1990's which will greatly enhance
business efficiencies and productivities. By interfacing
relational tables of almost unlimited data with enduser
menuing systems, we can deliver information in a
dynamic, syntactic-free environment. Endusers need
I<now only which information is needed and how. to
make decisions using that information. Endusers need
not know how to get that information, nor from where
the data came. DB2 and SAS are premier. st~
1rullii to help provide information delivery for the 1990's.
RAW
DATA
DB2 Strengths
As defined by Howe3 , A database is a collection of
non-redundant data shareable between different
application systems." IBM's DB2 is the best all-around
relational database language avaiJable on the market
today. The many strengths of DB2 reenforce this belief.
These strengths include:
SETOF
INFORMATION
•
Data relationship capability and enforcement
•
Data integrity
•
Data security and backup
•
Data interface capabilities
•
Data formats
•
Data peliormance
Introduction
•
Data management
Information delivery. is the method by which raw data is
converted into a cohesive, coherent set of information
or answers to questions concerning those data as
illustrated in figure 1. Information delivery depends on
the ability of the system to deliver dynamic output from
a syntactic-free interface in order to meet enduser
requirements to manage their environment better.
DB2 runs under the MVS operating system on IBM
mainframe computers. The mainstay of DB2 is its ability
to provide enforceable relationships among the different
data placed in the database. This is accomplished
through the use of separate tables containing nonredundant data. Data integrity is possible because the
structures for columns (fields), tables, and indexes are
DB2
Figure 1. Diagram of I
ilion Delivery
11
defined prior to input of data. The use of field-defining
attributes is the essence of data integrity. Uniqueness
checking occurs with the implementation of a primary
index for each table which ensures that exactly
duplicate rows in a DB2 table cannot occur. Data
security and backup is accomplished automatically after
defining the parameters necessary to implement it.
This ensures that data will NOT be lost.
The SAS programming language is one of the premier
4GL data processing tools available. SAS satisfies the
criteria that was specified in a paper presented at SUGI
14 titled "Expectations for a Fourth Generation
Language,,1. These criteria include:
Datamanagement
Data analysis
End-user interfacing (including the use of windows,
default screens, etc.)
•
The table structures in the database
•
The relationships between fields in separate tables.
•
The use and need for primary and non-primary
Indexes
•
The relationship between the information needed by
the enduser and the way the database is structured
SAS/DB2 offers the end user who is willing to use a:
software tool the ability to extract data directly from
DB2. The interactive interface in SAS/DB2 is very easy
to use and leads the user through the required panels.
The desired interface can be saved as a map between
DB2 and SAS and then reused. This interface
facilitates program development as well· as ad-hoc
querying. With appropriate techniques, the nonprogrammer could be provided with menus that.
interfacing with SAS/DB2, dynamically retrieve userdesignated data realtime from the database. Providing
the enduser with a non-programmer interface can be
accomplished with point-and-click and windowing
technology soon to be available in Version 6 ofSAS.
With these advanced techniques, the end user need
only have a conceptual understanding of the data and
Data input
Graphics
Fast data manipulation
The best method. for learning how to implement the
strengths of DB2 and SAS and produce effective
information delivery is to develop the experience. In our
shop, we have some new programmers (less than 1
year programming experience) who were able to
successfully write DB2 extract programs in SAS and
process the extracted data. This statement is given to
p~o~ide an example that the process is neither overly
difficult nor excessively time consuming. Large
applications of 15 to 20 DB2 tables and S()OO lines of
SAS code can be designed, developed, and delivered
within a six month window or less depending on the
number of programmers and their experience level.
SAS Strengths
•
Fast application modification
•
Only if these four items are addressed will you succeed
with the information delivery begun by transferring data
from DB2 into the SASsystem.
All of the characteristics mentioned contribute· to the
definition of DB2 as a multifaceted data management
resource and tool.
Data output (including reporting)
•
The SAS/DB2 interface, first available in version 5 of
SAS takes advantage of the SQL interface available in
D.B2. SAS/DB2 allows the user to structure queries in
either a TSOor batch environment and place the
extracted data directly into SAS datasets. We have
shown in a few tests that extracting data from DB2
through SAS/DB2 is faster and easier than using the
DB2 unload utility and inputting the resultant flat file
data into SAS. Although the SAS/DB2 interface allows
for quick and easy availability of data for the SAS
system, there are certain requirements that the
programmer or information analyst must adhere to.
These inc/ude an understanding of the following:
As DB2 has matured from its first release in 1985 to the
present, it has improved in all areas, but particularly in
formatting and performance. There are a number of
numeric and character storage formats available. The
date and time formatting now in DB2, however, is
extremelytlsefUI for many applications. Although DB2
has been able to manage large qlJantities of data since
its early releases,it did· so with limited performance.
With the latest release, version 2.2, DB2 is offering
increased performance of as much as an order of magnitude, specifically in the area of data access. This
suggests that report generation and querying of multimillion row tables may now be possible in a timely
manner.
•
Fast application development
Combining the Strengths
Through the use of the Structured Query Language
(S2L) which is the programming language that provides
access to DB2 tables, users can easilY OUIIO ana
restructure tables, input data into tables, and extract
data from the tables. Many languages such as PUI and
SAS have taken advantage of the SQL interface by
integrating that interface into their languages.
•
Fast data access
•
• Fast reporting techniques
In spite of all these strengths, SAS would be unusable
in a DB2 environment were it not for the SAS/DB2
interface.
Data interface capabilities are available for both input
and output. Because DB2 is a realtime, dynamic
database, data may be put into and extracted from DB2
by programs that are continually running. The GIGS
language is a perfect match with DB2 for those
applications which need to input data to DB2 based on
the realtime input of data to a frontend application by
endusers. GICS is a communications language that
can interface from and to many different formats. The
information delivery example which will be presented
later uses CIGS based PUI programs to move data into
DB2.
•
•
The degree to which SAS has met the standards is
covered in great detail elsewhere. The important point
is that SAS offers a fast, efficient methodology for
processing data. The word fast refers to the following:
12
be able to choose which data were needed and how
that data might be processed and presented, Sorting,
summarizing, reports, and graphics can all be
automatically made available. The skills for
accomplishing these tasks can be learned from
textbooks and/or courses on SASand SASlDB2.
The SAS Institute provides detailed documentation on
the SAS/DB2 interface. The IBM corporation supplies
volumes on the maintenance and structuring of DB2 as
well as the use and methodologies available with SOL.
Any user of SAS/DB2 should obtain the Sal User's
manual and Reference Text. There is also an excellent
text on the SAS/DB2 interface written by Diane Brown
titled "Guide to SAS/DB2-2. To have a successful
information delivery system, use OB2, SAS, Sal, and
SAS/DB2. learn the system while you are building
your first application.
Information Delivery Example
In our environment at IBM," we maintain a DB2
database which contains customer service data. By
organizing these data properly and providing needed
reports, service managers can make better decisions
about how to manage and improve customer service.
The database to which I refer has more than 20 tables
some large, some small, all with unique indexes:
though. Three of these tables are very large, containing
between five and ten million rows each with as many as
50 columns per table.
This volume of data might only be somewhat difficult to
the
manage if it were static and never changed.
contra,ry ,these tables are constanti'
g updated by
CICS transactions. being sent from •. 3 system that
produces the .data. We chose OB2 as .the repository of
the data because .it fulfilled our requirements for realtime updating of the databaSe~ We chose CICS as the
communications vehicle because it facilitates realtime
database updating betWeen ~ystems.
To
~ith the' CICS interface, we hav~successfuUy
Implemented a vehicle that performs well while
. '.,
.
satisfying therieedfor a realtime database.
Our problems arose, though, when .we looked althe'
requirements. for producing mamlgement reports from
the data in the database. We wrote 100me extraction
applications ushig PUI, and although they produced the
required output, they were slow to develop, slow in run
time, and slow to modify.
.
During the past five months, we developed a new set of
extraction progrCims to produce requested flat files and
reports.· We used SAsiDB2 and SAS to extract and
process the data. The development cycle was greatly
reducod from what we might have anticipated using
PUI. Weare also' able to quickly respond to users
changing requirements and quickly modify the existing·
programs.
Furthermore, we developed these programs with' one'
experienced SAS programmer and four programmers
with less than one year each of non-SAS programming
experience. This last statement is intended. to
emphasize the ease with which the development of
applications can be accomplished. Whenever pOSSible,
use database and programming tools that enforce a
structure that minimizes error and facilitates fast and
accurate development. DB2 and SAS. meet these
criteria to produce fast,accurate information delivery,
Conclusions
There are certain products in the software arena that
are referred to as strategiC products or industry leaders.
Programmers and analysts who make use of these
strategic products are more likely to sucCeed in the long
run, Strategic pr9ducts provide better support and a
higher probability of continuance in the software arena.
One of the purposes of this paper is to promote the use
of DB2 and SAS for information delivery. This promotion can be justified on the basis that DB2 and SAS
have been identified as strategic software products for
relCitional database languages and data processing
languages, respectively.
.
This paper was intended to emphasize the service of
Information Delivery as anew, modern endeavor to be
distinguished from data analysis. Data analysis
involves separate programs, each producing their own
output. Information Delivery focuses on the synergy
between a relational database and a 4Gl data
processing language wherein the structures of ffie
components enforee that synergy. As DB2 and SAS
work together, we can see that the whole (information
delivery) is greater than the sum of the parts.
Conceptualizing the form in which information might be
delivered is not easy. However, conceptualization of
the presentation form is the major chore. Multimedia
presentation techniques provide a plethora of
opportunity for information presentation. It is equally
important that the information for presentation be easily
available through tools that facilitate information
delivery. Information delivery includes both the storage
and maintenance as well as the processing of the data
and presentation of the derived information.
Asthe information age matures, we will see a greater
formalization of information delivery methodologies.
The task at hand is to design information. delivery
~yste~s. The better the tools we have are at assisting
In the task, the better. will be the information delivery
systems. DB2 and SAS have proven; and as strategic
information delivery products, will continue to prove to
be premier tools for information delivery for the 1990s
and beyond.
Bibliography··
1. . Baer, Darius (1989). Expectations lor a Fourth
. Generation language. In SAfiJ!> Users Group
International Proceedings of the Fourteenth Annual
Conference, SAS Institute, Inc., Cary, NC.
2.
Brown. Diane (1989).
McGraw-Hili; New York.
Guide to SAS/DB2,
'3, .Ho~e,.D: R (1989); Data Ana.lysis for Data Base
DeSign, Edward Arnold,london.
for more information. contact;
Darius S.Baer, Ph.D.
Department 77K
Support Delivery Systems
NSDBoulder
. IBM Corporation
5600 N. 63rd Street
Boulder, CO 80314
(303) 924-2108