Download Why Standard Utility Programs are Good and Why You Should Use the SAS System to Create Them

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Choice modelling wikipedia , lookup

Transcript
WHY STANDARD UTILITY PROGRAMS ARE GOOD,
AND WHY YOU SHOULD USE THE SAS® SYSTEM TO CREATE THEM
HENRY B. WINSOR, TAYLOR MANAGEMENT SYSTEMS, INC.
PK SHARMA, TAYLOR MANAGEMENT SYSTEMS, INC.
standardization and consistency.
If a
programming group standardizes on one
tool to provide a specific function, such as
FINDERRS (the error-locating utility to be
presented later in this paper), they only have
one program to modify as their needs
evolve, one program to validate, and one
program to train new employees how to use.
ABSTRACT
This paper discusses the advantages
of developing and using standard utility
programs for programmer support, points out
reasons for using the SAS System as the
primary development environment for these
utilities, and demonstrates a technique that
is useful in utility program development. The
code needed to implement an error location
utility on two different operating systems is
used for illustrative purposes.
Think what life would be like, if every
programmer in a group was using a different
language. The disparate languages would
make it very difficult for programmers to
work together and assist each other as
needed. It would be impossible to transfer
work from one programmer to another
without either having to have staffed the
group with programmers that are fluent in
every language, or have them spend much
of their time learning other languages. The
cost advantages of using only one language
should be obvious.
INTRODUCTION
In every programmer's particular bag
of tricks, there should be the concept of
creating a special program for her/his own
use, one that performs certain actions,
normally of a repetitive nature.
This
program is not an application in the usual
sense, since the program is not designed to
be used by a non-programmer. Instead, it is
used by the programmer to assist her/him in
some task of program development and/or
maintenance. At one time or another, every
real programmer has written utility programs,
she/he just may not have thought of it in
those terms. In this paper, we intend to
discuss not that utility programs are useful,
but why the effort should be made to
develop standard utilities, and why the SAS
System makes an excellent choice for the
development environment.
In a similar context, letting/forcing
programmers to develop their own utility
programs independently of each other will
tend to waste resources, and discourage
cooperative efforts within the group. As an
example, let's take a look at the error
conditions defined in FINDERRS.
B='ERROR: '
c= 'WARNING' ;
D='W.D FORMAT';
E= ' ENDSAS' ;
ERROR: is the minimum string that uniquely
identifies error messages generated by the
SAS System, and everybody will agree that
these messages are of sufficient importance
WHY STANDARD UTILITY PROGRAMS?
The advantages of standard utility
programs can be summed up in one phrase,
1342
to be brought to someone's attention.
WARNING identifies system waming
messages, which are not necessarily as
crucial, and may not require fixing. W.D
FORMAT identifies when the program has
been forced to shift a format from the one
specified in the program, which may be of
no importance. ENDSAS is executable
code, not any sort of message at all. These
four phrases will indicate different things to
different programmers, but, at our current
site, they are all treated as unrecoverable
errors, messages not allowed to be present
in finalized programs.
standard utilities, there will be no need to
seek outside (expensive) assistance. Also,
by having them work on something a little bit
different than their normal programming
work, their capabilities will be stretched in
an area that will only enhance their level of
training and the abilities of the group.
If you are in the process of having to
change or add additional operating systems,
you can take advantage of the inherent
abilities of the SAS System to perform and
appear in a virtually identical manner, no
matter what the underlying operating system.
At our current site, we are in the process of
moving most of our work from an IBM®
mainframe using CMS® to a network of
Hewlett-Packard workstations using HP-UX
(a proprietary dialect of UNIX). We also
have people working on personal computers
using MS-DOS, WINDOWS, and OS/2®,
using VMS on a VAX, and even an
occasional use for MVS® on the mainframe.
By having these different checks in
one program, it is easier to enforce the
departmental policy. A programmer has one
tool that tells him what messages need to be
cleaned up, and his supervisor has the same
tool to check that the work has been done
properly. There's no question whether all
the checks were performed, since the task of
checking is as simple and straightforward as
running a single program. If policy changes,
the change can be implemented in modifying
a single program, not many. For the same
reasons that people choose a single
environment for programming, they should
also provide standard utility programs.
While utility programs make a programmer's
job easier, standard utilities make the job of
their leaders easier.
If you haven't had the pleasure of
working with the script languages available
in each of these environments, we can
assure you that each environment has a
marvelous scripting language that allows one
to do almost anything that can be thought of,
can do anything the SAS System can, and in
some instances, do it in a more efficient
fashion. These languages also have almost
nothing in common with each other, and
each possesses its own little eccentricities,
rendering any knowledge gained in one
worse than useless in another.
WHY USE SAS?
The major reason for using the SAS
System is that programmers already have
familiarity with the environment, they would
be using a language in which they've
probably got the most experience, one in
which their company has already made an
investment. The group will have capable
programmers on staff, people who are
already developing utility programs on their
own.
If they are used to develop the
So, if you are in a circumstance
where you are either changing or adding
platforms, (and if you currently aren't, you
should expect to have to do so sometime in
the future), you should consider using the
SAS System to perform the meat of the
work, while using the scripting language to
perform the bare minimum of action that the
1343
log files to scan for errors, you can reduce
your consumption of paper by printing out
only those log files that contain errors, even
editing the file to a smaller size before
printing.
operating system requires, normally just file
access. For example, we tend to use the
operating system-specific programs (EXEC,
script, etc.) to identify the file used for input,
and invoke the SAS System in a batch
mode, passing the filename as a parameter.
This allows us to concentrate on working
with SAS, not learning the intricacies of a
scripting language.
If shelhe uses text search methods
for ERROR or WARNING in an editor,
she/he will have to search multiple times for
many individual messages to make sure to
cover every possible condition that the group
treats as an error. The FINDERRS program
does just that, without anyone having to
remember every unique string.
THE FINDERRS UTILITY PROGRAM
Let's take a look at the complete code
for FINDERRS. It uses the log file from a
SAS program as input. It scans each line of
the log file, looking for errors or warnings
that will require programmer intervention to
correct. If it finds an error condition, it writes
to the output file the page and line number
of the log file where the error is located, and
copies the first line of the error message. If
no error is located, the utility will write a line
to that effect.
If a good programmer is certain that
a program is working correctly, she/he will
still visually check the log, just in case
something unexpected cropped up. The
FINDERRS program will verify that a
program works correctly in less time than
any person could ever visually scan a log
file, and not miss an error. If you work in an
environment that displays file line size, the
output from a log file with no errors will
always be smaller than the output from one
that has errors. Otherwise, a visual check of
the output file contents will verify that the file
run was errorless.
If you write programs that generate a
two or three page log file, this may seem like
ridiculous overkill, the result of a time when
programmers had far too much time. on their
hands. On the contrary, even if you don't
have thousand page log files to peruse,
there are still advantages to using an errorlocating utility.
The code for FINDERRS follows.
1* FINDERRS.SAS */
If a programmer is working a group
where shelhe can't print out a paper copy of
the log all the time, and her/his terminal
screen is limited to 22 lines of display, it can
still take some time to scan two or three
pages, especially if the mainframe is busy
and it takes 45 seconds to display the next
22 lines. Instead, the FINDERRS program
will group all the error messages together,
eliminating the need to scan code that is
working correctly.
OPTIONS NOCENTER;
(The sysstuff macro uses the &SYSSCP
system macro variable to determine the
operating system in use, and set the
appropriate carriage control character and
FILENAME statements for the operating
system. This macro locates all system
specific code in one place, and will abort if it
is used on an unknown system.)
%macro sysstuff;
%GLOBAL CC;
If the programmer normally prints the
1344
1* indexes check for condition *1
%IF &SYSSCP=CMS %THEN %DO;
%let CC='l';
FILENAME INLOG DISK "&SYSPARM A";
FILENAME OUTLOG DISK "FINDERRS
"LISTING A" LRECL=133
BLKSIZE=266 RECFM=FB;
IF (INDEX(A,B)
(INDEX(A,C)
(INDEX(A,D)
(INDEX (A, E)
GT
GT
GT
GT
0)
0)
0)
0) THEN DO;
1* change page number header *1
%END;
IF RET PAGE NE SAME PAGE
THEN PUT @2 "»> PROBLEMS ON
"PAGE " RETPAGE
"OF &SYSPARM «<"I;
%ELSE %IF &SYSSCP=HP 800 %THEN %DO;
%let CC='OC'X;
FILENAME INLOG "finderrs.dat";
FILENAME OUTLOG "finderrs.lst";
%END;
%ELSE %DO;
DATA _NULL_;
PUT II
"&SYSSCP IS AN UNKNOWN OS" I;
ABORT RETURN;
RUN;
1* line no. will always be _N_ *1
%END;
1* set flag for error found *1
LINENUM=PUT(_N_,BEST.);
PUT @8 '» ON LINE '
LINENUM '«' I
@l CARD $CHAR133. I;
%mend sysstuff;
%sysstuff;
FLAG=l;
SAMEPAGE=RETPAGE;
END;
(the data null step does all the work.)
1* if no error *1
DATA _NULL_;
INFILE INLOG END=EOF PAD;
IF EOF & AFLAG
THEN PUT @2 "»> CONGRATULA"
"TIONS!! NO PRO"
"BLEMS WERE FOUND
ltIN &SYSPARM «<" Ii
RETURN;
1* need three fields from each
1* record, CC=carriage control,
1* PAGE=page number, CARD=record *1
INPUT @l CC $1. @l PAGE $CHAR5.
@l CARD $CHAR133.;
FILE OUTLOG;
A sample program was written, to show
most of the error conditions.
1* remove carriage control char *1
IF CC=&CC
THEN PAGE=SUBSTR(PAGE,2);
DATA A B;
INPUT Xl X2;
IF Xl LT 0 THEN OUTPUT B;
ELSE OUTPUT A;
CARDS;
1 23
2 37
1* convert to upper case,
just to
simplify the comparison effort. *1
A=UPCASE (CARD) ;
1* sample error conditions *1
B=' ERROR: '
C= 'WARNING , ;
D='W.D FORMAT' ;
E::; , ENDSAS I
RUN;
;
1* Warns of no data in data set*1
PROC SORT DATA=B;
BY Xl;
1* page number conditions *1
R='THE SAS SYSTEM';
S='NOTE: THE SAS SYSTEM';
RUNi
DATA _NULL_;
SET A;
FILE PRINT NOTITLES;
IF _N_=l
THEN PUT @10 'FIRST DRAFT' I;
1* bad format *1
PUT @lO Xl 2. @20 X2 1.;
RETURN;
1* get page number *1
RETAIN FLAG RET PAGE SAMEPAGE 0;
IF CC=&CC &
(INDEX(A,R) GT 0) &
(INDEX(A,S) EQ 0)
THEN RETPAGE=INPUT(COMPRESS(PAGE)
,BEST. ) ;
1345
1* stopped before final step *1
ENDSAS;
1* t'he good step * I
DATA _NULL_; SET A;
FILE PRINT NOTITLES;
IF _N_=l
THEN PUT @10 'FINAL REPORT'II;
PUT @10 XII. @20 X2 2.;
RETURN;
the log file. If you were to go directly to line
22 in the log file, you would be surprised to
not find an error message there, when your
utility apparently did. If you use the physical
line number for location, you should use the
number supplied by the utility program.
Altematively, you could text search for 22,
and probably locate the line in a few hits,
assuming there was not a lot of numbers in
your program code.
On CMS, you get the following output,
shifted to fit in the column.
»> PROBLEMS ON PAGE 1 OF SAMPLE
SASLOG «<
Finally, we wish to present first the
CMS EXEC, and then the Kom shell script
used to run this utility program.
»
ON LINE 30 «
WARNING: InpVr data
set is empty.
»
ON LINE 43 «
22
1* FINDERRS EXEC *1
PARSE UPPER ARG FILENAME
ENDSAS;
IF FILENAME=" THEN DO
SAY
SAY
'USAGE EXAMPLE: SASLOG FILENAME'
SAY
EXIT
END
"SAS FINDERRS (SYSPARM='"
FILENAME" SASLOG'"
'ERASE FINDERRS SASLOG *'
»
ON LINE 46 «
NOTE: At least one
W.D format was too small for the
number to be printed. The decimal
may be shifted by the "BEST"
format.
On HP-UX, you get the following
output, again shifted to fit in the column.
You will note the only differences are in the
name of the LOG file, and the actual line
numbers.
It's just a simple little exec. All it does
is make sure there is a program name for
input, call the batch EXEC, and clean up
after itself. ' A novice unfamiliar with REXX
can still follow what this EXEC is doing.
»> PROBLEMS ON PAGE 1 OF
sample. log «<
The shell script is slightly more
obscure, but is still a simple script to follow.
» ON LINE 39 «
WARNING: Input data set is empty.
»
22
II! Ibin/ksh
rm -f finderrs.lst
ON LINE 54 «
ENDSAS;
!= " 1
then
clear
echo 'Thank you! Just a moment .. '
II need to insert a formfeed
echo '\f\c' > finderrs.dat
cat $1 » finderrs.dat
sas -sysparm $1 finderrs.sas
rm finderrs.log
rm finderrs.dat
else
i f [ $1
»
ON LINE 57 «
NOTE: At least one W.D format was
too small for the number to be
printed. The decimal may be shifted
by the "BEST" format.
One of the above error lines has a
line number, which is the line number from
the program, not the actual line number in
1346
clear
echo 'You must specify a file ... '
ACKNOWLEDGEMENTS
The authors wish to thank some of
the other people that have developed and
~aintained the code presented in this paper,
Including Robert W. (Tad) Braden, David G.
Hall, James A. Pecho, and Michael C.
Rebecca. We would also wish to thank
Brian L. Boumival and Helen Bruce Winsor
for taking the time to review and recommend
changes to this paper.
fi
The only unusual action performed by
the script is the placement of the formfeed
character as the first character in the input
file. As per UNIX convention, the HP-UX
listing file does not start with a formfeed
which the FINDERRS program uses t~
identify those lines that contain page
numbers. This difference is system-specific,
and concatenating the formfeed in front of
the file seems to be the easiest solution.
SAS is a registered trademark or trademark
of SAS Institute Inc. in the USA and other
countries. CMS, IBM, MVS, and OS/2 are
registered trademarks or trademarks of
International Business Machines Corporation
in the USA and other countries. ® indicates
USA registration.
Note that both of these EXEC/script
files do the same exact thing, in two very
different ways. You can imagine what it's like
to write a program to perform such a
complicated process in two so dissimilar
languages, much less have to support two
libraries of many utility programs across both
platforms. Fortunately, we don't have to, we
let the operating system do file
management, and let the SAS System do
the real complicated stuff.
Other brand and product names are
registered trademarks or trademarks of their
respective companies.
CONCLUSION
Using the technique of considering
each line of a file as a single record has
other applications as well. Our current work
site has an extensive library of programs
that perform a variety of burden-reducing
tasks, such as table of contents printing, all
based on this technique. We feel confident
in your ability to identify many currently
mundane tasks at your workplace that can
be replaced by an extension of this method.
We hope that we have made a case
for why you should support the use and
development of standard utility programs,
and why the SAS System has many useful
features for developing these programs.
The SAS system isn't just for the needs of
end-users.
1347