Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
WHY STANDARD UTILITY PROGRAMS ARE GOOD, AND WHY YOU SHOULD USE THE SAS® SYSTEM TO CREATE THEM HENRY B. WINSOR, TAYLOR MANAGEMENT SYSTEMS, INC. PK SHARMA, TAYLOR MANAGEMENT SYSTEMS, INC. standardization and consistency. If a programming group standardizes on one tool to provide a specific function, such as FINDERRS (the error-locating utility to be presented later in this paper), they only have one program to modify as their needs evolve, one program to validate, and one program to train new employees how to use. ABSTRACT This paper discusses the advantages of developing and using standard utility programs for programmer support, points out reasons for using the SAS System as the primary development environment for these utilities, and demonstrates a technique that is useful in utility program development. The code needed to implement an error location utility on two different operating systems is used for illustrative purposes. Think what life would be like, if every programmer in a group was using a different language. The disparate languages would make it very difficult for programmers to work together and assist each other as needed. It would be impossible to transfer work from one programmer to another without either having to have staffed the group with programmers that are fluent in every language, or have them spend much of their time learning other languages. The cost advantages of using only one language should be obvious. INTRODUCTION In every programmer's particular bag of tricks, there should be the concept of creating a special program for her/his own use, one that performs certain actions, normally of a repetitive nature. This program is not an application in the usual sense, since the program is not designed to be used by a non-programmer. Instead, it is used by the programmer to assist her/him in some task of program development and/or maintenance. At one time or another, every real programmer has written utility programs, she/he just may not have thought of it in those terms. In this paper, we intend to discuss not that utility programs are useful, but why the effort should be made to develop standard utilities, and why the SAS System makes an excellent choice for the development environment. In a similar context, letting/forcing programmers to develop their own utility programs independently of each other will tend to waste resources, and discourage cooperative efforts within the group. As an example, let's take a look at the error conditions defined in FINDERRS. B='ERROR: ' c= 'WARNING' ; D='W.D FORMAT'; E= ' ENDSAS' ; ERROR: is the minimum string that uniquely identifies error messages generated by the SAS System, and everybody will agree that these messages are of sufficient importance WHY STANDARD UTILITY PROGRAMS? The advantages of standard utility programs can be summed up in one phrase, 1342 to be brought to someone's attention. WARNING identifies system waming messages, which are not necessarily as crucial, and may not require fixing. W.D FORMAT identifies when the program has been forced to shift a format from the one specified in the program, which may be of no importance. ENDSAS is executable code, not any sort of message at all. These four phrases will indicate different things to different programmers, but, at our current site, they are all treated as unrecoverable errors, messages not allowed to be present in finalized programs. standard utilities, there will be no need to seek outside (expensive) assistance. Also, by having them work on something a little bit different than their normal programming work, their capabilities will be stretched in an area that will only enhance their level of training and the abilities of the group. If you are in the process of having to change or add additional operating systems, you can take advantage of the inherent abilities of the SAS System to perform and appear in a virtually identical manner, no matter what the underlying operating system. At our current site, we are in the process of moving most of our work from an IBM® mainframe using CMS® to a network of Hewlett-Packard workstations using HP-UX (a proprietary dialect of UNIX). We also have people working on personal computers using MS-DOS, WINDOWS, and OS/2®, using VMS on a VAX, and even an occasional use for MVS® on the mainframe. By having these different checks in one program, it is easier to enforce the departmental policy. A programmer has one tool that tells him what messages need to be cleaned up, and his supervisor has the same tool to check that the work has been done properly. There's no question whether all the checks were performed, since the task of checking is as simple and straightforward as running a single program. If policy changes, the change can be implemented in modifying a single program, not many. For the same reasons that people choose a single environment for programming, they should also provide standard utility programs. While utility programs make a programmer's job easier, standard utilities make the job of their leaders easier. If you haven't had the pleasure of working with the script languages available in each of these environments, we can assure you that each environment has a marvelous scripting language that allows one to do almost anything that can be thought of, can do anything the SAS System can, and in some instances, do it in a more efficient fashion. These languages also have almost nothing in common with each other, and each possesses its own little eccentricities, rendering any knowledge gained in one worse than useless in another. WHY USE SAS? The major reason for using the SAS System is that programmers already have familiarity with the environment, they would be using a language in which they've probably got the most experience, one in which their company has already made an investment. The group will have capable programmers on staff, people who are already developing utility programs on their own. If they are used to develop the So, if you are in a circumstance where you are either changing or adding platforms, (and if you currently aren't, you should expect to have to do so sometime in the future), you should consider using the SAS System to perform the meat of the work, while using the scripting language to perform the bare minimum of action that the 1343 log files to scan for errors, you can reduce your consumption of paper by printing out only those log files that contain errors, even editing the file to a smaller size before printing. operating system requires, normally just file access. For example, we tend to use the operating system-specific programs (EXEC, script, etc.) to identify the file used for input, and invoke the SAS System in a batch mode, passing the filename as a parameter. This allows us to concentrate on working with SAS, not learning the intricacies of a scripting language. If shelhe uses text search methods for ERROR or WARNING in an editor, she/he will have to search multiple times for many individual messages to make sure to cover every possible condition that the group treats as an error. The FINDERRS program does just that, without anyone having to remember every unique string. THE FINDERRS UTILITY PROGRAM Let's take a look at the complete code for FINDERRS. It uses the log file from a SAS program as input. It scans each line of the log file, looking for errors or warnings that will require programmer intervention to correct. If it finds an error condition, it writes to the output file the page and line number of the log file where the error is located, and copies the first line of the error message. If no error is located, the utility will write a line to that effect. If a good programmer is certain that a program is working correctly, she/he will still visually check the log, just in case something unexpected cropped up. The FINDERRS program will verify that a program works correctly in less time than any person could ever visually scan a log file, and not miss an error. If you work in an environment that displays file line size, the output from a log file with no errors will always be smaller than the output from one that has errors. Otherwise, a visual check of the output file contents will verify that the file run was errorless. If you write programs that generate a two or three page log file, this may seem like ridiculous overkill, the result of a time when programmers had far too much time. on their hands. On the contrary, even if you don't have thousand page log files to peruse, there are still advantages to using an errorlocating utility. The code for FINDERRS follows. 1* FINDERRS.SAS */ If a programmer is working a group where shelhe can't print out a paper copy of the log all the time, and her/his terminal screen is limited to 22 lines of display, it can still take some time to scan two or three pages, especially if the mainframe is busy and it takes 45 seconds to display the next 22 lines. Instead, the FINDERRS program will group all the error messages together, eliminating the need to scan code that is working correctly. OPTIONS NOCENTER; (The sysstuff macro uses the &SYSSCP system macro variable to determine the operating system in use, and set the appropriate carriage control character and FILENAME statements for the operating system. This macro locates all system specific code in one place, and will abort if it is used on an unknown system.) %macro sysstuff; %GLOBAL CC; If the programmer normally prints the 1344 1* indexes check for condition *1 %IF &SYSSCP=CMS %THEN %DO; %let CC='l'; FILENAME INLOG DISK "&SYSPARM A"; FILENAME OUTLOG DISK "FINDERRS "LISTING A" LRECL=133 BLKSIZE=266 RECFM=FB; IF (INDEX(A,B) (INDEX(A,C) (INDEX(A,D) (INDEX (A, E) GT GT GT GT 0) 0) 0) 0) THEN DO; 1* change page number header *1 %END; IF RET PAGE NE SAME PAGE THEN PUT @2 "»> PROBLEMS ON "PAGE " RETPAGE "OF &SYSPARM «<"I; %ELSE %IF &SYSSCP=HP 800 %THEN %DO; %let CC='OC'X; FILENAME INLOG "finderrs.dat"; FILENAME OUTLOG "finderrs.lst"; %END; %ELSE %DO; DATA _NULL_; PUT II "&SYSSCP IS AN UNKNOWN OS" I; ABORT RETURN; RUN; 1* line no. will always be _N_ *1 %END; 1* set flag for error found *1 LINENUM=PUT(_N_,BEST.); PUT @8 '» ON LINE ' LINENUM '«' I @l CARD $CHAR133. I; %mend sysstuff; %sysstuff; FLAG=l; SAMEPAGE=RETPAGE; END; (the data null step does all the work.) 1* if no error *1 DATA _NULL_; INFILE INLOG END=EOF PAD; IF EOF & AFLAG THEN PUT @2 "»> CONGRATULA" "TIONS!! NO PRO" "BLEMS WERE FOUND ltIN &SYSPARM «<" Ii RETURN; 1* need three fields from each 1* record, CC=carriage control, 1* PAGE=page number, CARD=record *1 INPUT @l CC $1. @l PAGE $CHAR5. @l CARD $CHAR133.; FILE OUTLOG; A sample program was written, to show most of the error conditions. 1* remove carriage control char *1 IF CC=&CC THEN PAGE=SUBSTR(PAGE,2); DATA A B; INPUT Xl X2; IF Xl LT 0 THEN OUTPUT B; ELSE OUTPUT A; CARDS; 1 23 2 37 1* convert to upper case, just to simplify the comparison effort. *1 A=UPCASE (CARD) ; 1* sample error conditions *1 B=' ERROR: ' C= 'WARNING , ; D='W.D FORMAT' ; E::; , ENDSAS I RUN; ; 1* Warns of no data in data set*1 PROC SORT DATA=B; BY Xl; 1* page number conditions *1 R='THE SAS SYSTEM'; S='NOTE: THE SAS SYSTEM'; RUNi DATA _NULL_; SET A; FILE PRINT NOTITLES; IF _N_=l THEN PUT @10 'FIRST DRAFT' I; 1* bad format *1 PUT @lO Xl 2. @20 X2 1.; RETURN; 1* get page number *1 RETAIN FLAG RET PAGE SAMEPAGE 0; IF CC=&CC & (INDEX(A,R) GT 0) & (INDEX(A,S) EQ 0) THEN RETPAGE=INPUT(COMPRESS(PAGE) ,BEST. ) ; 1345 1* stopped before final step *1 ENDSAS; 1* t'he good step * I DATA _NULL_; SET A; FILE PRINT NOTITLES; IF _N_=l THEN PUT @10 'FINAL REPORT'II; PUT @10 XII. @20 X2 2.; RETURN; the log file. If you were to go directly to line 22 in the log file, you would be surprised to not find an error message there, when your utility apparently did. If you use the physical line number for location, you should use the number supplied by the utility program. Altematively, you could text search for 22, and probably locate the line in a few hits, assuming there was not a lot of numbers in your program code. On CMS, you get the following output, shifted to fit in the column. »> PROBLEMS ON PAGE 1 OF SAMPLE SASLOG «< Finally, we wish to present first the CMS EXEC, and then the Kom shell script used to run this utility program. » ON LINE 30 « WARNING: InpVr data set is empty. » ON LINE 43 « 22 1* FINDERRS EXEC *1 PARSE UPPER ARG FILENAME ENDSAS; IF FILENAME=" THEN DO SAY SAY 'USAGE EXAMPLE: SASLOG FILENAME' SAY EXIT END "SAS FINDERRS (SYSPARM='" FILENAME" SASLOG'" 'ERASE FINDERRS SASLOG *' » ON LINE 46 « NOTE: At least one W.D format was too small for the number to be printed. The decimal may be shifted by the "BEST" format. On HP-UX, you get the following output, again shifted to fit in the column. You will note the only differences are in the name of the LOG file, and the actual line numbers. It's just a simple little exec. All it does is make sure there is a program name for input, call the batch EXEC, and clean up after itself. ' A novice unfamiliar with REXX can still follow what this EXEC is doing. »> PROBLEMS ON PAGE 1 OF sample. log «< The shell script is slightly more obscure, but is still a simple script to follow. » ON LINE 39 « WARNING: Input data set is empty. » 22 II! Ibin/ksh rm -f finderrs.lst ON LINE 54 « ENDSAS; != " 1 then clear echo 'Thank you! Just a moment .. ' II need to insert a formfeed echo '\f\c' > finderrs.dat cat $1 » finderrs.dat sas -sysparm $1 finderrs.sas rm finderrs.log rm finderrs.dat else i f [ $1 » ON LINE 57 « NOTE: At least one W.D format was too small for the number to be printed. The decimal may be shifted by the "BEST" format. One of the above error lines has a line number, which is the line number from the program, not the actual line number in 1346 clear echo 'You must specify a file ... ' ACKNOWLEDGEMENTS The authors wish to thank some of the other people that have developed and ~aintained the code presented in this paper, Including Robert W. (Tad) Braden, David G. Hall, James A. Pecho, and Michael C. Rebecca. We would also wish to thank Brian L. Boumival and Helen Bruce Winsor for taking the time to review and recommend changes to this paper. fi The only unusual action performed by the script is the placement of the formfeed character as the first character in the input file. As per UNIX convention, the HP-UX listing file does not start with a formfeed which the FINDERRS program uses t~ identify those lines that contain page numbers. This difference is system-specific, and concatenating the formfeed in front of the file seems to be the easiest solution. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries. CMS, IBM, MVS, and OS/2 are registered trademarks or trademarks of International Business Machines Corporation in the USA and other countries. ® indicates USA registration. Note that both of these EXEC/script files do the same exact thing, in two very different ways. You can imagine what it's like to write a program to perform such a complicated process in two so dissimilar languages, much less have to support two libraries of many utility programs across both platforms. Fortunately, we don't have to, we let the operating system do file management, and let the SAS System do the real complicated stuff. Other brand and product names are registered trademarks or trademarks of their respective companies. CONCLUSION Using the technique of considering each line of a file as a single record has other applications as well. Our current work site has an extensive library of programs that perform a variety of burden-reducing tasks, such as table of contents printing, all based on this technique. We feel confident in your ability to identify many currently mundane tasks at your workplace that can be replaced by an extension of this method. We hope that we have made a case for why you should support the use and development of standard utility programs, and why the SAS System has many useful features for developing these programs. The SAS system isn't just for the needs of end-users. 1347