* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Dynamic Data Selection Lists in SAS/AF Programming Entries
Registry of World Record Size Shells wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Relational algebra wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Clusterpoint wikipedia , lookup
Functional Database Model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Object-relational impedance mismatch wikipedia , lookup
Interactive Systems DYNAMIC DATA SELECTION LISTS IN SAS/A~ PROGRAM SCREENS, or "PICK TWO FROM COLUMN A AND ONE FROM COLUMN B" John M. Rinehart Abstract particular chemical parameters at certain specified sampling locations over some range of time. SAS applications often involve reporting or graphing a subset of data from a large database. Generally, the user is asked to provide data selection or screening criteria used to identify the desired subset. In addition to fixed choices ("this year I last year", "Division A / Division B", "sales I profits", etc.), these criteria may be open-ended and based on actual data values contained in one or more columns of the database ("report for these SKU's ... H). The easiest and most reliable way to obtain such values is to have the user select them from a list of available values. A simple, common (and unfriendly) approach would be to give the user a "fill in the blank" screen to enter range endpoints or a list of discrete values as data screening criteria. This has many disadvantages for the user, including remembering (or even knowing in the first place) what data values to enter, and spelling them all correctly. Then if a blank report is returned, there is uncertainty: was a mistake made in entering the values, is there no data matching these criteria, or did the programmer screw up again? A better solution would present the user with a list of unique values available in the database, and allow simple selection of one or more for use as discrete values or range endpoints. Furthermore, if screening criteria may be specified for more than one database column, the value list presented for anyone column should reflect the screening of any criteria already given for all other columns. In this way, no values need to be remembered or typed (especially important with long lists or esoteric values, such as SKU's), and data coding for these purposes is unnecessary. The user sees exactly what data is available and knows what to expect in the output. This paper presents one method for implementing data selection lists in a SAS/AF program screen with SCL, by querying the source dataset for available values. The result is similar to the "pull-down list" or ·combo box" of popular GUI databases. In addition, dynamic linking of lists for multiple database columns is demonstrated, such that the available values from one column reflect the screening of any criteria already entered for other columns. The Problem: Data Selection The ultimate point of most information sYstems is to return stored information to the user in a useful form (report, graph, etc.) when requested. Usually, however, only a subset of the available information is needed (or wanted) to :fulfill any given request. So there must be some way to apply selection or screening criteria to one or more columns of the database to filter out the information of interest. A similar problem was discussed in the Input/Output department of the second quarter 1995 issue of Observations. the SAS Institute technical journal. The solution given there, using structured SCL lists, is somewhat limited. Only a single value may be selected for each database column, and the columns must be addressed in a fixed, hierarchical order. The technique discussed in this paper is more flexible on these points, at the expense of repeatedly accessing the source dataset. Multiple discrete values may be specified for any database column, and in any order. Sometimes, useful subsets of information can be anticipated in the system design, "hard-coded" into the application, and presented to the user in a multiple choice menu. More often, data selection must be left more open-ended to accommodate changing user needs and to reflect the availability of data. For example, a user may need to summarize inventory and sales data for a variable range or list of SKU's , or to plot water quality results for The sample demonstration given here uses the following trivial source dataset with only three columns, all used for data selection. Hopefully, extension to a real application will be obvious, where larger source datasets 494 Interactive Systems will have many more co1unms used for reporting after a subset of rows is selected. +IIUILD: DISPLAY SELECt .PJIOGRA'MM-----+ Co....and --=) Data Selection Screen DATASET WORK. SAMPLE OBS 1 2 3 4 5 6 7 8 9 10 11 12 SHAPE COLOR SIZE TRIANGLE TRIANGLE TRIANGLE TRIANGLE TRIANGLE SQUARE SQUARE SQUARE CIRCLE CIRCLE CIRCLE CIRCLE RED RED YELLOW YELLOW BLUE RED YELLOW BLUE RED RED BLUE BLUE 1JIRGE SMlILL 1JIRGE SMlILL 1JIRGE 1JIRGE 1JIRGE 1JIRGE 1JIRGE SMlILL 1JIRGE SMlILL -------- Shape: H1__ - ColoI': lC1_ Size: H2__ 183 _ _ 1IC2_ 1IC3_ HIZE Ro.... selected: &ROUSEL I I I +---------------------------+ The user is instructed to leave blank all fields for a given database column to select all available data, or to enter screening criteria values as required. To display a selection list of available values from which to choose, the user will press the <ENTER> key with the cursor positioned on a blank field. The selected values are then copied into the window fields, as shown in this partial SCL source code listing: The Basic Approach The selection list method described here is based on a SAS/AF PROGRAM entry and SCL (screen control language), suitable for SAS versions 6.07 and later, on MVSITSO and other operating platforms. The method could likely also be implemented in more recent versions using the SAS/AF FRAME entry. DIl:!r: CONTROL ALWAYS SOURCEDS = 'WORK. SAMPLE' ; RETURN; To display a list of values from a database column, the SCL function DATALISTCO is used on a temporary dataset containing the unique values. This dataset is generated from the source dataset by SQL code in a SUBMIT block, each time the list is displayed. The flexibility of the SQL query used here also permits other information to be added if needed, such as a description field to be displayed alongside coded values in the selection list. MAIN: FIELD=CURFLD(); IF (FIELD='Sl' AND Sl= BLANK) OR (FIELD='S2' AND S2=-BLANK-) OR (FIELD='S3' AND S3=-BLANK-) THEN DO; -SUBMIT CONTINUE SQL; CREATE TABLE U SHAPE AS SELECT UNIQUE SHAPE FROM 'SOURCEOS; ENDSUBMIT; DSlDooOPEN ( 'U SHAPE'); IF DSlD THEN DO; LIO-MruCELlST(); RC=CURLIST(LIO); CVlIR=DATALISTC (OSlO, 'S!!APE' , 'Select up to 3 shapes', 'N'j3); IF GETNITEMN(LIO,'COUNT') GT 0 THEN 00; Sl=GETNITEMC(LIO, 'SHAPE',I,l,' ') S2=GETNITEMC(LIO, 'SHAPE',2,1,' ') S3=GETNITEMC(LIO,'SHAPE',3,1,' ') END; RC=OELLIST (LIO) ; RC=CLOSE (OSlO) ; RC=OELETE('U SHAPE'); The DATALISTC function will display the list of unique values (and additional descriptors) in a window, and permit the user to select any number of them, up to a fixed maximum. The GETNITEMC function is then used to copy the selected values into window field variables for review and subsequent use in subsetting the source dataset. Consider the following window which allows the . user to enter data screening criteria on three columns of the sample source dataset: SHAPE, COLOR, and SIZE. The design allows up to three discrete values each for SHAPE and COLOR, and only one value for SIZE. SCL window variable names are shown in this BUll.D display: END; END; 495 - Interactive Systems determined in the system design, and set high enough to satisfy most reasonable user needs. Of course, other methods may be used to trigger the selection list display, such as separate pushbutton fields, or function keys. The displayed selection list might look something like this: Since the source dataset is queried each time a selection list is requested, the system response time can be impacted, especially for large datasets. Performance might be improved by creating an index on the database columns used for screening. When there are many records for each set of values in the screening columns, response should be snappier if the value lists are generated from a sununary of the source dataset, containing just the unique combinations of values, rather than the source itself: +SAS •••••••••••••••••••••••••••••••••••••••• Co....and ••• > Data Select:ion Screen Shape: _ __ +Select D a t a - - ; Co~d =_a) I ; Select up to 3 ; shape:: : Size: CIRCLE SQUARE TRIANGLE nIl:!I!: CONTROL ALWAYS SOURCEDS = 'WORK. SAMPLE' ; SUMMRYDS = 'WORK.SUMMRY'; SUBMIT CONTINUE SQL; CREATE TABLE &S'IlMMams AS SELECT UNIQUE SHAPE, COLOR, SIZE, COUNT (SHAPE) AS N FROM &SOURCEDS; ENDSUBMIT; RETURN; . ; : * Rows selected: 12 : * I ......................... +-------+ Once the user accepts the selected values with an END key or command, they are copied back into the window field variables, via the current environment list. If the users CANCELs or ENDs the selection list without selecting any values, the window fields are left unchanged. Logically Linking Multiple Columns +SAS----------------+ Co-..d ==-> The selection list datasets are repeatedly queried in order to allow a dynamic logical linking of the screening columns. In this way, each selection list includes only the values which appear in the source dataset as already screened on the other columns. Column relationships might be hierarchical (e.g.- STATE > COUNTY> CENSUSTRACT), allowing the user systematically to reduce the size of a potentially long list, but they need not be, as in the present example. Data Selection Screen Shape: SQUARE- TRIANGLE Colo .. : _ _ Size: __ The linking is achieved by using the window variable fields associated with each database column to generate a character string which can be used as a WHERE expression on that column. The unique value query which generates the selection list values for a particular column is then screened by the current WHERE expressions for all other columns. Rows ..elected: 8 Similar sections of conditional SCL code in the MAIN section control the generation and display of selection lists for the other database columns COLOR and SIZE. Since there is only one window field for a SIZE criterion, the DATALISTC function allows only one selection, and can be set up to close immediately, without the need for an END command. The maximum number of discrete selection values for each column must be The following partial SCL source listing shows the WHERE expressions defined as SCL nonwindow variables in a separate labeled section, so they can be recalculated as needed with a LINK statement: 496 Interactive Systems In use, column linking would look something like this. Referring to the sample dataset listing, suppose the user has requested triangles and squares, and has also selected small sizes. The color selection list will be: SLC!rYlIRS : ** Define character strings based on window variables, to be used as WHERE expressions when generating unique value selection lists.**; IF Sl!!S2!!S3 - _BLANK_ THEN SBLSBAft , (SHAPE NE " " ) I; +SAS •••••••••••••••••••••••••••••••••••••••• eo..and ELSE SBLSBAft = '{SHAPE IN (111!IS1!!' f I I I t I I ! ! 53! ! I SHAPE NE " ","'!!S2!! r r) AND _D_> Data Selection Screen II)'; Shape: SQUR~ Size: SIIRLL TRIANGLE RETURN; JaIN: +Select Data---+ Co_d -==) LD1lt SLC!rYlIRS; Select up to 3 FIELD=CURFLD ( ) ; **Show selection list for SHAPE **; IF (FIELD='Sl' AND 51- BLANK ) OR (FIELD='S2' AND 52=-BLANK-) OR (FIELD='S3' AND S3 m -BLANK-) THEN DO; -SUBMIT CONTINUE SQL; CREATE TABLE U SHAPE AS SELECT UNIQUE SHAPE FROM &50URCEDS 1IIIBRB &SELCOIoOll AlII) &SBLSIZE; ENDSUBMIT; cola ... : RED Rows se lected: YELLOW 2 .........................+ - - - - - - because there are no blue, sma1l, triangles or squares in the database. For the same reason, selecting the color blue would limit the SIZE value selection list: +SAS •••••••••••••••••••••••••••••••••••••••• Using the last example window, SELSHAPE would be defined as the character string ColllllMlRd •••> Data Selection Screen (SHAPE IN ('SQUARE ','1RIANGLE ',' ') AND SHAPE NE' ') Shape: which would be used in a WHERE clause when generating lists of color and size values, eliminating records with shape='CIRCLE'. Similar code is used to define variables SELCOLOR and SELSIZE for the other screening columns (see the complete example listing at the end of this paper). Expressions for numeric-valued columns would have to be defined appropriately. SQUR~ TRIANGLE +Select Data---+ Co..and __ a) Size: Select 1 size: LARGE Rows selected: 2 .........................+------ Because of the way these strings are used in WHERE clauses, they must be valid WHERE expressions even when no screening values are given ( Sl! ! S2 ! ! 53 = _BLANKJ. The default expression used here excludes missing values (' '), as they are also excluded when just some of the screening value fields are left empty. Thus, it is assumed that the database columns used for screening contain no missing values. The technique could be modified ifmissing·values must be included. Likewise, first selecting juSt small objects would eliminate squares from the SHAPE selection list, because the database contains no sma1l squares: 497 Interactive Systems back to the SCL window variable through a macro variable, &ROWSEL: +SAS •••••••••••••••••••••••••••••••••••••••• _ eo....... ---) Data Selection Sc.....n MlUN: Shape: _ __ Color: Size: **Determine number of rows selected**i LINK SLCXVARS; SUBMIT CONTINUE SQL; RESET NOPRINT; SELECT COUNT (SHAPE) INTO :aoNSEL FROM &SOURCEDS 1IIIBRl!l GSELSIIAi'I: JIKD 'SELCOLOa JIKD +Select Data--:' Co--.nd ---) StlALL Select up to 3 shapes: CIRCLE Rows se lected: 4 TRIANGLE 'SELSlZE; RESET PRINT; ENDSUBMIT; ROWSEL = SYMGETN ( •ROWSEL I ) RETURN; ......................... +------;:::: ::. , ; , ;::: I :~ ~ ;,;:~: ". ; On the other hand, selecting colors red and yellow will allow all the shapes in the shape selection list, even though Making the Final Subset there are no yellow circles: When the user is finally satisfied with the given screening criteria and is ready to proceed, it is time to make use of the values in the window variables to actually extract the subset from the source dataset. The simplest method is to once again submit an SQL query in the TERM section, and create a copy of the source dataset using the defined WHERE expressions for all of the screening columns: +SAS •••••••••••••••••••••••••••••••••••••••• Co...and -==) Data Selection Sc ....en Shape: _ __ Color: RED_ YELLOW _ _ Size: +Select Data--+ __ Co.....and •• -> Select up to 3 ~: shapes: IF STATUS NE' C' THEN SUBMIT CONTINUE SQL; CREATE TABLE 5StlBSETDS AS SELECT * FROM &SOURCEDS wm:az 'SELSDPE JIKD 5SELC:OloOa CIRCLE Rows selected: 8 TRIANGLE SQUARE ......................... + - - - - - - JIKD 5SBLSIZE; ENDSUBMIT; RETURN; Counting The Results Alternately, the values of the window variables can be SYMPUT to macro variables in the TERM section, to be used elsewhere in the application for subsetting the source dataset. Either way, the selected subset is then used to provide the user with the requested information in a suitable form. Careful readers may have noticed the changing number by "Rows selected" on these examples. It indicates the number of rows contained in a subset of the source dataset defined by the current screening values. This feature is another aid to the user in evaluating the entered screening criteria and determining if the subset will be a reasonable size. The protected field is calculated at the end of the SCL MAIN section by submitting another SQL query on the source dataset (or a unique value sununary) using the WHERE expressions for all of the screening columns. The selected record count is passed 498 Interactive Systems Complete SCL source code for the sample selection list application. **Show selection list for COLOR **: IF (FIELD='Cl' AND Cl= BLANK ) OR (FIELD='C2' AND C2=-BLANK-) OR (FIELD-'C3' AND C3=-BLANK-) THEN DO: -SUBMIT CONTINUE SQL; CREATE TABLE U COLOR AS SELECT UNIQUE COLOR FROM &SOURCEDS WHERE &SELSHAPE AND &SELSIZE: ENDSUBMIT; DSID=OPEN('U COLOR'); IF DSID THEN DO: LID=MAKELIST ( ) : RC=CURLIST (LID) : CVAR=DATALISTC(DSID,COLOR', lNI'1!: CONTROL ALWAYS SOURCEDS 'WORK.SAMPLE': SUBSETDS = 'WORK.SUBSET': RETURN: SLCS'VARS: ** Define character strings based on window variables, to be used as WHERE expressions when generating unique value selection lists.**; IF Sl!!S2!!S3 = BLANK THEN SELSHAPE-= f (SHAPE NE f t 'Select up to 3 colors',N',3); IF GETNITEMN(LID, 'COUNT') GT 0 THEN DO; Cl=GETNITEMC(LID, 'COLOR',l,l,' '): C2=GETNITEMC(LID,'COLOR',2,1,' '); C3=GETNITEMC(LID, 'COLOR',3,1,' '): I I) '; ELSE SELSHAPE = '(SHAPE IN ("'!!Sl!!' If,'''!!S2!! I I I , t , I ! ! S3! t r I') AND SHAPE HE " ")'; END: RC=DELLIST (LID); RC=CLOSE(DSID): RC=DELETE('U COLOR'I: END: END; **Show selection list for SIZE **: IF (FIELD-'SIZE' AND SIZE= BLANK ) THEN DO: -SUBMIT CONTINUE SQL: CREATE TABLE U SIZE AS SELECT UNIQUE SIZE FROM SOURCEDS WHERE &SELSHAPE AND &SELCOLOR; ENDSUBMIT; DSID=OPEN('U SIZE'); IF DSID THEN DO; LID-Ml\KELIST(); RC-CURLIST(LID); C~=DATALISTC(DSID, 'SIZE', 'Select 1 Size', 'Y',l); IF GETNITEMN(LID, 'COUNT') GT 0 THEN SIZE=GETNITEMC(LID,'SIZE',ll: RC=DELLIST(LID): RC=CLOSE (DSID): RC=DELETE('U SIZE'); END: - IF Cl! !C2! !C3 BLANK THEN SELCOLOR-· , (COLOR NE " ELSE SELCOLOR I I I) '; ~ I ! ! Cl ! ! r ","'!!C3!!' (COLOR IN (" , COLOR NE II t I I I I I ! ! C2! ! It) .AND ") I; IF SIZE - BLANK THEN SELSIZE ;;; , (SIZE NE I' 't) t; ELSE SELSIZE = t (SIZEa'I'! !SIZE!! t t ') '; RETURN: IGIN: LINK SLCTVARS: FIELD-CURFLD(): **Show selection list for SHAPE **: IF (FIELD='Sl' AND Sl- BLANK ) OR (FIELD='S2' AND S2=-BLANK-) OR (FIELD=' S3' AND 53=-BLANK-) THEN DO: -SUBMIT CONTINUE SQL; CREATE TABLE U SHAPE AS SELECT UNIQUE SHAPE FROM &SOURCEDS WHERE &SELCOLOR AND &SELSIZE; ENDSUBMIT; DSID=OPEN('U SHAPE'); IF DSID - END: **Determine number of rows selected**; LINK SLCTVARS: SUBMIT CONTINUE SQL: RESET NOPRINT: SELECT COUNT(SHAPEI INTO :ROWSEL FROM &SOURCEDS WHERE &SELSHAPE AND &SELCOLOR AND &SELSIZE: RESET PRINT; ENDSUBMIT; ROWSEL = SYMGETN('ROWSEL'); RETURN; THEN DO; LID=MAKELIST () ; RC=CURLIST(LID): CVAR=DArALISTC(DSID,'SHAPE', 'Select up to 3 shapes', 'N',3); IF GETNITEMN(LID, 'COUNT'I GT 0 THEN DO; Sl-GETNITEMC(LID, 'SHAPE',l,l,' '): S2-GETNITEMC(LID, 'SHAPE',2,1,' 'I: S3-GETNITEMC(LID,'SHAPE',3,1,' '): END: RC=DELLIST (LID) ; RC=CLOSE (DSID); RC=DELETE('U SHAPE'): END; END; _: IF STATUS NE 'C' THEN SUBMIT CONTINUE SQL; CREATE TABLE &SUBSETDS AS SELECT * FROM &SOURCEDS WHERE &SELSHAPE AND &SELCOLOR AND &SELSIZE: ENDSUBMIT: '*'*continued**; RETURN; 499 Interactive Systems Summary: Pros and Cons This technique is intended to improve user interactions when specifYing screening criteria to define a subset of a source dataset. It does so by allowing the user to select from lists of available values. Strengths: • The user sees exactly what data values are available in the source dataset - no guessing. . • Typographical errors in entering screening criteria are avoided. • The need for data coding as a mnemonic aid to retrieval is eliminated. • Value lookups may be provided to help identify coded data. • Data selection criteria may include multiple discrete values for each column. • Linking of value selection lists on multiple columns supports "drill down" through hierarchical relationships. • The size of the requested subset can be reported. Limitations: • The source dataset (or at least a unique valued subset) is repeatedly queried, which may affect response time. • Columns used for screening should not have missing values. • The number of discrete screening values available to the user for each column must be hard-coded. No technique is perfect. Learn from this what you can, modiiY as necessary, and use to advantage. Literature SAS Institute, Inc., "Input/Output", Observations Second Quarter 1995, Vol.3 No.4, pp. 63-65. Author Contact John M. Rinehart 3280 Skytop Trail Dover, PA 17315 (717) 292-1636 email: [email protected] SAS and SAS/AF are registered trademarlcs ofSAS Institute, Inc. in the USA and other COWltries. ® indicates USA registration. 500