Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
PROC ARKMAP: A Low Cost Method of Presenting Geographical Data in an Educational Environment Dennis w. King, University of Arkansas Richard M. Smith, University of Arkansas INTRODUCTION The ARKMAP procedure is a means of limited equipment-CRT (2) It is easy to use because of its limited command set and parallels to standard SAS procedure coding 1 (3) Few map design decisions are required on the part of the user. The creating county-level choropleth maps of the State of Arkansas (Figure 1). be produced using and line printer; It is in current use primarily by researchers, graduate students and classroom students selection of class intervals (levels) and print symbols is built in and is consistent with the most current cartographic research. in the College of Agriculture for presenting geographical data. There are several reasons why ARKMAP is an attractive procedure: (I)" The maps can ............ ............ : : : : : : : : ... ~"""" .. ,, . . . . . H ... " . . . . . ~"." < ........ " .................... " .. "" ........ ,,""" ....... ,," ........ " ........... ,............. " ........... " ............... " ....... " ........................ : : .. ".... ".... "· .. "··<>" ...... ~ .... ".. ".... •........ ".. •...................... ".. ~f~[~O; .. :E~i~~" .... •.......................... •.... "...... ".. •...... "........ •.......... •.......................... .. ..... (LA"S '" ................. ~"'" " .. <> .. LP"ns ., ... )I<;T • s ... 0 I<;r • 1111111111 • "~"".'" PATTEIIN I l l l I '" CLIISS LIMITe; toE DI ~r. PAHEIIN iiHi '" CLAS~ L P"!tS SE nIST. ......... " ................................................................ " ....................................................................................... : 566 of the mean. ARKMAF is written in PL/I and runs in a 1 MB virtual machine under VM/CMS SAS 82.3. ARRMAF will process both numeric variables and character variables of length B or less. Current cartographic research indicates that the average map reader is unable to reliably differentiate more than 5 to 7 gray tone map symbols. This perceptual limitation is even more pronounced in the case of map patterns produced with a computer line printer. For this reason, ARKMAP limits the user to a maximum of four classes cn any single map. Line printer produced map patterns have been selected to maximize gray tone contrast between symbols and are automatically assigned. In order to use ARKMAP, a SAS Data Set must be created for input. Each observation in the data set must be identified by a number which represents a map unit area. ARKMAP uses standard FIPS code numbers to identify counties. The numeric variable whose values are FIFS codes is indicated in the ID statement. The VARIABLES statement provides a list of variables in the SAS data set to be processed by ARKMAF. ARKMAP produces a separate set of summary statistics and map for each variable in the VARIABLES statement. Additionally, it calculates the mean or the sum {depending on the use or exclusion of the MEANS option) for each numeric variable by county. Although a user may not create more than a four class map with ARKMAP, it is possible to make a map with less than four classes. This is done by selecting either the Quantile or Range Method and indicating the number of classes with the GROUP parameter. The GROUF parameter is ignored for all other grouping methods which create only four class maps. After numeric variables are processed, ARKMAF establishes class intervals for the map according to the algorithm specified in the CLASSINT parameter. The mapmaker may either accept the recommended default method for selecting class intervals or select one of the alternative methods. The default Optimization ~iethod (OPTIMIZE) is an adaption of Fisher's algorithm which groups data values for maximum homogeneity. The result of optimization grouping is a set of choropleth map .classes having minimum variation within classes and ·maximum variation between classes. From a practical standpoint, this means that each pattern which appears on the choropleth map will represent a group of data values which are as alike as is possible. The default method need not be coded to produce class intervals and should be used in all but the most unusual circumstances. Complete versatility of class interval selection is achieved through the use of character variables. In the Data Step the user may create character variable values which correspond to any set of class interval values he or she might choose. ARKMAP then assigns a choropleth map symbol to each unique variable value. For example, one may wish to divide acreage data into three specific groups as indicated in the following code: DATA TWO, SET ONE, IF ACRE LT 100 THEN ACREAGE='LT 100 IF 100 LE ACRE LE 5220 , .• THEN ACREAGE='IOO-5220'; IF ACRE ,. • GT 5220 THEN ACREAGE='GT 5220 One may also wish simply to assign a label of some sort to a map unit area: DATA TWO; SET ONE; IF FIPS LT 39 THEN REGION='NORTHERN'; The Natural Breaks Method of grouping (NATBRKS) arranges the cata in ascending order, locates the thtee largest intervals between data values, and then assigns class boundaries at these locations. The Quantile Method (QUANTILE) assigns an equal number of data values to each class interval. The Equal Interval Method, '(RANGE) determines class intervals by sorting the . data in ascending order, finding the range between the smallest and largest value, and dividing the range into equal intervals. The Standard Deviation Method (STDDEV) first calculates the mean and standard deviation of the data set to be grouped. Class boundaries are then established at a distance of one or two standard deviations on either Side IF 39 LE FIPS LE 79 THEN REGION='SOUTBERN'; IF FIPS GT 79 THEN REGION='WESTERN'; ARKMAP will produce up to a four class map, depending on the values of the character variables. In this case, there should be only one observation per county in the input data set. In the case of multiple observations, only the last observation for each county will be used. Also, a maximum of four classes may be defined by the character variable value; further classes will be ignored. The four class limit has been established because of map deSign considerations. Counties for which th.ere are no observations in the data 567 set are left blank. ~ ~ of the BY statement variables. ARK MAP statement PROC ARK MAP The 10 statement The ID statement indicates the numeric variable in the data set which identifies which county the observation is to be attributed. The valUe of the ID variable must be a valid FIPS code for Arkansas. The 10 statement must be present for the procedure to run successfully. options-and-paramete~s: The options and parameters that can appear, are as follows: DATA=data-set-name The DATA parameter tells ARKMAP which SAS da ta set to be analyzed. If it is omitted, the last data set created will be used. The VARIABLES statement The VARIABLES statement identifies the variables in the data set to be processed by ARKMAP. A separate map will be created for each variable in the VARIABLES statement. If the statement is omitted, ARKMAP produces a map for each variable not listed in an 10 or BY statement. The VARIABLES statement may contain a mix of character and nUmeric variables. CLASSINT=OPTIMIZE NATBRKS QUANTILE RANGE STDDEV The CLASSINT parameter tells ARKMAP which algorithm will be used to generate the class intervals for the map. OPTIMIZE requests FISHER'S algorithm. NATBRKS requests that intervals be located at the largest gaps in the data array. QUANTILE requests the data be divided equally among classes. RANGE requests that intervals be set by dividing the range into equal parts. STDOEV requests that interval selection be based on the mean and standard 'deviation of the data set. OPTIMIZE is the default method. All of these methods are discussed in the introduction. TREATMENT OF MISSING VALUES If a value of the ID variable or the var fable being processed is missing, that observation is ignored in the calculations. REFERENCES Fisher, W.D. ·On Grouping for Maximum Homogeneity", Journal Qf American Statistical Associati~. Vol. 53, pp 789-798, 1958. GROUP=no. of intervals Hartigan, J.A. Clustering Algorithms. New York: John Wiley & Sons, 1975. The GROUP parameter is a means of specifying the number of groups (classes) into which ARKMAP will divide the data set. GROUP is valid only for the RANGE and QUANTILE methods as specified in the CLASSINT parameter; it is ignored for all other methods .• Omission of this parameter will cause ARKMAP to create a map with four class intervals. lllM Q!li COMMAND AND MMJlQ San Jose, California; IBM Corp., SC19-6209-1. Jenks, G.F. Optimal DAtA Classificati2n Hap~. Lawrence, Kansas: Occasional Paper No. 2, Department of Geography, University of Kansas, 1977. L2r Choropleth Jenks, G.F. and Caspall, F.C. "Error on Choroplethic Maps: Definition, Measurement, Reduction", Annals Qf Lhe Association ~ American ~raphers. MEANS The MEANS option indicates that the mean of the response data will be calculated for each county. Means would then be used for interval generation. The MEANS options is ignored for character data. PROCEDURE. INFORMATION ~: REFERENCE. Vol. 61, No.2, pp. 177-184, 1976. MANUAL. San Jose, LANGUAGE REFERENCE California: IBM Corp., GC26-3977-0. BAS PROGRAMME~ mLI.Ill::. Institute, Inc., 1981. ~EMENTS Cary, N.C.: SAS' The BY statement Smith, R.M. -Improved Areal Symbols for Computer Line-printed Maps·, .:rJle. American Cartographer. Vol. 7, No. 1, If a BY statement is used, the data set must be sorted by the variables in the BY statement. ARKMAP will produce a new map and summary table for each valUe pp. 51-57. S68 ,