Download INTRODUCTION TO THE MAPPING SCIENCES:

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Time series wikipedia , lookup

Transcript
INTRODUCTION TO MAPPING & GIS:
EXERCISE 8 - CLASS INTERVALING TECHNIQUES
NAME: ___________________________________________________
OBJECTIVES: This exercise gives you "hands-on" experience with the basic class intervaling
techniques used most commonly in thematic mapping. When you have finished, you will know the
fundamental principles of class intervaling and you will be able to calculate the mean and standard
deviation. You will also be able to calculate break points using the following methods: [1] equal value
range, [2] mean and standard deviation, [3] nested mean, [4] quantile, and [5] geometric progression.
PRINCIPLES: Class intervaling is a classification process used to reduce a large number of
quantitative values to a smaller number of ordered categories. In thematic mapping we class the data so we
can see the forest rather than be overwhelmed by the trees. Classing is one step in creating maps that enable
the clear depiction of data that in their raw form would be nearly incomprehensible. It is one means of
carrying out the KISS principle.
There are many techniques from which you can choose in setting up data classes, but in all cases you
must observe two fundamental principles:
1.
2.
Each of the original [unclassed] data values must fall into one of the classes
None of the original [unclassed] data values may fall into more than one class. A
short way of putting this is to say that the classes must be mutually exclusive and
exhaustive.
MEAN To complete this exercise you will need to know how to calculate the mean or average of a set
of numbers. There should be no mystery involved here. Certainly you are familiar with concepts like grade
point average, mean income level, the average height of members of the Philadelphia Seventy Sixers
baseball team and the like.
The mean is simply one of several measures that express, in a single number, the "central tendency" of a
set of data. To calculate the mean first count how many numbers there are. Call this number "N.” Next,
sum all of the values and divide this sum by N. The result of the division is the mean. In doing all of this
you might find certain conventions useful. Often we will find that it is convenient to refer to any member
of a set of numbers X, as Xi. For instance, the set of numbers X has the following members:
{ X1, X2, . . . , Xi, . . . , Xn}
We refer to any one of those numbers as Xi. The other convention you need to know about is the large
or uppercase Greek letter sigma, which looks like this: Σ and means "take the sum of whatever follows."
The formula for the mean is:
STANDARD DEVIATION One way in which to summarize a set of data is to report a measure of
central tendency. The mean is an example of such a measure. Common sense will tell you that there is
much more to a set of data than the mean value. Two sets of numbers can have the same mean and yet be
very different from each other in the amount of dispersion or divergence from the mean. Measures of
variability provide a statistical indication of dispersion in a set of data. The standard deviation is one such
measure. Others include the range, the interquartile range, and the variance. For now, focus on how to
calculate the standard deviation.
1. To begin, you calculate the mean [see above].
1
2. Next, subtract the mean from each of the original data values, Xi. This will give you a number of
differences equal to N.
3. Now, square each of these N differences.
4. After you square the differences, sum them and then divide the sum by N.
5. The result of this division is a common measure of dispersion called the variance. To get the
standard deviation simply use a calculator, computer, or square root table to take the square root of the
variance.
The formula for the standard deviation is:
PRACTICE Just to make sure that you understand and to help you get a feel for the standard deviation
let's calculate the mean and standard deviation for two simple sets of data both of which have the same
mean value, but which vary in their dispersion about the mean. The data are hypothetical incomes for
residents of two towns: one is Sameville and the other is Variton [see the table]. Calculate the mean and
standard deviation for the income of each place and briefly discuss your results in the space provided.
TABLE 1.1. INCOME DATA FOR THE RESIDENTS OF SAMEVILLE AND VARITON
[NUMBERS IN THOUSANDS OF DOLLARS]
SAMEVILLE VARITON
7.0
7.5
8.0
8.5
9.0
11.0
11.5
12.0
12.5
2.0
3.0
5.0
7.0
9.0
11.0
13.0
17.0
20.0
Show the results of your calculations in the spaces provided in Table 1.2. Also, show your calculations
in the space provided below Table 1.2.
TABLE 1.2. INCOME DATA FOR THE RESIDENTS OF SAMEVILLE AND VARITON
[NUMBERS IN THOUSANDS OF DOLLARS]
SAMEVILLE
VARITON
MEAN
SD
2
SHOW YOUR CALCULATIONS IN THE SPACE BELOW AND COMMENT
3
INTERVAL CALCULATION: This section requires that you use the classing techniques discussed
in the readings and in class to create class limits for some real world data. There are two data sets:
1.
Data on total population for the counties of New Jersey. The column, "sorted population"
will be used in calculating quantile classes. [Table 1.3]
2.
Data on percentage of population male for counties in Kansas [Table 1.4]
Each classing problem will refer you to the appropriate table.
TABLE 1.3 TOTAL POPULATION FOR NEW JERSEY COUNTIES
COUNTY
POPULATION
SUSSEX COUNTY
PASSAIC COUNTY
BERGEN COUNTY
WARREN COUNTY
MORRIS COUNTY
ESSEX COUNTY
HUDSON COUNTY
HUNTERDON COUNTY
SOMERSET COUNTY
UNION COUNTY
MIDDLESEX COUNTY
MONMOUTH COUNTY
MERCER COUNTY
BURLINGTON COUNTY
OCEAN COUNTY
CAMDEN COUNTY
GLOUCESTER COUNTY
SALEM COUNTY
ATLANTIC COUNTY
CUMBERLAND COUNTY
CAPE_MAY COUNTY
144166
489049
884118
102437
470212
793633
608975
121989
297490
522541
750162
615301
350761
423394
510916
508932
254673
64285
252552
146438
102326
SORTED POPULATION
884118
793633
750162
615301
608975
522541
510916
508932
489049
470212
423394
350761
297490
254673
252552
146438
144166
121989
102437
102326
64285
Average =400683
Standard deviation =245685
At this point review your notes on each of the classing techniques that we discussed in class. For most of
these techniques you will also find some discussion in the readings.
4
EQUAL VALUE RANGE CLASSES: Divide the data in Table 1.3 into five classes of equal value
range. To provide better looking classes use an artificial minimum of zero and an artificial maximum of
400. Show your results in the space below.
CLASS
LOWER LIMIT
UPPER LIMIT
1
__________
___________
2
__________
___________
3
__________
___________
4
__________
___________
5
__________
___________
SHOW YOUR WORK IN THE SPACE BELOW:
5
MEAN AND STANDARD DEVIATION CLASSES: Using the real minimum and maximum as
lower and upper limits, divide the data in Table 1.4 [Percent of the population male] into four classes using
mean and standard deviation class breaks.
CLASS
LOWER LIMIT
UPPER LIMIT
1
__________
___________
2
__________
___________
3
__________
___________
4
__________
___________
SHOW YOUR WORK IN THE SPACE BELOW:
6
QUANTILE CLASSES: As you can see the data in Table 1.3 are quite skewed. Sometimes quantile
classes are useful in the case of skewed data. Establish the break points for quintile class limits. To help
you I have sorted the data for you [see sorted column in table].
CLASS
LOWER LIMIT
UPPER LIMIT
1
__________
___________
2
__________
___________
3
__________
___________
4
__________
___________
5
__________
___________
SHOW YOUR WORK IN THE SPACE BELOW:
7
Table 1.4 Kansas Data on Percentage of Population Male
FIPS
20001
20003
20005
20007
20009
20011
20013
20015
20017
20019
20021
20023
20025
20027
20029
20031
20033
20035
20037
20039
20041
20043
20045
20047
20049
20051
20053
20055
20057
20059
20061
20063
20065
20067
20069
20071
20073
20075
20077
20079
20081
20083
20085
20087
20089
%MALE
48.09
48.16
48.76
47.99
48.25
47.05
47.92
49.12
49.55
48.63
47.67
47.89
48.47
48.45
46.98
49.49
47.86
48.46
48.02
48.57
48.15
49.08
49.78
48.27
47.82
48.99
51.76
50.76
50.46
48.43
51.3
49.64
49.08
49.95
49.09
49.66
48.53
47.49
48.18
48.46
50.46
49.56
49.06
50.27
49.52
SORT
55.08
55.04
51.96
51.76
51.3
50.9
50.85
50.81
50.76
50.55
50.54
50.48
50.46
50.46
50.27
49.95
49.78
49.78
49.71
49.68
49.66
49.64
49.63
49.56
49.55
49.52
49.49
49.43
49.39
49.35
49.32
49.3
49.22
49.2
49.17
49.14
49.12
49.09
49.09
49.08
49.08
49.06
49.06
49.02
48.99
8
20091
20093
20095
20097
20099
20101
20103
20105
20107
20109
20111
20113
20115
20117
20119
20121
20123
20125
20127
20129
20131
20133
20135
20137
20139
20141
20143
20145
20147
20149
20151
20153
20155
20157
20159
20161
20163
20165
20167
20169
20171
20173
20175
20177
20179
20181
20183
20185
20187
48.33
50.81
48.6
47.9
47.97
50.48
55.08
47.66
49.06
49.3
48.82
48.64
47.89
48.85
49.09
49.17
48.3
47.2
48.77
49.2
49.39
47.96
49.22
51.96
48.68
47.85
48.42
50.55
48.56
49.78
48.41
49.35
49.14
47.81
47.77
55.04
48.25
48.26
48.25
48.31
49.63
48.96
50.54
48.22
50.9
48.89
48.03
48.15
49.68
48.99
48.96
48.89
48.85
48.82
48.77
48.76
48.74
48.68
48.66
48.64
48.63
48.6
48.57
48.56
48.53
48.52
48.47
48.46
48.46
48.45
48.43
48.42
48.41
48.33
48.31
48.3
48.27
48.26
48.25
48.25
48.25
48.22
48.18
48.16
48.15
48.15
48.09
48.03
48.02
47.99
47.97
47.96
47.92
47.91
47.9
47.89
47.89
47.86
9
20189
20191
20193
20195
20197
20199
20201
20203
20205
20207
20209
48.99
48.66
48.74
49.32
49.43
50.85
49.02
49.71
47.91
48.52
47.69
47.85
47.82
47.81
47.77
47.69
47.67
47.66
47.49
47.2
47.05
46.98
MEAN
SD
49.00047619
1.297314722
Part II - Mapping Various Classing Techniques
After working out these classing techniques by hand, you now have a really good idea what is taking place
when different classing techniques are utilized in mapping. Now you can create maps of New Jersey’s
population by county using the techniques demonstrated above.
Copy the exercise8 folder to your local C: drive and open up the map document file exercise8.mxd. You
will see four maps of New Jersey municipalities. Your objective is to color the maps by four classing
methods – natural breaks, quantile, equal interval and standard deviation.
Right-click on the layer name of one of the layers in the table of contents and open up the properties dialog
box. Click on the Symbology tab. In the “show” window select Quantities and Graduated colors. In the
Value field choose POP2000. Click on the Classify button to open up the classify window. This window
is where you can change the classify method and any parameters.
Spend some time playing around with the classification functions of ArcGIS. Each time you change a
parameter click OK and then apply to see how it changes the way the map is drawn. See if you can figure
out how the chart that opens in the classification window depicts the categories. Try moving the blue lines
by dragging them. How does that change the map.
Once you are comfortable with the classification functions, color each of your four maps with a different
classification method. Add a legend. Write a source statement (NJ DEP & US CENSUS). Be sure to add
titles for each classification type. Also add your name and any other necessary or desired graphic elements,
print out the map and hand in with the rest of this lab.
10