Download Chapter 3 McGrew and Monroe

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Time series wikipedia , lookup

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Chapter 3
An Introduction to Statistical
Problem Solving in Geography
Summarized by Lana Hesler
Learning Objectives
Understand the basic descriptive measures
of central tendency
 Understand the basic descriptive measures
of dispersion
 Understand the concept of relative
variability
 Determine the value of measuring shape or
relative position
 Realize potential effects of location data on
descriptive statistics

Summarizing Data Sets

Measures of central tendency
◦ Numbers that represent the center or typical
value of a frequency distribution
 Includes mode, median, and mean

Measures of dispersion
◦ Numbers that depict the amount of spread or
variability in a data set
 Includes range, interquartile range, standard
deviation, variance, and coefficient variation
Summarizing Data Sets (cont.)

Measures of shape or relative position
◦ Numbers that further describe the nature or
shape of a frequency distribution
 Includes skewness – symmetry of a distribution
 Includes kurtosis – degree of flatness or peakedness
in a distribution
Descriptive Statistics

Mode
◦ Value that occurs most
frequently

Median
◦ Middle value from a set of
ranked observations.
Value with equal number
of data units above and
below it.

Mean
◦ The arithmetic average of
the values
Graphics provided by: http://www.transtutors.com/statistics-homework-help/numerical-measures
Weighted Mean

Weighted Mean defined
◦ Arithmetic average calculated from class intervals
and class frequencies

Assumptions
◦ Without information to the contrary, data are
distributed evenly within the interval
◦ Best summary representation of the values in
each interval is the class midpoint
 Class midpoint – value that is exactly midway between
extreme values that identify the class interval
http://www.transtutors.com/statistics-homework-help/numerical-measures/weighted-mean.aspx
Relative Variability


Defined as the amount of spread in a set of variables
Spread can be measured in different ways
◦ Simplest measure of variability is the range - difference
between largest and smallest value
◦ Quantiles are used to define intervals, portions, or
percentiles
◦ Interquartile range – data is divided into 4 equal
portions. Difference between 25th and 75th percentile is
the interquartile range
http://www.mathsisfun.com/definitions/range-statistics-.html
http://faculty.uncfsu.edu/dwallace/lesson%205.pdf
Standard Deviation and Variance

Standard Deviation
◦ The least squares property of the mean
carries over into the most common measure
of variability or dispersion

Variance
◦ The square of the standard deviation
Formula provided by: http://en.wikipedia.org/wiki/Standard_deviation
Standard Normal Distribution

68-95-99.7 rule
◦ http://www.oswego.edu/~srp/stats/z.htm
http://www.oswego.edu/~srp/stats/z.htm
Measures of Shape or Relative
Position

Skewness
◦ measures the degree of
symmetry in a
frequency distribution
◦ determines the extent
to which the values are
evenly or unevenly
distributed on either
side of the mean

Kurtosis
◦ measures flatness or
peakedness of a data
set
Graphics provided by:
http://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm
Spatial Data and Descriptive
Statistics

Boundary delineation
◦ Idea that a location of boundaries can affect
various descriptive statistics
For example:
The watershed area is highlighted
in yellow as the area that will be
covered in this watershed study.
The other colored areas are
watershed areas that will not be
covered.
http://proceedings.esri.com/library/userconf/cahinvrug09/papers/user-presentations/watershed_boundary_delineation.pdf
Spatial Data and Descriptive
Statistics (cont.)

Modifiable areal units
◦ Idea that using alternative subdivision or
regionlization schemes within the same overall
study area can influence descriptive statistics
For example:
These Aggregated
Districts have modifiable
areas of study. The study
area has been modified
several times in order to
show the east-west
aggregation of Indiana’s
crop aggregation in figure
C and then north south in
figure D.
http://www.agriculture.purdue.edu/ssmc/Frames/SSMC_newsletter11_2006.pdf
Spatial Data and Descriptive
Statistics (cont.)

Spatial Aggregation
◦ Idea that different spatial levels, or scales, can
vary the descriptive statistics
For example:
The first image shows the
unemployment statistics
based on region. The
second image shows the
unemployment statistics
based on state. The same
information is given in two
different graphs based on
the scale the data is
portrayed.
http://www.nationalatlas.gov/articles/people/a_unemployment.html#one
Lesson Review
Median, mode, and mean are used to
measure central tendency
 Measures of dispersion is determined
based on relative variability, standard
deviation, and variance
 Boundary delineation, modifiable areal
units, and spatial aggregation are all
measurements of shape or relative
position
