Download Political Research and Statistics

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Measures of Dispersion
9/26/2013
Readings
• Chapter 2 Measuring and Describing Variables
(Pollock) (pp.37-44)
• Chapter 6. Foundations of Statistical Inference
(128-133) (Pollock)
• Chapter 3 Transforming Variables (Pollock
Workbook)
OPPORTUNITIES TO DISCUSS
COURSE CONTENT
Office Hours For the Week
• When
– Friday 10-12
– Monday 10-12
– And by appointment
Homework
• Chapter 2
– Question 1: A, B, C, D, E
– Question 2: B, D, E (this requires a printout)
– Question 3: A, B, D
– Question 5: A, B, C, D
– Question 7: A, B, C, D
– Question 8: A, B, C
Course Learning Objectives
1. Students will learn the basics of research
design and be able to critically analyze the
advantages and disadvantages of different
types of design.
2. Students Will be able to interpret and explain
empirical data.
MEASURES OF DISPERSION
What are They?
• these measure the
uniformity of the data
• they measure how
closely or widely cases
are separated on a
variable.
The Standard Deviation
• A More accurate and
precise measure than
dispersion and
clustering
• Is the average distance
of values in a
distribution from the
mean
What it tells us
• When the value of the standard deviation is small,
values are clustered around the mean.
• When the value of the standard deviation is high,
values are spread far away from the mean.
About the Standard Deviation
• its based on the mean
• the larger the standard deviation, the more
spread out the values are and the more
different they are
• if the standard deviation =0 it means there is
no variability in the scores. They are all
identical.
From 2008
Who was more divisive?
The Standard Deviation
• It is a standardized measure…. So what?
• This means it has ratio ( the actual value)and
ordinal properties (the number of standard
deviations 0,1,2,3.. From the mean
• This means we can compare different means (e.g.
test scores)
The Standard Deviation and Outliers
• Any case that is more
than 2 standard
deviations away from
the mean
• These cases often
provide valuable
insights about our
distribution
2011 Baseball Salaries
How to determine the value of a
standard deviation
How to determine the value of a
standard deviation
• The value of +/- 1 s.d. = mean + value of s.d
– e.g. if the mean is 8 and the s.d is 2, the value of -1
s.d's is 6, and + 1 s.d.'s is 10
• The value of +/- 2 s.d. = mean + (value of s.d. *2)
– e.g. if the mean is 8 and the s.d is 2, the value of -2
s.d's is 4, and + 2 s.d.'s is 12
• Any value in the distribution lower than 4 and
higher than 12 is an outlier
ECU POL Sci
An Example from 2008
• States Database
• What is the Value of +/- 1 S.D?. (mean+ 1.s.d)
• What is the Value of +/-2 S.D? (mean +/- 2 s.d)
Unwrapping The Results
• Which are Outliers
• How did that shape the 2012 campaign
THE NORMAL CURVE
Different Kinds of Distributions
Rectangular
Camel Humps
Dromedary (one hump)
Bactrian (bi-modal)
The Normal/Bell Shaped curve
• Symmetrical around the
mean
• It has 1 hump, it is
located in the middle,
so the mean, median,
and mode are all the
same!
Why we use the normal curve
• To determine skewness
• The Normal Distribution
curve is the basis for
hypothesis/significance
testing
SKEWNESS
What is skewness?
• an asymmetrical
distribution.
• Skewness is also a
measure of symmetry,
• Most often, the median
is used as a measure of
central tendency when
data sets are skewed.
How to describe skewness
The Mean or the Median?
• In a normal distribution, the mean is the
preferred measure
• In a skewed distribution, you go with the
median
Deviate from the norm?
1. Divide the skewness
value
2. By the std. error of
skewness
A distribution is said to be skewed if
the magnitude of
(Skewness value/ St. Error of Skew) is
greater than 2 (in absolute value)
If the Value is Two or More
2 or
More
Skewness Value
Standard error of Skewness
Use the
Median
If the Value Is Two or Less
Less
Than 2
Mean
Skewness Value
Standard error of Skewness
Baseball Salaries again
• Divide the Skewness by
its standard error
– .800/.427
= 1.87
• This value is less than 2
so we use the mean (92
million)
• What does the positive
skew value mean???
Lets Try another One (Per Capita
income in the states)
• Divide the Skewness by its
standard error
.817/.337
= 2.42
• The value is greater than
two, and the skewness
value is positive
• What is the better
measure and what might
cause this distribution
shape?
CO2 Emissions by State
Percent Hispanic
World Urban Population
STATISTICAL SIGNIFICANCE
Testing
• Causality
• Statistical Significance
• Practical Significance
Statistical Significance
• A result is called statistically significant if it is unlikely to
have occurred by chance
• You use these to establish parameters, so that you can
state probability that a parameter falls within a
specified range called the confidence interval (chance
or not).
• Practical significance says if a variable is important or
useful for real-world. Practical significance is putting
statistics into words that people can use and
understand.
Curves & Significance Testing
What this Tells us
• Roughly 68% of the scores
in a sample fall within one
standard deviation of the
mean
• Roughly 95% of the scores
fall 2 standard deviations
from the mean (the exact
# for 95% is 1.96 s.d)
• Roughly 99% of the scores
in the sample fall within
three standard deviations
of the mean
A Practice Example
• Assuming a normal curve
compute the age (value)
– For someone who is +1 s.d,
from the mean
– what number is -1 s.d. from the
mean
• With this is assumption of
normality, what % of cases
should roughly fall within this
range (+/-1 S.D.)
• What about 2 Standard
Deviations, what percent
should fall in this range?
Life Expectancy in Latin America and
Caribbean
• Compute the estimated
values for Average Life
Expectancy for +/- 2
standard deviations
from the mean.
• With this is assumption
of normality, what % of
cases should fall within
this range (+/-2 s.d).
If you find this amusing or annoying,
you get the concept
STANDARD DEVIATION AND
CHARTS IN SPSS
Standard Deviation
For Ratio Variables
Step 2
Step 1
Step 4
Step 3
Standard Deviation in SPSS
• Open up the States.Sav
dataset and use the
union07 variable.
• Analyze
– Descriptive Statistics
• Descriptives
– Select your options
Testing for Skewness
In the Descriptives Command
Click
Here
In the Frequencies Command
Simple Bar Charts
• In SPSS
• OPEN GSS 2008
• Analyze
– Descriptive Statistics
• Frequencies