Download document 4416910

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
PS 366
3
Measurement
• Related to reliability, validity:
• Bias and error
– Is something wrong with the instrument?
– Is something up with the thing being measured?
Measurement
• Bias & error with the instrument
– Random?
– Systematic?
Measurement
• Bias & error with the thing being measured
– Random?
• failure to understand a survey question
– Systematic?
• does person have something to hide?
Measurement
• Example:
– Reliability, validity, error & bias in measuring
unemployment
– Census survey [also hiring reports, claims filed w/
government, state data to feds...]
– What sources of bias?
Measurement
• Unemployment [employment status]:
– Fully employed
– Part time
– looking for work, + part time
– looking for work, no job
– lost job, not looking for work
– retired
Measurement
• Example:
– Reliability, validity, error & bias in measuring
victims of violent crime
– Census surveys, police records, FBI UCR
– What sources of bias?
Measurement
• How do we ask people questions about
attitudes, behavior that isn’t socially
accepted?
– prejudice
– Racism
– Feelings toward gays & lesbians
– shoplifting
Measurement: Item Count Technique
• Here are 3 things that
sometimes make
people angry or upset.
After reading these,
record how many of
them upset you. Not
which ones, just how
many?
• federal govt increasing
the gas tax
• professional athletes
getting million dollar
salaries
• large corporations
polluting the
environment
Measurement: Item Count Technique
• federal govt increasing
the gas tax
• professional athletes
getting million dollar
salaries
• large corporations
polluting the
environment
• federal govt increasing
the gas tax
• professional athletes
getting million dollar
salaries
• large corporations
polluting the
environment
• a black family moving
next door
Measurement: Item Count Technique
• Randomly assign ½ of subjects to the 3 item
list
• Randomly assign ½ subjects to the 4 item list
• Difference in mean # of responses between
groups = % upset by sensitive item
– (mean 1 – mean 2) *100 = %
Item Count
• Control
TREATMENT
% upset
• Non South 2.28
2.24
0
• South
2.37
42
1.95
2.37 – 1.95 = 0.42 *100 = 42%
Item Count – Using poll information
• 1) The candidate
graduated from a
prestigious college
• 2) The candidate ran a
business
• 3) The candidate’s
family background
• 1) The candidate
graduated from a
prestigious college
• 2) The candidate ran a
business
• 3) The candidate’s
family background
• 4) The candidate is
ahead in polls
Use poll info
• Control
• All 2.28
• Young
1.36
1.04
TREATMENT
1.39
1.46
% use poll
3.2
41
• Is it significant?
– Depends....how much does mean reflect the
group? How much variation around the mean?
Central Tendency
• Statistics that describe the ‘average’ or
‘typical’ value of a variable
– Mean
– Median
– Mode
Central Tendency
• Why median vs. mean?
– Household income
– Home prices
Median vs Mean HH Income
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
median
60,667
49,847
66,875
67,005
45,735
63,472
44,891
50,262
39,930
65,885
76,917
61,146
56,815
62,244
mean
63,809
61,187
74,653
71,443
66,662
73,648
60,250
59,688
60,495
80,581
85,837
64,526
59,781
78,289
Median vs Mean Price
• Seattle Median $400K
• Seattle Mean
higher!
Central Tendency
• Mean
125
92
72
126
120
99
130
100
sum=864
•
mean = sum X/ N
• = 864 / 8
• mean = 108
• Is this repetitive?
Central Tendency
• Mean
125
92
300
126
120
99
130
100
sum=1092
•
mean = sum X/ N
• = 1092 / 8
• = 136.5
• Is this repetitive?
Central Tendency
• Mean
130
126
125
120
100
99
92
•
median = (N +1) /2
–
–
–
–
(8+1)/2
9/2
4.5 th
(120, 125)
• Is this repetitive?
Central Tendency
• Example
$120,00
$60,000
$40,000
$40,000
$30,000
$30,000
$30,000
•
Mean = $50,000
Mdn = $40,000
Mo =
$30,000
• Which is most
representative?
The Distribution
• Where is mean, median, mode if
– Normal
– Left skew
– Right skew
Variation
• How are observations distributed around the
central point?
• Is there one, more central point?
– unimodal
– bimodal
Variation
• Which is unimodal, which is bimodal:
– Mass public ideology
• V con, con, moderate, lib, v. lib
– Members of Congress ideology
– What does the mean mean?
Distribution
• How spread out are the observations?
• Single peak
– not much variation
• Flat?
– lots of variation; what does mean mean?
Variation
• Standard deviation
• Information about variation around the mean
• 1
Variation
• Mean
125
92
72
126
120
99
130
100
mean = 108
Variance
= sum of squared
distances of each obsv
from mean, over # of
observations
Variance
• Mean
125
92
72
126
120
99
130
100
mean = 108
(x - mean)
125-108
92-108
72-108
126-108
120-108
99-108
130-108
100-108
Variance
• Mean
125
92
72
126
120
99
130
100
mean = 108
(x - mean) (x - mean)2
17
289
-16
-36
18
12
-9
22
-8
256
1296
324
144
81
484
64
sum sqs=2938
Variance & Std. deviation
• Variance does not tell
us much
• Standard deviation =
square root of variance
• mean = 108
• variance = 2938 / 8
• = 367.25
• sd = sqrt 367.25
• = 19.2
Variation
• Range ( lo – hi)
• Standard Deviation
• Variance (sum of
distances from mean,
squared) / n
• expresses variation
around the mean in
‘standardized’ units
• Standard Deviation
• Bigger # = more
• Bigger # for each =
more variation
• Allow us to compare
apples to oranges
Standard Deviation
• Total convictions
– mean = 178, s.d. = 199.7
• Per capita convictions (per 10,000 people)
– mean = .357, s.d. = .197
Standard Deviation
Low s.d relative to mean
High s.d. relative to mean
Standard Deviation
Distribution of total convictions: mean 187; s.d. 199
Standard Deviation
Mean .357, s.d. .197
Standard Deviation
Turnout by state: mean = .62 ; s.d. = .07
Standard Deviation
• Tells even more if distribution ‘normal’
• If data interval
• What about a state that has 50% turnout, and
.7 corruption convictions per 10,000?
• Where are they in each distribution?
Standard Deviation
X
Mean .357, s.d. .197
Standard Deviation
X
Turnout by state: mean = .62 ; s.d. = .07
Standard Deviation & z-scores
• State’s position on turnout = z
– z= (score – mean) / s .d.
– = (.50 - .61) / .07 =
– = -.09 / .07 = -1.28
1.28 standard deviations below mean on turnout
Standard Deviation & z-scores
• State’s position on corruption = z
– z= (score – mean) / s .d.
– = (.70 - .35) / .19 =
– = +.35 / .19 = + 1.84
1.84 standard deviations above mean on corruption
Std Dev & Normal Curve
Std Dev & Normal Curve
Std Dev & Normal Curve
Std Dev & Normal Curve
Standard Deviation & z-scores
• Apples: Turnout + 1.84
• Oranges: Corruption -1.28
• Z = 0 is mean
• Z = 3 is 3 very rare
Z scores and Normal Curve
• How many states between mean & +1.84
• How many above 1.84
• See Appendix C in text
– below mean = 50%
– between mean and z=1.84 = 46.7%
– beyond mean = 3.3% [1.5 states if normal]
Z scores and Normal Curve
• How many states between mean & -1.28
• How many below z= - 1.28
• See Appendix C in text
– above mean = 50%
– between mean and z= -1.28 = 39.9%
– beyond mean = 10.3% [1.5 states if normal]