Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
PS 366 3 Measurement • Related to reliability, validity: • Bias and error – Is something wrong with the instrument? – Is something up with the thing being measured? Measurement • Bias & error with the instrument – Random? – Systematic? Measurement • Bias & error with the thing being measured – Random? • failure to understand a survey question – Systematic? • does person have something to hide? Measurement • Example: – Reliability, validity, error & bias in measuring unemployment – Census survey [also hiring reports, claims filed w/ government, state data to feds...] – What sources of bias? Measurement • Unemployment [employment status]: – Fully employed – Part time – looking for work, + part time – looking for work, no job – lost job, not looking for work – retired Measurement • Example: – Reliability, validity, error & bias in measuring victims of violent crime – Census surveys, police records, FBI UCR – What sources of bias? Measurement • How do we ask people questions about attitudes, behavior that isn’t socially accepted? – prejudice – Racism – Feelings toward gays & lesbians – shoplifting Measurement: Item Count Technique • Here are 3 things that sometimes make people angry or upset. After reading these, record how many of them upset you. Not which ones, just how many? • federal govt increasing the gas tax • professional athletes getting million dollar salaries • large corporations polluting the environment Measurement: Item Count Technique • federal govt increasing the gas tax • professional athletes getting million dollar salaries • large corporations polluting the environment • federal govt increasing the gas tax • professional athletes getting million dollar salaries • large corporations polluting the environment • a black family moving next door Measurement: Item Count Technique • Randomly assign ½ of subjects to the 3 item list • Randomly assign ½ subjects to the 4 item list • Difference in mean # of responses between groups = % upset by sensitive item – (mean 1 – mean 2) *100 = % Item Count • Control TREATMENT % upset • Non South 2.28 2.24 0 • South 2.37 42 1.95 2.37 – 1.95 = 0.42 *100 = 42% Item Count – Using poll information • 1) The candidate graduated from a prestigious college • 2) The candidate ran a business • 3) The candidate’s family background • 1) The candidate graduated from a prestigious college • 2) The candidate ran a business • 3) The candidate’s family background • 4) The candidate is ahead in polls Use poll info • Control • All 2.28 • Young 1.36 1.04 TREATMENT 1.39 1.46 % use poll 3.2 41 • Is it significant? – Depends....how much does mean reflect the group? How much variation around the mean? Central Tendency • Statistics that describe the ‘average’ or ‘typical’ value of a variable – Mean – Median – Mode Central Tendency • Why median vs. mean? – Household income – Home prices Median vs Mean HH Income • • • • • • • • • • • • • • • median 60,667 49,847 66,875 67,005 45,735 63,472 44,891 50,262 39,930 65,885 76,917 61,146 56,815 62,244 mean 63,809 61,187 74,653 71,443 66,662 73,648 60,250 59,688 60,495 80,581 85,837 64,526 59,781 78,289 Median vs Mean Price • Seattle Median $400K • Seattle Mean higher! Central Tendency • Mean 125 92 72 126 120 99 130 100 sum=864 • mean = sum X/ N • = 864 / 8 • mean = 108 • Is this repetitive? Central Tendency • Mean 125 92 300 126 120 99 130 100 sum=1092 • mean = sum X/ N • = 1092 / 8 • = 136.5 • Is this repetitive? Central Tendency • Mean 130 126 125 120 100 99 92 • median = (N +1) /2 – – – – (8+1)/2 9/2 4.5 th (120, 125) • Is this repetitive? Central Tendency • Example $120,00 $60,000 $40,000 $40,000 $30,000 $30,000 $30,000 • Mean = $50,000 Mdn = $40,000 Mo = $30,000 • Which is most representative? The Distribution • Where is mean, median, mode if – Normal – Left skew – Right skew Variation • How are observations distributed around the central point? • Is there one, more central point? – unimodal – bimodal Variation • Which is unimodal, which is bimodal: – Mass public ideology • V con, con, moderate, lib, v. lib – Members of Congress ideology – What does the mean mean? Distribution • How spread out are the observations? • Single peak – not much variation • Flat? – lots of variation; what does mean mean? Variation • Standard deviation • Information about variation around the mean • 1 Variation • Mean 125 92 72 126 120 99 130 100 mean = 108 Variance = sum of squared distances of each obsv from mean, over # of observations Variance • Mean 125 92 72 126 120 99 130 100 mean = 108 (x - mean) 125-108 92-108 72-108 126-108 120-108 99-108 130-108 100-108 Variance • Mean 125 92 72 126 120 99 130 100 mean = 108 (x - mean) (x - mean)2 17 289 -16 -36 18 12 -9 22 -8 256 1296 324 144 81 484 64 sum sqs=2938 Variance & Std. deviation • Variance does not tell us much • Standard deviation = square root of variance • mean = 108 • variance = 2938 / 8 • = 367.25 • sd = sqrt 367.25 • = 19.2 Variation • Range ( lo – hi) • Standard Deviation • Variance (sum of distances from mean, squared) / n • expresses variation around the mean in ‘standardized’ units • Standard Deviation • Bigger # = more • Bigger # for each = more variation • Allow us to compare apples to oranges Standard Deviation • Total convictions – mean = 178, s.d. = 199.7 • Per capita convictions (per 10,000 people) – mean = .357, s.d. = .197 Standard Deviation Low s.d relative to mean High s.d. relative to mean Standard Deviation Distribution of total convictions: mean 187; s.d. 199 Standard Deviation Mean .357, s.d. .197 Standard Deviation Turnout by state: mean = .62 ; s.d. = .07 Standard Deviation • Tells even more if distribution ‘normal’ • If data interval • What about a state that has 50% turnout, and .7 corruption convictions per 10,000? • Where are they in each distribution? Standard Deviation X Mean .357, s.d. .197 Standard Deviation X Turnout by state: mean = .62 ; s.d. = .07 Standard Deviation & z-scores • State’s position on turnout = z – z= (score – mean) / s .d. – = (.50 - .61) / .07 = – = -.09 / .07 = -1.28 1.28 standard deviations below mean on turnout Standard Deviation & z-scores • State’s position on corruption = z – z= (score – mean) / s .d. – = (.70 - .35) / .19 = – = +.35 / .19 = + 1.84 1.84 standard deviations above mean on corruption Std Dev & Normal Curve Std Dev & Normal Curve Std Dev & Normal Curve Std Dev & Normal Curve Standard Deviation & z-scores • Apples: Turnout + 1.84 • Oranges: Corruption -1.28 • Z = 0 is mean • Z = 3 is 3 very rare Z scores and Normal Curve • How many states between mean & +1.84 • How many above 1.84 • See Appendix C in text – below mean = 50% – between mean and z=1.84 = 46.7% – beyond mean = 3.3% [1.5 states if normal] Z scores and Normal Curve • How many states between mean & -1.28 • How many below z= - 1.28 • See Appendix C in text – above mean = 50% – between mean and z= -1.28 = 39.9% – beyond mean = 10.3% [1.5 states if normal]