Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistical Analyses t-tests Psych 250 Winter, 2013 Hypothesis: People will give longer sentences when the victim is female. Independent Variable: Gender of the Victim Dependent Variable: Length of Sentence Types of Measures / Variables • Nominal / categorical – Gender, major, blood type, eye color • Ordinal – Rank-order of favorite films; Likert scales? • Interval / scale – Time, money, age, GPA Main Analysis Techniques Variable Type Example Commonly-used Statistical Method Nominal by Nominal blood type by gender Chi-square Scale by Nominal GPA by gender t-test GPA by major Analysis of Variance weight by height GPA by SAT Regression Correlation Scale by Scale Main Analysis Techniques Variable Type Example Commonly-used Statistical Method Nominal by Nominal blood type by gender Chi-square Scale by Nominal GPA by gender t-test GPA by major Analysis of Variance weight by height GPA by SAT Regression Correlation Scale by Scale Stat Analysis / Hypothesis Testing 1. Form of the relationship 2. Statistical significance Variables: Scale by Categorical • Form of the relationship: Means of each category (M & F victim) • Statistical Significance: Independent samples t-test Means observed in Sample Victim Gender Average Sentence Male 6 months Female 16 months Statistical Signficance • Q: Is this a “statistically significant” difference? • Can the “null hypothesis” be rejected? Null hypothesis: there are NO differences in sentencing for male vs. female victims sample Sample n = 40 inference M victim: 6 months F victim: 16 months Universe n=∞ Logic of Statistical Inference • What is the probability of drawing the observed sample (M = 6 months vs. F = 16 months) from a universe with no differences? • If probability very low, then differences in sample likely reflect differences in universe • Then null hypothesis can be rejected; difference in sample is statistically significant Strategy • Draw an infinite number of samples of n = 40, and graph the distribution of their male victim / female victim differences Samples of n = 40 Universe n = ∞ M: 13 F: 9 M: 6 F: 16 Null Hyp: M = 11 months F = 11 months M: 11 F: 11 M: 8 F: 14 T-test Sampling distribution: Mean difference Function of: 1) difference in means 2) variance (dispersion around mean) Possible Sample -- 1 Male Victim 1 2 3 4 5 6 ... Female Victim 16 Possible Sample -- 2 Male Victim 1 2 3 4 5 6 ... Female Victim 16 Frequency Distribution lengthofsentave11 Valid 0 1 2 3 4 6 8 10 12 15 16 18 20 24 27 36 60 Total Frequency 12 1 1 4 1 4 1 1 8 2 1 4 1 2 2 2 1 48 Percent 25.0 2.1 2.1 8.3 2.1 8.3 2.1 2.1 16.7 4.2 2.1 8.3 2.1 4.2 4.2 4.2 2.1 100.0 Valid Percent 25.0 2.1 2.1 8.3 2.1 8.3 2.1 2.1 16.7 4.2 2.1 8.3 2.1 4.2 4.2 4.2 2.1 100.0 Cumulative Percent 25.0 27.1 29.2 37.5 39.6 47.9 50.0 52.1 68.8 72.9 75.0 83.3 85.4 89.6 93.8 97.9 100.0 Mean = 11 Variance Variance = s2 = but: x i - Mean )2 ----------------------N s2 = Standard Deviation = x i - Mean )2 ----------------------N-1 s = variance Calculating Variance lengthofsentave11 Valid 0 1 2 3 4 6 8 10 12 15 16 18 20 24 27 36 60 Total Frequency 12 1 1 4 1 4 1 1 8 2 1 4 1 2 2 2 1 48 Percent 25.0 2.1 2.1 8.3 2.1 8.3 2.1 2.1 16.7 4.2 2.1 8.3 2.1 4.2 4.2 4.2 2.1 100.0 Valid Percent 25.0 2.1 2.1 8.3 2.1 8.3 2.1 2.1 16.7 4.2 2.1 8.3 2.1 4.2 4.2 4.2 2.1 100.0 Cumulative Percent 25.0 27.1 29.2 37.5 39.6 47.9 50.0 52.1 68.8 72.9 75.0 83.3 85.4 89.6 93.8 97.9 100.0 Mean = 11 Variance Statistics lengthofsentave11 N Valid Mis sing Mean Std. Deviation Variance Minimum Maximum 48 0 11.02 12.109 146.617 0 60 t distribution • Sampling distribution of a difference in means • Function of mean difference & “pooled” variance (of both samples) t = mean1 – mean2 -------------------------------sp√ (1/n1) + (1/n2) Samples of n = 40 Universe n = ∞ mean dif & var mean dif & var Null Hyp: M = 11 months F = 11 months mean dif & var mean dif & var Samples of n = 40 Universe n = ∞ t t Null Hyp: M = 11 months F = 11 months t t t distribution 2.5% of area 2.5% of area Statistical Significance • If probability is less than 5 in 100, the null hypothesis can be rejected, and it can be concluded that the difference also exists in the universe. p < .05 • The finding from the sample is statistically significant SPSS t-test Output 1. Read means Group Statistics lengthofsentave11 victim gender female male N Mean 16.04 6.00 24 24 Std. Deviation 12.723 9.227 Std. Error Mean 2.597 1.883 Independent Samples Test Levene's Test for Equality of Variances F lengthofsentave11 Equal variances ass umed Equal variances not as sumed .824 Sig. .369 2. Read Levene’s Test t-tes t for Equality of Means t df Sig. (2-tailed) Mean Difference Std. Error Difference 95% Confidence Interval of the Difference Lower Upper 3.130 46 .003 10.042 3.208 3.584 16.499 3.130 41.951 .003 10.042 3.208 3.567 16.516 3. Read p value Report Findings • “Assailants were given an average sentence of 16 months when the victims were female, compared to 6 months when the victims were male (df = 46, t = 3.13, p. < .005).” • “Respondents gave longer sentences when the victims were female (16 months) than when they were male (6 months), a difference that was statistically signficant (df = 46, t = 3.13, p. < .005).” Statistical Analyses analysis of variance ( ANOVA ) Psych 250 Winter, 2011 Analysis of Variance Variable Type Example Commonly-used Statistical Method Nominal by Nominal blood type by gender Chi-square Scale by Nominal GPA by gender t-test GPA by major Analysis of Variance weight by height GPA by SAT Regression Correlation Scale by Scale Dep Var: Length of Sentence Indep var: Major length of sentence Valid 0 1 2 3 4 5 6 8 12 15 16 18 24 27 36 42 60 66 Total Frequency 12 1 4 5 1 1 6 2 6 1 1 2 1 1 1 1 1 1 48 Percent 25.0 2.1 8.3 10.4 2.1 2.1 12.5 4.2 12.5 2.1 2.1 4.2 2.1 2.1 2.1 2.1 2.1 2.1 100.0 Valid Percent 25.0 2.1 8.3 10.4 2.1 2.1 12.5 4.2 12.5 2.1 2.1 4.2 2.1 2.1 2.1 2.1 2.1 2.1 100.0 Statistics Cumulative Percent 25.0 27.1 35.4 45.8 47.9 50.0 62.5 66.7 79.2 81.3 83.3 87.5 89.6 91.7 93.8 95.8 97.9 100.0 length of s entence N Valid Mis sing Mean Std. Deviation Variance 48 0 9.98 14.573 212.361 Mean = 14.6 Variance = 212.4 Form of Relationship (differences seen in sample) Length of Sentence by Major Descriptives lengthofsentave11 N natural s cience s ocial s cience arts and humanities Total 19 14 15 48 Mean 14.26 7.43 10.27 11.02 Std. Deviation 15.183 8.474 10.067 12.109 Std. Error 3.483 2.265 2.599 1.748 • Nat sci • Soc sci • Art & Hum 95% Confidence Interval for Mean Lower Bound Upper Bound 6.94 21.58 2.54 12.32 4.69 15.84 7.50 14.54 14.3 7.4 11.0 Minimum 0 0 0 0 Maximum 60 24 36 60 Statistical Inference ( generalize from sample to universe? ) sample Sample n = 40 inference Nat sci = 14.3 Soc sci = 7.4 A & H = 11.0 Universe n=∞ Possible Sample -- 1 Social Science 1 2 3 4 5 6 7 8 Art & Human Natural Science 9 10 11 12 13 14 15 Possible Sample -- 2 Social Science 1 2 3 4 5 6 7 8 Art & Human Natural Science 9 10 11 12 13 14 15 ANOVA Logic 1. Calculate ratio of “between-groups” variance to “within-groups” variance 2. Estimate the sampling distribution of that ratio: F distribution 3. If the probability that the ratio in sample could come from universe with no differences in group means is < .05, can reject null hypothesis and infer that mean differences exist in universe ANOVA Logic • Between groups: nsocsci(Meansocsci - Mean)2 + narthum(Meanarthum - Mean)2 +nnatsci(Meannatsci – Mean)2 / df • Within groups: (ni – Meansocsci) 2 + (ni - Meanarthum)2 + (ni - Meannatsci) 2 / df F ratio between groups mean squares F = within groups mean squares Samples of n = 40 Universe n = ∞ f f Null Hyp: Nat sci = 11 months Soc sci = 11 months Art-Hum = 11 months f f f Distributions ANOVA: sentence by major Descriptives lengthofsentave11 N natural s cience s ocial s cience arts and humanities Total 19 14 15 48 Mean 14.26 7.43 10.27 11.02 Std. Deviation 15.183 8.474 10.067 12.109 Std. Error 3.483 2.265 2.599 1.748 95% Confidence Interval for Mean Lower Bound Upper Bound 6.94 21.58 2.54 12.32 4.69 15.84 7.50 14.54 Minimum 0 0 0 0 Maximum 60 24 36 60 ANOVA lengthofsentave11 Between Groups Within Groups Total Sum of Squares 388.933 6502.046 6890.979 df 2 45 47 Mean Square 194.467 144.490 F 1.346 Sig. .271 ANOVA: sentence by major simulated data Descriptives lengthofsentave11 N natural s cience s ocial s cience arts and humanities Total 19 14 15 48 Mean 14.26 7.43 10.27 11.02 Std. Deviation 15.183 8.474 10.067 12.109 Std. Error 3.483 2.265 2.599 1.748 95% Confidence Interval for Mean Lower Bound Upper Bound 6.94 21.58 2.54 12.32 4.69 15.84 7.50 14.54 Minimum 0 0 0 0 Maximum 60 24 36 60 ANOVA lengthofsentave11 Between Groups Within Groups Total Sum of Squares 388.933 6502.046 6890.979 df 2 45 47 Mean Square 194.467 144.490 F 1.346 Sig. .271 ANOVA: sentence by major simulated data Descriptives lengthofsentave11 N natural s cience s ocial s cience arts and humanities Total 19 14 15 48 Mean 14.26 7.43 10.27 11.02 Std. Deviation 15.183 8.474 10.067 12.109 Std. Error 3.483 2.265 2.599 1.748 95% Confidence Interval for Mean Lower Bound Upper Bound 6.94 21.58 2.54 12.32 4.69 15.84 7.50 14.54 Minimum 0 0 0 0 Maximum 60 24 36 60 ANOVA lengthofsentave11 Between Groups Within Groups Total Sum of Squares 388.933 6502.046 6890.979 df 2 45 47 Mean Square 194.467 144.490 F 1.346 Sig. .271 Write Findings “Social science majors assigned sentences averaging 7.4 years, arts and humanities students 10.3 years, and natural science students 14.3 years, but these differences were not statistically significant (df = 2, 42, F = 1.35, p < .30).”