Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Why do we need statistics? A. B. C. D. To confuse students To torture students To put the fear of the almighty in them To ruin their GPA, so that they don’t get into grad school, have to buss tables and move back in with parents E. All of the above F. All of the above (and other tragic outcomes) 1 A positive optimistic view… It is a tool that could help you succeed and move out of your parents house There is nothing to fear but fear itself You need a passing grade of C Can help to get into grad school It’s important to understand so you don’t get scammed… 2 The Caveat: Remember… There are lies There are d#$m (darn) lies and then There are statistics Magic 3 Statistics The science of collecting, displaying and analyzing data Based on quantitative measurements of samples Allow us to objectively evaluate data Descriptive Inferential 4 Defining variability Amount of change or fluctuation Some variability is expected Is the observed variability due to the usual variability among subjects from the population? Or is the observed variability greater than the usual variability 5 Highest Frequency (# of Subjects) Frequency Distribution From population Sample 1 Sample 2 Sample 3 0 Dependent variable highest score 6 Frequency (# of Subjects) Frequency Distribution Untreated groups of an experiment Experimental Control Dependent variable 7 Frequency (# of Subjects) Frequency Distribution Treated Groups Control Experimental Dependent variable 8 Beginning steps of an Experiment Sample from population Hypothesis Define variables Assign subjects to conditions Measure performance Calculate means Calculate variability Heading Error: Calculating Variance Deviation from the mean for each subject Sex SUBJE CT$ Female rat1 -4.4 4.4 -4.68 21.86 Female rat3 11 11 1.93 3.71 Female rat5 2.3 2.3 -6.78 45.90 Female rat7 8.5 8.5 -0.58 0.33 Female rat9 6.9 6.9 -2.18 4.73 Female rat11 -10.8 10.8 1.73 2.98 Female rat13 -10.9 10.9 1.83 3.33 Female rat15 17.8 17.8 8.73 76.13 Male rat2 29.6 29.6 6.86 47.09 Male rat4 -18.5 18.5 -4.24 17.96 Male rat6 14.5 14.5 -8.24 67.86 Male rat8 58.2 58.2 35.46 1257.59 Male rat10 -18.7 18.7 -4.04 16.30 Male rat12 -17.3 17.3 -5.44 29.57 Male rat14 -14.8 14.8 -7.94 63.00 Male rat16 10.3 10.3 -12.44 154.69 HEADGdeg (Xi – X) Square the deviation from the mean for each subject (Xi – X)2 Add the squared deviations together (Xi – X)2 Female = 158.96 Male = 1654.06 10 Heading Error: Calculating Variance Sex SUBJE CT$ Female rat1 -4.4 4.4 -4.68 21.86 Female rat3 11 11 1.93 3.71 Female rat5 2.3 2.3 -6.78 45.90 Female rat7 8.5 8.5 -0.58 0.33 Female rat9 6.9 6.9 -2.18 4.73 Female rat11 -10.8 10.8 1.73 2.98 Female rat13 -10.9 10.9 1.83 3.33 Female rat15 17.8 17.8 8.73 76.13 Male rat2 29.6 29.6 6.86 47.09 Male rat4 -18.5 18.5 -4.24 17.96 Male rat6 14.5 14.5 -8.24 67.86 Male rat8 58.2 58.2 35.46 1257.59 Male rat10 -18.7 18.7 -4.04 16.30 Male rat12 -17.3 17.3 -5.44 29.57 Male rat14 -14.8 14.8 -7.94 63.00 Male rat16 10.3 10.3 -12.44 154.69 HEADGdeg Compute the Variance s2 = (Xi – X)2 n-1 s2Female = 22.71 s2Male = 236.29 11 Heading Error: Calculating standard deviation Sex SUBJE CT$ Female rat1 -4.4 4.4 -4.68 21.86 Female rat3 11 11 1.93 3.71 Female rat5 2.3 2.3 -6.78 45.90 Female rat7 8.5 8.5 -0.58 0.33 Female rat9 6.9 6.9 -2.18 4.73 Female rat11 -10.8 10.8 1.73 2.98 Female rat13 -10.9 10.9 1.83 3.33 Female rat15 17.8 17.8 8.73 76.13 Male rat2 29.6 29.6 6.86 47.09 Male rat4 -18.5 18.5 -4.24 17.96 Male rat6 14.5 14.5 -8.24 67.86 Male rat8 58.2 58.2 35.46 1257.59 Male rat10 -18.7 18.7 -4.04 16.30 Male rat12 -17.3 17.3 -5.44 29.57 Male rat14 -14.8 14.8 -7.94 63.00 Male rat16 10.3 10.3 -12.44 154.69 HEADGdeg Standard deviation s or SD = s2 sFemale = 4.77 sMale = 15.37 12 Heading Error: Calculating standard error of the Mean Sex SUBJE CT$ Female rat1 -4.4 4.4 -4.68 21.86 Female rat3 11 11 1.93 3.71 Female rat5 2.3 2.3 -6.78 45.90 Female rat7 8.5 8.5 -0.58 0.33 Female rat9 6.9 6.9 -2.18 4.73 Female rat11 -10.8 10.8 1.73 2.98 Female rat13 -10.9 10.9 1.83 3.33 Female rat15 17.8 17.8 8.73 76.13 Male rat2 29.6 29.6 6.86 47.09 Male rat4 -18.5 18.5 -4.24 17.96 Male rat6 14.5 14.5 -8.24 67.86 Male rat8 58.2 58.2 35.46 1257.59 Male rat10 -18.7 18.7 -4.04 16.30 Male rat12 -17.3 17.3 -5.44 29.57 Male rat14 -14.8 14.8 -7.94 63.00 Male rat16 10.3 10.3 -12.44 154.69 HEADGdeg Standard error of the Mean SEM = SD n SEMFemale = 1.68 SEMMale = 5.43 13 Heading Error: Group Means with SEM Absolute Heading Error (deg) 30 25 20 Male 15 Female 10 5 0 Group 14 Heading Error: Group Means with 95% Confidence Interval Confidence intervals (CI) represent a range of values above and below our sample mean that is likely to contain the population mean; i.e., the true mean of the population is likely (we’re 95% confident) to fall somewhere within the CI range. ( ) SD n 35 Heading Error CI = X± tcrit 40 30 25 Male 20 Female 15 10 5 0 Group 15 Heading Error: Group Means with SEM variance (s2)- average squared deviation of scores from their mean standard deviation (SD)- average deviation of scores about the mean standard error of the mean (SEM)- dispersion of the distribution of sample means = (Xi – n-1 SD = s2 SEM = SD n 30 Absolute Heading Error (deg) s2 X)2 25 20 Male 15 Female 10 5 0 Group 16 Choosing a significance level Significance level - A criterion for deciding whether to reject the null hypothesis or not. • What is the convention? p < .05 ( level) • A stricter criterion may be required if the risk of making a wrong decision (a Type I error) is greater than usual. p < .01 or p < .001. • But there is a trade off in using a stricter criterion. 17 Choosing a significance level Type II error – Failure to reject the null hypothesis when it is really false (). • Concluding that the difference is due to chance variation when it is really due to the independent variable • Power of the statistical test (1 - ) 18 Summary Chart Reality Check Decision based on Statisical Results Fail to reject Ho Reject Ho Ho is true Correct p=1- Type I error p= Ho is False Type II error P= Correct p=1- 19 Time estimation experiment Time will go faster for people having fun than for those not having fun. Two group design: Fun - views cartoons with the captions for 10 min. No Fun – views cartons without captions for 10 min. Ho = the time estimates of the two groups will be the same. H1 = the fun group will have shorter estimates than the control group. Table 13-2 possible errors in the time estimation experiment (p.381, 6th ed.) What type of errors were made in the two descriptions? Type I Type II We conclude that there was no difference in the time estimates made by the “fun” and “no fun” groups even though the treatments did produce an effect. Type 1 = Reporting an effect that doesn’t really exist Type I Type II We conclude that there was a difference in the time estimates made by the “fun” and “no fun” groups even though the treatments produced little or no effect at all. Type 2 = Missing an effect that does really exist Type III – failure to accurately identify a type 1 or 2 error Note - The error has been corrected in the 7th ed., p. 390. 20 Questions to ask when selecting a test statistic Table 14-1 The parameters of data analysis ___________________________________________________ 1. How many independent variables are there? 2. How many treatment conditions are there? 3. Is the experiment run between or within subjects? 4. Are the subjects matched? 5. What is the level of measurement of the dependent variable? ___________________________________________________ 21 Answers based on the water maze study Table 13-1 The parameters of data analysis ___________________________________________________ 1. How many independent variables are there? one 2. How many treatment conditions are there? one 3. Is the experiment run between or within subjects? between 4. Are the subjects matched? no 5. What is the level of measurement of the dependent variable? ratio ___________________________________________________ 22 Levels of Measurement Ratio – a measure of magnitude having equal intervals between values and having an absolute zero point. Interval – same as ratio except that there is no true zero point. Ordinal – a measure of magnitude in the form of ranks (not sure of equal intervals and no absolute zero). Nominal – items are classified into categories that have no quantitative relationship to one another. 23 Choosing a test statistic TABLE 14-2 Selecting a possible statistical test by number of independent variables and level of measurement One Independent Variable Two Treatments Level of measurement of dependent variable Two Independent Groups Two matched groups (or within subjects) Two Independent Variables More Than Two Treatments Multiple independent groups Multiple matched groups (or within subjects) Interval or ratio t test for independent groups t test for matched groups One-way ANOVA One-way ANOVA (repeated measures) ordinal MannWhitney U test Wilcoxon test KruskalWallis test Friedman test Nominal Chi square test Chi square test Factorial Designs Independent groups Matched groups (or within sujects) Independent groups and matched groups (or between subjects and within subjects Two-way ANOVA Two-way ANOVA (repeated measures) Two-way ANOVA (mixed) Chi square test 24 Heading Error: Statistical Analysis t test for Independent Groups 1) Lay out Formula tobs = ( X 1 – X2 )( ) (n1 – 1) s21 + (n2 – 1) s22 (n1 + n2 – 2) 1 1 + n1 n2 2) Plug in Values tobs = 22.74 – 9.08 ( )( ) (8–1)236.29+(8–1)22.71 (8 + 8 – 2) 1 1 + 8 8 25 Heading Error: Statistical Analysis t test for Independent Groups 8) Divide the numerator by the denominator. tobs = 13.66 5.69 Formula tobs = tobs = 2.40 X1 – X2 ( )( ) (n1 – 1) s21 + (n2 – 1) s22 (n1 + n2 – 2) 1 1 + n1 n2 26 Determining significance 1.Was the hypothesis directional or nondirectional? 2.What was the significance level? 3.How many degrees of freedom do we have? Degrees of freedom (df)– the number of members in a set of data that can vary or change value without changing the value of a known statistic for those data. 27 Answers to the questions 1.Was the hypothesis directional or nondirectional? Nondirectional, so twotailed. 2.What was the significance level? p < .05 3.How many degrees of freedom do we have? 14 Look on page 531 of Myers & Hansen to find the critical value of t… 28 Answers to the questions Or you could just go on-line… e.g., http://www.psychstat.missouristate.edu/introbook/tdist.htm 29 Heading Error: Statistical Analysis t test for Independent Groups 8) Divide the numerator by the denominator. tobs = 13.66 5.69 tobs = 2.40 p < .05, two-tailed tcrit = 2.145 Formula tobs = X1 – X2 ( )( ) (n1 – 1) s21 + (n2 – 1) s22 (n1 + n2 – 2) 1 1 + n1 n2 30 Compare to our computer output from SPSS 8) Divide the numerator by the denominator. tobs = 13.66 5.69 tobs = 2.40 p < .05, two-tailed tcrit = 2.145 Formula tobs = X1 – X2 ( )( ) (n1 – 1) s21 + (n2 – 1) s22 (n1 + n2 – 2) 1 1 + n1 n2 31 Conclusion Decision: Reject the null hypothesis Are we done? - How much importance should we attach to this finding? - Was the effect just barely significant (p<.05)? - What if the sig level was, p<.0001? Would this be a larger effect? 32 Answers to the questions Assess the quality of the Experiment 1)Were control procedures adequate? 2)Were variables defined appropriately? 3)Is a Type I error likely? The t test is a robust statistic… Means that assumptions can be violated without changing the rate of type I or type II error. 33 Effect size Convert t to a correlation coefficient r= t2 t2 +df r= (2.40)2 (2.40)2 +14 r = .54 r2 = .15 According to Cohen (1988), r ≥ .50 is considered a large effect (.30 is a moderate effect and below .30 is a small effect). The r2 of .15 indicates that the IV accounts for 15% of the variability observed in the DV. Online site for effect size calculator: http://web.uccs.edu/lbecker/Psy590/escalc3.htm 34 Effect size Convert t to a correlation coefficient r= t2 t2 +df r= (2.40)2 (2.40)2 +14 r = .54 r2 = .15 Online site for effect size calculator: http://web.uccs.edu/lbecker/Psy590/escalc3.htm 35