Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Degrees of freedom (statistics) wikipedia , lookup
Foundations of statistics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
History of statistics wikipedia , lookup
Psychometrics wikipedia , lookup
Taylor's law wikipedia , lookup
Gibbs sampling wikipedia , lookup
Statistical inference wikipedia , lookup
Misuse of statistics wikipedia , lookup
Student’s t-Distribution The t-Distribution, t-Tests, Measures of Effect Size, & Managing Violations of Assumptions Sampling Distributions Redux • Chapter 7 opens with a return to the concept of sampling distributions from chapter 4 – Sampling distributions of the mean 1 Sampling Distribution of the Mean • Because the SDotM is so important in statistics, you should understand it • The SDotM is governed by the Central Limit Theorem Given a population with a mean μ and a variance σ2, the sampling distribution of the mean (the distribution of sample means) will have a mean equal to μ, a variance equal to σ2/n, and a standard deviation equal to σ 2 / n . The distribution will approach the normal distribution as n, the sample size, increases. (p. 178) Sampling Distribution of the Mean Translation: 1. For any population with a given mean and variance the sampling distribution of the mean will have: • • • µx = μ σx2 = σ2/n σx = σ/√n 2. As n increases, the sampling distribution of the mean (µx) approaches a normal curve 2 Sampling Distribution of the Mean • Analysis: – Although µx and µ will tend to be similar to one another… – The relationships between… • σx2 and σ2 • σx and σ – …will differ as a function of the sample size • We saw this in our sampling distribution of the mean example from chapter 4… So, you wanna test a hypothesis, do ya? • Our understanding of sampling and sampling distributions now allows us to test hypotheses • How we test a hypothesis depends on the information we have available 3 Choosing a Test • µ? 1. Which variables are available? – σ? – s? • Number of data sets: –1 –2 2. How many data sets are you presented with? 3. Do your data sets come from 1 or 2 groups? • Number of Groups –1 –2 Testing Hypotheses about Means: The Rare Case of Knowing σ • So far, to test the probability of finding a particular score, we’ve used the Standard Normal Distribution – IQ = 122 – µ = 100 – σ = 15 (x − x) z= z= σ (122 − 100) 15 z= (22) 15 z = 1.47 -1.96 < z < 1.96 Fail to reject H0 4 How the z-Test Works • How does our test change when we test group means, not just individual scores? – We use the central limit theorem How the z-Test Works n = 100 (122 − 100) 15 100 (122 − 100) z= 15 2 z= n=2 n=1 z= (122 − 100) 15 1 ( 22) 15 10 ( 22) z= 15 1.41 z= z= ( 22) 15 1 (22) 1.5 z = 14.67 (22) 10.64 z = 2.07 (22) 15 z = 1.47 z= z= z= 5 How the z-Test Works • Large samples reduce the amount of random variance (sampling error) – • • More confidence that the sample mean = population mean Larger samples improve our ability to detect differences between samples and populations For n = 1 (x − ) (x − µ) z= = z = σ µ σ n Testing Hypotheses: When σ Is Unknown • Generally, the population standard deviation, σ, is unknown to us • Occasionally, we will know the population mean, µ, when we don’t know σ • In these situations, the standard normal distribution no longer meets our needs 6 Testing Hypotheses: When σ Is Unknown • Knowing µ… – We can produce an estimate of σ from s – Using s changes the nature of the test we are conducting, as s is not distributed in the same fashion as σ • Sampling distribution of the sample standard deviation is NOT normally distributed – Strong positive skew Testing Hypotheses: When σ Is Unknown Sampling distribution of s Sampling distribution of σ 7 So How Does s Estimate σ? • Given the differences in distribution shape, it is easy to conclude that s ≠ σ – s is an unbiased estimator of σ over repeated samplings – However, a SINGLE value of s is likely to underestimate σ • Because of this fact, small samples will systematically underestimate σ as a function of s – This leads to any given statistic calculated from this distribution to be < a comparable value of z – We cannot use z any longer Æ t t and the t-Distribution • Developed by Student while he was working for the Guinness Brewing Co. 1. The shape of the t-distribution is a direct function of the size of the sample we are examining 2. For small samples, the t-distribution is somewhat flatter than the standard normal distribution, with a lower peak and fatter tails 8 t and the t-Distribution 3. As sample size increases: • • • The t-distribution approaches a normal distribution Theoretically, we mean that the closer that our sample comes to infinity, the more it looks like a normal distribution Practically, when n ~ 100 – 120 t and the t-Distribution 9 t and the t-Distribution 4. Identifying values of t associated with a given rejection region depends on: – α – the number of tails associated with the test – the degrees of freedom available in the analysis – For this one-sample test, (df = n-1) because we used one degree of freedom calculating s using the sample mean and not the population mean. One-Sample t-Test ( x − µ) ( x − µ) t = ( x − µ) or or t = t= s 2 x sx sx n n 10 z-Test vs. One-Sample t-Test z= (x − µ) σ n (x − µ) t= sx n Note the similarities between these tests: ONLY the source of “variance” and the distribution you test against have changed! Using the One-Sample t-Test • You are one the admissions board for a graduate school of Psychology. • You are attempting to determine if the GRE scores for the students applying to your program is competitive with the national average. – µVerbal = 569 • SPSS output from your data Descriptive Statistics N Range GRE 24 Valid N (listwise) 24 310.00 Mean 659.7917 Std. Deviation 86.43267 11 Using the One-Sample t-Test • Research Hypothesis: – The GRE scores from your applicants differ from the population norms • • H1: µa ≠ µp or ES > 0 Null Hypothesis – The GRE scores from your applicants do not differ from the population norms • • H0: µa = µp or ES = 0 Evaluate the students’ GRE-V scores Using the One-Sample t-Test • Select: • Rejection region • • α = .05 “Tail” or directionality • • We don’t know exactly how the students will score: we just expect them to show scores differing from the population values Might predict higher scores… 12 Using the One-Sample t-Test • Generate sampling distribution of the mean assuming H0 is true • • One-Sample t-test Given our sampling distribution: • Conduct the statistical test Using the One-Sample t-Test t= t= (x − µ) sx n (659.79 − 569) 86.43 24 µVerbal = 569 x-bar = 659.79 s = 86.43 n = 24 t= (90.79) 86.43 4.90 t= (90.79) 17.64 t = 5.15 This numerical value is called tobt tobt(23) = 5.15 13 Using the One-Sample t-Test • SPSS Output µ One-Sample Test Test Value = 569 95% Confidence Interval of the Difference t GRE 5.146 df Sig. (2-tailed) Mean Difference 23 .000 90.79167 Lower 54.2943 Upper 127.2890 tobt(23) = 5.15 Evaluating Statistical Significance of the t-Test • First note: – α = .05 – Tail or directionality: two-tailed – t-Value = 5.15 – Degrees of freedom (df) • For the One-Sample t-Test, df = n-1 (24-1 = 23) • Estimating s from x-bar (not σ from µ) 14 Evaluating Statistical Significance of the t-Test • In the past you… – Identified a tabled value of tcrit – Compare tcrit to our tobt value – If tobt falls into the rejection region identified by tcrit, then we reject H0 – If tobt does not fall into the rejection region identified by tcrit, then we fail to reject H0 • SPSS Simplifies matters by exactly calculating p for us Using the One-Sample t-Test • SPSS Output µ One-Sample Test Test Value = 569 95% Confidence Interval of the Difference t GRE 5.146 df Sig. (2-tailed) Mean Difference 23 .000 90.79167 Lower 54.2943 Upper 127.2890 tobt(23) = 5.15, p < .05 Exact probability ≈ .000003 15 Evaluating Statistical Significance of the t-Test tobt = 5.15 tcrit = - 2.069 tcrit = 2.069 0 Because tobt falls within the rejection region identified by tcrit we reject H0 Testing Hypotheses: Two Matched (Repeated) Samples • Sometimes, we’re interested in how a single set of scores change over time – – – – Psychotherapy tx influences depression Patients respond to medication Consumer attitudes before and after an advertisement Changes in citizen attitudes following the State of the Union address • When we look at two sets of scores collected from a single sample at different time points, we need to use a matched samples test 16 Matched Samples • Matched samples – Use the same participants at two or more different time points to collect similar data • MUST BE THE SAME SAMPLE! Time 1 Wait 30 Days BDI - II Time 2 BDI - II Matched Samples Test • With a matched samples test, you are testing the change in scores between the two administrations of the test – H0: µ1 = µ2 – H0: µ1 - µ2 = 0 or ES = 0 • This is truly the null hypothesis for the matched samples test 17 Matched Samples Test • Essentially, the group means at each time point mean little to us – Change in scores is the key – Conduct this test by obtaining the average difference score between the two time points Matched Samples Test D −0 t= sD n D-bar represents average difference scores between time points sD is the standard deviation of the difference scores -0 may seem redundant, but isn’t! 18 Calculating the Matched Samples t-Test • You are a researcher examining the impact of a new therapy intervention on the incidence of self-injurious behavior (SIB) • You collect a measure of the frequency of self-injurious acts when clients enter your treatment (time 1) • You collect a measure of the frequency of self-injurious acts two weeks later (time 2) Calculating the Matched Samples t-Test • Research Hypothesis: – The new treatment will change SIB scores • • H1: µ1 ≠ µ2 or ES > 0 Null Hypothesis – The SIB scores at time 2 will be the same as the scores at time 1 (no change) • • • H0: µ1 = µ2 H0: µ1 - µ2 = 0 or ES = 0 Evaluate SIB at time 1 & time 2 19 Using the One-Sample t-Test • Select: • Rejection region • • α = .05 “Tail” or directionality • We don’t know exactly how the treatment will work, so we’d better use a two-tailed test Using the One-Sample t-Test • Generate sampling distribution of the mean assuming H0 is true • • Matched Samples t-test Given our sampling distribution: • Conduct the statistical test 20 Calculating the Matched Samples t-Test Time 1 13 14 8 10 11 13 15 16 19 10 7 Time 2 8 10 4 7 10 9 11 9 17 6 2 D 5 3 5 D2 25 16 16 9 4 4 1 4 4 7 2 4 1 16 16 49 4 16 25 ∑D = 43 Descriptive Statistics D = 3.91 ∑D2 = 193 N Minimum Maximum Mean Std. Deviation time1 11 7.00 19.00 12.3636 3.58532 time2 11 2.00 17.00 8.4545 3.93354 Valid N (listwise) 11 (∑D)2 = 1849 Calculating the Matched Samples t-Test (∑ D ) 2 ∑D − n sD2 = (n − 1) 2 sD2 = sD2 = 1849 11 (10) 193 − 24.91 (10) 432 193 − 2 11 sD = (11 − 1) sD2 = sD2 = 2.49 193 − 168.09 (10) sD = 2.49 sD = 1.58 21 Calculating the Matched Samples t-Test t= t= D −0 sD n 3.91 .48 t= 3.91 − 0 1.58 11 t = 8.15 t= 3.91 1.58 3.32 tobt = 8.15 Evaluating Statistical Significance of the t-Test • First note: – α = .05 – Tail or directionality: two-tailed – t-Value = 8.15 – Degrees of freedom (df) • For the Matched Samples t-Test: – df = number of PAIRS of scores -1 – df = 11 - 1 = 10 – Again, we can calculate p exactly with SPSS 22 Calculating the Matched Samples t-Test • SPSS Output Paired Samples Correlations N Pair 1 time1 & time2 Correlation 11 Sig. .916 .000 Paired Samples Test Paired Differences 95% Confidence Interval of the Difference Mean Pair 1 time1 - time2 Std. Deviation Std. Error Mean 3.90909 1.57826 .47586 Lower 2.84880 Upper 4.96938 t 8.215 df Sig. (2-tailed) 10 .000 tobt (10) = 8.15, p < .05 p ≈ .0000009 Evaluating Statistical Significance of the t-Test tobt = 8.15 tcrit = - 2.228 tcrit = 2.228 0 Because tobt falls within the rejection region identified by tcrit we reject H0 23 Testing Hypotheses: Two Independent Samples • Probably the most common use of the tTest and the t-distribution • Compare the mean scores of two groups on a single variable – IV: Groups – DV: Variable of interest • Groups must be independent of one another – Scores in 1 group cannot influence scores in the other group Independent Samples t-Test X1 − X 2 t= s x1 − x2 or t= X1 − X 2 s12 s22 + n1 n2 This test is calculated by dividing the mean difference between two groups by the “dispersion” or “variation” observed between the two groups 24 Independent Samples t-Test: Degrees of Freedom • 1 df lost for each σ estimated by s using xbar • Since there are two independent groups in this analysis, we must estimate σ twice • df = (n1 + n2) - 2 Independent Samples t-Test: Example • Let’s return to the example used for the matched samples test • As a competent researcher, you realize that simply showing a change over time is not enough to prove the efficacy of your treatment – People spontaneously change over time • Show that an untreated control group does not change over the same period of time that your treatment group does change 25 Independent Samples t-Test: Example Time 1 Tx Group Tx SIB Scores SIB Scores = Ctrl Group Time 3 Time 2 ? SIB SIB Scores Scores Tx SIB Scores Independent Samples t-Test: Example • At time 1, the control and treatment SIB groups have equal SIB scores • Administer the treatment for 2 weeks to Tx group – The Control group receives no intervention during these two weeks • Compare SIB scores of Tx and Control group after 2 weeks • Provide Control group w/ intervention if desired 26 Independent Samples t-Test: Example • Research Hypothesis: – Your treatment for SIB will reduce SIB scores in the Tx group after 2 weeks • • H1: µt < µc Null Hypothesis – Your treatment for SIB will have no effect • • H0: µt = µc Evaluate the efficacy of your treatment Independent Samples t-Test: Example Time 2 Data Control 12 13 10 9 11 8 16 13 15 16 12 Tx 8 10 4 Ctrl Group 135 93 ∑x2 1729 941 18225 2 Tx Group ∑x (∑x)2 7 10 9 11 9 17 6 8649 x-bar 12.27 8.45 s2 7.29 15.47 s 2.69 3.93 n 11 11 Descriptive Statistics N Minimum Maximum Mean Std. Deviation ctrl 11 8.00 16.00 12.2727 2.68667 tx 11 2.00 17.00 8.4545 3.93354 Valid N (listwise) 11 27 Independent Samples t-Test: Example • Select: • Rejection region • • α = .05 “Tail” or directionality • We have evidence that the treatment probably works, so we make a one-tailed hypothesis here (scores for the Tx group will be lower than the Control group at time 2) Independent Samples t-Test: Example • Generate sampling distribution of the mean assuming H0 is true • • Independent Samples t-Test Given our sampling distribution: • Conduct the statistical test 28 Independent Samples t-Test: Example t= t= X1 − X 2 8.45 − 12.27 15.47 7.29 + 11 11 t= s12 s22 + n1 n2 − 3.82 1.41 + .66 t = −2.65 t= − 3.82 2.07 t= − 3.82 1.44 tobt(20) = -2.65 Evaluating Statistical Significance of the t-Test • First note: – α = .05 – Tail or directionality: one-tailed – t-Value = -2.65 – Degrees of freedom (df) • For the Independent Samples t-Test – (n1 + n2) - 2 – (11+11)-2 – 22 - 2 = 20 29 Evaluating Statistical Significance of the t-Test • SPSS Output Independent Samples Test Levene's Test for Equality of Variances F Self-Injurious Behavior Equal variances assumed Sig. .518 t-test for Equality of Means t .480 Equal variances not assumed df Sig. (2-tailed) Mean Difference Std. Error Difference 2.658 20 .015 3.81818 1.43625 2.658 17.663 .016 3.81818 1.43625 tobt(20) = -2.65, p < .05 p ≈ .015 Evaluating Statistical Significance of the t-Test tcrit = - 1.725 tobt = -2.65 0 Because tobt falls within the rejection region identified by tcrit we reject H0 30 Independent Samples t-Test: One Complication • There is a slight problem with the form of the equation we used… – ONLY can be applied to groups with equal sample sizes – A major limitation in real-world research t= X1 − X 2 s12 s22 + n1 n2 Pooled Variance Estimate • This equation permits tests with different sample sizes • Generates an estimate of the total variance between groups weighted by the size of each group – Therefore, larger samples have a greater impact on the variance – Vice-versa for small samples 31 Pooled Variance Estimate 2 2 ( n − 1 ) s + ( n − 1 ) s 2 1 2 2 sp = 1 n1 + n2 − 2 Using the Pooled Variance Estimate X − X2 t= 1 s 2p s 2p + X1 − X 2 n1 n2 t= s12 s22 + X1 − X 2 t = n1 n2 1 1 s 2p + n1 n2 32 Using the Pooled Variance Estimate: Example Time 2 Data Control 11 16 13 15 16 12 Tx 8 10 4 Ctrl Group No Data 7 10 9 11 9 17 6 2 Tx Group Descriptive Statistics ∑x 83 93 ∑x2 1171 941 ctrl 6 11.00 16.00 13.8333 2.13698 tx 11 2.00 17.00 8.4545 3.93354 (∑x)2 6889 8649 Valid N (listwise) x-bar 13.83 8.45 s2 4.57 15.47 s 2.14 3.93 n 6 11 N Minimum Maximum Mean Std. Deviation 6 Using the Pooled Variance Estimate: Example ( n1 − 1) s12 + ( n2 − 1) s22 s = n1 + n2 − 2 s 2p = (11 − 1)15.47 + (6 − 1)4.57 11 + 6 − 2 s 2p = (10)15.47 + (5)4.57 15 s 2p = 154.7 + 22.85 15 s 2p = 177.55 15 s 2p = 11.84 2 p 33 Using the Pooled Variance Estimate: Example t= t= t= X1 − X 2 1 1 s 2p + n1 n2 t= 8.45 − 13.83 1 1 11.84( + ) 11 6 − 5.38 11.84(.1667 + .0909) − 5.38 3.05 t= − 5.38 1.75 t= − 5.38 11.84(.2576) t = −3.07 tobt(15) = -3.07 Evaluating Statistical Significance of the t-Test • First note: – α = .05 – Tail or directionality: one-tailed – t-Value = -3.07 – Degrees of freedom (df) • For the Independent Samples t-Test – (n1 + n2) - 2 – (11+6)-2 – 17 - 2 = 15 34 Evaluating Statistical Significance of the t-Test • SPSS Output Independent Samples Test Levene's Test for Equality of Variances F Self-Injurious Behavior Equal variances assumed Sig. .714 t-test for Equality of Means t .411 Equal variances not assumed df Sig. (2-tailed) Mean Difference Std. Error Difference 3.080 15 .008 5.37879 1.74614 3.653 14.979 .002 5.37879 1.47232 tobt(15) = -3.07, p < .05 p ≈ .0076 Evaluating Statistical Significance of the t-Test tcrit = - 1.753 tobt = -3.07 0 Because tobt falls within the rejection region identified by tcrit we reject H0 35 Effect Size of The Independent Samples t-Test d= µ1 − µ 2 σ or d= X1 − X 2 sp We use the same effect size conventions we identified for the Matched Samples test Effect Size of The Independent Samples t-Test X1 − X 2 d= sp d= −5.38 11.84 8.45 − 13.83 d= 11.84 d = −.45 An effect size approaching the convention for a medium effect 36 t-test Assumptions • Although the t-test is generally a robust test, it can be affected by violations of underlying test assumptions – Normality – sampling distribution is normally distributed – Sample size – samples for each group should be of roughly equal size – Homogeneity of variance – σ1 = σ2 t-test Assumptions • One sample t-test – Normality - √ – Sample size - X – Homogeneity of variance – X • Matched & Independent samples t-test(s) – Normality - √ – Sample size - √ – Homogeneity of variance – √ 37 Impact of Violated Assumptions • For equal sample sizes… – …violating homogeneity of variance… • Minimal impact (α = .05 ± .02) – …with minor normality violations… • Similar results as above – …with major normality violations… • Severe skew (particularly in opposite directions) can lead to significant problems unless variances are fairly equal Impact of Violated Assumptions • Unequal sample sizes… – Much more difficult to interpret – Unequal sample sizes + heterogeneity of variance = distortions in p • Possibly increased risk of Type I error • Risk of error increases as more assumptions are violated 38 Coping with Violated Assumptions • What can we do to prevent or cope with violated assumptions? 1. Maintain equal sample sizes 2. Use trimmed samples… 3. Use a distribution free (i.e. non-parametric) test 4. Apply a statistical correction to t Coping with Violated Assumptions • SPSS Output Independent Samples Test Levene's Test for Equality of Variances F Self-Injurious Behavior Equal variances assumed Equal variances not assumed Sig. .714 .411 t-test for Equality of Means t df Sig. (2-tailed) Mean Difference Std. Error Difference 3.080 15 .008 5.37879 1.74614 3.653 14.979 .002 5.37879 1.47232 If pF < .05, use the “Equal variances no assumed” row 39 Statistical Tests We Have Learned 1. z-Test • • • 1 group 1 set of data µ & σ known 2. One-Sample t-Test • • • • 1 group 1 set of data µ known Estimate σ with s using x-bar 3. Matched Samples tTest • • • • 1 group 2 sets of data µ & σ unknown Estimate σD with sD using D-bar 4. Independent Samples t-Test • • • • 2 groups 2 sets of data µ & σ unknown Estimate σ twice with s using x-bar Choosing the Best Test 40 Choosing the Best Test • Flow-chart available on the website: – http://www.personal.kent.edu/~marmey • Also refer to the diagram on p. 11 of your Howell text • Try the review problems on the website for an example of the types of questions I might ask on an exam! 41