Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Psychometrics wikipedia , lookup
History of statistics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Foundations of statistics wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
Analysis of variance wikipedia , lookup
Misuse of statistics wikipedia , lookup
BIOL 582 Lecture Set 4 Two Sample Hypothesis Tests Review BIOL 582 • We have already done a few two-sample hypothesis tests, where we have generated distributions like this one 100 50 0 Frequency 150 200 Two samples (n1 =20), (n2=30), mean 1 = mean 2, sd 1 = sd 2, 1000 random permutations -3 -2 -1 0 mean 1 - mean 2 1 2 3 Review BIOL 582 • And we learned about theoretical probability distributions • We know that both empirical and theoretical distributions are used as proxies for distributions of test statistics under a null condition. 0.2 0.1 0.0 Density 0.3 0.4 Two samples (n1 =20), (n2=30), mean 1 = mean 2, sd 1 = sd 2, 1000 random permutations -3 -2 -1 0 mean 1 - mean 2 1 2 3 BIOL 582 Review • Finally, we know how hypothesis testing works • State Null hypothesis • State Alternative hypothesis • Define acceptable type I error rate • Determine P-value of observed statistic • Reject/Accept (or fail to reject) null hypothesis • Arrive at a reasonable conclusion • One thing we have not emphasized is that the process above requires use of a proper test statistic! • For example, if our null hypothesis is no difference in the portion of individuals with trait y less than x, comparing means would not make sense. Two-Sample approaches for comparison of means BIOL 582 • Here are some example test statistics, The first is most intuitive, the rest build on the first d12 = x1 - x2 d12 = x1 - x2 z12 = x1 - x2 t12 = The z stat scales the difference between sample means by the square root of the pooled population variance (stuff in denominator). This stat converts the difference in means to a “standard deviate”, which can be evaluated with a standard normal distribution. It requires that population variances are known (rather unlikely) s 12 s 22 N1 + These are the same, but the absolute value indicates a two-tailed assessment of a distribution is used (i.e., the alternative hypothesis is that means are not equal). As we have seen, these stats can be evaluated with empirical distributions. N2 The t stat scales the difference between sample means by the square root of the pooled sample variance (first stuff in denominator), which is converted to a standard error (by multiplying by the second stuff). x1 - x2 ( n1 -1) s12 + ( n2 -1) s22 ´ ( n1 -1) + ( n2 -1) 1 1 + n1 n2 This stat converts the difference in means to a “t stat”, which can be evaluated with a t distribution. Recall that the t distribution is like a standard normal distribution, corrected for small sample sizes. There are different t distributions for different sample sizes (degrees of freedom). The degrees of freedom are always the sum of the sample sizes, minus 2, unless a “correction” is made (more later) BIOL 582 Two-Sample approaches for comparison of means • Here are some more test statistics SS12 The “Sums of Squares” between samples 1 and 2. This stat needs some explanation. Recall that sample variance is equal to 2 s2 = The numerator is the sums of squares. Thus S ( xi - x ) n -1 s2 = SS n -1 However, the numbers indicate that two samples are used. So in this case, the sums of squares are calculated from sample means, not for them. See below: SS12 = ( x1 - x ) + ( x2 - x ) 2 See proof for interchangeability between SS and contrast in means on last page 2 Where the double bars indicate that this value is the “grand” mean (calculated from values of both samples). One must use an empirical distribution to evaluate this stat! BIOL 582 Two-Sample approaches for comparison of means • Here are some more test statistics SS12 F= SSE / (n1 + n2 - 2) The F value is a ratio of “between group” variance to “within group” variance. The sums of squares are first divided by the degrees of freedom – 1 for between groups, n1 – n2 – 2 for within groups – to create variances. Then the betweengroup variance is “standardized” by the within-group variance. The sum of squared error, SSE, is found as SSE = å( xij - xi ) 2 Meaning the mean of each group is subtracted from every subject within the groups, squared, and summed. This is essentially finding the variance of each sample but before dividing the SS by the df, the components are first added (pooled) This stat can be evaluated with an F distribution, with 1, and n1 – n2 – 2 df. When the null hypothesis compares two groups, F = t2. BIOL 582 Two-Sample approaches for comparison of means • Thought-provoking question • d12, SS require generating empirical distributions • z, t, F do not • Why would one bother using d12, SS? • The other test stats have rigid assumptions about the data 1. 2. 3. Data are sampled from normally distributed populations Populations have equal variances (although some ways to deal with this) All data are independent observations (also ways to deal with this) BIOL 582 Two-Sample approaches for comparison of means • Thought-provoking question • d12, SS require generating empirical distributions • z, t, F do not • Why would one bother using stats that have strict assumptions? • Parametric stats have a multitude of uses whereas empirical distributions have to take into account the appropriate (and often unique) method to generate them. BIOL 582 Two-Sample approaches for comparison of means • For example, t and F stats can be used to • test the slopes of linear regressions • measure effects in “paired” designs • test differences in proportions between populations • compare linear models • We will investigate the following “by hand” and with “canned” functions in R • Two sample means contrast • Paired designs • The t stat has different variants for different tests, and for violations in assumptions. For example, when variances are unequal, one can use an alternative calculation of t, which penalizes the degrees of freedom and means using a t-distribution with fatter tails, making it easier to have a type II error in lieu of a type I error. We will not dwell on these, but realize they exist. We will spend time, however, learning to concern ourselves with assumptions. Dealing with assumptions BIOL 582 • Normality • How to evaluate? Goodness of fit tests, normal probability plots Normal Quantile Plot 3 6 Normal Quantile Plot Bad 2 sd observed 0 0 -1 -2 -2 -3 sd observed 1 4 2 Good -3 -3 -2 -1 0 1 2 -2 -1 0 1 3 sd expected sd expected Two-sample Kolmogorov-Smirnov test data: a and b D = 0.04, p-value = 0.4005 alternative hypothesis: two-sided Two-sample Kolmogorov-Smirnov test data: a and b D = 0.078, p-value = 0.004558 alternative hypothesis: two-sided 2 3 Dealing with assumptions BIOL 582 Equal variance (and and a hint of normality) • How to evaluate? Box plots Really Bad 2 0 standard devaiates -5 0 standard devaiates 1 0 -1 -2 -2 -3 standard devaiates 4 Bad 5 Good 2 3 • 1 2 Group 1 2 Group 1 2 Group Proof of interchangeability between squared mean contrast and SS ( x1 - x2 ) = x12 - 2x1x2 + x22 2 2 é ù x x = x x x x ( 1 2 ) ë( 1 ) ( 2 )û 2 = éë( x1 - x ) - ( x2 - x )ùû 2 = ( x1 - x ) - 2 ( x1 - x ) ( x2 - x ) + ( x2 - x ) 2 2 = SS - 2 ( x1 - x ) ( x2 - x ) = SS - 2x1 x2 + 2x1 + 2x2 - 2x 2 2.0 Thus x12 - 2x1 x2 + x22 = SS - 2x1 x2 + 2x1 + 2x2 - 2x 2 x12 - 2x1 + x 2 + x22 - 2x2 + x 2 = SS ( x1 - x ) + ( x2 - x ) = SS 2 2 1.8 2 = SS - 2x1 x2 + 2x1 + 2x2 - 2x 2 1.6 ( x1 - x2 ) 1.4 x12 + x22 - 2x1 - 2x2 + 2x 2 = SS mean contrats/SS 2 1.2 2 2 1.0 x - 2x1 x2 + x + 2x1 x2 - 2x1 - 2x2 + 2x = SS 2 1 Thus 0 20 40 60 n1/n2 80 100