Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Testing means, part III The two-sample t-test One-sample t-test Null hypothesis The population mean is equal to o Sample Test statistic Y o t s/ n compare Null distribution t with n-1 df How unusual is this test statistic? P < 0.05 Reject Ho P > 0.05 Fail to reject Ho Paired t-test Null hypothesis The mean difference is equal to o Sample Test statistic d do t SE d compare Null distribution t with n-1 df *n is the number of pairs How unusual is this test statistic? P < 0.05 Reject Ho P > 0.05 Fail to reject Ho Comparing means • Tests with one categorical and one numerical variable • Goal: to compare the mean of a numerical variable for different groups. 4 Paired vs. 2 sample comparisons 5 2 Sample Design • Each of the two samples is a random sample from its population 6 2 Sample Design • Each of the two samples is a random sample from its population • The data cannot be paired 7 2 Sample Design - assumptions • Each of the two samples is a random sample • In each population, the numerical variable being studied is normally distributed • The standard deviation of the numerical variable in the first population is equal to the standard deviation in the second population 8 Estimation: Difference between two means Y1 Y2 Normal distribution Standard deviation s1=s2=s Since both Y1 and Y2 are normally distributed, their difference will also follow a normal distribution 9 Estimation: Difference between two means Y1 Y2 Confidence interval: Y1 Y2 SEY Y 1 t 2,df 2 10 Standard error of difference in means 1 1 2 SEY1 Y2 sp n1 n 2 s2p = pooled sample variance n1 = size of sample 1 n 2 = size of sample 2 11 Standard error of difference in means 1 1 2 SEY1 Y2 sp n1 n 2 Pooled variance: df s df s s df1 df2 2 p 2 11 2 2 2 12 Standard error of difference in means Pooled variance: df s df s s df1 df2 2 p 2 11 2 2 2 df1 = degrees of freedom for sample 1 = n1 -1 df2 = degrees of freedom for sample 2 = n2-1 s12 = sample variance of sample 1 s22 = sample variance of sample 2 13 Estimation: Difference between two means Y1 Y2 Confidence interval: Y1 Y2 SEY Y 1 t 2,df 2 14 Estimation: Difference between two means Y1 Y2 Confidence interval: Y1 Y2 SEY Y 1 t 2,df 2 df = df1 + df2 = n1+n2-2 15 Costs of resistance to disease 2 genotypes of lettuce: Susceptible and Resistant Do these differ in fitness in the absence of disease? QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. 16 Data, summarized Susceptible Resistant Mean number of buds 720 582 SD of number of buds 223.6 277.3 15 16 Sample size Both distributions are approximately normal. 17 Calculating the standard error df1 =15 -1=14; df2 = 16-1=15 18 Calculating the standard error df1 =15 -1=14; df2 = 16-1=15 df s df s 14 223.6 15277.3 s 63909.9 df1 df 2 14 15 2 p 2 1 1 2 2 2 2 2 19 Calculating the standard error df1 =15 -1=14; df2 = 16-1=15 df s df s 14 223.6 15277.3 s 63909.9 df1 df 2 14 15 2 p 2 1 1 SE x1 x2 2 2 2 2 2 1 1 1 1 2 sp 63909.9 90.86 15 16 n1 n 2 20 Finding t df = df1 + df2= n1+n2-2 = 15+16-2 =29 21 Finding t df = df1 + df2= n1+n2-2 = 15+16-2 =29 t0.052,29 2.05 22 The 95% confidence interval of the difference in the means Y1 Y2 sY Y 1 t 720 582 90.86 2.05 2 ,df 2 138 186 23 Testing hypotheses about the difference in two means 2-sample t-test The two sa mple t-tes t compare s the mea ns o f a nu me rica l var iabl e b etween t wo p opula tions . 24 2-sample t-test Test statistic: Y1 Y2 t SE Y Y 1 SEY1 Y2 1 2 1 sp n1 n 2 2 2 2 df s df s 2 2 sp2 1 1 df1 df2 25 Hypotheses H0: There is no difference between the number of buds in the susceptible and resistant plants. (1 = 2) HA: The resistant and the susceptible plants differ in their mean number of buds. (1 2) 26 Null distribution t2,df df = df1 + df2 = n1+n2-2 27 Calculating t x1 x 2 720 582 t 1.52 SE x1 x2 90.86 28 Drawing conclusions... Critical value: t0.05(2),29=2.05 t <2.05, so we cannot reject the null hypothesis. These data are not sufficient to say that there is a cost of resistance. 29 Assumptions of two-sample t tests • Both samples are random samples. • Both populations have normal distributions • The variance of both populations is equal. 30 Two-sample t-test Null hypothesis The two populations have the same mean Sample 12 Test statistic Y1 Y2 t SE Y Y 1 compare Null distribution t with n1+n2-2 df 2 How unusual is this test statistic? P < 0.05 Reject Ho P > 0.05 Fail to reject Ho Quick reference summary: Two-sample t-test • What is it for? Tests whether two groups have the same mean • What does it assume? Both samples are random samples. The numerical variable is normally distributed within both populations. The variance of the distribution is the same in the two populations • Test statistic: t • Distribution under Ho: t-distribution with n1+n2-2 degrees of freedom. 1 2 1 SEY Y sp Y Y 1 2 n1 n 2 • Formulae: t 1 SE Y Y 1 df1s12 df2 s22 s df1 df2 2 p 2 2 Comparing means when variances are not equal Welch’s t test We lch 's ap p rox imate t-tes t com pare s the mea ns o f two no rma lly distrib uted po pulati ons tha t ha ve un eq ua l var ian ce s. 33 Burrowing owls and dung traps QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. 34 Dung beetles 35 Experimental design • 20 randomly chosen burrowing owl nests • Randomly divided into two groups of 10 nests • One group was given extra dung; the other not • Measured the number of dung beetles on the owls’ diets 36 Number of beetles caught • Dung added: Y 4.8 s 3.26 • No dung added: Y 0.51 s 0.89 37 Hypotheses H0: Owls catch the same number of dung beetles with or without extra dung (1 = 2) HA: Owls do not catch the same number of dung beetles with or without extra dung (1 2) 38 Welch’s t t Y1 Y2 2 1 df 2 2 s s n1 n2 s12 s22 2 n1 n2 s2 n 2 s2 n 2 1 1 2 2 n1 1 n2 1 Round down df to nearest integer 39 Owls and dung beetles t Y1 Y2 2 1 2 2 s s n1 n2 4.8 0.51 2 3.26 0.89 10 10 2 4.01 40 Degrees of freedom s12 s22 n1 n 2 2 df s2 n 2 s n 1 1 2 2 n1 1 n 2 1 2 2 3.26 2 0.89 2 2 10 10 3.26 2 10 2 0.89 10 10 1 10 1 2 2 10.33 Which we round down to df= 10 41 Reaching a conclusion t0.05(2), 10= 2.23 t=4.01 > 2.23 So we can reject the null hypothesis with P<0.05. Extra dung near burrowing owl nests increases the number of dung beetles eaten. 42 Quick reference summary: Welch’s approximate t-test • What is it for? Testing the difference between means of two groups when the standard deviations are unequal • What does it assume? Both samples are random samples. The numerical variable is normally distributed within both populations • Test statistic: t • Distribution under Ho: t-distribution with adjusted degrees of freedom Y1 Y2 s s • Formulae: t n n 2 1 2 2 s s n1 n2 df 2 1 2 2 1 2 2 s2 n 2 s2 n 2 1 1 2 2 n1 1 n2 1 The wrong way to make a comparison of two groups “Group 1 is significantly different from a constant, but Group 2 is not. Therefore Group 1 and Group 2 are different from each other.” 44 A more extreme case... 45