* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The t-test - University of South Florida
Survey
Document related concepts
Transcript
The t-test Inferences about Population Means when population SD is unknown Confidence intervals in z (Review) Want to estimate height of students at USF. Sampled N=100 students. Found mean =68 in and SD = 6 in. Best guess for population mean is 68 inches plus or minus some. X X 95%CI = X z.05 X N 95%CI=68±(1.96)[6/sqrt(100)] 68 ±1.96(.6) = 68 ±1.18 Interval is 66.82 to 69.18. Such an interval will contain the mean 95% of the time. Problem with z Formulas so far use population SD, and they have been correct, but SD is usually unknown, so we have to estimate Estimate will be off a bit; would be nice to account for this The statistic called ‘t’ adjusts for error in estimate of SD. Estimate of SD is better as sample size increases, so t changes with N. The values of t are basically the same as z, but t spreads out more and more as the sample size gets small. The t Distribution We use t when the population variance is unknown (the usual case) and sample size is small (N<100, the usual case). If you use a stat package for testing hypotheses about means, you will use t. The t distribution is a short, fat relative of the normal. The shape of t depends on its df. As N becomes infinitely large, t becomes normal. Example values from t and z Area beyond value z t (df=100) t (df=25) [t changes with df (N)] .50 0 0 0 .25 .67 .68 .68 .025 1.96 1.98 2.06 .005 2.57 2.62 2.79 Degrees of Freedom For the t distribution, degrees of freedom are always a simple function of the sample size, e.g., (N-1). One way of explaining df is that if we know the total or mean, and all but one score, the last (N-1) score is not free to vary. It is fixed by the other scores. 4+3+2+X = 10. X=1. t table Confidence Intervals in t Want to estimate height of students at USF. Sampled N=100 students. Found mean =68 in and SD = 6 in. Best guess for population mean is 68 inches plus or minus some. ( X X )2 95%CI = X t.05 s X 95%CI=68±(1.98)[6/sqrt(100)] sX sX N N 1 N t.05 t( .05, 2tails,df 99) 1.98 68 ±1.98(.6) = 68 ±1.19 Interval is 66.81 to 69.19. Such an interval will contain the mean 95% of the time. Note this is virtually the same as in z, where interval was 66.82 to 69.18. Matters more when N is small. CI in t, Example 2 Suppose we want to estimate mean curiosity score for psychology students. Sample N = 25 people, Mean = 52, SD = 10. ˆ 52; ˆ s X 10; ˆ X s X sX 10 2 N 25 t(.05) t(.05, 2tail,df 24) 2.064 95%CI X t.05 s X 52 2.064(2) 95%CI 47.872 to 56.128 Note: this is same as CI in z, except we use t instead of z. The value of t comes from a table. Tabled value depends on df. One-sample t-test We can use a confidence interval to “test” or decide whether a population mean has a given value. For example, suppose we want to test whether the mean height of women at USF is equal to 68 inches. Suppose we randomly sample 50 women students at USF. We find that their mean height is 63.05 inches. The SD of height in the sample is 5.75 inches. Then we find the standard error of the mean by dividing SD by sqrt(N) = 5.75/sqrt(50) = .81. The critical value of t with (50-1) df is 2.01(find this in a t-table). Our confidence interval is, therefore, 63.05 plus/minus 1.63. See the graph. One-sample t Example 1 One sample t test C onfiden c e inter v al v e iw 10 N =50 M = 63.05 SD =5.75 8 S 6 F requency Pop Mean = 68 X .8 1 H is togr am o f Sample H eig ht t=2.0 1 4 ci X 1.63 2 0 40 50 60 70 80 Height in Inches Take a sample, set a confidence interval around the sample mean. Does the interval contain the hypothesized value? Conventional Steps (Cookbook) 1. Choose alpha (.05) 2. State null and alternative hypotheses (H0: pop mean is 68) (Ha is not 68) 3. Calculate observed stat (t = ?) 4. Find critical value (tcrit =value in table) 5. State decision rule (if obs > tcrit, reject null) 6. State conclusion (pop mean is not 68) One sample t test t distribution view 15 12 X 4.95 t d i s tri b u ti o n S X .8 1 9 F requency 68 X 6305 . 6 t X 4 .9 5 6 .1 1 SX .8 1 3 0 62 Height in Inches 70 The sample mean is roughly six standard deviations (St. Errors) from the hypothesized population mean. If the population mean is really 68 inches, it is very, very unlikely that we would find a sample with a mean as small as 63.05 inches. One-sample t, Example 2 X 25, s X 2.52 Over the years, smokers at M’s treatment center report smoking an average of 30 cigs per day. New treatment Smoke-BGon pills given to N=25 new clients. Did it help? tobs sX X sX sX 2.52 .50 N 25 X 25 30 tobs 10 sX .5 tcrit t( .05, 2tails,df 24) 2.064 |tobs| > tcrit. Reject null. Result is significant. Application We prefer to use the t test instead of the z test when the _____ is small. 1 mind 2 sample size 2 standard error 4 type II error Definition The t test adjusts for error in estimating the population ____ during hypothesis testing. 1 mean 2 median 3 range 4 standard deviation Application We compute a one-sample t test and find an obtained value of t of 2.5. The critical (tabled) value of t given the null hypothesis turns out to be 2.01. What do we decide? 1 2 3 4 the result is significant the result is not significant we made a type I error we made a type II error