10-2 Estimating a Population Mean (σ Unknown) Confidence Intervals Involving Z Using the Calculator The t distributions When we substitute the standard error of xbar for its standard deviation, the distribution of the resulting statistic, t, is not Normal. We call it the t distribution. The t distributions There is a different t-distribution for each sample size n. We specify a t distribution by giving its degrees of freedom, which is equal to n-1 We will write the t distribution with k degrees of freedom as t(k) for short. We also will refer to the standard Normal distribution as the z-distribution. Comparing t and z distributions Y1= normalpdf(x) Y2= tpdf(x,2) (DISTR menu) Window X[-3,3] Y[-0.1,0.4] Y2= tpdf(x,9) Y2= tpdf(x,30) Comparing t and z distributions Compare the shape, center, and spread of the t-distribution with the z-distribution. As the degrees of freedom k increase, the t(k) density curve approaches the N(0,1) curve ever more closely. As the sample size increases, s estimates σ ever more closely. Finding t with Table C Suppose you want to construct a 95% confidence interval for the mean mu of a population based on a SRS of size n=12. What critical value t* should you use? If you have a TI-84+, you can use invT((1+C)/2, df) to find t*. One sample t interval for mu Recall the inference tool-box One sample t interval for mu Environmentalists, government officials, and vehicle manufacturers are all interested in studying the auto exhaust emissions produced by motor vehicles. The table gives the nitrogen oxide (NOX) levels for a random sample of light-duty engines of the same type. One sample t interval for mu Construct a 95% confidence interval for the mean amount of NOX emitted by light-duty engines of this type. One sample t interval for mu Step 1: Parameter Step 2: Conditions Step 3:Calculations s x± t n One sample t interval for mu Step 1: Parameter Step 2: Conditions Step 3:Calculations Step 4:Interpretation Remember the three C's! Conclusion, Connection, Context One sample t interval for mu Note: When the actual df does not appear in Table C, use the greatest df available that is less than your desired df. Recall matched pairs design... Matched pairs is a form of block design in which just two treatments are compared. Subjects are matched in pairs and each treatment is given to one subject in each pair. or... each subject receives both treatments in some randomized order Is it a matched pairs design? When you have two sets of data, ask yourself if there is something that links the values in pairs and, therefore, prevents them from being independent. If so, a one-sample procedure is optimal. Inference procedures for two samples assume that the samples are selected independently of each other. This assumption does not hold when the same subjects are measured twice. Too many numbers what do I do? Notice that paired t procedures are also useful in before and after observations on the same subjects. Warning: I probably shouldn’t even show you the next slide. Don’t even think about writing it down… it’s on page 651. The parameter µ in a paired t procedure is... The mean difference in the responses to the two treatments within matched pairs of subjects in the entire population (when subjects are matched in pairs), or... The mean difference in response to the two treatments for individuals in the population (when the same subject receives both treatments), or... The mean difference between before-and-after measurements for all individuals in the population (for before-and-after observations on the same individuals). The parameter µ in a paired t procedure is... Okay so it’s the difference in the means for the entire population. Paired t procedures Example 10.10 pg 651 Construct and interpret a 90% confidence interval for the mean change in depression score. Randomization Random Selection of individuals for a statistical study allows us to generalize the results of that study to a larger population. Random Assignment of treatments to subjects in an experiment lets us investigate whether there is evidence of a treatment effect (cause and effect). That is it lets us compare results of different treatments. The t-procedures are not robust against outliers, because xbar and s are not resistant to outliers. One sample t interval for mu Without the outlier, the interval is much narrower and centered differently (1.165, 1.421). Can we really be 95% confident in either interval? No, since the outlier suggests that the population may not be Normal. Robust Procedures T procedures are not robust against outliers, but they are quite robust against non-Normality of the population when there are no outliers, especially when the distribution is roughly symmetric. Robust Procedures Larger samples improve the accuracy of critical values from the t distribution when the population is not normal. For most purposes, you can safely use the onesample t procedures when n≥ 15 unless an outlier or some strong skewness is present. Why can’t I use a z procedure is n is large? Because σ is unknown! Can we use t? Given the percent of each state's residents who are at least 65 years of age, can or should we use t to approximate the mean of these percents? Hint: This is a population not a sample. Can we use t? Given the time of the first lightning strike each day in a mountain region of Colorado, can or should we use t procedures to draw conclusions about the mean time of a day's first lightning strike with complete confidence? Hint: n =70 and the distributionn is what shape? Can we use t? Given the distribution of word lengths in Shakespeare's plays? Hint: n is unknown.