Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
"It is a capital mistake to theorize before one has data." Sir Arthur Conan Doyle Chapter 23: INFERENCES FOR MEANS (Pages 520 - 546) OVERVIEW: In chapter 18 we were given the standard deviation ( ) of the population. In reality, one frequently does not know the standard deviation of the population from which a random sample was obtained. When sample sizes are small and the population standard deviation is not known, statisticians make use of the t-distribution, which is a family of distributions that are all bell-shaped, but differs from the normal distribution in the fact that there is more area in the tails. Hence, there is a t-distribution table that differs from the normal distribution table. As samples sizes get larger, the t-distribution approaches the normal distribution in shape. In a one-sample test, the degrees of freedom (needed to use the table) is one less than the sample size, and that is what determines which distribution you will use. A few important things to note in this situation: -When the standard deviation parameter is estimated from a sample, the resulting statistic is called the The standard error of the sample mean SE( x ) = sx = -If a sample has mean x , the one sample t-statistic is -A confidence interval calculated for a t-statistic has the form where t* is the appropriate value from the t-distribution table (table T in back of book). Assumptions/Conditions: 1. Independence Assumption: A) Randomization condition: B) 10% condition: 2. Normal Population Assumption: A) Nearly Normal Condition: Examples: 1. A coffee machine dispenses coffee into paper cups. You’re supposed to get 10 ounces of coffee, but the amount varies slightly from cup to cup. Here are the amounts measured in a random sample of 20 cups. Is there evidence that the machine is shortchanging customers? 9.9 9.7 10.0 10.1 9.9 9.6 9.8 9.8 10.0 9.5 9.7 10.1 9.9 9.6 10.2 9.8 10.0 9.9 9.5 9.9 Define the parameter. Hypotheses. Model. Random sample; 20 < 10% of all cups (no reason to doubt independence). The histogram of sample data to the right looks roughly unimodal and symmetric, so it’s reasonable to believe that the amount of coffee in all possible cups could be described by a Normal model. Mechanics. n dof x sx t= P-value = Conclusion. Such a small P-value makes it unlikely that the low sample mean resulted from sampling error, so we Confidence interval. The conditions have been met, so we can create a 95% one-sample t-interval. (Note that all confidence intervals look alike: estimate ± margin of error.) x t *19 SE ( x ) We are 95% confident that 2. A company has set a goal of developing a battery that lasts over 5 hours (300 minutes) in continuous use. In a first test of 12 of these batteries the following life spans measured (in minutes): 321, 295, 332, 351, 281, 336, 311, 253, 270, 326, 311, and 288. • Is there evidence that the company has met its goal? Solution: We want to know if the mean battery lifespan exceeds the 300-minute goal set by the manufacturer. We have 12 battery lifespans in our sample to test the claim. Define the parameter. Hypotheses. Model. Randomization Condition: This is not a random sample of batteries, but merely 12 batteries produced for preliminary testing. However, it is reasonable to assume that these batteries are representative of all batteries. Nearly Normal Condition: The distribution of battery lifespans is roughly unimodal and symmetric, so it’s reasonable to assume that the lifespans of all batteries could be described by a Normal model. Since the conditions have been met, we can do a one sample t-test for the mean, with 11 degrees of freedom. Mechanics. n t x sx n t t Conclusion. dof P-value = x sx • Find a 90% confidence interval for the mean lifespan of this type of battery. Solution. The conditions have been met, so we can create a one-sample t-interval, with 90% confidence. x t *11 SE ( x ) I am 90% confident that the mean battery lifespan is between and minutes. • If we wish to conduct another trial, how many batteries must we test to be 95% sure of estimating the mean lifespan to within 15 minutes? To within 5 minutes? Solution: We want to know how many batteries to test to be 95% sure of estimating the mean lifespan to within 15 minutes. First, do a preliminary estimate using z* =1.96 as the critical value. Now, do a better estimate, using as the critical value. We would need to sample about batteries in order to estimate the mean battery lifespan to within 15 minutes, with 95% confidence. Finally, to estimate the mean battery lifespan to within 5 minutes, you could do the entire process again, perhaps using a critical value with much higher degrees of freedom. We know that it’s going to take lots more batteries to cut the margin of error to a third of what it was. Alternatively, we know it will take a sample about 9 times as large, 18(9) = 162 batteries, since the margin of error was decreased to a third of its size. (The standard error involves the square root of the sample size in it’s denominator.) Facts about the t-distribution 1. Density curves of t-distribution are similar to normal curve. 2. Spread of t-distribution is a tad greater than normal dist. 3. As the dof increase, the curve approaches the normal curve Rules for using the t-test Ideally the population will have a normal distribution, but for times when this is not given: - For sample sizes less than 15, the t-procedures can be used if the data are - For sample sizes 15 to 40, t-procedures can be safely used as long as the data are - For sample sizes greater than 40, t-procedures can be used Use the TI-83 to calculate a quick 95% confidence interval for the following problem: An SRS of 75 male adults living in a particular suburb was taken to study the amount of time they spent per week doing rigorous exercise. It indicated a mean of 73 minutes with a standard deviation of 21 minutes. = true mean rigorous exercise time for all males in the suburb x sx t – confidence interval = x t * C= sx n The desired 95% confidence interval is dof = t* =