* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Section 18: Inferences about Means (σ unknown, sample “small
Bootstrapping (statistics) wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
Inductive probability wikipedia , lookup
History of statistics wikipedia , lookup
Foundations of statistics wikipedia , lookup
German tank problem wikipedia , lookup
Misuse of statistics wikipedia , lookup
Section 18: Inferences about Means (unknown, sample “small”) (Major Concept Review) Suppose we have a SRS of size n from a population with mean 𝜇 and standard deviation 𝜎, but both of these parameters unknown, and we’re trying to estimate 𝜇. This is the same scenario as sections 12 and 13, but now we don’t know 𝜎. We know in theory that the probability distribution of the sample mean is 𝑥 ≈ N 𝜇, 𝜎 𝑛 But this relies upon knowing these rules. We can “reverse our logic” to find the unknown 𝜇, but the formula still relies upon knowing 𝜎. Remember that probability is by definition about uncertainty. It depends on what you know and when. There is no such thing as objective probability. So ultimately, the shape of 𝑥 ’s probability distribution comes down to what you know. If you’re working with less certain knowledge, then 𝑥 ’s probability will be a little more spread out than it would otherwise be; you don’t have the precision you did before. Its probability shape will be a little “splayed out” compared to the normal shape. It will be not quite normal. These are the t-distributions. Under the above conditions, we say that 𝑥 has the t-distribution with 𝒏 − 𝟏 degrees of freedom, where 𝑥−𝜇 𝑠 𝑛 The denominator is called the standard error of the sample mean. All our logic says that statistics estimate the corresponding parameters. So the sample’s standard deviation estimates the population standard deviation. 𝑠 estimates 𝜎 For the set of numbers in the sample, this is s as we computed it in section 2. 𝑡= 𝑆𝐸 𝑥 = 𝑠 𝑛 estimates 𝜎𝑥 = 𝜎 𝑛 The t-distribution is the price you pay for the added randomness of approximating 𝜎. (The added randomness means that confidence intervals are a little wider; you’re less sure of your facts. It also means that 𝐻0 retains the benefit of the doubt a little longer; it is harder to reject.) There are an infinite number of t-distributions. They look “not quite normal” to the naked eye. total area 1 (density curve) symmetric about 0, extends infinitely #df larger curve approaches N(0,1) “almost normal” Check out the t-chart reference at the link from the homepage. You will see that each t-distribution row resembles the confidence interval row for z. In fact, because the t-distribution will many degrees of freedom is approximately the same as the normal distribution, notice that the bottom (“infinity”) row is z-confidence row again. This all means that Confidence intervals work almost exactly as before. We are simply using a different reference row. We cannot compute the p-value for a hypothesis test. We have to get a “rough idea” of whether it’s suspiciously small using information that is analogous to that of the z-row only. The confidence interval for a t-distribution is 𝑥±𝑡 𝑠 𝑛 with df = 𝑛 − 1 with df = 𝑛 − 1 The test-statistic for a hypothesis test for 𝜇 is: 𝑡= 𝑥−𝜇 𝑠 𝑛 The technical conditions for a t-distribution to considered a valid approximation are: n ≤ 30: The distribution should be “mound-shaped” & approximately symmetric (*) n>30 : The Central Limit Theorem allows us to relax these restrictions Example 1a: The drug 6-mP is used to treat leukemia. The remission times (in weeks) for n = 21 patients produce an average of 17.1 weeks and a standard deviation of 10.0 weeks. 𝑥 = 17.1 𝑠 = 10.0 Notice that 𝑠 = 10.0 is a statistic; it refers to the 21 patients. (If 10.0 were the standard deviation of all possible patients, then we would use 𝜎 = 10.0 in our z-procedures in sections 12 and 13.) (It’s fair to assume * above, so our procedures are valid. This is a technicality; notice that we have no easy alternative in practice.) 𝜇 = mean remission time for all possible patients using 6 − mP =? 𝜎 = stdev of remission times for all possible patients using 6 − mP =? (The fact that the latter is unknown, recall, is what makes it a t-test.) A 95% confidence interval for 𝜇 is 17.1 ± 2.086 10 = 17.1 ± 4.55 weeks 21 We refer to the 21-1 = 20 degrees of freedom row to get the value of t corresponding to 95% area in the middle. C 0.950 0.980 df 20 2.086 2.528 Notice that because the t-distributions are more “splayed out”, it’s more than the corresponding 1.96. A 98% confidence interval for 𝜇 is 17.1 ± 2.528 10 = 17.1 ± 5.52 weeks 21 Notice that these margins of error are wider than the corresponding z’s would be. We are less certain of our facts than in section 12, so the territory is bigger: Less information less precise payoff Example 1b: A previous leukemia drug had average remission time 21.7 weeks. 1) Is 6-mP better? “Better” would mean a lower average remission time than the old drug has: 𝐻0 : 𝜇 = 21.7 𝐻𝑎 : 𝜇 < 21.7 Assuming 𝐻0 , the probability of a sample as extreme (low) as ours (p-value): 𝑡= 𝑥 − 𝜇 17.1 − 21.7 = = −2.108 𝑠 10.0 𝑛 21 df = 20 Ignore the sign on t. Refer to the row 20 of the t-chart and the one-tail header: one tail 0.010 df 20 2.528 we’re here 0.025 2.086 Although we can’t compute the area of the tail corresponding to 2.108, notice that we don’t have to do so. All we want to know is whether it is “too small to believe”, as always. It’s enough to have a “ballpark” idea of its size: The p-value is between 0.01 and 0.025. Therefore it is smaller than the common levels of 0.05 and 0.10. At the 1% level, p-value ⊀ 0.01 no evidence of 𝐻𝑎 do not reject 𝐻0 At the 5% level, p-value < 0.025 < 0.05 evidence of 𝐻𝑎 reject 𝐻0 At the 10% level, p-value < 0.025 < 0.10 evidence of 𝐻𝑎 reject 𝐻0 At the 5% and 10% levels, we have evidence that 6-mp is an improvement; at the 1% level, we don’t. 2) Does 6-mP have a different (better or worse) average remission time than the old drug? 𝐻0 : 𝜇 = 21.7 𝐻𝑎 : 𝜇 ≠ 21.7 Assuming 𝐻0 , the probability of a sample as extreme (as far from the center) as ours (p-value): 𝑡 = −2.108 df = 20 two tail 0.020 df 20 2.528 we’re here 0.05 2.086 The p-value (area of tails, plural) is between 0.02 and 0.05. At the 1% level, p-value ⊀ 0.01 no evidence of 𝐻𝑎 do not reject 𝐻0 At the 5% level, p-value < 0.05 evidence of 𝐻𝑎 reject 𝐻0 At the 10% level, p-value < 0.05 < 0.10 evidence of 𝐻𝑎 reject 𝐻0 At the 5% and 10% levels, we have evidence that 6-mp has a different average remission time from the previous drug; at the 1% level, we don’t.