Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Lecture 15: Tests about Population Means and Population Proportions Devore: Section 8.2-8.3 April, 2011 Page 1 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 A Normal Population with known σ • This case is not common in practice. We will use it to illustrate basic principles of test procedure design • Let X1 , . . . , Xn be a sample size n from the normal population. The null value of the mean is usually denoted µ0 and we consider testing either of the three possible alternatives µ > µ0 , µ < µ0 and µ 6= µ0 • The test statistic that we will use is X̄ − µ0 √ Z= σ/ n It measures the distance of X̄ from µ0 in standard deviation units. April, 2011 Page 2 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • Consider Ha : µ > µ0 as an alternative. The outcome that would allow us to reject the null hypothesis H0 : µ = µ0 is z ≥ c for some c > 0 • How do you select c? We need to control the probability of Type I Error. For a test of level α, we have α = P ( Type I Error ) = P (Z ≥ c|Z ∼ N (0, 1)) • Therefore, we need to choose c = zα . Such a test procedure is called upper-tailed. • It is easy to understand that for Ha : µ < µ0 we will have the rejection region of the form z ≤ c. For the test to have the level α, we need to choose c = −zα . Such a test is called a lower-tailed test. April, 2011 Page 3 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • Now consider the case of Ha : µ 6= µ0 . The rejection region here consists of z ≥ c and z ≤ −c. • For simplicity, consider the case α = 0.05. Then, 0.05 = P (Z ≥ c or Z ≤ −c|Z ∼ N (0, 1)) = Φ(−c) + 1 − Φ(c) = 2[1 − Φ(c)] • Therefore, we select c such that 1 − Φ(c) = P (Z ≥ c) = 0.025; it is z0.025 = 1.96. This test is called a two-tailed test. April, 2011 Page 4 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Summary • Let H0 : µ = µ0 ; define the test statistic Z = 1. X̄−µ √0. σ/ n Ha : µ > µ0 has the rejection region z ≥ zα and is called an upper-tailed test 2. Ha : µ < µ0 has the rejection region z ≤ −zα and is called an lower-tailed test 3. Ha : µ 6= µ0 has the rejection region z ≥ zα/2 or z ≤ −zα/2 and is called a two-tailed test April, 2011 Page 5 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Recommended Steps for Testing Hypotheses about a Parameter 1. Identify the parameter of interest and describe it in the context of the problem situation. 2. Determine the null value and state the null hypothesis. 3. State the alternative hypothesis. 4. Give the formula for the computed value of the test statistic. 5. State the rejection region for the selected significance level 6. Compute any necessary sample quantities, substitute into the formula for the test statistic value, and compute that value. • The formulation of hypotheses (steps 2 and 3) should be done before examining the data. April, 2011 Page 6 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Example • A manufacturer of sprinkler systems used for fire protection in office buildings claims that the true average system-activation temperature is 130. Can we believe his claim? • Parameter of interest is µ = true average activation temperature. • H0 : µ = 130; Ha : µ 6= 130. • Test statistic is x̄ − µ0 x̄ − 130 √ = √ Z= σ/ n 1.5/ n April, 2011 Page 7 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • Rejection region is z ≤ −zα/2 or z ≥ zα/2 . If α = 0.01, we have z ≤ −z0.005 = −2.58 and z ≥ z0.005 = 2.58 • With n = 9 and x̄ = 131.08, 131.08 − 130 √ z= = 2.16 1.5/ 9 • Since 2.16 is outside the rejection region, we fail to reject H0 at significance level 0.01. April, 2011 Page 8 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Type II Error • As an example, consider an upper-tailed test with the rejection region z ≥ zα . √ • H0 is not rejected when x̄ < µ0 + zα · σ/ n 0 • For a particular µ > µ, the probability of Type II error is then √ 0 0 β(µ ) = P (X̄ < µ0 + zα · σ/ n|µ = µ ) 0 0 X̄ − µ µ0 − µ 0 √ √ =P < zα + |µ = µ σ/ n σ/ n 0 µ0 − µ √ = Φ zα + σ/ n April, 2011 Page 9 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • Similar derivations can help us to derive Type II error probabilities for a lower-tailed test and a two-tailed test. Results can be summarized as follows: 1. Ha : µ > µ0 has the probability of Type II Error 0 Φ zα + µ0 −µ √ σ/ n 2. Ha : µ< µ0 has the probability of Type II Error 0 µ0 −µ 1 − Φ −zα + σ/√n 3. Ha : µ 6= µ0 has the probability of Type II Error 0 Φ zα/2 + µ0 −µ √ σ/ n 0 − Φ −zα/2 + µ0 −µ √ σ/ n April, 2011 Page 10 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Sample Size Determination • Sometimes, we want to bound the value of Type II error for a 0 specific value µ . • Consider the same sprinkler example. Fix α and specify β for 0 such an alternative value. For µ = 132 we may want to require β(132) = 0.1 in addition to α = .01. • The sample size required for that purpose is such that 0 µ0 − µ √ =β Φ zα + σ/ n April, 2011 Page 11 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • Solving for n, we obtain σ(zα + zβ ) n= µ0 − µ0 2 and the same answer is true for a lower-tailed test • For a two-tailed test, it is only possible to give an approximate solution. It is σ(zα/2 + zβ ) n≈ µ0 − µ0 2 April, 2011 Page 12 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Large Sample Tests • When the sample size is large, the z tests described earlier are modified to yield valid test procedures without requiring either a normal population distribution or a known σ . • Let us assume n > 40. Then, the test statistic X̄ − µ0 √ Z= S/ n is approximately standard normal • The use of the same rejection regions as before results in a test procedure for which the significance level is approximately α. April, 2011 Page 13 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Example • To test H0 : µ = 20 vs. H1 : µ > 20 at significance level α = 0.05, we collect 400 observations with x̄n = 21.82935 and sn = 24.70037. Should H0 be rejected? • The value of test statistic is 21.82935 − 20 = 1.481234 < 1.645 = z0.05 t= 24.70037/20 - should not be rejected April, 2011 Page 14 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 A Normal Population Distribution • When n is small, we can no longer invoke CLT as a justification for the large sample test • Remember that for a normally distributed random sample X1 , . . . , Xn , the statistic X̄ − µ √ T = S/ n has a t distribution with n − 1 df • Therefore, we have the test with H0 : µ = µ0 and a test x̄−µ statistic value t = s/√n0 . April, 2011 Page 15 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Summary of the Three Possible t-Tests • If Ha : µ > µ0 , the rejection region of the level α test is t ≥ tα,n−1 • If Ha : µ < µ0 , the rejection region of the level α test is t ≤ −tα,n−1 • If Ha : µ 6= µ0 , the rejection region of the level α test is t ≥ tα/2,n−1 or t ≤ −tα/2,n−1 April, 2011 Page 16 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Example • The Edison Electric Institute publishes figures on the annual number of kilowatt hours expended by various home appliances. • It is claimed that a vacuum cleaner expends an average of 46 kilowatt hours per year. • Suppose a planned study includes a random sample of 12 homes and it indicates that VC’s expend an average of 42 kilowatt hours per year with s = 11.9 kilowatt hours. • Assuming the population normality, design a 0.05 level test to see whether VC’s spend less than 46 kilowatt hours annually April, 2011 Page 17 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • H0 : µ = 46 kilowatt hours and Ha : µ < 46 kilowatt hours • Assuming α = 0.05, we have a critical region t < −1.796 where x̄ − µ0 √ t= s/ n with 11 df • The value of the statistic is 42 − 46 √ = −1.16 t= 11.9/ 12 • Since t is not in the rejection region, we fail to reject H0 . April, 2011 Page 18 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 A β curve for the t-test • It is much more difficult to compute the probability of the Type II Error in this case than in the normal case • The reason is that it requires the knowledge of distribution of √ 0 under the alternative Ha . To do it precisely, we T = X̄−µ S/ n must compute 0 0 β(µ ) = P (T < tα,n−1 when µ = µ ) • There exist extensive tables of these probabilities for both oneand two-tailed tests. April, 2011 Page 19 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Figure 1: April, 2011 Page 20 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Calculating β 0 • First, we select µ and the estimated value for unknown σ . 0 Then, we find an estimated value of d = |µ0 − µ |/σ . Finally, the value of β is the height of the n − 1 df curve above the value of d • If n − 1 is not the value for which the corresponding curve appears visual interpolation is necessary April, 2011 Page 21 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • One can also calculate the sample size n needed to keep the Type II Error probability below β for specified α. 1. First, we compute d 2. Then, the point (d, β) is located on the relevant set of graphs 3. The curve below and closest to the point gives n − 1 and thus n 4. Interpolation, of course, is often necessary April, 2011 Page 22 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Large-Sample Tests • Let p denote the proportion of individuals or objects in a population who possess a specified property; thus, each object either possesses a desired property (S) or it doesn’t (F). • Consider a simple random sample X1 , . . . , Xn . If the sample size n is small relative to the population size, the number of successes in the sample X has an approximately binomial distribution. If n itself is also large, both X and the sample proportion p̂ = X/n are approximately normally distributed • Large-sample tests concerning p are a special case of the more general large-sample procedures for an arbitrary parameter θ . We considered such a large-sample test before for the mean µ of an arbitrary distribution. April, 2011 Page 23 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • Some basic properties of p̂ are; 1. Estimator p̂ is unbiased: E p̂ = p. 2. Second, it is approximately normal and its standard deviation p (SD) is σp̂ = p(1 − p)/n 3. Note that σp̂ does not include any unknown parameters. This is not always the case. It is enough to remember the large-sample test of the mean where σµ̂ = σX̄ = σ 2 /n which is in general unknown unless σ 2 is specified. April, 2011 Page 24 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • Let us consider first an upper-tailed test. It means having a null hypothesis H0 : p = p0 vs. an alternative Ha : p > p0 . • Under the null hypothesis, we have E (p̂) = p0 and p σp̂ = p0 (1 − p0 )/n; therefore, for large n the test statistic Z=p p̂ − p0 p0 (1 − p0 )/n has approximately standard normal distribution • The rejection region is, clearly, z ≥ zα for a test of approximately level α. April, 2011 Page 25 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • The lower-tailed test has a rejection region z ≤ zα • The two-tailed test has a rejection region |z| ≥ zα/2 . The last expression is a concise way of saying that z ≥ zα/2 or z ≤ −zα/2 . • These tests are applicable whenever the normal approximation of the binomial distribution is reasonable: np0 ≥ 10, n(1 − p0 ) > 10. April, 2011 Page 26 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Example • 1276 individuals in a sample of 4115 adults were found to be obese as per their BMI. A 1998 survey based on self-assessment revealed that 20% of the adult population considered themselves obese. Does the most recent data suggest that the true proportion of obese adults is more than 1.5 times the percentage from the self-assessment survey? • The null hypothesis here is that p = 0.3 and the alternative is p > .30 • Note that the rule of thumb is satisfied: np0 = 4115(.3) > 10 abd n(1 − p0 ) = 4115(.7) > 10 April, 2011 Page 27 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • Thus, we use the large-sample test with p z = (p̂ − .3)/ (.3)(.7)/n • For a significance level α = .1 we use zα = 1.28. The sample proportion is p̂ = 1276/4115 = .310. Plugging this value in z , we obtain 1.40 > 1.28. Thus, the null hypothesis is rejected. April, 2011 Page 28 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Type II Error and sample size determination • Type II Error probability can be computed exactly as before. If 0 H0 is not true, the true proportion p = p 6= p0 . Under 0 Ha : p = p we have Z is still approximately normal; however, 0 p − p0 E(Z) = p p0 (1 − p0 )/n and 0 0 p (1 − p )/n V (Z) = p0 (1 − p0 )/n April, 2011 Page 29 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • The formulas for the type II error are very similar to what we saw before for the mean test. We only give the upper-tailed test formula (Ha : p > p0 ) " # p 0 p0 − p + zα p0 (1 − p0 )/n 0 p 0 β(p ) = Φ p (1 − p0 )/n and the lower-tailed test formula (Ha : p < p0 ) " # p 0 p0 − p − zα p0 (1 − p0 )/n 0 p 0 β(p ) = 1 − Φ p (1 − p0 )/n • Sample size formulas can also be easily derived. In the two-tailed case, the formula is approximate as before April, 2011 Page 30 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Example • A package delivery service advertises that at least 90% of all packages brought to its office by 9 A.M. are delivered by noon the same day. How likely is it that a level 0.1 test based on n = 225 packages will detect a departure of 10% from this goal? • The null hypothesis is H0 : p = 0.9 vs. Ha : p < 0.9. 0 • For p = 0.8, we have β(.8) = 1 − Φ ! p .9 − .8 − 2.33 (.9)(.1)/225 p = 0.228 (.8)(.2)/225 • Thus, the probability of type II error under the alternative 0 p = 0.8 is about 23%. April, 2011 Page 31 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Small Sample tests • These are test procedures for proportions when the sample size n is small. They are based directly on the binomial distribution rather than the normal approximation. • Consider the alternative hypothesis Ha : p > p0 and let X be yet again the number of successes in the sample size n. For a test level α, we find the rejection region from P (X ≥ c when X ∼ Bin(n; p0 )) = 1 − P (X ≤ c − 1 when X ∼ Bin(n; p0 )) = 1 − B(c − 1; n, p0 ) April, 2011 Page 32 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • It is usually not possible to find an exact value of c in this case; the usual way out is to use the largest rejection region of the form {c, c + 1, . . . , n} satisfying the bound on the Type I error 0 • To compute the Type II error for an alternative p > p0 , we first 0 note that X ∼ Bin(n, p ) if the alternative is true. Then, 0 0 0 β(p ) = P (X < c when X ∼ Bin(n, p )) = B(c−1; n, p ) Note that this is a result of a straightforward binomial probability calculation April, 2011 Page 33 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Example • A builder claims that heat pumps are installed in 70% of all homes being constructed today in the city of Richmond, VA. Would you agree with this claim if a random survey of new homes in this city shows that 8 out of 15 had heat pumps installed? Use a 0.1 level of significance. • H0 : p = 0.7 vs. Ha : p < 0.7 with α = 0.10 • The test statistic is X ∼ Bin(0.7, 15) April, 2011 Page 34 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • We have x = 8 and np0 = (15)(0.7) = 10.5. Thus, we must find c such that P (X ≥ c) = 1 − B(c − 1; 15, 0.7) = 0.1 ∼ Bin(0.7, 15). It is easy to check that the rejection region will be {13, 14, 15}. for X April, 2011 Page 35