Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Hypothesis Testing Next to interval estimates, test of hypothesis is the other leg of statistical inference. With hypothesis testing we start with a hypothesis or a claim about the population parameter and use the sample data to test for the validity of the hypothesis. The purpose of the test is to see if the evidence provided by the sample statistic is significantly different from (or significantly contradicts) the hypothesized or claimed value for the population parameter. Any hypothesis test thus results in two mutually exclusive outcomes: you either reject the hypothesis and conclude that the difference between the sample statistic and hypothesized value for the population parameter is significant, or not reject the hypothesis and conclude that the difference is not significant. Since there are two mutually exclusive outcomes for a hypothesis test, there are two attributes to any hypothesis test statement: 1. The hypothesis that the population mean is a certain value. This is called the NULL HYPOTHESIS and is denoted by H0. 2. The mutually exclusive attribute that the population mean is not that claimed value. This is called the ALTERNATIVE HYPOTHESIS and denoted by H1. Test of Hypothesis for the Population Mean Example 1 We want to test the claim or hypothesis that the average textbook expense per semester by IUPUI undergraduate students is (equal to) $500. Page 1 of 15 The first step in a hypothesis test is to write the null and alternative hypothesis. Since the claim is that the population mean is $500, then the null hypothesis is written as: H0: µ = $500. The mutually exclusive attribute, the alternative hypothesis, is that mean is not equal to $500: H1: µ ≠ $500. H0: µ = $500 H1: µ ≠ $500 A sample of n = 110 students yielded the following textbook expenditure data (in dollars). Using Excel, compute the sample mean x̅ and standard deviation s. 649 243 852 243 805 425 223 658 354 738 207 840 304 434 532 335 529 626 309 431 880 366 x̅ = $547.33 674 206 839 319 753 418 372 276 442 742 578 755 340 525 821 282 283 299 501 856 771 326 588 674 666 803 684 610 685 222 235 667 220 603 718 780 240 832 824 541 478 666 395 569 264 607 747 449 839 217 410 878 207 678 687 329 700 260 355 229 725 477 217 675 459 780 841 207 397 651 855 340 348 574 575 593 847 573 873 504 744 689 791 386 878 831 789 670 s = $217.02 The sample mean x̅ = $547.33 differs (deviates) from the hypothesized mean µ₀ = $500 by x̅ − µ0 = $47.3 Is this deviation significant? That is, 95% of x̅ values would fall within MOE = ±$40.56 from µ = $500. If we prove this deviation is significant, then we reject the null hypothesis H0: µ = $500 and opt for the alternative hypothesis H1: µ ≠ $500 Conclude that the population mean is different from $500. How do we determine if the deviation x̅ − µ₀ = $47.33 is significant? P(µ − 40.56 < x̅ < µ + 40.56) = 0.95 That benchmark value is the margin of sampling error (MOE). Where is x̅ = 547.33 located relative to this interval? This x̅ falls outside the 95% interval. That is, the deviation x̅ − µ₀ = $47.33 exceeds the benchmark MOE = $40.56. Thus, we conclude that this x̅ value does not belong to this sampling distribution. It belongs to a different sampling distribution with a different center of gravity: µ ≠ 500. We know from the theory of sampling distribution that, if the population mean were µ = $500, then 95% of the sample means from samples of size n = 110 would fall within Therefore, we conclude that the deviation x̅ − µ₀ = $47.33 is significant, reject the null hypothesis H₀: µ = 500, and opt for the alternative hypothesis H₁: µ ≠ 500. We need a benchmark value to compare this deviation to. 𝑥𝐿 , 𝑥𝑈 = µ ± 𝑀𝑂𝐸 = µ ± 𝑧0.025 𝑥𝐿 , 𝑥𝑈 = 500 ± 1.96 217.02 √110 𝑠 √𝑛 = 500 ± 40.56 𝑥𝐿 , 𝑥𝑈 = ($459.44, $540.56) Page 2 of 15 483 517 213 392 277 532 721 364 715 618 275 Type I Error versus Type II Error In this example we rejected H₀ because the sample mean x̅ = 547.33 was not one of the 95% of sample means that fell within MOE. If the population mean were in fact $500, then 1 – 0.95 = 0.05 (5%) of the sample means would fall outside the margin of error. Thus, this allows for a 5% probability of rejecting a true null hypothesis. In hypothesis testing, if we reject a null hypothesis that turns out to be true, then we have committed a TYPE I Error. In this example, we have allowed a 5% probability for Type I Error. Now consider another sample of n = 110 from the same population. Page 3 of 15 x̅ = $533.85 444 849 642 661 659 654 354 422 293 694 420 699 751 740 554 603 499 709 520 663 506 235 264 871 770 302 677 781 399 539 529 478 596 726 781 239 328 739 809 320 239 625 862 351 632 839 452 878 386 606 813 777 297 767 866 755 393 226 506 223 438 521 508 562 442 525 376 524 401 733 508 279 796 727 306 553 509 328 736 847 435 708 262 264 412 352 557 788 665 486 290 292 875 540 297 463 345 332 423 s = $192.65 As with the previous sample, compute the 95% interval around µ₀ = $500. 𝑥𝐿 , 𝑥𝑈 = 500 ± 1.96 192.65 = 500 ± 36 = (464, 536) √110 This time x̅ = 533.85 falls within this interval. (See the diagram below.) We can therefore conclude that x̅ − µ₀ = $33.85 is not significant (x̅ − µ₀ < MOE), and not reject H₀. Now suppose the population mean turns out to be a value other than $500, say $550. Therefore, we have not rejected a false null hypothesis. Here we have committed a TYPE II ERROR—not rejecting a false null hypothesis. In short, the Type I and Type II errors can be stated as: In the diagram below, x̅ = 533.85 belongs to the sampling distribution H₁ with a center of gravity µ = 550, even though we have concluded (incorrectly) that it belongs to the distribution H₀ with µ = 500. Type I Error: Reject a true null hypothesis. Type II Error: Not reject a false null hypothesis. Page 4 of 15 The following is a graphic representation of two possible outcomes of a hypothesis test (“reject” versus “not reject”) and the Type I and Type II errors. How to Set Up the Decision Rule Regarding H₀ Back to example 1. Step 1: Write the null and alternative hypotheses. H₀: µ = 500 H₁: µ ≠ 500 Step 2: Choose a probability for Type I Error (the probability of rejecting a true H₀). The probability of Type I Error is denoted by α and is called the “level of significance” of the test. Typically, 5% is selected for α. Page 5 of 15 Now all the ingredients of the hypothesis test are available: n = 110 x̅ = $547.33 s = $217.02 DECISION RULE A Reject H₀ if |x̅ − μ₀| > MOE x̅ − μ₀ = 547.33 – 500 = 47.33 217.02 s MOE = zα/2 = 1.96 = 40.56 110 n x̅ − μ₀ = 47.33 > MOE = 40.56 Reject H₀. Conclude the mean is different from $500. α = 0.05 DECISION RULE B Reject 𝐻0 if test statistic (TS) > critical value (CV) Start with Decision Rule A x̅ − μ₀ > zα/2se(x̅ ) Divide both sides by se(x̅ ). x μ > zα/2 se( x ) The left-hand-side of inequality is the zscore and is called the “test statistic”. 547 .33 500 z= = 2.29 20.692 The z-score corresponding to the given tail area is called the “critical value”. zα/2 = z0.025 = 1.96 TS = z = 2.29 > CV = z0.025 = 1.96 Reject H₀ Page 6 of 15 DECISION RULE C Reject the null if 𝑝𝑟𝑜𝑏 value < α The “probability value” is 2 × the tail area corresponding the test statistic. 2 × P(z > 2.29) = 2 × 0.0111 = 0.0222 𝑝𝑟𝑜𝑏 value = 0.0222 < α = 0.05 Reject H₀. Two-Tails versus One-Tail Tests A hypothesis test is said to be a two-tails test if Ho is equal (=) to a value and H1 is not equal (≠) to that value. The above textbook expense example was a two-tails test. Here is another example: Example 2 Two-tails test To test the hypothesis, at a 5% level of significance, that the mean vehicle speed on a freeway is 80 mph a random sample of 120 vehicles were clocked. H₀: µ = 80 mph H₁: µ ≠ 80 mph 72 68 86 78 88 80 82 66 84 87 66 86 x̅ = 78.67 83 69 75 77 86 80 89 83 79 90 86 92 86 77 87 77 80 65 85 68 69 67 77 88 86 74 86 69 76 74 92 83 66 82 78 66 s = 7.98 66 74 76 70 73 71 80 91 70 89 68 90 86 87 77 83 66 79 68 77 81 65 74 76 76 84 85 82 68 67 85 68 73 90 67 75 92 75 83 79 77 66 89 79 89 87 65 83 90 85 88 73 71 91 75 85 67 81 73 89 se(𝑥 ) = 7.98⁄√120 = 0.728 Decision Rule A Decision Rule B Decision Rule C Reject H₀ if |x̅ − μ₀| > MOE Reject H₀ if TS > CV Reject the null if 𝑝𝑟𝑜𝑏 value < α |x̅ − µ₀| = |78.67 – 80| = 1.33 MOE = zα/2se(x̅ ) = 1.96 7.98 = 1.43 120 |x̅ − µ₀| = 1.33 < MOE = 1.43 Do not reject H₀. Conclude that the mean is equal to 80 mph. Page 7 of 15 88 83 75 83 83 75 84 89 69 70 82 80 𝑇𝑆 = 𝑧 = |𝑥 − 𝜇0 | |78.67 − 80| = se(𝑥) 0.728 78.67 80 = 1.83 0.728 CV = zα/2 = z0.025 = 1.96 TS = z = = TS = 1.83 < CV = 1.96 Do not reject H₀. Conclude that the mean is equal to 80 mph. 𝑝𝑟𝑜𝑏 value = 2 × P(z > 0.78) = 2 × 0.0336 = 0.0672 𝑝𝑟𝑜𝑏 value = 0.0672 > α = 0.05 Do not reject H₀. Conclude that the mean is equal to 80 mph. Lower-Tail Test The lower tail test applies when we want to test if the sample evidence is significantly lower (less) than the hypothesized value. Example 3 Suppose we are planning to set up a new check-out counter design in a nationwide supermarket which is claimed to reduce the customer waiting time to below 10 minutes. But before implementing the new plan in all the stores, the design was tested in a random sample of 40 stores for one month. The new design will be adopted if the test provides significant proof that the mean is less the 10 minutes. The following sample data representing the average waiting time in each test store was obtained. 8.0 7.9 9.3 9.5 9.5 7.4 11.1 11.8 8.8 11.2 7.5 11.4 8.4 10.4 8.3 11.6 11.1 11.8 11.0 7.1 11.3 11.3 10.1 7.8 11.2 11.9 10.8 11.6 12.0 9.7 7.2 11.4 7.9 8.6 9.2 9.4 9.0 8.4 9.2 8.9 Does the sample data provide proof that the mean waiting time is significantly less than 10 minutes? Perform the test at a 5% level of significance. The null and alternative hypotheses are written as follows. Note that the statement the mean waiting time is “significantly less than” indicates that the “µ < 10” should be the alternative hypothesis. The mutually exclusive statement is “µ ≥ 10”. This should be the null hypothesis. H0: µ ≥ 10 H₁: µ < 10 The ingredients of the test are: n = 40 x̅ = 9.75 s = 1.56 se(𝑥 ) = 1.56⁄√40 = 0.247 α = 0.05 Note that since this is lower tail test the deviation of sample statistic 𝑥 from the null mean 𝜇0 will be negative: 𝑥 − 𝜇0 = 9.75 − 10 = −0.25 The “−“ sign should be taken into account when writing the decision rules. Page 8 of 15 Decision Rule A Reject H₀ if x̅ − μ₀ < −𝑀𝑂𝐸 Decision Rule B Reject H₀ if TS < −CV Decision Rule C Reject H₀ if 𝑝𝑟𝑜𝑏 value < α In lower tail test use the –MOE In lower tail test use the –CV x̅ − µ₀ = 9.75 – 10 = −0.25 TS: t = x μ0 = −0.25 ∕ 0.247 = −1.012 se( x ) CV: t0.05, 39 = 1.685 Since this is a lower-tail test, we are interested only in the one tail of the t distribution. TS = −1.012 > −CV = −1.685 𝑝𝑟𝑜𝑏 value = P(t < −1.012) = 0.1589 To find the 𝑝𝑟𝑜𝑏 value you must use Excel. MOE = tα, df se(x̅ ) Note two changes in MOE: 1. We use t instead of z (n < 100) 2. We use α instead of α/2. We are interested only in the lower tail of the sampling distribution of x̅ . t0.05, 39 = 1.685 MOE = 1.685 × 0.247 = 0.42 min. x̅ − µ₀ = −0.25 > −MOE = −0.42 Do not reject H₀. The mean is not significantly less than 10 minutes. Do not adopt the new design. Page 9 of 15 Do not reject H₀. Excel 2010: =T.DIST(x, deg_freedom, cumulative) =T.DIST(-1.012,39,1) or, =T.DIST.RT(x, deg_freedom) =T.DIST.RT(1.012,39) Older versions: =TDIST(x, deg_freedom, tails) =TDIST(1.012,39,1) 𝑝𝑟𝑜𝑏 value = 0.1589 > α = 0.05 Do not reject H₀. Upper-Tail Test The upper tail test applies when we want to test if the sample evidence is significantly higher (greater) than the hypothesized value. Example 4 A random sample n = 32 reimbursements for office visits to physicians paid by Medicare provided the following data: 109 102 102 102 103 105 90 108 120 113 105 92 101 97 102 118 92 103 98 96 93 110 93 102 117 118 118 100 98 108 116 97 The sample mean is x̅ = $104. Does the sample provide significant evidence that the mean reimbursement is greater than $100? Perform the test of hypothesis at a 5% level of significance. The null and alternative hypotheses are written as follows. Note that the statement, the mean reimbursement is “greater than” $100, indicates that the “µ > $100” should be the alternative hypothesis. The mutually exclusive statement is “µ ≤ 10”. This should be the null hypothesis. H0: µ ≤ 100 H₁: µ > 100 The ingredients of the test are: n = 32 Page 10 of 15 x̅ = 104 s = 8.69 se(𝑥 ) = 8.69⁄√32 = 1.536 α = 0.05 Decision Rule A Decision Rule B Decision Rule C Reject H₀ if x̅ − μ₀ > MOE Reject H₀ if test statistic > critical value Reject H₀ if 𝑝𝑟𝑜𝑏 value < α x̅ − µ₀ = 104 – 100 = $4.00 MOE = tα, df se(x̅ ) TS = t = x μ0 = 4 ∕ 1.536 = 2.604 se( x ) CV: t0.05, 31 = 1.696 Since this is an upper-tail test, we are interested only in the one tail of the t distribution. TS > CV 𝑝𝑟𝑜𝑏 value = P(t > 2.604) = 0.0065 Reject H₀. To find the 𝑝𝑟𝑜𝑏 value you must use Excel. t0.05, 31 = 1.696 MOE = 1.696 × 1.536 = $2.61. x̅ − µ₀ = $4.00 > MOE = $2.61 Reject H₀. Conclude that the mean reimbursement is greater than $100. Excel 2010: =T.DIST(x, deg_freedom, cumulative) =T.DIST(-2.604,32,1) or, =T.DIST.RT(x, deg_freedom) =T.DIST.RT(2.604,32) Older versions: =TDIST(x, deg_freedom, tails) =TDIST(2.604,32,1) 𝑝𝑟𝑜𝑏 value = 0.0065 < α = 0.05 Reject H₀. Page 11 of 15 TWO IMPORTANT GUIDELINES YOU MUST OBSERVE IN STATING H₀ AND H₁ I. The Eleventh Commandment: Thou Shalt Not Put the “Equal” Sign in H₁ NEVER!!! H₀: μ ≠ 100 H₁: μ = 100 H₀: μ < 100 H₁: μ ≥ 100 H₀: μ > 100 H₁: μ ≤ 100 ALWAYS! H₀: μ = 100 H₁: μ ≠ 100 H₀: μ ≥ 100 H₁: μ < 100 H₀: μ ≤ 100 H₁: μ > 100 II. Set H₀ and H₁ such that the sample evidence conflicts with H₀ (conforms with H₁). INCORRECT! CORRECT x̅ = 95 H₀: μ ≤ 100 x̅ = 95 H₀: μ ≥ 100 H₁: μ > 100 H₁: μ < 100 ̅ − 𝝁𝟎 < 𝟎, then the test is a lower In a one-tail test, if 𝒙 tail test. Page 12 of 15 Test of Hypothesis for the Population Proportion π Two-Tail Test Twenty seven percent (27%) of U.S. adult population have a higher education bachelor’s degree. To test, at a 5% level of significance, if the same percentage of Indiana adult population has a higher education bachelor’s degree, in a random sample of n = 1,000 of Hoosiers, the sample proportion of adults with a bachelor’s degree was p̅ = 0.258. H₀: π = 0.27 H₁: π ≠ 0.27 n = 1000 p̅ = 0.258 Decision Rule A Decision Rule B Decision Rule C Reject H₀ if p̅ − π₀ > MOE Reject H₀ if TS > CV Reject H₀ if 𝑝𝑟𝑜𝑏 value < α |p̅ − π₀| = |0.258 – 0.27| = 0.012 MOE = zα/2 se(p̅ ) 𝜋0 (1 − 𝜋0 ) se(𝑥 ) = √ 𝑛 se(𝑥 ) = √ 0.27(1 − 0.27) = 0.014 1000 Note you must use the null proportion π₀ in the standard error formula, not p̅ . MOE = 1.96 × 0.014 = 0.027 p̅ − π₀ = 0.012 < MOE = 0.027 Do not reject H₀. Conclude that the Indiana percentage is the same as the national percentage. Page 13 of 15 α = 0.05 𝑇𝑆 = |𝑝 − 𝜋0 | 0.012 = = 0.86 se(𝑝) 0.014 𝑝𝑟𝑜𝑏 value = 2 × P(z > 0.86) = 2 × 0.1949 = 0.3898 CV = zα/2 = z0.025 = 1.96 𝑝𝑟𝑜𝑏 value = 0.3898 > α = 0.05 TS = 0.86 < CV = 1.96 Do not reject H₀. Do not reject H₀. Lower-Tail Test "Overall, 41 percent of teachers at U.S. public schools hold a master's degree." Test the hypothesis, at a 5% level of significance, that less than 41 percent of Indiana public school teachers hold a master's degree. In a random sample of 500 Indiana public school teachers, the sample proportion of teachers with a master’s degree was p̅ = 0.38. H₀: π ≥ 0.41 H₁: π < 0.41 n = 500 p̅ = 0.38 α = 0.05 Decision Rule A Reject H₀ if p̅ − π₀ < −MOE In lower tail test use the –MOE p̅ − π₀ = 0.38 – 0.41 = −0.03 MOE = zα/2 se(p̅ ) π0 (1 π0 ) n 0.41(1 0.41) = = 0.0220 500 se(p̅ ) = MOE = 1.64 × 0.022 = 0.036 p̅ − π₀ = −0.03 > −MOE = −0.036 Do not reject H₀. Conclude that the Indiana percentage is not less than the national percentage. Page 14 of 15 Decision Rule B Reject H₀ if TS < −CV In lower tail test use the –CV p π0 = −0.03 ∕ 0.022 = −1.36 se( p ) CV = zα = z0.05 = −1.64 TS = −1.36 > CV = −1.64 TS: z = Do not reject H₀. Decision Rule C Reject H₀ if 𝑝𝑟𝑜𝑏 value < α 𝑝𝑟𝑜𝑏 value = P(z < −1.36) = 0.0869 𝑝𝑟𝑜𝑏 value = 0.0869 > α = 0.05 Do not reject H₀. Upper-Tail Test A 2005 report stated that vehicle speed was a factor in 30 percent of fatal crashes. Test the hypothesis, at a 5% level of significance, that currently more than 30 percent of fatal crashes involve speed. In a random sample 800 fatal crashes in 2010, 272 involved vehicle speed. H₀: π ≤ 0.30 H₁: π > 0.30 n = 800 p̅ = 272 ∕ 800 = 0.34 α = 0.05 Decision Rule A Decision Rule B Decision Rule C Reject H₀ if p̅ − π₀ > MOE Reject H₀ if TS > CV Reject H₀ if 𝑝𝑟𝑜𝑏 value < α p̅ − π₀ = 0.34 – 0.30 = 0.04 MOE = zα/2 se(p̅ ) TS: z = p π0 = 0.04 ∕ 0.0162 = 2.47 se( p ) CV: zα = z0.05 = 1.64 TS = 2.47 > CV = 1.64 Reject H₀. 𝑝𝑟𝑜𝑏 value = P(z > 2.47) = 0.0068 𝑝𝑟𝑜𝑏 value = 0.0068 < α = 0.05 π0 (1 π0 ) n 0.30(1 0.30) = = 0.0162 800 se(p̅ ) = MOE = 1.64 × 0.0162 = 0.027 p̅ − π₀ = 0.04 > MOE = 0.027 Reject H₀. Conclude that in 2010 more than 30% fatal crashes involved speed. Page 15 of 15 Reject H₀.