Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Lecture 14: Introduction to Hypothesis Testing Devore: Section 8.1 March, 2011 Page 1 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 What is statistical hypothesis? • A statistical hypothesis is a claim about the value of a parameter(s) or about the form of a distribution as a whole. • As an example, consider a normal distribution with the mean µ. Then, the statement µ = .75 is a hypothesis. March, 2011 Page 2 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Null and Alternative Hypotheses • Usually, two contradictory hypotheses are under consideration. For example, we may have µ = .75 and µ 6= .75. Alternatively, for a probability of success of some binomial distribution, we may have p ≥ .10 and p ≤ .10. 1. The null hypothesis H0 is the one that is initially assumed to be true. 2. The alternative hypothesis Ha is the assertion contrary to H0 . March, 2011 Page 3 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • We reject the null hypothesis in favor of the alternative hypothesis if the sample evidence suggests so. If the sample does not contradict H0 , we continue to believe it is true. • Thus, the two possible conclusions from a hypothesis-testing analysis are reject H0 or fail to reject H0 . March, 2011 Page 4 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 An Example of the Test • A test of hypothesis is a method for using sample data to decide whether the null hypothesis should be rejected. • How exactly do we formulate a test? It depends on what our goals are... • Consider a company that wants to introduce an expensive new product to its line-up of existing ones. Clearly, there has to be an extensive evidence in favor of this new product. If it is, for example, a new type of the lightbulb, we need to ensure that its average lifetime is much longer than the one for existing types before adopting it. March, 2011 Page 5 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • A reasonable test would be to test H0 : µ = a vs. Ha : µ > a where a is some predetermined threshold. • Clearly, the alternatives Ha : µ < a or H0 : µ 6= a are of no interest in this case. • Ha : µ < a and Ha : µ > a are called one-sided alternatives; H0 : µ 6= a is called a two-sided alternative. • The value a that separates null hypothesis from an alternative is called a null value. March, 2011 Page 6 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Testing Procedure • A test procedure is specified by 1. A test statistic, a function of the sample data on which the decision will be based 2. A rejection region, a set of all test statistic values for which H0 will be rejected (null hypothesis rejected iff(=if and only if) the test statistic value falls in this region.) March, 2011 Page 7 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Example • Example. Consider the claim about the average nicotine content of a cigarette brand being at most 1.5 mg. In this case, the best setup would be to test H0 : µ = 1.5 vs. Ha : µ > 1.5. Why? We only care if this content is exceeded! • Let X̄ be a sample average nicotine content. Then, evidence against H0 would be provided by x̄ > 1.5. • Note that the choice of the rejection region is somewhat arbitrary...We could have selected x̄ > 1.55 as a rejection region. March, 2011 Page 8 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Error Types • Example It is possible that for some sample x̄ = 1.8 even when H0 is true. • A Type I error consists of rejecting the null hypothesis H0 when it is true • Example It is also possible that x̄ = 1.5 for a particular sample even if H0 is false • A Type II error involves not rejecting H0 when it is false March, 2011 Page 9 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • The only way to get rid of both errors is to use the entire population! In reality, a different procedure has to be followed. • Assume that 25% of the time automobiles have no visible damage in 10mph crash tests. Denote p the proportion of all 10 mph crashes that results in no visible damage to the new bumper. Then, H0 : p = .25 vs. Ha : p > .25. The experiment is based on n = 20 independent crashes with prototype of the new design. March, 2011 Page 10 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Type I Error analysis • Consider the following procedure: 1. Test Statistic is X - the number of crashes with no visible damage 2. Rejection region R8 reject H0 if x = {8, 9, . . . , 20}; in other words, ≥ 8. • Thus, the probability of Type I error is α = P ( Type I Error ) = P (X ≥ 8 when X ∼ Bin(20, .25)) = 1 − B(7; 20, .25) = .102 March, 2011 Page 11 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Type II Error Analysis • In contrast to Type I Error, there is no single β ; instead, there are different β ’s for different values of p • Suppose the true value of p is p = 0.3. Then, β(.3) = P ( Type II Error when p = 0.3) = P (X ≤ 7 when X ∼ Bin(20, .3)) = B(7; 20, .3) = .772 • It is easy to understand that β decreases as p grows more different from the null value .25 March, 2011 Page 12 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • Now consider a different rejection region R9 = {9, 10, . . . , 20}. • Since X ∼ Bin(20, p), we have α = P (H0 rejected when p = .25 ) = P (X ≥ 9 when X ∼ Bin(20, .25)) = 1 − B(8; 20, .25) = .041 • Note that the Type I error probability has gone down. March, 2011 Page 13 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • At the same time, β(.3) = P (H0 is not rejected when X ∼ Bin(20, .3)) = P (X ≤ 8 when X ∼ Bin(20, .3)) = B(8; 20, .3) = .887 which is larger than before. Think of it as an equilibrium... March, 2011 Page 14 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Proposition • If an experiment and a sample size are fixed, decreasing the size of the rejection region to obtain a smaller value of α always results in a larger value of β for any parameter value consistent with with the alternative hypothesis Ha . • The usual approach is to specify the largest value of α that can be tolerated and find a rejection region for it. This makes β as small as possible subject to the bound on α. Such a value of α is called the significance level of the test. • Traditional choices are .10,.05 and .01. The resulting test is called a level α test. March, 2011 Page 15 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 Example • Consider again the nicotine example. We test H0 : µ = 1.5 vs. Ha : µ > 1.5 based on a random sample X1 , . . . , X20 . We know that σ = .20 and X is normally distributed • Hence, X̄ is normally distributed with mean µX̄ = µ and standard deviation σX̄ = √.20 = 0.0354. 32 • We use X̄ − 1.5 Z= 0.354 as a test statistic March, 2011 Page 16 Statistics 511: Statistical Methods Dr. Levine Purdue University Spring 2011 • We reject H0 when z ”considerably” exceeds zero...In other words, we choose c such that α = P ( Type I Error) = P (Z ≥ c WhenZ ∼ N (0, 1)) • For example, if α = 0.05, we have c = z.05 = 1.645. This corresponds to x̄ ≥ 1.56. • Then, for any particular µ > 1.5, β = P (X̄ < 1.56|µ). March, 2011 Page 17