Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
AP Statistics Mr. Coppock Errors in Hypothesis Testing Like any decision making process, hypothesis testing is subject to error. There are two errors that one can make while doing hypothesis testing. One can reject the null hypothesis when in fact it should be accepted, or one might not reject the null hypothesis when in fact it should be rejected. The two errors can best be depicted using the 2 X 2 table below: Reality Null Hypothesis is Alternative true hypothesis is true Test Decision Reject Null Hypothesis Fail to Reject Null Hypothesis Type I Error Correct Decision Correct Decision Type II Error AP Statistics Mr. Coppock Example: Lets say Consumer Reports Magazine doesn’t believe that McDonald’s Quarter Pounders are really a quarter pound of beef (4 oz.). They perform the following hypothesis test using an SRS of 50 burgers with an level of 0.05 (the standard deviation of Quarter Pounder weights is 1.5 oz): Ho : 4 Ha : 4 A) Lets assume that the burgers are in fact 4 oz. (i.e. McDonalds is telling the truth) 1) If we got an x of 3.5 oz from our SRS, we would reject Ho because: Z 3.5 3.5 4.0 1.5 / 50 2.357 P(Z < -2.357) = .009 the p-value for this x is below .05 (it is 0.009). This, however would be a wrong decision. The reason we made this error is because our sample was a fluke and happened to give us an extremely low x . We call this a Type I error – rejecting the null hypothesis, when, in fact, it is true. The probability of this error is the probability of rejecting the true null hypothesis based on such a “fluke”. Since our level is .05, the probability of rejecting Ho based on a fluke is .05. 3.5 4.0 4.5 AP Statistics Mr. Coppock 2) If we got an x of 3.97, we would not reject Ho because: Z 3.97 3.97 4.0 1.5 / 50 0.141 P(Z < -0.141) = 0.444 The p-value for this x is above .05 (it is .444). This would be a correct decision since Ho is correct. This correct decision would occur with a probability of 0.95 (1- ) 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8 B) Lets now assume that Quarter Pounders, in fact, weigh below 4 oz. (say they weigh only 3.7 .oz. which McDonalds is lying) 1) If we got an x of 3.5 oz., we would reject Ho because Z 3.5 3.5 4.0 1.5 / 50 2.357 P(Z < -2.357) = .009 The p-value for this x is below .05 (it is 0.009). This would be a correct decision, since the alternative hypothesis is true ( 4 ). The probability of this decision is called the power of the test. We’ll learn how to calculate it in the next example. AP Statistics Mr. Coppock 2) If we got an x of 3.97, we would not reject Ho because Z 3.97 3.97 4.0 1.5 / 50 0.141 P(Z < -0.141) = 0.444 The p-value for this x is above .05 (it is .444). This would be a wrong decision because Ho is not true (the true weight of Quarter Pounders we assumed was 3.7 oz.) and we failed to reject Ho. This is called a Type II error and is usually denoted by . A good way to picture and to calculate it is to construct the normal curve that represents the true weight ( 3.7 oz.) next to the normal curve of our null hypothesis assumption ( 4.0 oz.) and look at overlap areas: Normal curve based on the true value of ( 3.7 oz.) Normal curve used to calculate the test statistic assuming the null hypothesis is true ( 4.0 oz.) 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 H a : 3.7 Ho : 4 Type I error ( ) – Dark shaded region (discussed earlier) Type II error ( ) – Light shaded region. This is the probability that we would not reject the null hypothesis that 4.0 oz. when in fact the true is only 3.7 oz. b(i.e. alternative hypothesis is true) AP Statistics Mr. Coppock Power: A high probability of a Type II Error (failing to reject the hull hypothesis that really is false) means that the test is not sensitive enough to usually detect the alternative. The sensitivity of the test to detect the alternative is called the power of the test. The power is simply the probability of NOT making a Type II Error. So: Power = 1 - 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 Things to remember: (1) A Type I error can only occur when a null hypothesis is true. (You incorrectly reject a true null hypothesis.) (2) A Type II error can only occur when a null hypothesis is false. (You incorrectly fail to reject a false null hypothesis.) (3) The Power of a test is 1 - probability (Type II error). (This is the probability that you correctly reject a false null hypothesis.) (4) One needs an alternative to the null hypothesis in order to calculate a Type II error. AP Statistics Mr. Coppock Important Properties of Type I and Type II Errors: A) The probabilities of a Type I and Type II error are inversely related. If we decrease the level of we will increase the probability of - level a o - level a o The tradeoff between Type I and Type II errors has serious implications for choosing an level when doing hypothesis testing. Choosing the smallest possible level might not be such a good idea if the consequences of a type II error are grave. AP Statistics Mr. Coppock Example 1: Suppose you are deciding whether or not to reject a parachute before placing it in service. Ho: The parachute works properly Ha: The parachute does not work properly A Type I error would be to reject the null hypothesis when in fact it is true. This would mean rejecting a working parachute. The costs of such an error are the amount of money it took to make the parachute A Type II error would be not to reject the null hypothesis when in fact it should be rejected (the alternative hypothesis is true). This would mean that the inspector would accept a defective parachute and put it in commission. The cost of such an error would obviously be the life of the individual using the parachute (the parachute not working means the person jumping off the plane is going to die). In such an example the cost of a Type II error is so much greater than a Type I error that we would choose a high alpha level so as to minimize . Example 2: Suppose you are deciding whether to give the death penalty to a murder suspect. Ho: The suspect is innocent Ha: The suspect is guilty A Type I error in this case would entail rejecting the presumption of innocence when in fact the suspect is innocent. The cost of this error would be executing an innocent person. A Type II error would mean you do not reject the null hypothesis and decide the person is not guilty when in fact he is guilty. The cost in this case would be the person would either go free or perhaps serve a little jail time (if convicted of a lesser crime). In this case a Type I error seems more costly, and we would want to choose an alpha level that is extremely small so as to minimize the Type I error. AP Statistics Mr. Coppock B. Both Type I and Type II Error can be reduced by increasing the sample size and thus reducing the standard deviation in the distributions: - level a o - level a o AP Statistics Mr. Coppock C. The probability of a Type II Error decreases, the farther away the alternative mean is from o : - level a o - level a o - level a o AP Statistics Mr. Coppock A good analogy to errors in hypothesis testing would, again be a comparison to our judicial system: Reality Jury Decision Defendant is Innocent Defendant is Guilty Reject Presumption of Innocence (guilty verdict) Type I Error Correct Decision Fail to Reject Presumption of Innocence (Not Guilty Verdict) Correct Decision Type II Error