Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 20 Testing Hypotheses About Proportions Copyright © 2009 Pearson Education, Inc. NOTE on slides / What we can and cannot do The following notice accompanies these slides, which have been downloaded from the publisher’s Web site: “This work is protected by United States copyright laws and is provided solely for the use of instructors in teaching their courses and assessing student learning. Dissemination or sale of any part of this work (including on the World Wide Web) will destroy the integrity of the work and is not permitted. The work and materials from this site should never be made available to students except by instructors using the accompanying text in their classes. All recipients of this work are expected to abide by these restrictions and to honor the intended pedagogical purposes and the needs of other instructors who rely on these materials.” We can use these slides because we are using the text for this course. I cannot give you the actual slides, but I can e-mail you the handout. Please help us stay legal. Do not distribute even the handouts on these slides any further. The original slides are done in orange / brown and black. My additions are in red and blue. Copyright © 2009 Pearson Education, Inc. Slide Slide201- 3 Main topics of this chapter The reasoning behind hypothesis tests The null hypothesis P-values The one-proportion z-test One and two sided alternatives Things to watch out for when testing hypotheses Copyright © 2009 Pearson Education, Inc. Slide Slide201- 4 Division of Mathematics, HCC Course Objectives for Chapter 20 After studying this chapter, the student will be able to: Perform a one-proportion z-test, to include: writing appropriate hypotheses, checking the necessary assumptions, drawing an appropriate diagram, computing the P-value, making a decision, and interpreting the results in the context of the problem. Copyright © 2009 Pearson Education, Inc. What we’re doing I flip a coin 3 times and get 3 heads. I flip a 4th time and get heads. This could have happened by chance (p = 0.0625). The coin could be fair? I flip a 5th time and get heads. This could have happened by chance (p = 0.125). The coin could be fair. This could have happened by chance (p = 0.03125). The coin could be fair (??). I flip a 6th time and get heads. This could have happened by chance (p = 0.015625). Do you still think that this coin is fair? Copyright © 2009 Pearson Education, Inc. Slide 1- 6 What we’re doing Starting out: The coin is fair. Burden of proof is on us to show that it is not. After 3rd flip: Some evidence, but it still could be fair. After 4th flip: More evidence – still could be fair, but we’re beginning to wonder. After 5th flip: More evidence - Hmmmmmm! After 6th flip: Lots of evidence – probably not fair. What do we mean “probably?” Copyright © 2009 Pearson Education, Inc. Slide 1- 7 Electoral College poll from Chapter 19 62% of Americans say that they would amend the Constitution to replace the Electoral College system of electing the President and Vice-President with a Popular Vote system. Gallup Poll taken October 6 – 9, 2011; announced last week. Gallup sampled 1005 adults 18 or older living in the continental United States. Other surveys may have gotten different results. There is variability among results. We will measure it. Copyright © 2009 Pearson Education, Inc. What have we learned? We can be 95% confident that the true percentage of Americans who want the Presidential election decided by popular vote is between 59.0% and 65.0% We cannot say that “More than 60% of Americans want the Presidential election decided by popular vote ” and be 95% confident that we are correct. The 70% CI is (60.4%,63.58%) We can say that “More than 60% of Americans want the Presidential election decided by popular vote” and be 70% confident that we are correct. But is 70% really that confident? Copyright © 2009 Pearson Education, Inc. Slide 1- 9 Hypotheses Hypotheses are working models that we adopt temporarily. Our starting hypothesis is called the null hypothesis. The null hypothesis, that we denote by H0, specifies a population model parameter of interest and proposes a value for that parameter. We usually write down the null hypothesis in the form H0: parameter = hypothesized value. The alternative hypothesis, which we denote by HA, contains the values of the parameter that we consider plausible when we reject the null hypothesis. Copyright © 2009 Pearson Education, Inc. Slide 1- 10 Testing Hypotheses The null hypothesis, specifies a population model parameter of interest and proposes a value for that parameter. H0: p = 0.60 would be appropriate for our Gallup example We want to compare our data to what we would expect given that H0 is true. We can do this by finding out how many standard deviations away from the proposed value we are. We then ask how likely it is to get results like we did if the null hypothesis were true. Copyright © 2009 Pearson Education, Inc. Slide 1- 11 A Trial as a Hypothesis Test Think about the logic of jury trials: 1 To prove someone is guilty , we start by assuming they are innocent. We retain that hypothesis until the facts make it unlikely beyond a reasonable doubt. Then, and only then, we reject the hypothesis of innocence and declare the person guilty. 1Guilty in a criminal trial; “liable” or “in favor of the plaintiff” in a civil trial – we’ll use “guilty” for both. Copyright © 2009 Pearson Education, Inc. Slide 1- 12 What happens in a jury trial? The trial starts by assuming that the defendant is innocent – recall “innocent until proved guilty.” During the trial, evidence is gathered (by physical exhibits and/or testimony.) The jury examines the evidence. If it is unlikely that, given innocence, we would not have all of this evidence just by chance, the jury rejects the hypothesis of innocence and returns a Guilty verdict. If the evidence is not strong enough to reject the hypothesis of no guilt, then the jury returns “Not guilty.” The jury does not prove guilt or innocence. Source for next slide: Food and Drug Administration, Center for Food Safety and Applied Nutrition, Basic Statistics Course (Marc Boyer, Martine Ferguson) Copyright © 2009 Pearson Education, Inc. Slide Slide201- 13 Step Trial by Jury Statistical significance 1. Start with the presumption that the defendant is innocent Start with the presumption that the null hypothesis is true. 2. Listen to only the factual evidence presented in the trial. Ignore newspaper and television. Base the conclusion only on data from this one experiment. Don’t consider any other data. 3. Evaluate whether you believe the witness. Ignore testimony from unreliable witnesses. Evaluate whether the experiment was performed properly. 4. Think about whether the evidence is consistent with the assumption of innocence. Calculate the p-value 5. If the evidence is inconsistent with the assumption, then reject the assumption of innocence and declare the defendant guilty. Otherwise reach a verdict of not guilty. A juror can’t conclude “maybe” or ask for more evidence. If the p-value is less than a preset threshold (.05 is common), conclude that the data are inconsistent with the null hypothesis, and declare the difference to be statistically significant. Otherwise, conclude that sufficient evidence does not exist. Copyright © 2009 Pearson Education, Inc. A Trial as a Hypothesis Test (cont.) The same logic used in jury trials is used in statistical tests of hypotheses: We begin by assuming that a hypothesis is true. Next we consider whether the data are consistent with the hypothesis. If they are, all we can do is retain the hypothesis we started with. If they are not, then like a jury, we ask whether they are unlikely beyond a reasonable doubt (or “preponderance of evidence” in a civil trial.) Copyright © 2009 Pearson Education, Inc. Slide 1- 15 P-Values The statistical twist is that we can quantify our level of doubt. We can use the model proposed by our hypothesis to calculate the probability that the event we’ve witnessed could happen. That’s just the probability we’re looking for—it quantifies exactly how surprised we are to see our results. This probability is called a P-value. Copyright © 2009 Pearson Education, Inc. Slide 1- 16 P-Values (cont.) When the data are consistent with the model from the null hypothesis, the P-value is high and we are unable to reject the null hypothesis. In that case, we have to “retain” the null hypothesis we started with. We can’t claim to have proved it; instead we “fail to reject the null hypothesis” when the data are consistent with the null hypothesis model and in line with what we would expect from natural sampling variability. If the P-value is low enough, we’ll “reject the null hypothesis,” since what we observed would be very unlikely were the null model true. In a jury trial, the null hypothesis is that of innocence. Copyright © 2009 Pearson Education, Inc. Slide 1- 17 “An ode to p-values” P-value low? The null’s gotta go. P-value high? The null will fly. Source: Mario F. Triola, “Elementary Statistics using EXCEL”, 4th edition, © 2010, Pearson Publishing Co. Copyright © 2009 Pearson Education, Inc. Slide 1- 18 Testing Hypotheses (reminder) The null hypothesis, which we denote H0, specifies a population model parameter of interest and proposes a value for that parameter. We might have, for example, H0: p = 0.60, as in the Gallup electoral college poll example. We want to compare our data to what we would expect given that H0 is true. We can do this by finding out how many standard deviations away from the proposed value we are. In other words, we find the z-score that our observations give rise to, and then compare it with the distribution that reflects the null hypothesis. We then ask how likely it is to get results like we did if the null hypothesis were true. Copyright © 2009 Pearson Education, Inc. Slide Slide201- 19 What to Do with an “Innocent” Defendant If the evidence is not strong enough to reject the presumption of innocent, the jury returns with a verdict of “not guilty.” The jury does not say that the defendant is innocent. All it says is that there is not enough evidence to convict, to reject innocence. The defendant may, in fact, be innocent, but the jury has no way to be sure. Copyright © 2009 Pearson Education, Inc. Slide 1- 20 What to Do with an “Innocent” Defendant (cont.) Said statistically, we will fail to reject the null hypothesis. We never declare the null hypothesis to be true, because we simply do not know whether it’s true or not. Sometimes in this case we say that the null hypothesis has been retained. Copyright © 2009 Pearson Education, Inc. Slide 1- 21 What to Do with an “Innocent” Defendant (cont.) In a criminal trial, the burden of proof is on the prosecution. In a civil trial, it is on the plaintiff, or the party filing the lawsuit. In a hypothesis test, the burden of proof is on the unusual claim. The null hypothesis is the ordinary state of affairs, so it’s the alternative to the null hypothesis that we consider unusual (and for which we must marshal evidence). Copyright © 2009 Pearson Education, Inc. Slide 1- 22 Recall the Casey Anthony Trial Early July 2011 – Casey Anthony was found Not Guilty of killing her 2-year old daughter Caylee. Very emotional reaction nationwide. Casey Anthony juror Jennifer Ford said that she and the other jurors cried and were "sick to our stomachs" after voting to acquit Ms. Anthony "I did not say she was innocent," said Ford. "I just said there was not enough evidence. If you cannot prove what the crime was, you cannot determine what the punishment should be." Source: ABC News Copyright © 2009 Pearson Education, Inc. Slide 1- 23 A Comment on the jury trial example. There is a law forum on the Internet sponsored by Martindale-Hubble. Lawyers there are in agreement that A judge or jury would never decide a case based solely on statistics, no matter how significant the p-values are. If this did happen, the case would certainly be appealed. Statistics is a tool, to be used in conjunction with other tools, to explain phenomena. Copyright © 2009 Pearson Education, Inc. Slide Slide201- 24 The Reasoning of Hypothesis Testing There are four basic parts to a hypothesis test: 1. Hypotheses 2. Model 3. Mechanics 4. Conclusion Let’s look at these parts in detail… Recall our example – 62% of Americans feel that the Electoral College should be scrapped. Can we say that more than 60% do? Copyright © 2009 Pearson Education, Inc. Slide 1- 25 The Reasoning of Hypothesis Testing (cont.) 1. Hypotheses The null hypothesis: To perform a hypothesis test, we must first translate our question of interest into a statement about model parameters. In general, we have H0: parameter = hypothesized value. The alternative hypothesis: The alternative hypothesis, HA, contains the values of the parameter we consider plausible if we reject the null. In some texts, the alternative hypothesis is denoted H1. It’s the same thing – only author’s preference dictates which is used. I typically use Ho and Ha because they are easier to type. Copyright © 2009 Pearson Education, Inc. Slide 1- 26 Hypothesis – back to our Electoral College example We are interested in whether the number of Americans that want the Electoral College scrapped is more than 60%. We take the “devil’s advocate” position – form a statement saying “no change”. In this case, it is HO: p = 0.60 The we formulate the alternative, HA: p > 0.60. As for things turning out lower, we’d abandon the claim whether the optimism rate is exactly or less than 60%. So (in this example) we are not interested in whether less than 60% are optimistic. Copyright © 2009 Pearson Education, Inc. Slide Slide201- 27 Hypothesis – back to our Electoral College example Remember: We form our hypothesis before collecting the data. Therefore, it is incorrect to say Ho: p = 0.62. We got the 0.62 from the survey. We do not know about the 0.62 when we form our hypothesis. Therefore, Ho: p = 0.62 cannot possibly be correct. The correct null hypothesis is what is supposed before collecting the data. Ho: p = 0.60. Copyright © 2009 Pearson Education, Inc. Slide 1- 28 The Reasoning of Hypothesis Testing (cont.) 2. Model To plan a statistical hypothesis test, specify the model you will use to test the null hypothesis and the parameter of interest. All models require assumptions, so state the assumptions and check any corresponding conditions. Your plan should end with a statement like Because the conditions are satisfied, I can model the sampling distribution of the proportion with a Normal model. Watch out, though. It might be the case that your model step ends with “Because the conditions are not satisfied, I can’t proceed with the test.” If that’s the case, stop and reconsider. Copyright © 2009 Pearson Education, Inc. Slide 1- 29 The Reasoning of Hypothesis Testing (cont.) 2. Model Each test we discuss in the book has a name that you should include in your report. The test about proportions is called a oneproportion z-test. Copyright © 2009 Pearson Education, Inc. Slide 1- 30 Checking assumptions – our example Independence: Gallup has expertise in assuring that their respondents are independent. Representative sample? Yes. Sample size: npo = 1005 * 0.62 = 623 > 10 nqo = 1005 * 0.38 = 382 > 10 10% condition: 1005 is a small portion of the American population of over 300 million. Note: If HO were precisely true, there would be exactly 603 (60% of 1005) in our sample who want the Electoral College scrapped. Copyright © 2009 Pearson Education, Inc. Slide 1- 31 One-Proportion z-Test The conditions for the one-proportion z-test are the same as for the one proportion z-interval. We test the hypothesis H0: p = p0 using the statistic p̂ p0 z SD p̂ where SD p̂ p0 q0 n When the conditions are met and the null hypothesis is true, this statistic follows the standard Normal model, so we can use that model to obtain a P-value. Copyright © 2009 Pearson Education, Inc. Slide 1- 32 The Reasoning of Hypothesis Testing (cont.) 3. Mechanics Under “mechanics” we place the actual calculation of our test statistic from the data. Different tests will have different formulas and different test statistics. Usually, the mechanics are handled by a statistics program or calculator, but it’s good to know the formulas. Copyright © 2009 Pearson Education, Inc. Slide 1- 33 The Reasoning of Hypothesis Testing (cont.) In this case, we get a z-score. Recall that a z-score measures the number of standard deviations we are above or below the mean. A very high or very low z-score indicates a rare event. We measure this by looking at the area under the standard normal curve. Copyright © 2009 Pearson Education, Inc. Slide 1- 34 The Reasoning of Hypothesis Testing (cont.) 3. Mechanics The ultimate goal of the calculation is to obtain a P-value. The P-value is the probability that the observed statistic value (or an even more extreme value) could occur if the null model were correct. If the P-value is small enough, we’ll reject the null hypothesis. Note: The P-value is a conditional probability—it’s the probability that the observed results could have happened if the null hypothesis is true. Copyright © 2009 Pearson Education, Inc. Slide 1- 35 Mechanics in our example pˆ p0 z SD pˆ The formula is So we need to compute SD, which is SD pˆ p0 q0 n po =0.62,so qo = 0.38. n = 1005 Note: We are rounding po =0.62 for easy computation. In reality, 623/1020 = 0.6199004975. This is what the technologies will use. Copyright © 2009 Pearson Education, Inc. Slide Slide201- 36 Mechanics in our example Therefore SD = SQRT (0.6 * 0.4 / 1005 = 0.015453 Then z = (0.62 – 0.60) / 0.015453 or + 1.294 What do we do with this? Copyright © 2009 Pearson Education, Inc. Slide 1- 37 Mechanics in our example Then z = (0.62 – 0.60) / 0.015454 = + 1.294. This is a z-score, so we compare it with N(0,1). Normalcdf(1.294,+99999)= 0.098 It turns out that we get a percentile of about 9.8%. About 9.8% (of the time, we’d see a result this extreme or more if there is no effect. Now what do we do with this? Copyright © 2009 Pearson Education, Inc. Slide Slide201- 38 The Reasoning of Hypothesis Testing (cont.) 4. Conclusion The conclusion in a hypothesis test is always a statement about the null hypothesis. The conclusion must state either that we reject or that we fail to reject the null hypothesis. And, as always, the conclusion should be stated in context. Copyright © 2009 Pearson Education, Inc. Slide 1- 39 The Reasoning of Hypothesis Testing (cont.) 4. Conclusion Your conclusion about the null hypothesis should never be the end of a testing procedure. Often there are actions to take or policies to change. Copyright © 2009 Pearson Education, Inc. Slide 1- 40 Let’s review our example – what did we do? 1. Set up the Hypothesis State the null and the alternative hypotheses. 2. 3. 4. Find the model State and check the four assumptions Name the test (1 proportion z-test) Mechanics: Find the p-value Conclusion: Say what the p-value tells us. This is called the 1-sample z-test in a few texts, but “1-proportion z-test” is more accurate. Copyright © 2009 Pearson Education, Inc. Slide Slide201- 41 Conclusion in Our Electoral College Example A sample of 1005 Americans showed that 623 want the Electoral College scrapped. This is above 60%, or 603, for this sample. The one-proportion z-test of whether 60% of Americans want the Electoral College scrapped gave a p-value of 0.098. We would see a result this extreme or more 9.8% of the time, or a hair above one time in 10, by chance alone. There is not sufficient evidence to show that this was a real phenomenon and not just a chance occurrence. Why? Normally, we consider 1 in 20 (or more) to be sufficient evidence. More about this later. Copyright © 2009 Pearson Education, Inc. Slide Slide201- 42 Using the technology 62% want the Electoral College scrapped. The technology will not allow us to use 623.1 observations. We will need to input 623 successes. Both technologies will therefore work with 623/1005 = 0.6199004975 instead of 0.62. We will therefore get a slightly different answer. In most cases, this will not matter. Copyright © 2009 Pearson Education, Inc. Slide 1- 43 Hypothesis Testing with the TI Press the STAT key. Then move to 1-propZTest Enter data as on the left and “Calculate.” See result at right Copyright © 2009 Pearson Education, Inc. Slide 1- 44 Hypothesis Testing with the TI (alternative) 1. 2. You can “make a picture” of the test. Use “Draw” instead of “Calculate.” Before doing this, make sure of two things: That all of the equations entered in “Y=“ are turned off. That all of the plots in [STAT PLOT] are turned off. Copyright © 2009 Pearson Education, Inc. Slide 1- 45 Hypothesis Testing with the TI (alternative) Press the STAT key. Then move to 1-propZTest Copyright © 2009 Pearson Education, Inc. Slide 1- 46 Hypothesis Testing with the TI (alternative) Make sure that PLOTS and Y1 through Y6 are turned off or these plots may overlay your drawing! Choose Draw instead of Calculate Copyright © 2009 Pearson Education, Inc. Slide 1- 47 Hypothesis Testing with StatCrunch See the screen captures on the next few slides. It is very similar to the confidence interval. When the time comes, select “Hypothesis Test”. Note” “Proportions” is where you want to go, not “Z-statistics.” Copyright © 2009 Pearson Education, Inc. Slide 1- 48 Copyright © 2009 Pearson Education, Inc. Slide 1- 49 Why select “summary” and not “data”. “Data” refers to the 1005 people surveyed. If you select “data”, then StatCrunch will look for 1005 respondents in the first column and how they responded in the second! We do not have the raw data; we have a summary. Remember Chapter 3? Same thing here. Copyright © 2009 Pearson Education, Inc. Slide 1- 50 Copyright © 2009 Pearson Education, Inc. Slide 1- 51 Copyright © 2009 Pearson Education, Inc. Slide 1- 52 Hypothesis test results: p : proportion of successes for population H0 : p = 0.6 HA : p > 0.6 Proportion Count Total Sample Prop. p 623 1005 0.6199005 Std. Err. Z-Stat 0.015453348 1.287779 P-value 0.0989 We get p = 0.0989, same as with the TI. Slide 1- 53 Copyright © 2009 Pearson Education, Inc. Confidence Interval for our example. Recall from Chapter 19: This is the formula for the 95% confidence interval ˆ ˆ when we tested We already computed SE( pˆ ) pq n (i.e. computed our hypothesis test.) It is 0.015311. We also found that the 95% confidence interval is (0.5899, 0.6499). Note that it contains 0.60. our hypothesized mean. If we do the interval in the TI and then do the test (or vice-versa), our data will be there. We do not have to re-enter it for the other procedure! In StatCrunch, we will have to re-enter. Copyright © 2009 Pearson Education, Inc. Slide Slide201- 54 It appears that Hypothesis Testing and Confidence Intervals are Related Concepts. They are two sides of the same coin. In Confidence Intervals, we are estimating a parameter. In Hypothesis Testing, we are testing a hypothesis. We get the same information in both – we just tell the story differently. Copyright © 2009 Pearson Education, Inc. Slide Slide201- 55 One important distribution difference Confidence Interval: The normal distribution is around the 62% because this is what the data gives. But in hypothesis testing, the hypothesis is set up before we even see the data. Hence the distribution is centered around the null hypothesis of 60%. Copyright © 2009 Pearson Education, Inc. Slide 1- 56 Video example – cracking of ingots Cracking is a serious problem for aluminum ingots weighing 30,000 pounds each. Cracking of ingots is hovering at about 20% of ingots made. The engineers have designed a process that they hope will reduce the percent of cracking. Based on data that the engineers have collected on the performance of 400 randomly examined ingots in which 68 were cracked, has their process worked? Statistically, we can answer their question. NOTE: This example comes from the video on this chapter as presented by Dr. DeVeaux. Copyright © 2009 Pearson Education, Inc. Slide Slide201- 57 Checking assumptions – our example Independence: This is not a simple random sample. But the company has told us that defects in one ingot will not affect any other. Representative sample? Yes. Sample size: npo = 400 * 0.2 = 80 > 10 nqo = 400 * 0.8 = 320 > 10 10% condition: The ingots being tested are a small fraction of the total ingots produced. We can use the normal model. Note: If HO were precisely true, there would be 80 cracked ingots. Copyright © 2009 Pearson Education, Inc. Slide Slide201- 58 Cracked ingot example with the TI [STAT][TESTS], then to 5 (1-PropZ-test) Enter in the appropriate places: What we are testing – 0.2 X: 68, n: 400 One sided test Calculate Copyright © 2009 Pearson Education, Inc. Slide 1- 59 Cracked ingot example with the TI Here is our result: We could also choose Draw instead of Calculate Copyright © 2009 Pearson Education, Inc. Slide 1- 60 Conclusion in Our Example A sample of 400 ingots showed 68 cracked. This is below 20%, or 80, for this sample. The one-proportion ztest of whether the new method has reduced cracking gave a p-value of 0.067. Thus, if the method did not reduce cracking, we would see a result this extreme or more 6.7% of the time, or one time in 15, by chance alone. While this is promising, it is not sufficient evidence to show that there was a real change and not just a chance occurrence. Why? Normally, we consider 1 in 20 (or more) to be sufficient evidence. More about this later. Copyright © 2009 Pearson Education, Inc. Slide Slide201- 61 Alternative Alternatives There are three possible alternative hypotheses: HA: parameter < hypothesized value HA: parameter ≠ hypothesized value HA: parameter > hypothesized value Copyright © 2009 Pearson Education, Inc. Slide 1- 62 Alternative Alternatives (cont.) HA: parameter ≠ value is known as a two-sided alternative because we are equally interested in deviations on either side of the null hypothesis value. For two-sided alternatives, the P-value is the probability of deviating in either direction from the null hypothesis value. Note: The “0.069” should be “0.067”. Copyright © 2009 Pearson Education, Inc. Slide 1- 63 Back to our ingot example Suppose the metallurgist were interested in what happens either way. He would test HO: p = 0.20 against HA: p ≠ 0.20. He would get the same z-score. The difference is that he would be interested in z-scores both below – 1.5 and above 1.5. We pick up 6.7% of the area on both sides. Thus, the metallurgist would report p = 0.134. Copyright © 2009 Pearson Education, Inc. Slide Slide201- 64 Back to our ingot example Two-sided alternative with the TI Just pick the two-sided alternative and do as before. Copyright © 2009 Pearson Education, Inc. Slide 1- 65 Alternative Alternatives (cont.) The other two alternative hypotheses are called one-sided alternatives. A one-sided alternative focuses on deviations from the null hypothesis value in only one direction. Thus, the P-value for one-sided alternatives is the probability of deviating only in the direction of the alternative away from the null hypothesis value. Again, this should be 0.067. Copyright © 2009 Pearson Education, Inc. Slide 1- 66 One and two sided alternatives Note that we got two conclusions: Report to the manager: If there is really no improvement in our process, we have a result this extreme or more 6.7% of the time. Report to the metallurgist: If there is really no difference as a result of our method, we have a result this extreme or more 13.4% of the time. Both statements are true, and they are consistent with each other. Copyright © 2009 Pearson Education, Inc. Slide 1- 67 Alternative Alternatives (cont.) The decision to use a one-sided or two-sided alternative is rarely a statistical one. You must decide whether you should test onesided or two-sided Before you see the data? Not even that far – Before you even do the experiment; i.e. before that data even exist! Copyright © 2009 Pearson Education, Inc. Slide Slide201- 68 P-Values and Decisions: What to Tell About a Hypothesis Test How small should the P-value be in order for you to reject the null hypothesis? It turns out that our decision criterion is contextdependent. When we’re screening for a disease and want to be sure we treat all those who are sick, we may be willing to reject the null hypothesis of no disease with a fairly large P-value. A longstanding hypothesis, believed by many to be true, needs stronger evidence (and a correspondingly small P-value) to reject it. Another factor in choosing a P-value is the importance of the issue being tested. Copyright © 2009 Pearson Education, Inc. Slide 1- 69 P-Values and Decisions (cont.) Your conclusion about any null hypothesis should be accompanied by the P-value of the test. If possible, it should also include a confidence interval for the parameter of interest. Don’t just declare the null hypothesis rejected or not rejected. Report the P-value to show the strength of the evidence against the hypothesis. This will let each reader decide whether or not to reject the null hypothesis. Copyright © 2009 Pearson Education, Inc. Slide 1- 70 What Can Go Wrong? Hypothesis tests are so widely used—and so widely misused—that the issues involved are addressed in their own chapter (Chapter 21). There are a few issues that we can talk about already, though: Copyright © 2009 Pearson Education, Inc. Slide 1- 71 What Can Go Wrong? (cont.) Don’t base your null hypothesis on what you see in the data. Think about the situation you are investigating and develop your null hypothesis appropriately. If you develop the null hypothesis (and the alternative hypothesis) before running the experiment, you won’t make this mistake (or the next one.) Don’t base your alternative hypothesis on the data, either. Again, you need to Think about the situation. Copyright © 2009 Pearson Education, Inc. Slide 1- 72 What Can Go Wrong? (cont.) Don’t make your null hypothesis what you want to show to be true. You can reject the null hypothesis, but you can never “accept” or “prove” the null. Don’t forget to check the conditions. We need randomization, independence, and a sample that is large enough to justify the use of the Normal model. If you fail to reject the null hypothesis, don’t think a bigger sample would be more likely to lead to rejection. Each sample is different, and a larger sample won’t necessarily duplicate your current observations. Copyright © 2009 Pearson Education, Inc. Slide 1- 73 What have we learned? We can use what we see in a random sample to test a particular hypothesis about the world. Hypothesis testing complements our use of confidence intervals. Testing a hypothesis involves proposing a model, and seeing whether the data we observe are consistent with that model or so unusual that we must reject it. We do this by finding a P-value—the probability that data like ours could have occurred if the model is correct. Copyright © 2009 Pearson Education, Inc. Slide 1- 74 What have we learned? (cont.) We’ve learned the process of hypothesis testing, from developing the hypotheses to stating our conclusion in the context of the original question. We know that confidence intervals and hypothesis tests go hand in hand in helping us think about models. A hypothesis test makes a yes/no decision about the plausibility of a parameter value. A confidence interval shows us the range of plausible values for the parameter. Copyright © 2009 Pearson Education, Inc. Slide 1- 75 Main topics of this chapter The reasoning behind hypothesis tests The null hypothesis P-values The one-proportion z-test One and two sided alternatives Things to watch out for when testing hypotheses Copyright © 2009 Pearson Education, Inc. Slide Slide201- 76 Division of Mathematics, HCC Course Objectives for Chapter 20 After studying this chapter, the student will be able to: Perform a one-proportion z-test, to include: writing appropriate hypotheses, checking the necessary assumptions, drawing an appropriate diagram, computing the P-value, making a decision, and interpreting the results in the context of the problem. Copyright © 2009 Pearson Education, Inc. A seasonal example for Halloween Do you have children who go trick or treating Halloween night? Kid brothers and sisters count! If so, do you steal candy from their trick or treat bags? Go ahead and admit it!! A majority of parents in the Baltimore-Washington area do it! But can we really say that? Copyright © 2009 Pearson Education, Inc. Slide 1- 78 A survey about candy-copping parents! In early October, Mandala Research took a poll. Surveys were taken in 20 major US metro areas. 4000 parents were questioned; 200 in each area. In our area, 51% (say 102) admitted to stealing candy from their children’s trick or treat bags. Can we claim this to be a majority? We’ll use α-level 0.10 Source: http://betterinbulk.net/2011/10/halloween2011-in-washington-d-c.html Copyright © 2009 Pearson Education, Inc. Slide 1- 79 A survey about candy-copping parents! Hypotheses: Assumptions and Conditions; Ho: p = 0.50 Ha: p > 0.50 Independence: We’ll assume that Mandala knows how to do this 10 successes and failures? 102 and 98 are both greater than 10 Sampling less than 10% of parents of trick-ortreating age in our area. The test can proceed. Copyright © 2009 Pearson Education, Inc. Slide 1- 80 Our hypothesis test Copyright © 2009 Pearson Education, Inc. p = 0.3886 If Ho is true, we have a result this extreme or more 0.3886 (or 38.86%) of the time. This is perfectly reasonable. There is not enough to reject Ho at α = 0.10. We fail to reject. Slide 1- 81 A confidence interval Copyright © 2009 Pearson Education, Inc. The confidence interval contains proportions on both sides of 0.50. We cannot claim that a majority of parents steal candy from their children's’ trick or treat bags because there are proportions less than 0.5 in the interval. Slide 1- 82 Hypothesis Tests and Confidence Intervals If the stated proportion is inside an α percent confidence interval, then a two-sided hypothesis will be rejected at the (1-α) percent level. If the stated proportion is inside a 95% confidence interval, then a two-sided hypothesis will be rejected at the 0.05 level. The rule breaks down if The test is one-sided, or The percent changes; for example, 95% confidence level and a rejection decision at the 0.01 level. Copyright © 2009 Pearson Education, Inc. Slide 1- 83