Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
10.2 ap stats new.notebook August 31, 2009 10.2 Tests of Significance A test of significance asks whether some difference or effect is real or is just due to chance sampling variability A significance test assesses evidence from sample data against some claim about a parameter. Two ways to test claims or hypotheses about a parameter: 1. Confidence Intervals 2. Significance Tests 1 10.2 ap stats new.notebook August 31, 2009 Tests of Significance are based on sampling distributions (ch. 9) and standard scores (like z) and tell us how much evidence we have against some claim. US Court System 1. Assume innocent. 2. Determine legal procedings. 3. Assess evidence against innocence. 4. Make a decision about guilt. Significance Testing 1. Assume some claim is true. 2. Determine procedure to follow. 3. Find the probability of getting such a statistic. 4. Make a decision about the claim. 2 10.2 ap stats new.notebook August 31, 2009 A few more specifics about Significance Tests 1. Assume that some claim is true. 2. Find your statistic and look at the sampling distribution. 3. Find the probability of getting a statistic like that or more extreme. 4. Is the evidence convincing? An outcome that is very unlikely, assuming that the claim is true, is good evidence that the claim should be questioned or rejected. It's not monkeybusiness, it's logic! 3 10.2 ap stats new.notebook August 31, 2009 Hypothesis: • claim about a population &/or parameter. • what we assume is true or want to prove Never write a hypothesis about a statistic! Writing a hypothesis about a statistic is immediately 4 10.2 ap stats new.notebook August 31, 2009 Null Hypothesis Ho A statistical test begins by supposing the effect we want is not present. Then we try to find evidence against this assumption. The claim of no change, no effect, or no difference is the null hypothesis (the null is dull) and is symbolized as Ho. We then evaluate how much evidence against the null hypothesis there is. The null hypothesis is the statement being tested. See page 565 for more about Ho. 5 10.2 ap stats new.notebook August 31, 2009 Alternative Hypothesis Ha The statement that there is a change, an effect, or a difference is the alternative hypothesis and is symbolized as Ha. If we find strong evidence against the null hypothesis (Ho), then we decide the alternative hypothesis (Ha) is probably the true state of things. 6 10.2 ap stats new.notebook August 31, 2009 More about hypotheses and claims. The null hypothesis is just that "null" nothing interesting, nothing has changed, no difference, not effective, etc. The burden of proof resides in the alternative hypothesis. Someone might claim that something has changed or that it has not. Either way the null is no change. Ex. Suppose we have a medication that's proven effective, but has now been slightly reformulated. Scenario 1: We've added another ingredient and now we claim the new version is even more effective. Ho: The cure rate has not changed Ha: The cure rate is higher Scenario 2: We've deleted an ingredient that was causing upset stomach in some people and now we claim the new version is still just as effective. Ho: The cure rate has not changed Ha: The cure rate is lower In both cases the null is the same, but in the first case the claim is in the alternative hypothesis and in the second it's in the null. What's different is the strength of the conclusion we can reach. In the first case, rejecting the null offers strong evidence that the claim is true (not quite a definitive proof of the claim, but certainly leaning in that direction). In the second, failing to reject the null merely says we lack evidence things have changed. This is nowhere near proof that the claim is true. We'll continue to hold the conjecture of equivalent effectiveness, but we certainly haven't proven it (which is why we never "accept" a null hypothesis). 7 10.2 ap stats new.notebook August 31, 2009 Hypotheses in words Null Hypothesis Ho: The true population proportion or mean of what you are trying to study equals the guess of the parameter. Alternative Hypothesis Ha: The true population proportion or mean of what you are trying to study is greater than, less than, or not equal to the guess of the parameter. 8 10.2 ap stats new.notebook August 31, 2009 Hypothesis Testing State both hypotheses at the beginning of EVERY statistical test. Only one hypothesis will be declared likely TRUE at the end of the test. We can't really prove that one is true, though. 9 10.2 ap stats new.notebook August 31, 2009 reject Ho • implies there's sufficient evidence that the Ho is false • guilty • implies there's sufficient evidence to convict fail to reject Ho • implies there's insufficient evidence that the Ho is false • not guilty • implies there's insufficient evidence to convict accept Ho • implies proof that Ho is true • innocent • implies proof that one is innocent 10 10.2 ap stats new.notebook August 31, 2009 Hypotheses for tests about the mean Ho: µ=µo Ha: µ<µo Ho: µ=µo Ha: µ>µo Ho: µ=µo Ha: µ≠µo 11 10.2 ap stats new.notebook August 31, 2009 How we quantify the strength of evidence: the Pvalue The P value is • Probability that the test statistic takes a value at least as extreme as the one observed. • A conditional probability • P(seeing this statistic (or one more extreme) | Ho is true) • The probability of seeing a statistic at least this extreme if the Ho is true. A small Pvalue is strong evidence against Ho. How small? That depends on what we're testing. 12 10.2 ap stats new.notebook August 31, 2009 page 567 The probability, computed assuming that Ho is true, that the observed outcome would take a value as extreme or more extreme than that actually observed is called the Pvalue of the test. The smaller the Pvalue is, the stronger the evidence against Ho provided by the data. page 569 If we decide in advance that some particular probability is decisive, we use the Greek letter alpha, α, to indicate this level of evidence against Ho that we will insist upon. If the Pvalue is small or smaller than alpha, α, we say that the data are statistically significant at level α. 13 10.2 ap stats new.notebook August 31, 2009 Statistical Significance is • a way to say how much evidence we need to make some decision. • based on α, the significance level, the complement of a confidence level • declared if the Pvalue is as small or smaller than α. 14 10.2 ap stats new.notebook August 31, 2009 Evidence Against Null Hypothesis PValue "Some" 0.05 < P < 0.10 "Moderate" 0.01 < P < 0.05 "Strong" P < 0.01 but, we have to be careful about being rigid concerning α (alpha). Ex. Most of us have a pretty good builtin feel for α. We have a coin we suspect is biased. We agree that we expect 10 heads in 20 tosses. Almost no one would suspect it biased if 11 heads came up. That's just ordinary random variability. How about 12 heads? That's probably still a reasonable number for a fair coin. At 13, some of us might suspects the coin is not fair. Most of us would really start to question the coin's fairness at 14 and 15 heads. If we check the binomial cumulative probability of up to 14 or 15 heads in 20 tosses, we would find that most of us reject the assumption of a fair coin at a probability of 0.01 to 0.05. 15 10.2 ap stats new.notebook August 31, 2009 A Significance level is • the p value that is the criterion for saying that Ho is either true or not true. • also called the α (alpha) level. • generally set at .05 or .01. • the area of the rejection region (also called the critical region) of a distribution. 16 10.2 ap stats new.notebook August 31, 2009 What if alpha is not given? NOBODY will argue with α=0.05, but if you think something else is more appropriate, just justify it by appealing to the context. (Since this requires a longer answer that will eat up time on the AP Stat test, it is NOT casually recommended it unless the question specifically asks you to address that issue.) 17 10.2 ap stats new.notebook August 31, 2009 From now on, always follow one of these methods: The inference toolbox (page 571) Step 1: ID the population and the parameter we want to draw conclusions about. Step 2: Choose the appropriate inference procedure and verify the conditions for the procedure. Step 3: If the conditions are met, find the test statistic and the Pvalue. Step 4: Interpret the results in the context of the problem. PHANTOMS (my usual method) Population and parameter Hypotheses in context Assumptions/conditions verified Name of the procedure Test statistic found Obtain a P value Make a decision about Ho State a conclusion in context 18 10.2 ap stats new.notebook August 31, 2009 A comparison of PHANTOMS and the Inference Toolbox Inference toolbox: "PHANTOMS" 1 P parameter H hypotheses 2 A assumptions 3 N name of test T test statistic O obtain pvalue M make decision 4 S state conclusion in context 19 10.2 ap stats new.notebook August 31, 2009 Another method you can use is PHAT DC: 1. state Parameter of interest 2. state null and alternate Hypotheses 3. state and verify Assumptions 4. choose a Test, then construct and evaluate a test statistic 5. Decide based on the test statistic whether Ho is supported 6. interpret test results in the Context of the situation Remember, that's PHAT DC: Parameter Hypotheses Assumptions Test choice and Test statistic Decisions Conclusion in context Inference toolbox: 1 2 3 4 20 10.2 ap stats new.notebook August 31, 2009 Other people like to use this: "CATCH" spelled backward H C T A C Inference toolbox: hypotheses and parameter check conditions test statistic alpha and pvalue conclusion in context 1 2 3 4 21 10.2 ap stats new.notebook August 31, 2009 Details! Details! No matter what method you use... Show how you verified the assumptions/conditions. Don't just list them and check them off. Specify the procedure, by name or formula. Show the values substituted into the formula. (It's an insurance policy if you do this but push a wrong calculator button, you might still get full credit.) Be sure your conclusion is in context. Generic statements don't earn you points. 22 10.2 ap stats new.notebook August 31, 2009 page 572 To test the hypothesis Ho: µ = µo based on an SRS of size n from a population with unknown mean µ and known standard deviation σ, compute the onesample z statistic 23 10.2 ap stats new.notebook August 31, 2009 The Pvalue for a test of Ho against Ha: µ > µo is P(Z>z) Where Ha: µ < µo is P(Z<z) Ha: µ ≠ µo is 2P(Z>z) The Pvalues are exact if the population distribution is Normal. Otherwise, they are approximate for large n. 24 10.2 ap stats new.notebook August 31, 2009 Ex. Company A uses bug repellent for its workers. The current repellent protects for 8 hours. A cheaper repellent is being considered. Assume α=.05, that effective times are Normally distributed, and that σ=2 hours. Is there evidence that the cheaper repellent protects less than 8 hours? The cheaper repellent protected a SRS of 15 workers for an average of 7 hours. 25 10.2 ap stats new.notebook August 31, 2009 26 10.2 ap stats new.notebook August 31, 2009 27 10.2 ap stats new.notebook August 31, 2009 Ex. Company B also uses bug repellent for its workers. The current repellent protects for 8 hours. A plantbased repellent is being considered. Assume α=.05, that effective times are Normally distributed, and that σ=2 hours. Is there evidence that the plantbased repellent is any different from the current one? The plantbased repellent protected a SRS of 15 workers for an average of 8.5 hours. 28 10.2 ap stats new.notebook August 31, 2009 Pvalue = 2(.166)= .332 2(.166)= .332 At .05 level of significance there is no reason to believe that the average effective life of the plantbased repellent is different from 8 hours. 29 10.2 ap stats new.notebook August 31, 2009 A couple ways to remember whether you should reject Ho: p α person alreadysetup hurdle Pvalue method > cleared the hurdle, so fail to reject Ho ≤ hit the hurdle, so reject Ho test statisic method α If the test statistic falls in the shaded critical region (rejection region), then we reject Ho. n α o ed s ba * e z alu l v ca iti cr If the test statistic falls in the unshaded acceptance region, then we fail to reject Ho. If the T(wisted) S(ister) goes shopping and the If the T(wisted) S(ister) goes store is brightly lit, and shopping and the lights go out, not shaded, she keeps her she rejects her purchases; who purchases (fails to reject would shop in the dark? her purchases). 30 10.2 ap stats new.notebook August 31, 2009 page 578 the critical value method To test the hypothesis H0: µ = µo based on an SRS of size n from a population with unknown mean µ and known standard deviation σ, compute the onesample z statistic. Reject Ho at significance level α against a onesided alternative Ha: µ > µo if z > z* Ha: µ < µo if z < z* where z* is the upper critical value from Table C. Reject Ho at significance level α against a twosided alternative Ha: µ ≠ µo if |z| > z* where z* is the upper α/2 critical value from Table C. 31 10.2 ap stats new.notebook August 31, 2009 We can also use a CI for a significance test. page 581 A level α twosided significance test rejects a hypothesis Ho: µ= µo exactly when the value µo falls outside a level 1 α confidence interval for µ. Basically, if the hypothesized value for μ is outside the CI, then we reject the Ho. This works when Ha says that μ≠μo. The significance level will be α if the confidence level is 1α. 32 10.2 ap stats new.notebook August 31, 2009 Ex. Survey of Study Habits and Attitudes (SSHA) • a psychological test • measures motivation, attitude toward school, & study habits • scores range from 0 to 200 • for U.S. college students μ ≈ 115 and σ ≈ 30 A teacher suspects that older students have better attitudes toward school & gives SSHA to 20 students over age 30. = 135.2. Assume σ = 30 for the older students. Construct a 90% confidence interval for the mean score μ for older students. Is there evidence that the mean score for older students differs from that of the general college population? 33