Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STAT 1450 COURSE NOTES – CHAPTERS 15 TESTS OF SIGNIFICANCE: THE BASICS Connecting Chapter 15 to our Current Knowledge of Statistics Chapters 10 & 12 used information about a population to answer questions about a sample (e.g., 20% of people are smokers, what is the probability that a random sample of 2 people smoke). Now with inference, we have statistics problems where we use the information about a sample to answer questions concerning the population. If we want to ___________________________________, then we should use statistics to create a _____________________. If, on the other hand, we want to _________________ provided by data ____________________ concerning a population parameter, we need to conduct a ______________________. 15.1 The Reasoning of Tests of Significance We are now inquiring about a behavior of an event if a phenomena was repeated numerous times. We will begin by working with simple random samples of data from Normal populations with known standard deviations. Situation: People drink coffee for a variety of professional, and now, social reasons. Coffee used to merely be a beverage option on the menu. Now, it is the main attraction for a growing number of restaurants and shoppes. The standard “cup of coffee” is 8 oz. However, even a Tall at Starbuck’s is 12 oz. Please answer the following questions: a) How many ounces of coffee do you think people typically drink each day? _________ b) How many ounces of coffee do you drink daily? _________ Note: We will only consider the population of regular coffee drinkers. 15.2 Stating Hypotheses We have two possible hypotheses about this situation: 1. The mean amount of coffee consumed daily is not different from the value listed in a). 2. The mean amount of coffee consumed daily is different from the value listed in a). Chapter 15, page 1 These hypotheses have names: 1. The _______________________ is the claim tested about the population parameter. The test is designed to assess the strength of the _______________________ the null hypothesis. Usually the null hypothesis is a statement of “no effect” or “no difference.” It commonly assumes the ___________________________________.” 2. The _________________________________ is the claim about the population parameter that we are trying to find ______________________. An alternative hypothesis is ________________ if it states that a parameter Is _____________________ or _______________________ the null hypothesis value. It is _________________ if it states that the parameter is __________________ the null value (it could be either smaller or larger). Question: Is the alternative hypothesis in our situation one-sided or two-sided? Example: Let’s use one of the values from a) to compose the null and alternative hypotheses. Data: Suppose it is known that the standard deviation for daily coffee consumption is 9.2 oz. The average amount of coffee consumed daily for a random sample of 48 people is 26.31 oz. True or False: We can conclude that the average amount of coffee consumed daily is different from our hypothesized value. Important note: Base your alternative hypothesis on your question of interest—do not base it on the data. Chapter 15, page 2 15.3 P-value & Statistical Significance A ___________________________ calculated from the sample data measures how far the data departs from what we would expect if the null hypothesis were true. The further this statistic is from 0, the more the data contradicts the null hypothesis. Note: A test statistic tells us how many standard deviations our value is away from the hypothesized mean. A positive test statistic is above the mean. A negative one is below the mean. We then use this information to figure out how likely it is to see results like ours if the null hypothesis was true. The probability, computed assuming that the null hypothesis is true, that the test statistic would take a value as extreme or more extreme than that actually observed is called the _________________________ of the test. If the _____________ is ___________ enough, the data we observed would be ________ (very unlikely to have happened) if the null hypothesis were true. If the _________ is _______________enough, the data we observed are ____________ at all (could plausibly have happened due to sampling variability) if the null hypothesis were true. Tests of Significance & the Justice System Tests of Significance Null Hypothesis Alternative Hypothesis Test Statistic P-value Justice System The defendant gets the “benefit of the doubt.” (i.e., they are not guilty). They are “guilty.” Totality of Evidence collected. The probability of observing data as extreme as what was collected if we under the assumption that the defendant is, indeed, “not guilty.” When, the evidence collected seems ‘likely’ (based upon the null hypothesis) Decision Jury rules that the defendant is ‘not guilty.” When, the evidence collected seems ‘extremely unlikely’ (based upon the null hypothesis) Decision - Either we have “bad” data (mistrial, tampering, etc…) Or The jury rules that the defendant is ‘guilty.’ Note: Our jury system assumes innocent until proven guilty. The actual truth of whether the person indeed committed the crime may never be known. Question: What is the cut-off between “likely,” “unlikely,” and “extremely unlikely?” Chapter 15, page 3 If the P-value is as small or smaller than , we say that the data are statistically significant at level . The quantity is called the significance level or the level of significance. P-Value vs. P-value > P-value ≤ Decisions about H0 Ho Ho Question: Why should a significance level be set before the test has been done? The test statistic for hypothesis testing has is based upon our work from sampling distributions and confidence intervals. 15.4 Tests for a Population Mean Tests of significance, allow researchers to determine the validity of certain hypotheses based upon P-values. There are various parameters that we can test (proportions, standard deviations, etc…). We will begin with the most common parameter to be tested, the mean; much like how we began our confidence interval discussion by estimating the true mean, . Draw an SRS of size n from a large population that has the Normal distribution with mean μ and standard deviation σ. The one-sample z statistic x z n has the z distribution. To test the hypothesis H 0 : 0 , compute the ______________________ x 0 z n Key Words “more than” “increased” Alternative Hypothesis Ha: > 0 P-Value P (Z ≥ z) “less than” “reduced” Ha: < 0 P (Z ≤ z) “different” “is not” Ha: ≠ 0 2*P(Z ≥ z) Rejection Region Back to our example… Chapter 15, page 4 Example: The standard deviation of daily coffee consumption is 9.2 oz. A random sample of 48 people consumed an average of 26.31 oz. of coffee daily. Is this evidence that the average amount of coffee consumed daily is different from our original estimate? Poll: Using your intuition, do you believe we have enough evidence against our original claim? (a) Yes (b) No Let’s conduct the test of significance. Technology Tips – Conducting Tests of Significance ( known) TI-83/84 STAT TESTS ZTest Enter. Select Stats. Enter 0, , x , and n. Select Calculate. (Note: Select Data when x and n are not provided. Then enter the list where the data are stored.) JMP Enter the data. Analyze Distribution.“Click-and-Drag” (the appropriate variable) into the ‘Y, Columns’ box. Click on OK. Click on the red upside-down triangle next to the title of the variable from the ‘Y,Columns’ box. Proceed to ‘Test Mean.’ Enter 0, and click on OK. Chapter 15, page 5 The 4-Step Process As Applied to Tests of Significance 1. ________: What is the practical question that requires a statistical test? 2. ________: a) Identify the parameter. b) List all given information from the data collected. n: ________________ c) State the null (H0) and alternative (HA) hypotheses. H0: _________________ HA: _________________ = __________ d) Specify the level of significance, e) Determine the type of test. f) Left-tailed Right-tailed Two-Tailed Sketch the region(s) of “extremely unlikely” test statistics. 3. _______: a) Check the conditions for the test you plan to use. Random sample? Population : Sample Ratio? Large enough for sample? b) Calculate the test statistic c) Determine (or estimate) the P-Value. 4. _________: a) Make a decision about the about the null hypothesis (Reject H0 or Fail to reject H0). b) Interpret the decision in the context of the original claim. (i.e., “There is enough (or not enough) evidence at the level of significance that … ) Chapter 15, page 6 Example: Recall that IQ scores from Chapter 14 followed a Normal Distribution with = 15. You suspect that persons from affluent communities have IQ scores above 100. A random sample of 35 residents of an affluent community had an average IQ score of 112. Is there significant evidence to support your claim at the =.05 level? The 4-Step Process As Applied to Tests of Significance 1. State: What is the practical question that requires a statistical test? 2. Plan: a) Identify the parameter. b) List all given information from the data collected. n: ____________________ c) State the null (H0) and alternative (HA) hypotheses. H0: ____________________ HA: ____________________ = _____________________ d) Specify the level of significance, e) Determine the type of test. Left-tailed Right-tailed Two-Tailed f) Sketch the region(s) of “extremely unlikely” test statistics. 3. Solve: Random Sample? a) Check the conditions for the test you plan to use. Population : Sample Ratio? Large Enough for Normality? b) Calculate the test statistic c) Determine (or estimate) the P-Value. 4. Conclude: a) Make a decision about the about the null hypothesis (Reject H 0 or Fail to reject H0). b) Interpret the decision in the context of the original claim. (i.e., “There is enough (or not enough) evidence at the level of significance that … Note: Some homework exercises will provide you with raw data. You are to use the data to compute the sample mean and/or standard deviation. Then proceed with computing the confidence interval or performing a test of significance. 15.5 Significance from a Table: Chapter 15, page 7 The graphing calculator and JMP provide the most accurate P-value calculations. Tables can also be used to estimate P-values. t least 3 new ideas that had the most impact on your knowledge of tests of significance. There are two methods of determining the P-Value for a z-statistic. Table C: 1. Compare z with the critical values z* at the bottom of Table C. 2. If z falls between two values of z*, then the P-value falls between the two corresponding values of P in the “One-sided P” or the “Two-sided P” row of Table C. Table A: 1. Compute the P-value, which is: a) P (Z > z) for a Right-tailed test. b) P (Z < z) for a Left-tailed test. c) 2*P (Z > |z|) for a two-tailed test. 2. Compare the P-value with . Using technology to compute P-values is most preferred. For our purposes, using Table C is a suitable alternative to technology. This may not produce the same accuracy as the other options, but it will strengthen estimation skills. Example: The z-statistic for a left-tailed test is z = -1.45. How significant is this result? Five-Minute Summary: List at least 3 concepts that had the most impact on your knowledge of tests of significance. _______________ ______________ ____________ Chapter 15, page 8