Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics Week 8 Fundamentals of Hypothesis Testing: One-Sample Tests 1 Goals of this note After completing this noe, you should be able to: Formulate null and alternative hypotheses for applications involving a single population mean or proportion Formulate a decision rule for testing a hypothesis Know how to use the p-value approaches to test the null hypothesis for both mean and proportion problems Know what Type I and Type II errors are 2 What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population mean The average number of TV sets in U.S. homes is equal to three ( μ 3 ) population proportion A marketing company claims that it receives 8% responses from its mailing. ( p=.08 ) 3 The Null Hypothesis, H0 States the assumption to be tested Example: The average number of TV sets in U.S. Homes is equal to three ( H0 : μ 3 ) Is always about a population parameter, not about a sample statistic H0 : μ 3 H0 : X 3 4 The Null Hypothesis, H0 (continued) Begins with the assumption that the null hypothesis is true Similar to the notion of innocent until proven guilty Refers to the status quo Always contains “=” , “≤” or “” sign May or may not be rejected 5 The Alternative Hypothesis, H1 Is the opposite of the null hypothesis e.g.: The average number of TV sets in U.S. homes is not equal to 3 ( H1: μ ≠ 3 ) Challenges the status quo Never contains the “=” , “≤” or “” sign Is generally the hypothesis that is believed (or needs to be supported) by the researcher 6 Hypothesis Testing We assume the null hypothesis is true If the null hypothesis is rejected we have proven the alternate hypothesis If the null hypothesis is not rejected we have proven nothing as the sample size may have been to small 7 Hypothesis Testing Process Claim: the population mean age is 50. (Null Hypothesis: H0: μ = 50 ) Population Is X 20 likely if μ = 50? If not likely, REJECT Null Hypothesis Suppose the sample mean age is 20: X = 20 Now select a random sample Sample Sampling Distribution of X There are two cutoff values (critical values), defining the regions of rejection H0: μ = 50 H1: μ 50 /2 /2 X 50 Reject H0 Do not reject H0 Reject H0 0 20 Likely Sample Results Lower critical value Upper critical value 9 Level of Significance, Defines the unlikely values of the sample statistic if the null hypothesis is true Defines rejection region of the sampling distribution Is designated by , (level of significance) Typical values are .01, .05, or .10 Is the compliment of the confidence coefficient Is selected by the researcher before sampling Provides the critical value of the test 10 Level of Significance and the Rejection Region Level of significance = H0: μ = 3 H1: μ ≠ 3 /2 Two tailed test /2 Rejection region is shaded 0 H0: μ ≤ 3 H1: μ > 3 Represents critical value 0 Upper tail test H0: μ ≥ 3 H1: μ < 3 Lower tail test 0 11 Errors in Making Decisions Type I Error When a true null hypothesis is rejected The probability of a Type I Error is Called level of significance of the test Set by researcher in advance Type II Error Failure to reject a false null hypothesis The probability of a Type II Error is β 12 Example Possible Jury Trial Outcomes The Truth Verdict Innocent Innocent No error Guilty Type I Error Guilty Type II Error No Error 13 Outcomes and Probabilities Possible Hypothesis Test Outcomes Decision Key: Outcome (Probability) Actual Situation H0 True H0 False Do Not Reject H0 No error (1 - ) Type II Error (β) Reject H0 Type I Error () No Error (1-β) 14 Type I & II Error Relationship Type I and Type II errors can not happen at the same time Type I error can only occur if H0 is true Type II error can only occur if H0 is false If Type I error probability ( ) , then Type II error probability ( β ) 15 p-Value Approach to Testing p-value: Probability of obtaining a test statistic more extreme ( ≤ or ) than the observed sample value given H0 is true Also called observed level of significance 16 p-Value Approach to Testing (continued) Convert Sample Statistic (e.g. X ) to Test Statistic (e.g. t statistic ) Obtain the p-value from a table or computer Compare the p-value with If p-value < , reject H0 If p-value , do not reject H0 17 8 Steps in Hypothesis Testing 1. 2. 3. 4. 5. 6. 7. 8. State the null hypothesis, H0 State the alternative hypotheses, H1 Choose the level of significance, α Choose the sample size, n Determine the appropriate test statistic to use Collect the data Compute the p-value for the test statistic from the sample result Make the statistical decision: Reject H0 if the p-value is less than alpha Express the conclusion in the context of the problem 18 Hypothesis Tests for the Mean Hypothesis Tests for Known Unknown 19 Hypothesis Testing Example Test the claim that the true mean # of TV sets in U.S. homes is equal to 3. 1. State the appropriate null and alternative hypotheses H0: μ = 3 H1: μ ≠ 3 (This is a two tailed test) 2. Specify the desired level of significance Suppose that = .05 is chosen for this test 3. Choose a sample size Suppose a sample of size n = 100 is selected 20 Hypothesis Testing Example (continued) 4. 5. Determine the appropriate Test σ is unknown so this is a t test Collect the data Suppose the sample results are n = 100, 6. X = 2.84 s = 0.8 So the test statistic is: t X μ 2.84 3 .16 2.0 s 0.8 .08 n 100 The p value for n=100, =.05, t=-2 is .048 21 Hypothesis Testing Example (continued) 7. Is the test statistic in the rejection region? Reject H0 if p is < alpha; otherwise do not reject H0 The p-value .048 is < alpha .05, we reject the null hypothesis 22 Hypothesis Testing Example (continued) 8. Express the conclusion in the context of the problem Since The p-value .048 is < alpha .05, we have rejected the null hypothesis Thereby proving the alternate hypothesis Conclusion: There is sufficient evidence that the mean number of TVs in U.S. homes is not equal to 3 If we had failed to reject the null hypothesis the conclusion would have been: There is not sufficient evidence to reject the claim that the mean number of TVs in U.S. home is 3 23 One Tail Tests In many cases, the alternative hypothesis focuses on a particular direction H0: μ ≥ 3 H1: μ < 3 H0: μ ≤ 3 H1: μ > 3 This is a lower tail test since the alternative hypothesis is focused on the lower tail below the mean of 3 This is an upper tail test since the alternative hypothesis is focused on the upper tail above the mean of 3 24 Lower Tail Tests H0: μ ≥ 3 There is only one critical value, since the rejection area is in only one tail H1: μ < 3 Reject H0 -t Do not reject H0 3 X Critical value 25 Upper Tail Tests There is only one critical value, since the rejection area is in only one tail t H0: μ ≤ 3 H1: μ > 3 Do not reject H0 3 tα Reject H0 X Critical value 26 Assumptions of the One-Sample t Test The data is randomly selected The population is normally distributed or the sample size is over 30 and the population is not highly skewed 27 Hypothesis Tests for Proportions Involves categorical values Two possible outcomes “Success” (possesses a certain characteristic) “Failure” (does not possesses that characteristic) Fraction or proportion of the population in the “success” category is denoted by p 28 Proportions (continued) Sample proportion in the success category is denoted by ps X number of successes in sample ps n sample size When both np and n(1-p) are at least 5, ps can be approximated by a normal distribution with mean and standard deviation p(1 p) μps p σps n 29 Hypothesis Tests for Proportions The sampling distribution of ps is approximately normal, so the test statistic is a Z value: Z ps p p(1 p) n Hypothesis Tests for p np 5 and n(1-p) 5 np < 5 or n(1-p) < 5 Not discussed in this chapter 30 Example: Z Test for Proportion A marketing company claims that it receives 8% responses from its mailing. To test this claim, a random sample of 500 were surveyed with 25 responses. Test at the = .05 significance level. Check: n p = (500)(.08) = 40 n(1-p) = (500)(.92) = 460 31 Z Test for Proportion: Solution Test Statistic: H0: p = .08 H1: p .08 Z = .05 n = 500, ps = .05 Critical Values: ± 1.96 Reject Reject .025 .025 -1.96 -2.47 0 1.96 z ps p p(1 p) n .05 .08 2.47 .08(1 .08) 500 p-value for -2.47 is .0134 Decision: Reject H0 at = .05 There is sufficient Conclusion: evidence to reject the company’s claim of 8% response rate. 32 Potential Pitfalls and Ethical Considerations Use randomly collected data to reduce selection biases Do not use human subjects without informed consent Choose the level of significance, α, before data collection Do not employ “data snooping” to choose between onetail and two-tail test, or to determine the level of significance Do not practice “data cleansing” to hide observations that do not support a stated hypothesis Report all pertinent findings 33 Summary Addressed hypothesis testing methodology Discussed critical value and p–value approaches to hypothesis testing Discussed type 1 and Type2 errors Performed two tailed t test for the mean (σ unknown) Performed Z test for the proportion Discussed one-tail and two-tail tests Addressed pitfalls and ethical issues 34