Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lesson 10 - R Summary of Hypothesis Testing Objectives • Review Hypothesis Testing Hypothesis Testing • The process of hypothesis testing is very similar across the testing of different parameters • The major steps in hypothesis testing are – Formulate the appropriate null and alternative hypotheses – Calculate the test statistic – Determine the appropriate critical value(s) – Reach the reject / do not reject conclusions Similarities in hypothesis test processes Parameter Mean (σ known) Mean (σ unknown) Proportion Variance Std Dev H0 : μ = μ0 μ = μ0 p = p0 σ2 = σ02 σ = σ0 (2-tailed) H1: μ ≠ μ0 μ ≠ μ0 p ≠ p0 σ2 ≠ σ02 σ ≠ σ0 (L-tailed) H1: μ < μ0 μ < μ0 p < p0 σ < σ02 σ < σ0 (R-tailed) H1: μ > μ0 μ > μ0 p > p0 σ > σ02 σ > σ0 Test statistic Difference Difference Difference Ratio Ratio Critical value Normal Student t Normal Chi-square Chi-square Chapter 10 – Section 1 If a researcher wishes to test a claim that the average weight of a white rhinoceros is 5,000 lbs, then she should state a null hypothesis of 1) H1: Average weight = 5,000 pounds 2) H0: Average weight = 5,000 pounds 3) H0: Average weight ≠ 5,000 pounds 4) H0 + H1: Average weight = 5,000 pounds Chapter 10 – Section 1 If the hypotheses for a test are H0: μ = 20 seconds H1: μ < 20 seconds then an example of a Type I error occurs when 1) μ = 20 seconds and we did not reject H0 2) μ = 15 seconds and we rejected H0 3) μ = 25 seconds and we did not reject H0 4) μ = 20 seconds and we rejected H0 Chapter 10 – Section 2 The classical approach rejects the null hypothesis H0: μ = 20 when 1) The sample mean is far (too many standard deviations) from 20 2) The sample mean is not equal to 20 3) The sample mean is close (too few standard deviations) to 20 4) The sample mean is equal to 20 Chapter 10 – Section 2 In the P-value approach, relatively small values of the P-value correspond to situations where 1) The classical approach does not apply 2) The null hypothesis H0 must be accepted 3) The null hypothesis H0 must be rejected 4) The probability of obtaining such a sample mean is relatively small Chapter 10 – Section 3 When the population standard deviation σ is not known, then we should perform hypothesis tests using 1) The alternative hypothesis 2) The t-distribution 3) The normal distribution 4) The Type II Error Chapter 10 – Section 3 In testing a claim regarding a population mean with σ is unknown, we 1) May use only the classical approach with the t-distribution 2) May use only the P-value approach with the t-distribution 3) May use either the classical approach or the Pvalue approach with the t-distribution 4) May use either standard normal distribution with the t-distribution Chapter 10 – Section 4 A possible null hypothesis for testing a claim regarding a population proportion is 1) H0: Mean Weight of Dogs = 20 kgs 2) H0: Standard Deviation of Weight of Dogs = 8 kgs 3) H0: Proportion of Dogs Weighs 30 kgs 4) H0: Proportion of Dogs that weigh < 30 kgs = 0.30 Chapter 10 – Section 4 Tests of a claim about a population proportion use 1) The normal model, or the binomial probability distribution if the sampling distribution is not normal 2) Always the normal model 3) Always the Type II model 4) The t-distribution, or the sampling distribution if the sample size is too small Chapter 10 – Section 5 The test of a claim about a population standard deviation uses the 1) Normal distribution 2) The t-distribution 3) The chi-square distribution 4) All of the above Chapter 10 – Section 5 If a sample size n is 65, then a test of a claim about a population standard deviation uses 1) A normal distribution with mean 65 2) A normal distribution with standard deviation 64 3) A chi-square distribution with 65 degrees of freedom 4) A chi-square distribution with 64 degrees of freedom Chapter 10 – Section 6 To determine the appropriate hypothesis test to perform, we should 1) Consider which P-value we wish to obtain 2) Consider which type of parameter we are analyzing 3) Consider whether the null hypothesis is known or unknown 4) All of the above Chapter 10 – Section 7 If the hypotheses for a test are H0: μ = 20 seconds H1: μ < 20 seconds then an example of a Type II error occurs when 1) μ = 25 seconds and we did not reject H0 2) μ = 15 seconds and we rejected H0 3) μ = 15 seconds and we did not reject H0 4) μ = 20 seconds and we rejected H0 Chapter 10 – Section 7 A large power for a test occurs when 1) The Type II error β is small 2) The probability of failing to reject the null hypothesis, when the alternative hypothesis is true, is small 3) Distinguishing between the null hypothesis and the alternative hypothesis is relatively clear with the data 4) All of the above Hypothesis Testing H0: The status quo, what was done before, what we are trying to disprove H1: The new item, the new study results Test Statistics: Test μ σ unknown x-μ Z0 = ---------σ / √n Test μ σ known x-μ t0 = ---------s / √n p^ - p Z0 = ---------p(1-p) -------n Test σ n s² χ²0 = -------σ² Test population prop Critical Values: (left, two, right tailed tests) Zc = Zα, 1-α/2, 1-α; tc = tα, 1-α/2, 1-α/n-1; χ²c = χ²1-α, 1-α/2, α/n-1 Conclusion: If Zc < Z0, tc < t0, p < α, or χ²c < χ²0 then Reject H0 Otherwise we Fail to Reject (FTR) Hypothesis Testing Methods Q0 FTR • Classical Qα – More standard deviations away from mean – Probability of getting a more extreme value – Within the interval Q0 FTR • P-Value • Confidence Interval Rej H0 FTR Rej H0 LB Qα Rej H0 Rej H0 UB Requirements to Check • Mean, σ Known – Simple Random Sample (SRS) – Normal distribution • Mean, σ unknown – SRS – No outliers and “normality” (normality plot) • Population Proportion – SRS – n(p)(1-p) ≥ 10 – n ≤ 0.05N (allows normal estimation of binomial) (keeps it from being hypergeometric) • Variance or Standard Deviation – SRS – Normal distribution Hypothesis Test – Mean, σ Known USAA Auto Insurance data base show the average miles driven is 12,200. A local rep, Sam, believes the residents of southwestern Virginia drive more. He obtains a sample of 35 drivers whose average was 12,895.9. Using USAA’s database σ = 3800 miles. Test his claim at the α = 0.01 level. H0: μ0 = 12,200 (drivers in southwestern VA drive the same as elsewhere) H1: μ0 > 12,200 (drivers in southwestern VA drive more than elsewhere) x-bar = 12,895.9 μ0 = 12,200 σ = 3800 n = 35 α = 0.01 X-bar – μ Z0 = --------------- = 1.083 and p = 0.13931 (from calculator) σ / √n Critical Values: Zc = 2.326 Confidence Interval (CI) [11241, 14550] Conclusion: Since Z0 < Zc (μ0 in CI or p > α), we fail to reject H0 and conclude that we don’t have sufficient evidence to say SWVA drivers drive more. Hypothesis Test – Mean, σ Unknown A high school principal believes that the new attendance policy has reduced the average number of tardies among the habitual tardy students. He samples 40 of his habitual tardy students and determines that their average tardies was 16.8 with a standard deviation of 4.7. He wants you to test at the α = 0.1 level to see if the average number of tardies was less than the historic mean of 18.1. H0: μ0 = 18.1 (habitual tardiness remained the same) H1: μ0 < 18.1 (habitual tardiness decreased) x-bar = 16.8 μ0 = 18.1 σ = 4.7 n = 40 α = 0.1 X-bar – μ t0 = --------------- = -1.7493 and p = 0.04405 (from calculator) s / √n Critical Values: tc = -1.304 Confidence Interval (CI) [15.548, 18.052] Conclusion: Since t0 < tc (μ0 out of CI or p < α), we reject H0 and conclude that the habitual tardiness has decreased. Hypothesis Test – Population Proportion In the 1990’s 65% of students at Virginia Tech thought that lying was unethical. In a poll conduct last May in a simple random sample of 1005 Virginia Tech students, 704 responded that lying was unethical. Is there evidence to indicate that the percentage of students who believe that lying is unethical has increased at the α = 0.05 level. H0: p0 = 0.65 (% who thought lying was unethical behavior is the same) H1: p0 > 0.65 (% who thought lying was unethical behavior has increased) p0 = 0.65 x = 704 n = 1005 α = 0.05 p-hat – p0 Z0 = --------------- = 3.356 and p = 0.0004 (from calculator) √p0(1-p0)/n Critical Values: Zc = -1.304 Confidence Interval (CI) [0.672, 0.728] Conclusion: Since Z0 > Zc (p0 out of CI or p < α), we reject H0 and conclude that the percentage who believe lying is unethical has increased. Hypothesis Test – Population Variance A snack bag of plain M&M’s has a mean number of M&M’s of 21. The quality control people at M&M-Mars have published data on the internet the claims the standard deviation of the number of M&Ms to be under 0.75. A Stats class samples 11 snack bags of plain M*Ms and determines that the standard deviation was 0.6404. Their teacher wants to know if their sample standard deviation is smaller than the advertised at the α = 0.05 level H0: σ0 = 0.75 (the standard deviation of M&Ms in snack bags is the same) H1: σ0 < 0.75 (the standard deviation of M&Ms in snack bags has decreased) σ0 = 0.75 s = 0.6404 n = 11 α = 0.05 n s² χ²0 = --------------- = 7.291 (by hand) p-value = 0.302 (by χ²cdf) σ²0 Critical Values: χ²c = 3.940 Confidence Interval: NA Conclusion: Since χ0 > χc (or p > α), we fail to reject H0 and conclude that there is insufficient evidence that σ in plain M&M snack bags has decreased. Summary and Homework • Summary – We can test whether sample data supports a hypothesis claim about a population mean, proportion, or standard deviation – We can use any one of three methods • The classical method • The P-Value method • The Confidence Interval method – The commonality between the three methods is that they calculate a criterion for rejecting or not rejecting the test statistic • Homework – pg 511-513; 1, 2, 3, 7, 8, 12, 13, 14, 15, 17, 20, 37