Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 9: Basics of Hypothesis Testing April 17 In Chapter 9: 9.1 Null and Alternative Hypotheses 9.2 Test Statistic 9.3 P-Value 9.4 Significance Level 9.5 One-Sample z Test 9.6 Power and Sample Size Review: Basics of Inference • Population all possible values • Sample a portion of the population • Statistical inference generalizing from a sample to a population with calculated degree of certainty • Two forms of statistical inference – Hypothesis testing – Estimation • Parameter a numerical characteristic of a population, e.g., population mean µ, population proportion p • Statistic a calculated value from data in the sample, e.g., sample mean ( x ), sample proportion ( p̂) Distinctions Between Parameters and Statistics (Chapter 8 review) Parameters Statistics Source Population Sample Notation Greek (e.g., m) Roman (e.g., xbar) Vary No Yes Calculated No Yes Ch 8 Review: How A Sample Mean Varies Sampling distributions of means tend to be Normal with an expected value equal to population mean µ and standard deviation (SE) = σ/√n x ~ N m , SE x where SE x n Introduction to Hypothesis Testing • Hypothesis testing is also called significance testing • The objective of hypothesis testing is to test claims about parameters • The first step is to state the problem! • Then, follow these four steps -- Hypothesis Testing Steps A. Null and alternative hypotheses B. Test statistic C. P-value and interpretation D. Significance level (optional) Understand all four steps of testing, not just the calculations. §9.1 Null and Alternative Hypotheses • Convert the research question to null and alternative hypotheses • The null hypothesis (H0) is a claim of “no difference in the population” • The alternative hypothesis (Ha) is a claim of “difference” • Seek evidence against H0 as a way of bolstering Ha Illustrative Example: “Body Weight” • Statement of the problem: In the 1970s, 20–29 year old men in the U.S. had a mean body weight μ of 170 pounds. Standard deviation σ = 40 pounds. We want to test whether mean body weight in the population now differs. • H0: μ = 170 (“no difference”) • The alternative hypothesis can be stated in one of two ways: Ha: μ > 170 (one-sided alternative) Ha: μ ≠ 170 (two-sided alternative) §9.2 Test Statistic This chapter introduces the z statistic for onesample problems about means. z stat x m0 SEx where m 0 the population mean assuming the null hypothesisis true and SEx n Illustrative Example: z statistic • For the illustrative example, μ0.= 170 • We take an SRS of n = 64 and know We know that σ = 40 so the standard error of the mean is 40 SEx 5 n 64 • If we found a sample mean of 173, then x m 0 173 170 zstat 0.60 SEx 5 • If we found a sample mean of 185, then x m 0 185 170 zstat 3.00 SEx 5 Reasoning Behind the zstat This is the sampling distribution of the mean when μ = 170, σ = 40, and n = 64. x ~ N 170 ,5 §9.3 P-value • The P-value answer the question: What is the probability of the observed a test statistic equal to or one more extreme than the current statistic assuming H0 is true? • This corresponds to the AUC in the tail of the Z sampling distribution beyond the zstat. Use Table B or a software utility to find this AUC. (See next slide). • Smaller and smaller P-values provider stronger and stronger evidence against H0 The one-sided P-value for zstat = 0.6 is 0.2743 The one-sided P-value for zstat = 3.0 is 0.0010 Two-Sided P-Value • For one-sided Ha use the area in the tail beyond the z statistic • For two-sided Ha consider deviations “up” and “down” from expected double the one-sided P-value • For example, if the one-sided P-value = 0.0010, then the two-sided P-value = 2 × 0.0010 = 0.0020. • If the one-sided P = 0.2743, then the two-sided P = 2 × 0.2743 = 0.5486. Two-tailed P-value §9.4 Significance Level • Smaller and smaller P-values provide stronger and stronger evidence against H0 • Although it unwise to draw firm cutoffs, here are conventions that used as a starting point: P > 0.10 non-significant evidence against H0 0.05 < P 0.10 marginally significant against H0 0.01 < P 0.05 significant evidence against H0 P 0.01 highly significant evidence against H0 • Examples P =.27 is non-significant evidence against H0 P =.01 is highly significant evidence against H0 Decision Based on α Level • Let α represent the probability of erroneously rejecting H0 • Set α threshold of acceptable error (e.g., let α = .10, let α = .05, or whatever level is acceptable) • Reject H0 when P ≤ α • Example: Set α = .10. Find P = 0.27. Since P > α, retain H0 • Example: Set α = .05. Find P = .01. Since P < α reject H0 §9.5 One-Sample z Test (Summary) Test procedure A. H0: µ = µ0 vs. Ha: µ ≠ µ0 (two-sided) or Ha: µ < µ0 (left-sided) or Ha: µ > µ0 (right-sided) B. Test statistic z stat x m0 where SEx SEx n C. P-value: convert zstat to P value [Table B or software] D. Significance level (optional) Test conditions: • Quantitative response • Good data • SRS (or facsimile) • σ known (not calculated) • Population approximately Normal or sample large (central limit theorem) Example: The “Lake Wobegon Problem” • • • • • Typically, Weschler Adult Intelligence Scores are Normal with µ = 100 and = 15 Take an SRS of n = 9 Measure scores {116, 128, 125, 119, 89, 99, 105, 116, 118} Calculate x-bar = 112.8 Does this sample mean provide statistically reliable evidence that population mean μ is greater than 100? Illustrative Example: Lake Wobegon A. Hypotheses: H0: µ = 100 Ha: µ > 100 (one-sided) B. Test statistic: SEx zstat 15 5 n 9 x m 0 112 .8 100 2.56 SEx 5 C. P-value: Pr(Z ≥ 2.56) = 0.0052 (Table B) P =.0052 highly significant evidence against H0 D. Significance level (optional): This level of evidence is significant at α 0.01 (reject H0). Two-Sided Alternative (Lake Wobegon Illustration) Two-sided alternative Ha: µ ≠100 doubles P = 2 × 0.0052 = 0.0104 P = .0104 provides significant against H0 §9.6 Power and Sample Size Two types of decision errors: Type I error = erroneous rejection of a true H0 Type II error = erroneous retention of a false H0 Truth Decision H0 true H0 false Retain H0 Correct retention Type II error Reject H0 Type I error Correct rejection α ≡ probability of a Type I error β ≡ Probability of a Type II error Power • The traditional hypothesis testing paradigm considers only Type I errors • However, we should also consider Type II errors • β ≡ probability of a Type II error • 1 – b “Power” ≡ probability of avoiding a Type II error Power of a z test | m0 m a | n 1 b z1 2 • where Φ(z) represent the cumulative probability of Standard Normal z (e.g., Φ(0) = 0.5) • μ0 represent the population mean under the null hypothesis • μa represents the population mean under an alternative hypothesis Calculating Power: Example A study of n = 16 retains H0: μ = 170 at α = 0.05 (two-sided). What was the power of the test conditions to identify a significant difference if the population mean was actually 190? | m m | n a 1 b z1 0 2 | 170 190 | 16 1.96 40 0.04 0.5160 [From table B] Reasoning Behind Power • Consider two competing sampling distribution models – One model assumes H0 is true (top curve, next page) – An other model assumes Ha is true μa = 190 (bottom curve, next page) • When α = 0.05 (two-sided), we will reject H0 when the sample mean exceeds 189.6 (right tail, top curve) • The probability of getting a value greater than 189.6 on the bottom curve is 0.5160, corresponding to the power of the test Sample Size Requirements The required sample size for a two-sided z test to achieve a given power is n z1 b z1 2 2 2 2 where 1 – β ≡ desired power of the test α ≡ desired significance level σ ≡ population standard deviation Δ = μ0 – μa ≡ the difference worth detecting Illustrative Example: Sample Size Requirement • How large a sample is needed for a one-sample z test with 90% power and α = 0.05 (two-tailed) when σ = 40. The null hypothesis assumes μ = 170 and the alternative assumes μ = 190. We look for difference Δ = μ0 − μa = 170 – 190 = −20 n 2 z1 b z1 2 2 2 40 2 (1.28 1.96 ) 2 20 2 41 .99 • Round this up to 42 to ensure adequate power. Example showing the conditions for 90% power.