Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 14 Introduction to Inference Chapter 14 Introduction to Inference 1 What is Statistical Inference? For everyone who does habitually attempt the difficult task of making sense of figures is, in fact, assaying a logical process of the kind we call induction, in that he is attempting to draw inferences from the particular to the general; or, as we more usually say in statistics, from the sample to the population. R.A. Fisher (1890 – 1962) Father of modern statistics Statistical Inference Two forms of statistical inference: • Estimation (Confidence Intervals) • Hypothesis Tests (Significance) Chapter 14 Introduction to Inference 3 Statistical Inference • Objective to infer parameters • Parameter ≡ a numerical characteristic of a population or probability function • Examples of parameters: μ (population mean; expected value) σ (population standard deviation; standard dev parameter) p (probability of “success,” population proportion) • Chs 14 & 15 introduces concepts about inference • Chs 15–20 introduces inferential techniques Chapter 14 Introduction to Inference 4 “Simple Conditions” for Chapter 14 • • • Data acquired by simple random sample (SRS), i.e., all potential observations have same probability of entering the sample No major deviations departures from Normality in population Value of σ is known or assumed before collecting data Chapter 14 Objective: to infer μ! Introduction to Inference 5 Example “Female BMI” • Statement: What is the mean BMI µ in females between ages 20 and 29? • Body Mass Index ≡ BMI = weight / height2 • Assume “simple conditions” 1. SRS 2. Population approx. normal 3. σ = 7.5 (assumed before data collected) • Plan: Estimate µ with 95% confidence Chapter 14 Introduction to Inference 6 Reasoning behind estimation • If I took a multiple SRSs, the sample means (x-bars) would be different in each one. • We do not expect x-bar to be exactly equal to µ any given x-bar is just an estimate of µ. • The variability of the x-bars in predictable in the form of a sampling distribution of means • Fact: Under the “simple conditions” in this chapter, the sampling distributions of means will be Normal distribution with mean µ and standard deviation: Chapter 14 x n ← Standard Deviation of the Mean (also referred to as the standard error of the mean) Introduction to Inference 7 Example (Female BMI) In our example, n = 654 and σ = 7.5. Therefore: 7.5 x 0.3 (rounded) n 654 • σx-bar tells us how close x-bar is likely to be to µ • The 68-95-99.7 rule tells us that x-bar will be within two σx-bar units (that’s 0.6) of µ in 95% of samples • If we say that µ lies in the interval (x-bar − 0.6) to (xbar + 0.6), we’ll be right 95% of the time • Therefore, we can be 95% confident that an interval “xbar ± 0.6” will capture µ Chapter 14 Introduction to Inference 8 Basis of Confidence Intervals (CIs) Chapter 14 Introduction to Inference 9 Confidence Interval (CI) • The CI has two parts point estimate ± margin of error • Suppose in our particular sample, the mean is 26.8. This is the point estimate for µ. • Recall from the previous slide that the margin of error for our data is 0.6 (with 95% confidence) • Therefore, the 95% confidence interval (for this particular sample) = 26.8 ± 0.6 = (26.2, 27.4). Chapter 14 Introduction to Inference 10 Confidence Level C • CIs can be calculated at different levels of • • • • confidence. Let C represent the probability the interval will capture the parameter In our example, C = 95% Other common levels of confidence are 90% and 99%. In this chapter we adjust the C level by changing the z* critical value. Chapter 14 Introduction to Inference 11 Confidence Levels & z critical values In this Chapter we adjust the confidence level by altering critical value z* Common levels of confidence & z critical values Confidence level C 90% 95% 99% Critical value z* (table C) 1.645 1.960 2.576 Chapter 14 Introduction to Inference 12 C level CI for μ, σ known “z procedure” To estimate µ with confidence level C, use xz n Use Table C to determine value of z* Chapter 14 Introduction to Inference 13 Example (95% CI): Solve & Conclude Data: n 654, x 26.8 xz n 7.5 26.8 (1.960) 654 26.8 0.6 (26.2, 27.4) Conclude: We are 95% confident population mean BMI µ is between 26.0 and 27.6 Chapter 14 Introduction to Inference 14 Now have students calculate a 99% CI with the data Data: n 654, x 26.8 xz n Hint: The only thing that changes is the z* critical value. Conclude: We are 99% confident population mean BMI µ is between “lower confidence limit (LCL) here” and “upper confidence limit (UCL) here.” Chapter 14 Introduction to Inference 15 Interpreting a CI • Confidence level C is the success rate of the method that produced the interval. • We know with C level of confidence that the CI will capture µ. • We don’t know with certainty whether any given CI will capture µ or missed it. Chapter 14 Introduction to Inference 16 Four-Step Procedure for CIs Chapter 14 Introduction to Inference 17 Stopping Point for Exam 2 Slides after this point forward could be edited after exam 2 Hypothesis (“Significance”) Tests • Objective test a claim about a parameter • Uses an elaborate vocabulary Chapter 14 Introduction to Inference 19 4-step Process Hypothesis (Significance) Testing Chapter 14 Introduction to Inference 20 State and Plan Example “Population Weight Gain?” State: Is there good evidence that the population is gaining weight? Plan • Parameter is population mean weight gain µ • Null hypothesis H0 statement of “no difference” population not gaining weight H0: μ = 0 • Alternative hypothesis Ha population gaining weight Ha: μ > 0 • Type of test: z test if “simple conditions” (slide 5) met Chapter 14 Introduction to Inference 21 Notes on Statistical Hypotheses • H0 is key to understanding • Ha contradicts H0 • Ha can be stated in one-sided or two-sided ways – One-sided Ha specifies the direction of the difference weight GAIN in population Ha: μ > 0 – Two-sided Ha does not specific the direction of the difference weight CHANGE in the population Ha: μ ≠ 0 Chapter 14 Introduction to Inference 22 Example “Weight Gain” “Solve” Sub-steps (a) Check conditions SRS No major departures from Normality σknown before collecting data (b) Calculate statistics See “z Statistic” Slide (c) Find P-value Chapter 14 Introduction to Inference 23 Reasoning of Significance Testing • If H0 and the conditions are x ~ N (0, 0.316) true, then the sampling distribution of x-bar would be Normal with µ = 0 and 1 x 0.316 n 10 • If a study produced an x-bar of 0.3, this would be poor evidence against H0 • If a different study produced an x-bar of 1.02, this would be good evidence against H0 Chapter 14 Introduction to Inference 24 Test Statistic Standardize the sample mean zstat x 0 n Suppose: x-bar = 1.02, n = 10, and σ = 1 x μ0 zstat σ n 1.02 0 1 10 3.23 X-bar is 3 standard deviations greater than expected if H0 true Chapter 14 Introduction to Inference 25 P-Value from Z Table For Ha: μ > μ0 P-value = Pr(Z > zstat) = right-tail beyond zstat • For Ha: μ < μ0 P-value = Pr(Z < zstat) = left tail beyond zstat • For Ha: μ μ0 P-value = 2 × onetailed P-value Chapter 14 Introduction to Inference 26 P-value from Z Table • Draw (right) • One-sided P-value = Pr(Z > 3.23) = 1 − .9994 = .0006 • Two-sided P-value = 2 × one-sided P = 2 × .0006 = .0012 Chapter 14 Introduction to Inference 27 P-value: Interpretation • P-value ≡ the probability the data would take a value as extreme or more extreme than observed if H0 were true • Smaller-and-smaller P-values → stronger-andstronger evidence against H0 • Conventions .10 < P < 1.0 insignificant evidence against H0 .05 < P ≤ .10 marginally significant evidence vs. H0 .01 < P ≤ .05 significant evidence against H0 0 < P ≤ .01 highly significant evidence against H0 Chapter 14 Introduction to Inference 29 “Significance Level” • α (alpha) ≡ threshold for “significance” • If we choose α = 0.05, we require evidence so strong that it would occur no more than 5% of the time when H0 is true • Decision rule P-value ≤ α evidence is significant P-value > α evidence not significant • For example, let α = 0.01. The two-sided Pvalue = 0.0012 is less than .01, so data are significant at the α = .01 level. Chapter 14 Introduction to Inference 30 Example “Weight Gain” Conclusion • The P-value of .0012 provides highly significant evidence against H0: µ = 0 • We rule in favor of Ha: µ ≠ 0 • Conclude: the population’s mean weight in changing • Our sample mean weight gain of 1.02 pounds per person is statistically significant at the α= .002 level but not at the α= .001 level Chapter 14 Introduction to Inference 31 Chapter 14 Basics Introduction of Significance to Inference Testing 32