* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Degrees of freedom (statistics) wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
History of statistics wikipedia , lookup
German tank problem wikipedia , lookup
Statistical inference wikipedia , lookup
Opinion poll wikipedia , lookup
Lecture 14: Statistical Inference Statistical inference deals with drawing conclusions about population parameters (which are not known) from an analysis of the sample data. Two types of inference: • Estimation of parameters [point estimation, interval estimation]; • Testing of statistical hypotheses. Examples 1 & 2. page 324. Estimating a mean (Section 8.2) Basic setting: • Want to estimate the mean µ of a population. • Have a random sample X1, . . . , Xn from the population. For example: • Want to estimate the mean IQ of Michigan first-graders. • Select n = 50 first graders at random and measure their IQs. • Their IQs are X1, . . . , X50. Easy part: • Almost always (in STT 200, always) the sample mean X is the best estimator of the population mean. • Recall that X= X1 + · · · + Xn . n • So we just use the mean of the sample as the estimate of the mean of the population. X is called a point estimator of µ Hard part: • What can we say about the quality of X as an estimator of µ? • e.g. what’s the probability that X is within 3 of µ? Put another way: • It’s likely that X and µ will not be exactly equal. • Can we attach a meaningful “margin of error” to X? What we know from Chapter 7 Take a random sample of size n from a population with mean µ and standard deviation σ. Then • The sample mean X has mean µ √ • The sample mean X has standard deviation σ/ n. • If the population distribution is normal, the distribution of X is normal. • If the population distribution is not normal, then – If n is small, we don’t know how to proceed. – If n is large, we can use the normal density to approximate the distribution of X. Using what we know: If X is normal with mean µ and s.d. √ σ/ n, then from Table 3: ³ 2σ 2σ ´ P µ− √ ≤X ≤µ+ √ = 0.954. n n √ So we’ll call 2σ/ n the “95.4% error margin.” Finding error margins • Suppose we want the 98% error margin. ¡ ¢ • Find −c and c such that P − c ≤ Z ≤ c = 0.98 where Z is standard normal. • From Table 3, c = 2.33. √ • So 2.33σ/ n is the 98% error margin. • Suppose we want the 93% error margin. ¡ ¢ • Find −c and c such that P − c ≤ Z ≤ c = 0.93 where Z is standard normal. • From Table 3, c = 1.81. √ • So 1.81σ/ n is the 93% error margin. Matching the book’s notation • We want a 100(1 − α)% margin of error. • For example, a 98% margin of error corresponds to α = 0.02. • Let the area between −zα/2 and zα/2 be 1 − α. • For example, if α = 0.02 then zα/2 = 2.33. √ • Then zα/2σ/ n is the 100(1 − α)% margin of error. Minor problem: • The formulas require the population standard deviation σ. • But usually we don’t know σ! • Possible solution: Replace the unknown σ by S, the sample standard deviation, which we can compute from the data. • This will work well if n is large. Example 3. page 327–328. Computing sample sizes • Suppose we want a 98% margin of error to be equal to 2. How large does n have to be? • We know σ 2 = 2.33 √ . n • Solve for n to get · 2.33σ n= 2 • So if σ = 7, then · 2.33(7) n= 2 so we would use n = 67. ¸2 ¸2 ≈ 66.5, Example • Model human pregnancy lengths (in days) by a normal density with mean µ and standard deviation σ = 16. • Based on a sample of size n = 27, compute a 99% margin of error. • From Table 3, the area between −2.57 and 2.57 is about 0.99. • So the 99% margin of error is 2.57(16) √ ≈ 7.91. 27 • If we want the 99% margin of error to be 4, how large does n have to be? • Margin of error is 2.57(16) √ . n • Solve for n to get · n= 2.57(16) 4 so we would use n = 106. ¸2 ≈ 105.7,