Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

German tank problem wikipedia , lookup

Statistical inference wikipedia , lookup

Opinion poll wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Lecture 14: Statistical Inference
Statistical inference deals with drawing conclusions about
population parameters (which are not known) from an analysis
of the sample data.
Two types of inference:
• Estimation of parameters [point estimation, interval estimation];
• Testing of statistical hypotheses.
Examples 1 & 2. page 324.
Estimating a mean (Section 8.2)
Basic setting:
• Want to estimate the mean µ of a population.
• Have a random sample X1, . . . , Xn from the population.
For example:
• Want to estimate the mean IQ of Michigan first-graders.
• Select n = 50 first graders at random and measure their
IQs.
• Their IQs are X1, . . . , X50.
Easy part:
• Almost always (in STT 200, always) the sample mean X is
the best estimator of the population mean.
• Recall that
X=
X1 + · · · + Xn
.
n
• So we just use the mean of the sample as the estimate of
the mean of the population.
X is called a point estimator of µ
Hard part:
• What can we say about the quality of X as an estimator
of µ?
• e.g. what’s the probability that X is within 3 of µ?
Put another way:
• It’s likely that X and µ will not be exactly equal.
• Can we attach a meaningful “margin of error” to X?
What we know from Chapter 7
Take a random sample of size n from a population with mean
µ and standard deviation σ. Then
• The sample mean X has mean µ
√
• The sample mean X has standard deviation σ/ n.
• If the population distribution is normal, the distribution of
X is normal.
• If the population distribution is not normal, then
– If n is small, we don’t know how to proceed.
– If n is large, we can use the normal density to approximate the distribution of X.
Using what we know: If X is normal with mean µ and s.d.
√
σ/ n, then from Table 3:
³
2σ
2σ ´
P µ− √ ≤X ≤µ+ √
= 0.954.
n
n
√
So we’ll call 2σ/ n the “95.4% error margin.”
Finding error margins
• Suppose we want the 98% error margin.
¡
¢
• Find −c and c such that P − c ≤ Z ≤ c = 0.98 where
Z is standard normal.
• From Table 3, c = 2.33.
√
• So 2.33σ/ n is the 98% error margin.
• Suppose we want the 93% error margin.
¡
¢
• Find −c and c such that P − c ≤ Z ≤ c = 0.93 where
Z is standard normal.
• From Table 3, c = 1.81.
√
• So 1.81σ/ n is the 93% error margin.
Matching the book’s notation
• We want a 100(1 − α)% margin of error.
• For example, a 98% margin of error corresponds to α =
0.02.
• Let the area between −zα/2 and zα/2 be 1 − α.
• For example, if α = 0.02 then zα/2 = 2.33.
√
• Then zα/2σ/ n is the 100(1 − α)% margin of error.
Minor problem:
• The formulas require the population standard deviation σ.
• But usually we don’t know σ!
• Possible solution: Replace the unknown σ by S, the sample
standard deviation, which we can compute from the data.
• This will work well if n is large.
Example 3. page 327–328.
Computing sample sizes
• Suppose we want a 98% margin of error to be equal to 2.
How large does n have to be?
• We know
σ
2 = 2.33 √ .
n
• Solve for n to get
·
2.33σ
n=
2
• So if σ = 7, then
·
2.33(7)
n=
2
so we would use n = 67.
¸2
¸2
≈ 66.5,
Example
• Model human pregnancy lengths (in days) by a normal density with mean µ and standard deviation σ = 16.
• Based on a sample of size n = 27, compute a 99% margin
of error.
• From Table 3, the area between −2.57 and 2.57 is about
0.99.
• So the 99% margin of error is
2.57(16)
√
≈ 7.91.
27
• If we want the 99% margin of error to be 4, how large does
n have to be?
• Margin of error is
2.57(16)
√
.
n
• Solve for n to get
·
n=
2.57(16)
4
so we would use n = 106.
¸2
≈ 105.7,