Download Chapter 16: Confidence Intervals: The Basics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Regression toward the mean wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Chapter 16: Confidence Intervals: The Basics
· The reasoning of statistical estimation
Inferential statistics: consists of procedures used to make inferences or generalizations
about population characteristics from information contained in a sample drawn from this
population.
⇓
Learn about a large group by examining data from some of its members.
Large group (Studying about ⇒ Population)
Parameter: A numerical measurement describing some characteristic of a population.
Mean(µ), Variance(σ 2 ), Standard Deviation (σ)
Small part of Large group (We use the information contained in ⇒ Sample)
Statistic: A numerical measurement describing some characteristic of a sample.
Mean(x), Variance(s2 ), Standard Deviation (s)
· Conditions for Inference about a Mean
1. We have an SRS from the population of interest. There is no nonresponse or other
practical difficulty. The population is large compared to the size of the sample.
2. The variable we measure has an exactly Normal distribution N (µ, σ) in the population.
3. We do not know the population mean µ, but we do know the population standard
deviation σ.
· Margin of error and confidence level
Ex. An NHANES report gives data for 654 women aged 20 to 29 years. The mean BMI
of these 654 women was x = 26.8. On the basis of this sample, we want to estimate
the mean BMI µ in the population of all 20.6 million women in this age group. To
match the simple conditions, we will treat the NHANES sample as an SRS from
a Normal population with known standard deviation σ = 7.5.
1. Estimate the unknown population mean.
We can use x = 26.8 as the unknown population mean.
Theory Behind = A point estimate: A single value used to approximate a population
parameter.
Unbiased estimators can be the point estimator. Sample mean (x)
2. What is the standard distribution for the sampling distribution of x.
7.5
σ
√ = √
= 0.29327 ≈ 0.29
n
654
Theory Behind = If x is the mean of an SRS of size n drawn from a large population
with mean µ and standard deviation σ, then
1)Sampling distribution of x is the very close value of the population mean µ.
σ
2) The standard deviation of the distribution of x is √ .
n
3. Using Empirical Rule (the 68-95-99.7 rule), estimate the interval, which contains 95%
of data are contained.
σ
x − 2 ·√
n
σ
< µ < x + 2 ·√
n
26.8 − 2 · (0.29) < µ < 26.8 + 2 · (0.29)
26.22 < µ < 27.38
26.2 < µ < 27.4
4. Interpreting a Confidence Interval
We can say that we are 95% confident that the true mean BMI µ of all young women is
some value in the between 26.2 and 27.4.
or
We have 95% confidence to say that the interval from 26.2 to 27.4 actually does contain
the true value of the population mean proportion µ.
· Confidence intervals for a population mean
A Confidence Interval (CI): A range (or an interval) of values used to estimate
the true value of a population parameter.
The confidence level: That the confidence interval actually contains the population
parameter.
There are three common choices for the confidence level:
90% (α = 0.10)
95% (α = 0.05)
99% (α = 0.01)
Critical Values: The number z ∗ is a critical value that is a z score with the property
(z ∗ )
that it separates an area of α/2 in the right tail of the standard normal
distribution.
Confidence Level
α
Critical Value (z ∗ )
————————————————————————————
90%
0.10
1.645
95%
0.05
1.960
99%
0.01
2.576
C.I. for Estimating a Population Mean
(with σ Known),
x − E
where
σ
E = z∗ · √
n
σ
x − z∗ · √
n
< µ < x + E
(Margin of error)
σ
< µ < x + z∗ · √
n
Round-Off Rule for Confidence Interval Estimates of mean.
· From the original set of data values: Round the confidence interval limits to one more
decimal place than is used for the original set of data.
· From summary statistic: Round the confidence interval limits to the same number
of decimal place used for the sample mean.
Ex. NCAA Football Coach Salaries
(Population Mean: σ Known)
A simple random sample of 40 salaries of NCAA football coaches has a mean of $415,953.
Assume that σ = $63,364. Assume that the population is normally distributed. Hence
all three conditions of “Simple Conditions for Inference about a Mean” are satisfied.
a. Find the best point estimate of the mean salary (unknown population mean)
of all NCAA football coaches.
b. Construct a 99% of confidence interval estimate of the mean salary of an NCAA
football coach.
c. Interpret the confidence interval for mean in words.
Ex. (Population Mean: σ Known)
Suppose that you give the NAEP test to an SRS of 900 eighth-graders from a large
population in which the standard deviation σ = 125. After your research, you found
that your survey sample has the mean x = 288. Construct 90% Confidence Interval.