Download UNIT 4 Section 8 Estimating Population Parameters

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

German tank problem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
UNIT 4
Section 8
Estimating Population Parameters using Confidence Intervals
To make inferences about a population that cannot be surveyed entirely, sample statistics can be taken
from an SRS of the population and used to estimate population parameters. Recall that parameters are
unknown fixed values about populations, such as m and p.
ο‚·
ο‚·
ο‚·
The population mean ΞΌ is estimated using π‘₯Μ… (x-bar), the sample mean.
The population proportion p is estimated using pΜ‚ (p-hat), the sample proportion.
The population standard deviation  is estimated using s, the sample standard
deviation.
These point estimators are the best estimators of parameters; they are unbiased estimators (because the
mean of the sampling distribution is equal
ο‚ ο€  to the value of the parameter).
Using these statistics from sample data, we create Confidence Intervals to provide a range of values to
capture the true parameter. The interval is based on a Confidence Level that determines how precise
our estimate is and how confident we are that our interval captures the true parameter.
Confidence Interval & Confidence Level
Because the point estimate (the mean of the sample π‘₯Μ… or the proportion of the sample 𝑝̂ ) will vary with
each new sample and will not necessarily be the exact value of the population parameter, a range of
plausible values must be provided.
This Confidence Interval is an interval of values constructed around the point estimate, at the center
of the interval, with a margin of error. The margin of error is added to and subtracted from the point
estimate to provide an interval and accounts for chance variation in the point estimate that exists
between samples (However, the margin of error does not account for biased sampling methods.).
A confidence interval has the general form:
point estimate
statistic
margin of error
(critical value) · (standard deviation of statistic)
z or t score
standard error
The critical value is the z-score (or later t-score) obtained based on the confidence level (which
provides an area).
If given the confidence interval limits, the point estimate can be found, using:
𝑝̂ =
(π‘ˆπ‘π‘π‘’π‘Ÿ π‘π‘œπ‘›π‘“π‘–π‘‘π‘’π‘›π‘π‘’ 𝑙𝑒𝑣𝑒𝑙) + (πΏπ‘œπ‘€π‘’π‘Ÿ π‘π‘œπ‘›π‘“π‘–π‘‘π‘’π‘›π‘π‘’ 𝑙𝑒𝑣𝑒𝑙)
2
If given the confidence interval limits, the margin of error can be found, using:
𝑝̂ =
(π‘ˆπ‘π‘π‘’π‘Ÿ π‘π‘œπ‘›π‘“π‘–π‘‘π‘’π‘›π‘π‘’ 𝑙𝑒𝑣𝑒𝑙) βˆ’ (πΏπ‘œπ‘€π‘’π‘Ÿ π‘π‘œπ‘›π‘“π‘–π‘‘π‘’π‘›π‘π‘’ 𝑙𝑒𝑣𝑒𝑙)
2
PRACTICE: Use the given confidence interval to find the point estimate and the margin of error.
a. (.344, .528)
b. (-.18, .24)
The Confidence Level is a predetermined percentage that represents how confident we are that the
confidence interval constructed will actually capture the true population parameter. The margin of
error decreases as the Confidence Level decreases and as the sample size n increases, narrowing the
interval.
Our use of a 95% confidence level can be interpreted as, β€œWe are using a method that will provide
correct results in 95% of all confidence intervals constructed using randomly obtained data.”
To sketch a confidence interval, draw a normal curve and label the intervals’ critical values based on
the confidence level. For a 95% confidence level, find the z-score that corresponds to the area to the
left of the lower bound (and to the right of the upper bound) using a z-score sheet or a calculator:
ο‚·
Using the z-score sheet, look in the body of the table for area .0250 [
1-.95
2
] to obtain a -1.960
z score… critical values ±1.96 (as standard deviations below and above the point estimate).
ο‚·
Using the calculator, press β€œ2nd,” β€œDistribution,” β€œinvNorm(.025),” β€œEnter” to obtain a 1.960 z score… critical values ±1.96 (as standard deviations below and above the point
estimate).
Interpret the interval you have constructed in context of the problem. Be sure to correctly interpret the
meaning of the confidence interval: β€œWe are 95% confident that the (true parameter) lies between
(lower limit) and (upper limit).” The remaining 5%, split equally between the highest 2.5% and the
lowest 2.5%, is where the true parameter lies if our interval does not capture the actual parameter.
PRACTICE: Sketch and label a normal curve with the z-scores for the given confidence intervals.
a. 85%
b. 90%
Margin of Error & Required Sample Size
Sample size depends on the desired confidence level and margin of error. Use the formulas for margin
of error, and round up to the next larger integer.
To estimate p
m ο‚³ Z
pˆ (1  pˆ )
n
To estimate  with
m ο‚³ Z
ο‚ ο€ 
To estimate  without

n
ο‚ ο€ 
m ο‚³ t
s
n
PRACTICE: Find the margin of error that corresponds to the given statistic and confidence level.
a.
x = 70, n = 100, 95% confidence, s =12
b. sample size 500, 20% successes, 99% confidence
PRACTICE: How many doctors must be randomly selected for IQ tests if we want to estimate the
mean IQ score with 95% confidence that the sample mean is within two IQ points of the population
mean?
PRACTICE: What is the minimum sample size needed in order to be 99% confident that the margin of
error is at most 6% when the pΜ‚ value is estimated to be .76?
Confidence Interval for Estimating Population PROPORTION pΜ‚
A four-step system is used to organize the inference process when estimating proportions:
I.
Identify the population and the parameter of interest in context.
II.
Verify the assumptions/conditions (a-c), and identify the procedure (d) in context.
a. Data is randomly obtained (SRS). (Otherwise, this condition may be assumed, which may
limit the ability to generalize the results to the population.)
b. Population is at least 10 times sample size n. (for independence and to find standard
deviation.)
c. Both n pΜ‚ ο‚³ 10 and n (1- pΜ‚ ) ο‚³ 10 must be met. (for normal approximation)
d. State, β€œWe have verified the conditions for a (type of interval) for (parameter).”
III.
With conditions
met, carry
ο‚ ο€ 
ο‚ ο€ out the selected procedure.
Find confidence interval:
estimate ± margin of error
𝑝̂
IV.
±
(Show work)
𝑝̂(1βˆ’π‘Μ‚)
π‘βˆš
𝑛
Interpret results in context of the problem.
β€œWe are (confidence level)% confident that the true (parameter) of (the specific
population) lies between (the interval bounds).
(Confirm results with calculator: STAT, TESTS, 1-PropZInt, (enter values), Calculate)
Example 1: The New York Times and CBS News conducted a nationwide poll of an SRS of 1048
randomly selected 13- to 17-year olds. Of these teenagers, 692 had a television in their room.
Construct a 95% confidence interval to estimate p.
I.
We are interested in estimating p, the proportion of 13- to 17-year olds who have a
television in their room.
II.
As stated in the problem, data are obtained from a SRS of 13- to 17-year olds.
It is safe to assume that the population is comprised of at least 10,480 (10X sample
size) 13- to 17-year olds.
npˆ  10
 692 οƒΆ
(1048 )
οƒ· ο‚³ 10
 1048 οƒΈ
692 ο‚³ 10
n(1  pˆ )  10
 356 οƒΆ
(1048 )
οƒ· ο‚³ 10
 1048 οƒΈ
356 ο‚³ 10
It is safe to use the normal approximation.
We have verified the conditions for a one-proportion Z confidence interval for p.
III.
pˆ  Z
pˆ (1  pˆ )
n
692  356 οƒΆ

οƒ·
692
1048  1048 οƒΈ
ο‚± 1.96
1048
1048
.6603 ο‚± 1.96
.6603 (.3397 )
1048
.6603 ο‚± .0287
(.6316 , .6890)
IV.
We are 95% confident that the true proportion of 13- to 17-year olds who have a
television in their room lies between 63.16% and 68.90%.
Example 2: We are interested in the proportion of seniors at CBHS who plan to attend college
in the state of Florida. We randomly selected 50 seniors who plan to attend college and 37 of
them say they will be going to school in state. Construct a 99% confidence interval to estimate p.
Confidence Interval for Estimating Population MEAN m WITH standard deviation
s
A four-step system is used to organize the inference process when estimating means:
I.
Identify the population and the parameter of interest in context.
II.
Verify the assumptions/conditions (a-c), and identify the procedure (d) in context.
e. Data is randomly obtained (SRS). (Otherwise, this condition may be assumed, which may
limit the ability to generalize the results to the population.)
f. Population is at least 10 times sample size n. (for independence and to find standard
deviation.)
g. Either n ο‚³ 30 or the original population is normally distributed. (for normal
approximation)
h. State, β€œWe have verified the conditions for a (type of interval) for (parameter).”
III.
ο‚ ο€ 
With conditions met, carry out the selected procedure.
Find confidence interval: estimate ± margin of error
π‘₯Μ… ± 𝑍
IV.
(Show work)
𝜎
βˆšπ‘›
Interpret results in context of the problem.
β€œWe are (confidence level)% confident that the true (parameter) of (the specific
population) lies between (the interval bounds).”
EXAMPLE 1: A sample of 54 bears from Yellowstone National Park has a mean weight of 182.9 lb.
Assuming that s is known to be 121.8 lb., find a 99% CI estimate of the mean of the population of all
such bear weights.
Confidence Interval for Population Mean m WITHOUT Standard Deviation s
Follow the same procedures as The Confidence Interval, CI, is found using:
s
x ο‚± t
n
s, standard deviation of sample
m, margin of error for mean without s
x , the sample mean, the best point estimate of the
population mean
ο‚ ο€ 
Use t-distribution to obtain larger critical
values when the standard deviation is not known. The tdistribution will provide a wider interval of values as we must approximate using the sample standard
deviation s. As with known population standard deviation, the greater the sample size, the narrower
the interval. When determining β€œdegrees of freedom,” choose one fewer than sample size n on t score
sheet (or in calculator), df = n β€” 1, and always round down to the next lowest degrees of freedom
when the value isn’t exact on sheet.
PRACTICE: Find critical t value for the following:
1. n = 15; SRS; normally distributed population; 95% confidence level; s = 2.1
2. Construct a CI (without 4-step process) for estimating mean:
n = 20 (from a normally distributed population)
x = 4.4 in
s = 4.2 in
99% confidence level
EXAMPLE 2: A sample of 54 bears from Yellowstone National Park has a mean weight of 182.9 lb.
Assuming that s is not known, but the sample standard deviation is 111.2 lb., construct a 99% CI
estimate of the mean of the population of all such bear weights.