Download 8.3 Estimating a Population Mean

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
8.3 Estimating a Population Mean
Objectives
SWBAT:
STATE and CHECK the Random, 10%, and Normal/Large Sample conditions for constructing a
confidence interval for a population mean.
EXPLAIN how the t distributions are different from the standard Normal distribution and why it is
necessary to use a t distribution when calculating a confidence interval for a population mean.
DETERMINE critical values for calculating a C% confidence interval for a population mean using a
table or technology.
CONSTRUCT and INTERPRET a confidence interval for a population mean.
DETERMINE the sample size required to obtain a C% confidence interval for a population mean
with a specified margin of error.
Calculator BINGO! Let’s use the Rossman/Chance applet to simulate the calculator
activity. (from pg 509). Did one method do a better job of capturing the true mean?
Go to www.rossmanchance.com
Click on Rossman/Chance Applet Collection.
Under “Sampling Distribution Simulations” click on “Simulating Confidence Intervals for
Population Parameter”
Under “Method” select Means, Normal, z with sigma
Enter mu of 100, sigma of 40, n of 4, and change conf level to 99%. Calculate a few
intervals, then change intervals to 1000, or even 10000. How often is our interval
containing mu?
What’s the problem with the previous approach?
How can we truly know the population standard deviation 𝜎?
Let’s estimate sigma using our sample standard deviation s.
Change the method to z with s and calculate another 1000 intervals. What happens?
The intervals that missed were typically too short. How can we become more accurate?
A man by the name of William Gosset fixed this problem by using a bigger multiplier – a t
statistic rather than a z statistic.
Return to the applet and switch the method to t. What happens?
Gosset worked for a Guinness brewery in Dublin and published under the pen name
“Student.” He developed the “Student’s t distribution.”
He had to use a pen name because a previous Guinness employee published a paper
containing secrets of the Guinness brewery, so in order to prevent further disclosure of
confidential material, Guinness prohibited its employees from publishing any papers,
regardless of the information they contained.
When should we use a t* critical value rather than a z* critical value for
calculating a confidence interval for a population mean?
We use t* rather than z* when we have to use s (sample SD) to estimate 𝜎 (population SD).
How do we calculate the value of t* to use? How do we calculate degrees of
freedom?
• We have to use our t table using df = n – 1, and looking up the area in one-tail.
• If we have a sample size of 12, you would look up df of 11.
• If we have a sample size of 48, your df would be 47, but the table does not
have a value for 47, so you need to be more conservative and use df = 40.
You can also use the command invT(area: , df: ). The area is the area to the left of
a desired critical value. For example, if constructing a 95% confidence interval,
you would enter area: 0.025 (same value you’d look up if using the table).
A Carucci sidebar: What are degrees of freedom?
The number of degrees of freedom for a collection of sample data is the number
of sample values that can vary after certain restrictions have been imposed on
all data values.
For example, if 10 students have quiz scores with a mean of 80, we can freely
assign values to the first 9 scores, but the 10th score is then determined. The
sum of the 10 scores must be 800, so the 10th score must equal 800 minus the
sum of the first 9 scores. Because those first 9 scores can be freely selected to
be any values, we say that there are 9 degrees of freedom available.
What is a t distribution anyway? Describe the shape, center, and spread of the
t distributions. Note: t does not stand for Taylor.
• Shape: symmetric, unimodal, but not quite Normal. It has heavier tails. The t
distribution approaches the Normal distribution as df increase.
• Center: 0, since t is a standardized score
• Spread: greater than a standard Normal distribution, but gets closer to
Normal as the df increase. This means we need to go farther than 1.96 SD
to have 95% confidence.
Example: t* critical values
a) Suppose you want to construct a 90% confidence interval for the mean 𝜇 of
a population based on an SRS of size 10. What critical value t* should you
use?
Table: Using the row for df = 10 – 1 = 9 and the column that is above the 90%
confidence level (tail probability .05), the desired critical value is t* = 1.833.
Calculator: invT(area: .05, df: 9) = 1.833
b) What if you wanted to construct a 99% confidence interval for 𝜇 using a
sample size of 75?
Table: With n = 75, df = 75 – 1 = 74. Because there is no row for df = 74, we use
the more conservative df = 60. Using the column for 99% confidence (.005), t*
= 2.660.
Calculator: invT(area: .005, df: 74) = 2.644
Note: the difference is because the calculator can be more accurate
What are the three conditions for constructing a confidence interval for a
population mean?
As with proportions, you should check some important conditions
before constructing a confidence interval for a population mean.
Conditions For Constructing A Confidence Interval About A Mean
• Random: The data come from a well-designed random sample or
randomized experiment.
o 10%: When sampling without replacement, check
that
n£
1
N
10
• Normal/Large Sample: The population has a Normal distribution
or the sample size is large (n ≥ 30). If the population distribution
has unknown shape and n < 30, use a graph of the sample data
to assess the Normality of the population. Do not use t procedures
if the graph shows strong skewness or outliers.
What is the formula for the standard error of the sample mean? How do you
interpret this value? Is this formula on the formula sheet?
When the conditions for inference are satisfied, the sampling
distribution for x has roughly a Normal distribution. Because we
don’t know s , we estimate it by the sample standard deviation sx .
sx
, where sx is the
n
sample standard deviation. It describes how far x will be from m, on
average, in repeated SRSs of size n.
The standard error of the sample mean x is
^ interpretation
Formula sheet? Taylor Swift must have gotten to the formula sheet before
us and shaken this formula off
What is the formula for a confidence interval for a population mean? Is this
formula on the formula sheet?
One-Sample t Interval for a Population Mean
When the conditions are met, a C% confidence interval for the
unknown mean µ is
sx
x ±t*
n
where t* is the critical value for the tn-1 distribution with C% of its
area between −t* and t*.
The formula sheet contains the general confidence interval formula of:
Statistic ± (critical value) X (standard deviation of statistic)
a) Construct and interpret a 95% confidence interval for the mean weight of an Oreo cookie.
Do: Because there are 36-1 = 35 degrees of freedom and we want 95%
confidence, we will use the t table and a conservative degrees of freedom of 30
to get a critical value of t* = 2.042 (use tail probability 0.025).
On the calculator press STAT > Tests > 8: Tinterval
Stay on STATS and enter a mean of 11.3921, SD of 0.0817, n: 36, C-Level: 0.95
You get (11.364, 11.42)
Conclude: We are 95% confident that the interval from 11.3645 grams to
11.4197 grams captures the true mean weight of an Oreo cookie.
b) On the packaging, the stated serving size is 3 cookies (34 grams). Does the
interval in part (a) provide convincing evidence that the average weight of an
Oreo cookie is less than advertised? Explain.
The stated serving size is 34 grams for 3 cookies, or 34/3 = 11.333 grams/cookie.
Because all of the plausible values in the interval are greater than 11.333 grams,
there is no evidence that the average weight of an Oreo cookie is less than
advertised. In fact, there is convincing evidence that the average weight is
greater than advertised!
How can you lose credit for the Normal/Large Sample condition on the AP Exam?
• Not including a graph of the sample data
• It is not enough just to make a graph of the data on your calculator when assessing Normality.
You must sketch the graph on your paper to receive credit. You don’t have to draw multiple
graphs – any appropriate graph will do.
• Not understanding that the condition is about the population
What should you do if you think the Normal/Large Sample condition isn’t met?
You need to check whether it’s reasonable to believe that the population distribution is
Normal. You can make a boxplot or dotplot of the data to check for Normality of the
population (see example on pg 520-521).
Can you use your calculator for the Do step? Are there any drawbacks to this?
• You may use your calculator to compute a confidence interval on the AP exam, but
there is a risk involved. If you just give the calculator answer with no work you’ll get
either full credit for the “Do” step (if the interval is correct) or no credit (if it’s wrong).
If you opt for the calculator-only method, be sure to name the procedure (e.g., one
sample Tinterval) and to give the interval (e.g., 11.364 to 11.42).
• If you want to be able to get partial credit, show work.
Example: As part of their final project in AP Statistics, Amanda and Eva randomly selected 18
rolls of a generic brand of toilet paper to measure how well this brand could absorb water. To
do this, they poured ¼ cup of water onto a hard surface and counted how many squares it took
to completely absorb the water. Here are the results from their 18 rolls:
29
20
25
29
21
24
27
25
24
29
24
27
28
21
25
26
22
23
Construct and interpret a 99% confidence interval for 𝜇 = the mean number of squares of
generic toilet paper needed to absorb ¼ cup of water.
Example: As part of their final project in AP Statistics, Amanda and Eva randomly selected 18
rolls of a generic brand of toilet paper to measure how well this brand could absorb water. To
do this, they poured ¼ cup of water onto a hard surface and counted how many squares it took
to completely absorb the water. Here are the results from their 18 rolls:
29
20
25
29
21
24
27
25
24
29
24
27
28
21
25
26
22
23
Construct and interpret a 99% confidence interval for 𝜇 = the mean number of squares of
generic toilet paper needed to absorb ¼ cup of water.
On the calculator: enter the data into a list
Use the command Tinterval, switch to Data, put your list, Freq: 1, C-Level: .99
(22.991, 26.897)
Example: As part of their final project in AP Statistics, Amanda and Eva randomly selected 18
rolls of a generic brand of toilet paper to measure how well this brand could absorb water. To
do this, they poured ¼ cup of water onto a hard surface and counted how many squares it took
to completely absorb the water. Here are the results from their 18 rolls:
29
20
25
29
21
24
27
25
24
29
24
27
28
21
25
26
22
23
Construct and interpret a 99% confidence interval for 𝜇 = the mean number of squares of
generic toilet paper needed to absorb ¼ cup of water.
Conclude: We are 99% confident that the interval from 22.99 squares to 26.89
squares captures the true mean number of squares of generic toilet paper
needed to absorb 1/4 cup of water.
How can we choose an appropriate sample size when we plan to calculate a
confidence interval for a mean?
z*
s
n
£ ME
z*
s
n
£ ME