Download PowerPoint

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
STATISTICS 200
Lecture #16
Thursday, October 13, 2016
Textbook: Sections 9.3, 9.4, 10.1, 10.2
Objectives:
• Define standard error, relate it to both standard deviation and
sampling distribution ideas.
• Describe the sampling distribution of a sample proportion.
• Reformulate confidence interval formula using general idea of
estimate plus/minus (multiplier × standard error)
• Interpret confidence level as a relative frequency
• Calculate new values of the multiplier for new confidence
levels other than 95%
We now begin a strong focus on
Inference
Means
Proportions
One
population
proportion
Two
population
proportions
This week
One
population
mean
Difference
between
Means
Mean
difference
Motivation
Eventual Goal: Use statistical inference to answer
the question “What is the percentage of Creamery
customers who prefer chocolate ice cream over
vanilla?”
Strategy:
Get a random sample of
90 individuals and ask
them this question. Use
the answers to perform a
hypothesis test to answer
the question.
Comparison of Binomial-based statistics
Variable
Count of successes
Chapter 8
Proportion of
successes
Chapter 9 and beyond
Notation Mean
St. Dev.
Binomial Distribution vs. approximate p-hat sampling
distribution: n = 100 & p = 0.70
A better confidence interval
OLD:
Conservative
margin of
error:
ME = (multiplier)*(standard error)
NEW:
New formula for margin of error
ME = (multiplier) × (standard error)
Z*
• Related to
Empirical
rule
____________.
• Expresses level of
confidence that the
interval includes the
parameter
_________.
Estimate of the
Standard
deviation
________________
of the sampling
distribution of p-hat
Z*-multiplier
•
Use when the normal approximation is appropriate, i.e.
n*p > 10 and _____________.
n*(1-p) > 10
when _________
Confidence Multiplier
level
(z*)
90%
1.65
95%
1.96  2
98%
2.33
99%
2.58
0.90
The z-multiplier for a 68% confidence
1
level would be _______,
because we
1 standard deviation
must go _____
from the mean to capture 68% of the
area.
0.95
0.98
Three Factors affect the width of a confidence interval
Page 382 textbook
1.
Level of
confidence
Level of
confidence
Z*
2.
ME
Sample size
sample
size
ME
0.0 0.1 0.2 0.3 0.4 0.5
^ (1 - p
^)
p
0.0
0.2
The scatterplot shows the variation is…
A. largest when p-hat = 1.0
B. largest when p-hat = 0.5
C. largest when p-hat = 0.25
D. smallest when p-hat = 0.9
E. smallest when p-hat = 0.2
0.4
^p 0.6
0.8
1.0
Factor 3: Value of p-hat impacts width of C.I.
At a given level of confidence and sample size,
the confidence interval is the widest when p-hat
0.5 and it becomes narrower as pequals ______
0.5_______ in either
hat moves away from
direction.
Confidence Intervals:
Population Proportion
Conservative
Method:
Chapter 1 & 5
1
ME 
n
When normal
conditions
aren’t met, use
this option
Normal
Approximation:
Chapter 10
Exact
(Binomial)
p̂(1  p̂)
ME  z *
n
Need a computer
to calculate the
interval. Does not
include a M.E.
Minitab: provides both options
Pages 389 &
390 in the textbook
13
Binomial distributions
n fixed at 10,
p increasing
p fixed at 0.02,
n increasing
Values of n and p
determine
whether binomial
is normal in shape
What does it mean to be 95% confident?
• Before the sample is drawn: We can say that
P(conf. int. contains the true parameter) = 0.95.
• After the sample is drawn: There is no more
randomness! (Both the CI and the parameter are now
fixed.) So we cannot talk of “probability” any longer.
Interpreting 95% confidence: An example
Suppose we have a sample of 200 students in
STAT 100 and find that 28 of them are left
handed.
Our sample proportion is: 0.14
We now find the ME and construct a 95% CI.
Find the standard error: That is,
estimate the standard deviation of the
sample proportion based on a sample of
size 200:
Hence, z* times the standard error = 2×.025 = .05
On the following two slides, we'll pretend that the true
population proportion is 0.12.
Normal curve of sample proportions The green curve is the
based on sample size 200
true distrtibution of p-
hat.
Of course, ordinarily
we don't know where it
lies, but at least we
know its approximate
standard deviation.
Thus, we can build a
confidence interval
around our 14%
estimate (in red).
0.08
0.10
0.12
0.14
0.16
0.18
sample percents
If we take another sample, the red line will move
but the green curve will not!
30 confidence intervals
based on sample size 200
If we repeat the
sampling over
and over, 95% of
our confidence
intervals will
contain the true
proportion of
0.12.
This is why we
use the term
"95% confidence
interval".
0.06
0.08
0.10
0.12
0.14
0.16
sample percents
0.18
Definition of "95% confidence interval for
the true population proportion":
An interval of values computed from a
sample that will cover the true but
unknown population proportion for 95% of
the possible samples.
To find a 95% CI:
• The center is at p-hat.
• The margin of error is 2 times the S.E., where…
• …the S.E. is the square root of [p-hat(1-p-hat)/n].
What does it mean to be 95% confident?
A. There is a 95% probability that the one interval that
I calculated contains the true value for the
parameter.
B. If I get 100 such intervals, about 95 of them will
contain the true value for the parameter.
C. The sample estimate has a 95% chance of being
inside the calculated interval.
D. The p-value has a 95% chance of being inside the
interval.
If you understand today’s lecture…
9.25, 9.33, 9.35, 9.37, 10.1, 10.3, 10.7, 10.9,
10.11, 10.13, 10.15, 10.19, 10.21, 10.23,
10.25, 10.27, 10.33, 10.45
Objectives:
• Define standard error, relate it to both standard deviation and
sampling distribution ideas.
• Describe the sampling distribution of a sample proportion.
• Reformulate confidence interval formula using general idea of
estimate plus/minus (multiplier × standard error)
• Interpret confidence level as a relative frequency
• Calculate new values of the multiplier for new confidence
levels other than 95%