Download Confidence intervals

Document related concepts

Inductive probability wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Statistical inference wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
r
• What is r?
• What does it tell us about the relationship between two
variables?
Coincidence?
• Calculate the day of the year on which your
birthday falls.
• Draw a number
• Record numbers in spreadsheet on board..
• We’ll run regression to determine relationship
between your day and the random number
you drew.
Vietnam Draft
• A similar exercise was done with the Vietnam
war draft lottery. The relationship was found
to be -.226. What does this mean?
• Statisticians found that the probability of this
happening by chance was less than .001. So
there was strong evidence that the lottery was
unfair.
Introduction to Inference
Estimating with Confidence
Statistical Inference
• Statistical inference provides methods for
drawing conclusions about a population from
sample data.
• If we believe something to be true and we
design an experiment to test it, how likely is
our experiment to give us true results?
Statistical Inference
• Confidence intervals: estimate values
• Tests of significance: assess the evidence for a
claim about a population
• This chapter we use oversimplified examples
to understand the reasoning.
Estimating with Confidence
• In 2000 1,260,278 college-bound seniors took
the SAT. The mean math score was 514 with a
s.d. of 113. (Verbal: 505,111)
• In California about 49% of students take the
SAT – many take the ACT and others are not
college-bound.
• How could we estimate the SAT score of all
California seniors?
California Confidence
• Suppose you arrange to give the test to an SRS
of 500 California Seniors. The mean of the
sample is 461.
• How would this sample vary if we took many
samples of seniors from the same population?
Central Limit Theorem
• CLT tells us that xbar has a distribution that is
close to normal.
• The mean of the sampling distribution is the
same as the mean of the entire population.
• Standard deviation would be s/√500
Going on
• Somehow we know that the true standard
deviation is 100. (Next chapter we’ll deal with
not knowing the true standard deviation.)
Our standard deviation of xbar is then 4.5.
(VERIFY)
• If we collect another sample we would expect
a different mean.
Statistical Confidence
As we collect different
samples and calculate
different means, 95%
of the time we would
expect the mean to fall
within two standard
deviations of the true,
but unknown mean.
What would the range
of values be if we go
with our sample data
with xbar = 461?
Confidence Intervals
• We say of the interval (452,470) that it will
contain the true mean of the population 95%
of the time.
• Our margin of error is ±9.
Confidence Intervals
A level C confidence interval for a parameter
has two parts.
• An interval calculated from the data, usually
of the form estimate ± margin or error
• A confidence level C, which gives the
probability that the interval will capture the
true parameter value in repeated samples.
Conditions for constructing a
confidence interval for a mean
• Data come from an SRS from the population
of interest.
• The sampling distribution of xbar is
approximately normal.
– If the population distribution is normal, then the
sampling distribution of xbar will be normal.
– From the CLT, is sampling distribution is
approximately normal if the sample size is large
enough. Unless strongly skewed distribution,
n≥15 is usually adequate.
• One-sample confidence interval on µ (s known)
estimate margin of error or
x  z*
s
n
• CONDITIONS
The sample must be reasonably random.
The sampling distribution of is approximately normal.
• z* or z critical value is the number of standard deviations
on either side of the mean necessary to have the given
confidence level. Can be found on the last two lines of
Table C.
Inference Toolbox
To construct a confidence interval:
• Step 1: Identify the population of interest and the
parameter you want to draw conclusions about.
• Step 2: Choose the appropriate inference procedure.
Verify the conditions for using the selected procedure.
• Step 3: If the conditions are met, carry out the
inference procedure.
• Step 4: Interpret your results in the context of the
problem.
z critical value
• Aka z*
• The number of standard deviations we must
go out on either side of the mean to catch the
central probability.
• Comes from the standard normal table.
The t table
Most of the mysteries of the t-table will be revealed in
Chapter 11. For now, go to the t-table in either your
formula chart or inside back cover of the book.
Look at the last row.
Pick your confidence level at the bottom, and go up
One row to find z*
Video Screen Tension p. 546
• A manufacturer of high-resolution video
terminals must control the tension on the
mesh of fine wires that lies behind the surface
of the viewing screen.
• Here are the tension readings from an SRS of
20 screens from a single day’s production.
269.5
280.4
264.7
342.6
Find xbar.
297.0
233.5
307.7
338.8
269.6
257.4
310.0
340.1
283.3
317.5
343.3
374.6
304.8
327.4
328.1
336.1
Construct a 90% confidence interval for the mean tension m
of all the screens produced on this day.
Step1: Identify the population of interest and the parameter
you want to draw conclusions about. The population of
interest is all of the video terminals produced on the day in
question. We want to estimate m, the mean tension for all of
these screens.
Step 2: Choose the appropriate inference procedure.
Verify the conditions for using the selected procedure.
(Only one inference procedure so far – confidence interval
with known standard deviation.) We were given that
The sample was an SRS. Let’s look at a stem plot and
a normal probability plot to see if there is any reason
to doubt the normality of the sampling distribution.
(20 might be large…)
Stemplot
Normal Probability plot
Step 3: If the conditions are met, carry out the
inference procedures.
s
x  z*
n
Step 4: Interpret your results in the context of the problem.
We are 90% confident that the true mean tension in the
entire batch of video terminals produced that day is
between 290.4 and 322.1 mV.
Movie Theaters
• A survey of 81 movie theaters showed that
the average length of a feature film was 98
minutes. Past studies indicate that s = 12
minutes. Determine a 95% confidence
interval for estimating the mean length of all
feature films. Interpret the interval in the
context of the problem.
What your calculator can do for you!
• A survey of 81 movie theaters showed that
the average length of a feature film was 98
minutes. Past studies indicate that s = 12
minutes. Determine a 95% confidence
interval for estimating the mean length of all
feature films. Interpret the interval in the
context of the problem.
Tests of significance
• Confidence intervals are used to estimate a
population parameter.
• Tests of significance assess the evidence
provided by data about some claim
concerning a population.
Basketball
• Leah claims that she makes 80% of her basketball
free throws.
• In her next 20 throws she makes only 8. (40%)
• Caleb says “Aha! Someone who really makes 80%
of their free throw shots would very rarely make
only 8 out of 20. I don’t believe your claim!”
• In fact, if Leah’s claim is true, the probability of
her making 40% in one string is .0001. The small
probability convinces us that while her claim is
possible, it is very unlikely.
Hypotheses
• In order to do an inference test you must have
some question or claim.
– Is Leah’s average 80%?
– Does this drug reduce blood pressure?
– Does this treatment reduce blood pressure?
– Does this produce cause whiter teeth?
– Does this cola lose sweetness over time?
– Is the diameter of these ball bearings within the
customer’s tolerance?
Warmup
• Weekly sales of regular ground coffee at a
supermarket have in the recent past varied
according to a normal distribution with mean
m=354 units per week and standard deviation
s=33 units. The store reduces the price by
5%. Sales in the next three weeks are
405,378, and 411. Is this good evidence that
average sales are now higher? Write the
hypotheses.
Soda Sweetness
• Diet colas use artificial sweeteners which may
lose their sweetness over time.
• Trained testers sip cola and score the cola on a
sweetness scale.
• Cola is then stored for a month at high
temperature to imitate 4 months storage.
• Testers again rate the colas.
Sweetness losses
• Here are sweetness losses as judged by 10 tasters
• 2.0 0.4 0.7 2.0 -0.4 2.2 -1.3 1.2 1.1 2.3
• (A negative indicates taster thought it gained
sweetness.)
• Mean is 1.02
• That’s not a large loss. If we had another group
try they would have different numbers. Is there
good evidence that the cola lost sweetness?
Hypotheses
• Our test asks
– Does the sample result xbar = 1.02 reflect a real
loss of sweetness?
– OR
– Could we easily get the outcome xbar=1.02 just by
chance?
Null hypothesis
• The null hypothesis says that there is no effect
or no change in the population. Status quo.
• ALWAYS stated in terms of a parameter.
• Generally written as Ho and referred to as
H-nought.
• For our cola problem:
– Ho: m = 0
Alternate Hypothesis
• However, we suspect that cola does lose its
sweetness. So our alternate hypothesis is that
the difference will be positive:
– HA: m > 0
Alternate hypothesis may be one-sided or twosided.
Steps in Inference Testing
• First step is always to define your population
and parameter of interest. (This should sound
familiar. Yes, you have to do it every time.)
• Second step is to state your hypotheses
Practice:
• A car dealer advertises that its new
subcompact model gets 47 mpg. You assume
the dealer will not underrate the car, but you
are suspicious about the claim.
• Complete the first two steps of inference
procedure.
Questions on hw?
Back to cola example
• 10 tasters; xbar=1.02; standard deviation for
individual tasters is known to be 1.
• Ho: m = 0
• HA: m > 0
• So, sampling distribution of xbar from 10 tasters is
then normal with mean m=0 (if there is no change in
sweetness) and standard deviation 1/√10.
Sample Distribution
• So the value xbar=1.02 seems unlikely. The
probability of this happening is about
6/10000. An outcome this unlikely convinces
us that the true mean is really more than 0.
• The probability is called the p-value. A large
p-value fails to give evidence. A small p-value
means the null hypothesis is unlikely.
Outline of a test
• State population, parameter and hypotheses.
• Choose appropriate procedure. Verify
conditions.
• If conditions met, calculate test statistic and
find the probability (p-value) that your statistic
could have occurred by chance.
• Interpret your results in context.
Back to cola
• State the hypotheses:
• Find the value of the test statistic (what is it?)
• Sketch the normal curve of the test statistics
when Ho is true. Why is the sampling
distribution normal?
• Find the p value:
• Is the result significant at the a=0.05 level?
Molly Jerael
Paulo Alex U
Nathan Alex B
Chelsea Emily
Denver Jonna
Katie S Braxton
Bryan Cheyanne
Katie W
AJ Mac
Cole Michael
Ramon Erica
Richard
Interpreting results
• The final step of an inference test is to
interpret the results in the context of the
problem. It should consist of 2 sentences.
• “Because p is _____________ we reject/fail to
reject the null hypothesis. There is/is not
evidence that …….”
2nd Period
• How can you tell from a problem when we use
a confidence interval and when a test?
Steps for inference testing
• See Inference Toolbox on page 571
Single or two-sided test
• If our alternate hypothesis is < or > we have a
one-sided test. P is the area under the normal
curve to the left or right of the test statistic.
• If our alternate hypothesis is ≠ then the p
value is twice the area to the left or right of
the test statistic.
10.38: Pressing Pills
Pills hardness values:
Enter into calculator
The target values for the hardness are m=11.5.
Standard deviation is known to be 0.2. Is
there significant evidence at the 5% level that
the mean hardness of the tablets is different
from the target value? Use the Inference
toolbox.
Fixed significance level
• Sometimes our problem specifies an a with
which to compare p. If p is less than a then
the statistic is significant at level a.
• 10.27, 10.33, 10.28, 10.34, 10.38
• How do we know when to use a confidence
interval and when to use a significance test?
• What does 95% confident mean?
• If we wanted to establish a 95% confidence
interval with a margin of error of 8 for the
problem in 10.27, 10.33, what size sample
would we need.
Practice now
• 10.55,54 (p. 585)
• HW: 10.51, 52
Which is worse – to condemn an
innocent person or to let a guilty
person go free?
Errors
• Since our testing procedure calls for us to
reject a null hypothesis based on a probability,
there will be times we will make a mistake!
• We classify our errors into two types –
imaginatively named Type I and Type II
Juries
The reality:
• Innocent
J
u
r
y
F
i
n
d
s
Guilty
Innocent
Guilty
Errors
• If we reject Ho when it is in fact true we
commit a Type I error.
• If we fail to reject Ho when it is in fact false we
commit a Type II error.
Potato Chips
• When a batch of potato chips are produced, a
sample are tested to see if they meet
standards. If they do not, the distributor
refuses to take the chips. (Acceptance
testing.)
• Ho: the batch of potato chips meets standards
• Ha: the potato chips do not meet standards.
• What are Type I and Type II errors and their
consequences?
Probabilities
• What is the probability of committing a Type I
error?
• What is the probability of committing a Type II
error?
Type II error
• Type II error can only be calculated for a
specific value of the parameter.
• For instance, in our potato chips, salt content
is supposed to be 2 mg. s = .1 mg.
• Assume a = .05. Let us also assume that the
true mean of the sodium values is 2.05.
Pictures:
Power
What is it?
fail to reject Ho
type II
error
reject Ho
power
fail to reject Ho
reject Ho
What if we
increased alpha?
type II
error
power
and the power
increases!
fail to reject Ho
reject Ho
What if we increased
sample size?
Well, the standard
deviation would get
smaller.
type II
error
power
What happened to power?
To a type II error?
fail to reject Ho
reject Ho
What if the alternative
(truth?) were further
from Ho?
type II
error
type II error is near
0
power
power is near 1
So it’s a balancing act.
What is most important in the
context of the problem?
Which error would be
most costly or dangerous?
To raise the power of a test
(that is, to increase the probability of correctly
rejecting Ho if it is NOT true)
1. increase sample size
2. decrease variation
3. move alternative further from Ho
4. increase alpha (prob. of type I error)