Download significance tests - Westlake City Schools

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Dempster–Shafer theory wikipedia , lookup

Transcript
AP Statistics Section 11.1 A
Basics of Significance Tests
Confidence intervals are one of the
two most common types of
statistical inference.
Use a confidence interval when
your goal is to
____________________________.
estimate a population parameter
The second type of inference,
called significance tests, has a
different goal:
to assess the evidence provided by data
about some claim concerning a popualtion.
Example 1: I claim that I make 80% of my free throws.
To test my claim, you ask me to shoot 20 free throws. I
make only 8 out of 20. Assume p = .8, and find the
probability of making exactly 8 of the 20 free throws.
Also, find the probability of making 8 or less free
throws.
binomialpdf (20,.8,8)
.000087
binomialcdf (20,.8,8)
.0001
“Aha!” you say. “Someone who makes 80% of his free
throws would almost never make only 8 out of 20. So I
don’t believe your claim.” Your reasoning is based on
asking what would happen if my claim were true and
we repeated the sample of 20 free throws many times.
I would almost never make as few as 8. This outcome is
so unlikely that it gives strong evidence that my claim is
not true.
Significance tests use elaborate
vocabulary but the basic idea is
simple: getting an outcome that
would rarely happen if a claim were
true is strong evidence that the claim
is not true.
A significance test is a formal procedure for
comparing observed data with a hypothesis
whose truth we want to assess. The hypothesis
is a statement about a, population parameter
such as the population mean ___
 or population
p The results of a test are
proportion ___.
expressed in terms of a probability that
measures
_______________________________________.
how well the data and the hypothesis agree
The reasoning behind statistical tests,
like that of confidence intervals, is
based on asking what would happen if
we repeated the sampling or
experiment many times. We will begin
with the unrealistic assumption that
we know  , the population standard
deviation.
Example 2: Vehicle accidents can result in serious injuries to
drivers and passengers. When they do, someone usually calls
911. In the case of life-threatening injuries, victims generally
need medical attention within 8 minutes of the crash. Several
cities have begun to monitor paramedic response times. In one
such city, the mean response time (RT) to all such accidents
involving life-threatening injuries last year was   6.7 minutes
with   2 minutes. The city manager shares this information
with emergency personnel and encourages them to “do better”
next year. At the end of the following year, the city manager
selects a simple random sample of 400 calls involving lifethreatening injuries and examines the response times. For this
sample, the mean response time was x  6.48 minutes. Do these
data provide good evidence that response times have decreased
since last year?
Remember, sample results may
vary! Maybe the mean RT for the
SRS is simply a result of
____________________.
sampling variability
We want to use the same reasoning here as we did in
the previous example. We make a claim and
ask if the data give evidence *____________*
against it.
We would like to conclude that the mean
RT ____________,
decreased so the claim we test is that
RTs _____________________.
have not decreased
If we assume the RTs for calls involving lifethreatening injury have not decreased, the mean RT
for the population of all such calls would still be
__________
  6.7 (assume ________
  2 too).
Consider the sampling distribution of from 400 calls:
Shape: approx. Normal - CLT
Mean:  x    6.7
Standard deviation:
x  
 2
n
400
as long as N  10n
Find the probability of
x  6.48 minutes.
normalcdf (1000,6.48,6.7,.1)  .014
An observed value this small would
rarely occur by chance if the true
 were 6.7 minutes. This
observed value is good evidence
that the true  is, in fact, less than 6.7
minutes. Thus we can
conclude the average response
time decreased this year.
In example 2, we asked whether the accident RT
data are likely if, in fact, there is no decrease in
paramedics’ RTs. Because the reasoning of
significance tests looks for evidence against a
claim, we start with the claim we seek evidence
against, such as “no decrease in response time.”
This claim is our _________________(
null hypothesis ____
H 0 ).
This is the statement being tested in a
significance test.
The significance test is designed to
assess the strength of the evidence
against the null hypothesis. Usually the
null hypothesis is a statement of “no
change”, or “no difference” from
historical values. The null hypothesis
can be thought of as the “status quo”
hypothesis.
The claim about the population that
we are trying to find evidence for is
the alternative hypothesis ( ____
H a ).
In example 2, the null hypothesis says “no
decrease” in the mean RT of 6.7 min.”:
H0:________
  6.7 while the alternative
hypothesis says “there is a decrease in the
mean RT of 6.7 min.”: Ha: ________
  6.7
where  is the mean response time to all
calls involving life-threatening injuries in
the city this year.
In this instance the alternative
hypothesis is one-sided because
we are interested only in
deviations from the null hypothesis
in one direction.
Hypotheses always refer to some
population, not to a particular
outcome. Be sure to state
H 0 and H a in terms of a population
parameter.
Example 3: Does the job satisfaction of assembly
workers differ when their work is machine-paced rather
than self-paced? One study chose 18 subjects at
random from a group of people who assembled
electronic devices. Half of the subjects were assigned at
random to each of two groups. Both groups did similar
assembly work, but one work setup allowed workers to
pace themselves, and the other featured an assembly
line that moved at fixed time intervals so that the
workers were paced by the machine. After two weeks,
all subjects took the Job Diagnosis Survey (JDS), a test
of job satisfaction. Then they switched work setups and
took the JDS again after two more weeks.
This is a
_________________design
matched - pairs
experiment. The response variable
is the
__________________________,
difference in JDS scores
self-paced minus machine-paced.
The parameter of interest is the mean  of
the differences in JDS scores in the
population of all
assembly workers. The null hypothesis
says that there (is a / is no) difference in
the scores:
H0 :   0
The authors of the study simply wanted to know
if the two work conditions have different levels
of job satisfaction. They did not specify the
direction of difference. The alternative
hypothesis is therefore two-sided; that is either
_______
  0 or _______.
  0 For simplicity, we write
this as H a : _______.
0
The alternative hypothesis should express
the hopes or suspicions we have before
we see the data. It is cheating to first look
at the data and then frame the alternative
hypothesis to fit what the data show. If
you do not have a specific direction firmly
in mind in advance, use a two-sided
alternative.