Download m - Algebra I PAP

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Chapter 10
Hypothesis Testing
Using a Single Sample
Section 10.1
Hypotheses & Test
Procedures
Hypothesis
In statistics, a hypothesis is a
statement about a population
characteristic.
Sharing prescription drugs with others can
be dangerous. Is this a common
occurrence among teens?
OR
How do we answer questions
like these using sample data?
The National Association of Colleges and
In Chapter
9, we
used
sample
data
to salary
Employers
stated
that
the
average
starting
In thisestimate
chapter, the
we value
will use
sample
data to
of
an
unknown
for
graduating
with a bachelor’s
degree in
teststudents
somepopulation
claim
or hypothesis
about
the
characteristic.
2010
is $48,351.
Is this true
population
characteristic
to for
seeyour
if it iscollege?
plausible.
To do this, we use a test of hypotheses or
test procedure.
What is a test of hypotheses?
A test of hypotheses is a method that uses
Is it one of the
sample data to decidevalues
between
two
of the
competing claims (hypotheses)
about the
sample statistic
population characteristic.
that are likely to
Is the value
of the
statistic . . .
Is it one
thatsampleoccur?
isn’t likely
to
– a random
occurrence
due to natural
occur?
variation?
OR
– a value that would
be considered
surprising?
FORMAL STRUCTURE
Hypothesis tests are based on a
reductio ad absurdum form of
argument.
Specifically, we make an assumption
and then attempt to show that
assumption leads to an absurdity or
contradiction, hence the assumption is
wrong.
Hypothesis statements:
YouHare
usually
The null hypothesis, denoted by
0, is a
tryingthat
to
claim about a population characteristic
determine if
is initially assumed to be true.
this claim is
The hypothesis statements are ALWAYS about
believable.
the population
– NEVER about a sample!
The alternative hypothesis, denoted by Ha,
is the competing claim.
To determine what the alternative hypothesis
should be, you need to keep the research
objectives in mind.
Formal Structure
Rejection of the null hypothesis will
imply the acceptance of the alternative
hypothesis.
Assume H0 is true and attempt to show
this leads to an absurdity, hence H0 is
false and Ha is true.
Formal Structure
Typically one assumes the null hypothesis to
be true and then one of the following
conclusions are drawn.
1. Reject H0
Equivalent to saying that Ha is correct or true
2. Fail to reject H0
Equivalent to saying that we have failed to show
a statistically significant deviation from the claim
of the null hypothesis
This is not the same as saying that the null
hypothesis is true.
Let’s consider a murder trial . . .
To determine which hypothesis is
You are trying to
What
is thethe
null
correct,
juryhypothesis?
will listen to the
is what you
determine
if the
So weOnly
willifmake
one
of twoThis
decisions:
evidence.
there is
“evidence
true
H0: theadefendant
innocent
evidenceis supports
beyond
reasonable is
doubt”
would theassume
youclaim.
begin.
null hypothesis be rejected in favor before
of
this
•the
Reject
the null
hypothesis
alternative
hypothesis.
• Fail to reject the null hypothesis
What is the
alternative
hypothesis?
If there
is not convincing
evidence, then we
would “fail to reject” the null hypothesis.
Ha: the defendant
guilty
Rememberis
that
the actual verdict that is
returned is “GUILTY” or “NOT GUILTY”. We
never end up determining the null hypothesis
is true – only that there is not enough
evidence to say it’s not true.
The Form of Hypotheses:
Null hypothesis
H0: population characteristic = hypothesized value
This one is considered a
two-tailed test because you
are interested
in both
Alternative
hypothesis
The
null hypothesis always
direction.
includes the>equal
case. value
Ha: population
characteristic
hypothesized
This hypothesized
value
is
Ha: population characteristic
< hypothesized
value
a specific number
Ha: population characteristic
≠ This
hypothesized
value
sign
is context
determined
by
the
Notice
that the alternative
These are
considered
one-determined
by the
of
the
problem
hypothesis
tailed tests
becauseuses
youthe
aresame
context
of the
population
characteristic
and
the
Let’s
practice
writing
only interested in one
problem.
same hypothesized
value
as
the
hypothesis statements.
direction.
null hypothesis.
Sharing prescription drugs with others can be
dangerous. A survey of a representative sample of
592 U.S. teens ages 12 to 17 reported that 118 of
those surveyed admitted to having shared a
prescription drug with a friend. Is this sufficient
evidence that more than 10% of teens have shared
prescription medication with friends?
What is the
What words indicate
What
theis the
population
hypothesized
direction of the alternative
characteristic
of interest?
value?
The
true
proportion
p of
H0: phypothesis?
= .1
teens who have shared
State the hypotheses :
Ha: p > .1
prescription medication with
friends
Compact florescent (cfl) lightbulbs are much
more energy efficient than regular
incandescent lightbulbs. Ecobulb brand
60-watt cfl lightbulbs state on the package
“Average life 8000 hours”. People who purchase
this brand would be unhappy if the bulbs lasted
less than 8000 hours. A sample of these bulbs will
be selected and tested.
What is the
What
words indicate
theis the
population
hypothesized
State the
hypotheses
:What
direction of the alternative
characteristic value?
of interest?
hypothesis?
H : m = 8000
0
Ha: m < 8000
The true mean (m) life of the
cfl lightbulbs
Because in variation of the manufacturing process,
tennis balls produced by a particular machine do not
have the same diameters. Suppose the machine
was initially calibrated to achieve the specification of
m = 3 inches. However, the manager is now
concerned that the diameters no longer conform to
this specification. If the mean diameter is not 3
inches, production will have to be halted.
State the hypotheses :
What words indicate What
the is the population
direction
of interest?
The true mean m
H0of: mthe
= alternative
3 characteristic
hypothesis?
diameter of tennis
H a: m ≠ 3
balls
Peaches Example
Do the “16 ounce” cans of peaches
canned and sold by DelMonte meet
the claim on the label (on the
average)?
Notice, the real concern
would be selling the
consumer less than 16
ounces of peaches.
H0: µ = 16
Ha: µ < 16
Light Bulb Example
Do two brands of light bulbs have the
same mean lifetime?
H0: µBrand A = µBrand B
Ha: µBrand A  µBrand B
Milling Example
Do parts produced by two different milling
machines have the same variability in
diameters?
H0 : 1  2
Ha : 1  2
or equivalently
H0 : 12  22
Ha : 12  22
Caution
When you set up a hypothesis test, the
result is either
1. Strong support for the alternate hypothesis (if
the null hypothesis is rejected)
2. There is not sufficient evidence to refute the
claim of the null hypothesis (you are stuck with
it, because there is a lack of strong evidence
against the null hypothesis.
For each pair of hypotheses, indicate which are
Must useand
a population
characteristic - x is a
not legitimate
explain why
statistics (sample)
a) Ho: m = 15
Ha: m ≥ 15
Must be only greater than!
c)
Must use same
Honumber
: p = 0.1
H0a! :
as in H
p ≠ 0.1
! > 3.2
d) Ho: Hm0 =MUST
2.3 beH“=“
:
m
a
e) Ho: p ≠ 0.5 Ha: p = 0.5
Section 10.2
Errors in Hypothesis
Testing
When you perform a hypothesis
test you make a decision:
reject H0 or fail to reject H0
When
make one of
Each could possibly
be you
a wrong
these decisions, there is a
decision; therefore,possibility
there are
thattwo
you could
be wrong!
types of errors.
That you made an error!
Type I error
• The error of rejecting H0 when H0 is true
• The probability of a Type I error is denoted
by a.
ais called the significance level of the test
This is the lower-case
Thus, a test
withletter
a =“alpha”.
0.01 is said to have
Greek
a level of significance of 0.01 or to be a
level 0.01 test.
Type II error
• The error of failing to reject H0 when
H0 is false
• The probability of a Type II error is
denoted by b
This is the lower-case
Greek letter “beta”.
Here is another way to look at the types
of errors:
Suppose
H
is false
true
Suppose
Suppose
H
H000His
is
true
Suppose
0 is
and
we
fail
to
and
and
wewe
fail
reject
towe
reject
it,
false
and
reject
it,
what
it,what
whatit,
type
type
ofof
reject
what
type
of decision
decision
decision
was
type
of
was
made?
made?
was
made?
Reject
H0
H0
is true
H0 is
false
Type I
error
Correct
Fail to
reject H0 Correct
Type II
error
Error Analogy
Consider a medical test where the hypotheses
are equivalent to
H0: the patient has a specific disease
Ha: the patient doesn’t have the disease
Then,
Type I error is equivalent to a false negative
(i.e., Saying the patient does not have the disease
when in fact, he does.)
Type II error is equivalent to a false positive
(i.e., Saying the patient has the disease when, in
fact, he does not.)
The U.S. Bureau of Transportation Statistics reports
that for 2009 72% of all domestic passenger flights
arrived on time (meaning within 15 minutes of its
scheduled arrival time). Suppose that an airline with
a poor on-time record decides to offer its employees
a bonus if, in an upcoming month, the airline’s
proportion of on-time flights exceeds the overall 2009
industry rate of .72.
State the hypotheses.
Type I error – the airline
decides
to reward
H0: p = .72
State a Type
I error
in the
employees when the
context.
H : p > .72
a
proportion of on-time flights
doesn’t exceed .72
State a Type II error in
Type II error – the airline
context.
employees do not receive the
bonus when they deserve it.
In 2004, Vertex Pharmaceuticals, a biotechnology
company, issued a press release announcing that it had
filed an application with the FDA to begin clinical trials on
an experimental drug VX-680 that had been found to
reduce the growth rate of pancreatic and colon cancer
State a Type I error in the
tumors in animal studies.
context
of thisofproblem.
A potential
consequence
making a Type I
DataWhat
resulting
from
the planned
clinical
trialswould
can be
is aerror
potential
consequence
would
be
that the
company
used to test: continue
to devote resources to the
of this error?
development
of the
drug
when itfor
really
is not
Let m = the true
mean growth
rate
of tumors
patients
taking the experimental drug effective.
H0: m = mean growth rate of tumors for patients not taking the
experimental drug
Ha: m < mean growth rate of tumors for patients not taking the
experimental drug
A Type I error would be to incorrectly conclude
that the experimental drug is effective in slowing
the growth rate of tumors
In 2004, Vertex Pharmaceuticals, a biotechnology
company, issued a press release announcing that it
had filed an application with the FDA to begin clinical
trials on an experimental drug VX-680 that had been
a Typerate
II error
in the context
found to reduceState
the growth
of pancreatic
and
of this studies.
problem.
colon cancer
tumorsconsequence
in animal
A potential
of making a Type II
What is aerror
potential
consequence
would be
that the company might
Data resulting
from
plannedofclinical
can be
abandon
development
a drug trials
that was
of
thisthe
error?
used to test:
effective.
H0: m = mean growth rate of tumors for patients not taking the
experimental drug
Ha: m < mean growth rate of tumors for patients not taking the
experimental drug
A Type II error would be to conclude that the drug
is ineffective when in fact the mean growth rate of
tumors is reduced
The relationship between a and b
The ideal test procedure would result in both a = 0
(probability
of aa significance
Type I error)level
and a
b == 0.05
(probability
Selecting
results
a test
procedure that, used over and over
of a in
Type
II error).
different
samples,
rejects
a
true
H
about
0
Thiswith
isSo
impossible
to
achieve
since
we
must
why not always choose a small a base
a5=times
.05data.
orina100.
= .01)?
our decision (like
on sample
Standard test procedures allow us to select a, the
significance level of the test, but we have no direct
control over b.
Relationships Between
a and b
Generally, with everything else held constant,
decreasing one type of error causes the other
to increase.
The only way to decrease both types of error
simultaneously is to increase the sample size.
No matter what decision is reached, there is
always the risk of one of these errors.
The relationship between a and b
Suppose
thisisnormal
curve
If
the
null
hypothesis
false
and
the
Let’s consider the
represents is
the
sampling
alternative hypothesis
true,
then the
This taildistribution
would
represent
b, the
following
hypotheses:
for ppart
when
null
This
is the
the
true
proportion
is believed
to of
bethe
greater
probabilitycurve
of hypothesis
failing
to
reject
a a
is
true.
that
represents
than .5 – so the curve should really be
false
H0Type
.
or
the
I error.
shifted to the right.
H0: p = .5
Ha: p > .5
Let a = .05
.5
The relationship between a and b
If the null hypothesis is false and the
Let’s consider
the hypothesis is true, then the true
alternative
This
tail that
would
b, the
followingproportion
hypotheses:
is
believed
to be
greater
Notice
as represent
a gets
smaller,
b than .5
of
failing
to reject
a
–probability
so the curve
should
really
be shifted
to
gets
larger!
false
Hright.
0.
the
H : p = .5
0
Ha: p > .5
Let a = .01
How does one decide what a
level to use?
After assessing the consequences of Type I
and Type II errors, identify the largest a
that is tolerable for the problem. Then
employ a test procedure that uses this
maximum acceptable value –rather than
anything smaller – as the level of
significance.
Remember, using a smaller a increases b.
The EPA has adopted what is known as the Lead and Copper
Rule, which defines drinking water as unsafe if the
concentration of lead is 15 parts per billion (ppb) or greater or
Since mostofpeople
would
consider
the
if the concentration
copper is
1.3 ppb
or greater.
consequence of the Type I error more
The manager
of a we
community
water to
system
might
use lead
serious,
would
want
keep
a
small
aaa sample
Type
error
in
context.
WhichState
type
errorIII
has
a more
serious
level measurements
from of
of
water
specimens
to
State
Type
error
in
context.
– so select
a smaller
significance level
test the following
hypotheses:
consequence?
What
of a = .01. of
What is
is aa consequence
consequence
of aa Type
Type I?
II?
H0: m = 15 versus Ha: m < 15
A Type I error leads to the conclusion that a water source meets
EPA standards when the water is really unsafe.
There are possible health risks to the community
A Type II error leads to the conclusion that a water
source does NOT meet EPA standards when the
water is really safe.
The community might lose a good water source.
Section 10.3
Large-Sample Hypothesis
Test for a Population
Proportion
Large-Sample Hypothesis Test
for a Population Proportion
The fundamental idea behind hypothesis
testing is:
We reject H0 if the observed sample is very
unlikely to occur if H0 is true.
Test Statistic
A test statistic is the function of
sample data on which a
conclusion to reject or fail to
reject H0 is based.
Recall the General Properties for
These three properties imply that the standardized
Sampling
Distributions
of p
variable
p̂  p
z
p1  p 
n
1. p̂
has an approximately standard normal distribution when n
is large.
As long as the sample size is less
μ =p
2.
p(1-p)
σp̂ =
n
than 10% of the population
P-value
The P-value (also called the observed
significance level) is a measure of
inconsistency between the hypothesized
value for a population characteristic and the
observed sample.
The P-value is the probability, assuming that
H0 is true, of obtaining a test statistic value
at least as inconsistent with H0 as what
actually resulted.
Computing P-values
The calculation of the P-value depends on the form of
the inequality in the alternative hypothesis.
• Ha: p > hypothesized value
z curve
P-value = area in upper tail
Calculated z
Computing P-values
The calculation of the P-value depends on the form of
the inequality in the alternative hypothesis.
• Ha: p < hypothesized value
z curve
P-value = area in lower tail
Calculated z
Computing P-values
The calculation of the P-value depends on the form of
the inequality in the alternative hypothesis.
• Ha: p ≠ hypothesized value
P-value = sum of area in two tails
z curve
Calculated z and –z
Using P-values to make a decision:
To decide whether or not to reject H0,
we compare the P-value to the
significance level a
If the P-value > a, we “fail to reject” the
null hypothesis.
If the P-value < a, we “reject” the null
hypothesis.
Summary of the Large-Sample z
Test for p
Null hypothesis:
Test Statistic:
H0: p = hypothesized value
p̂  p
z
p(1  P)
n
Alternative Hypothesis:
Ha: p > hypothesized value
Ha: p < hypothesized value
Ha: p ≠ hypothesized value
or
P-value:
Area to the right of calculated z
Area to the left of calculated z
2(Area to the right of z) of +z
2(Area to the left of z) of -z
Summary of the Large-Sample z
Test for p Continued . . .
In June 2006, an Associated Press survey
was conducted to investigate how people
use the nutritional information provided on
food packages. Interviews were conducted
on thisselected
data, isadult
it reasonable to conclude
with Based
1003 randomly
that aand
majority
of adult was
Americans
Americans,
each participant
asked frequently
nutritional
labels
when purchasing
a series check
of questions,
including
the following
two:
packaged
foods?food, how often do
Question 1: When purchasing
packaged
you check the nutritional labeling on the package?
Question 2: How often do you purchase food that is bad
for you, even after you’ve checked the nutrition
labels?
It was reported that 582 responded “frequently” to the
question about checking labels and 441 responded “very
often” or “somewhat often” to the question about purchasing
bad foods even after checking the labels.
Nutritional Labels Continued . . .
H0: p = .5
Ha:We
p > will
.5 create a test statistic using:
p = true proportion of adult
who
p̂ Americans
p

frequently checkz nutritional
p1  p  labels
n
We use p > .5 582
to test for a majority of
pˆ  whofrequently
.58
adult Americans
check
For this sample:
1003 labels.
nutritional
A test statistic indicates how many standard
This
observedthe
sample
proportion
is greater
deviations
sample
statistic (p)
is fromthan
the .5.
Is itplausible
a sample
proportion
of p = .58
.58
.5 population
characteristic
(p).
z occurred as
5.08a result of chance variation, or is it
.5.5
unusual to observe a sample proportion this large
1003
when p = .5?
Nutritional Labels Continued . . .
H0: p = .5
Ha: p > .5
p = true proportion of adult Americans who
frequently
check
nutritional
labels
In
the
normal
curve,
Next we
findstandard
the
P-value
for this
testseeing a
value ofstatistic.
5.08 or larger is unlikely. It’s
probability
is approximately 0.
582
pˆ 
 .58
For this sample:
1003
Since the of
P-value
is so small,
.58
 .5
The
P-value
is the probability
obtaining
a testwe
z 
 5.08
reject H0. There
statistic
withisHconvincing
.5.5at least as inconsistent
0 as was
evidence H
to suggest
that the
observed,
assuming
is
true.
0
1003
P-value ≈ 0
majority of adult Americans
frequently check the nutritional
labels on packaged0foods.
A report states that nationwide, 61% of high school
graduates go on to attend a two-year or four-year
college the year after graduation. Suppose a random
sample of 1500 high school graduates in 2009 from a
particular state estimated the proportion of high school
graduates that attend college the year after graduation
to be 58%. Can we reasonably conclude that the
proportion of this state’s high school graduates in
2009 who attended college the year after graduation
is different from the national figure? Use a = .01.
H0: p = .61
Ha: p ≠ .61
Where p is the proportion of all
State
hypotheses.
2009
high the
school
graduates in
this state who attended college
the year after graduation
College Attendance Continued . . .
H0: p = .61
Ha: p ≠ .61
Where p is the proportion of all 2009 high
school graduates in this state who attended
college the year after graduation
Assumptions:
• Given a random sample of 1500 high school
graduates
• Since 1500(.61) > 10 and 1500(.39) > 10, sample
size is large enough.
• Population size is much larger than the sample
size.
College Attendance Continued . . .
H0: p = .61
Ha: p ≠ .61
Test statistic:
Where p is the proportion of all 2009 high
school graduates in this state who attended
college the year after graduation
.58  .61
z
 2.38
.61(.39)
What potential error could you
1500
have made?
Type II
area to the left
P-value = 2(.0087)The
= .0174
Useofa-2.38
= .01is
approximately .0087
Since P-value > a, we fail to reject H0. The evidence does
not suggest that the proportion of 2009 high
school graduates in this state who attended
college the year after graduation differs from
the national value.
County Judge Example
A county judge has agreed that he will give
up his county judgeship and run for a state
judgeship unless there is evidence at the
0.10 level that more then 25% of his party is
in opposition. A random sample of 800 party
members included 217 who opposed him.
Please advise this judge.
County Judge Example Continued
p = proportion of his party that is in opposition
H0: p = 0.25
HA: p > 0.25
a = 0.10
Note: hypothesized value = 0.25
n  800, p 
217
 0.27125
800
0.27125  0.25
z
 1.39
0.25(0.75)
800
County Judge Example Continued
P-value=P(z  1.39)  1  0.9177  0.0823
At a level of significance of 0.10,
there is sufficient evidence to
support the claim that the true
percentage of the party members
that oppose him is more than 25%.
Under these circumstances, I would
advise him not to run.
In December 2009, a county-wide water
conservation campaign was conducted in a
particular county. In 2010, a random sample
of 500 homes was selected and water usage was
recorded for each home in the sample. Suppose the
sample results were that 220 households had
reduced water consumption. The county supervisors
wanted to know if their data supported the claim that
fewer than half the households in the county
reduced water consumption.
H0: p = .5
Ha: p < .5
the hypotheses.
where p is theState
proportion
of all households in
Calculate
p.
the county with reduced
water usage
220
p̂ 
 .44
500
Water Usage Continued . . .
H0: p = .5
Ha: p < .5
where p is the proportion of all households in
the county with reduced water usage
Verify assumptions
1. p is from a random sample of households
2. Sample size n is large because np = 250 >10 and
n(1-p) = 250 > 10
3. It is reasonable that there are more than 5000 (10n)
households in the county.
Water Usage Continued . . .
H0: p = .5
Ha: p < .5
where p is the proportion of all households in
the county with reduced water usage
Calculate
the
testup
statistic
Look
this
value
in the
.44  .5
and
P-value
z
 2.68
What potential
tableerror
of zcould
curveyou
areas
.5(.5)
have made? Type I
500
P-value = .0037
Use a = .01
Since P-value < a, we reject H0. There is convincing
evidence that the proportion of households with
reduced water usage is less than half.
Water Usage Continued . . .
H0: p = .5
Ha: p < .5
where p is the proportion of all households in
the county with reduced water usage
Since P-value < a, we reject H0.
Confidence intervals are two-tailed,
Use a = .01
Compute
a 98%
so
we.01
need
to putconfidence
.01that
in the
upper
With
in
each
tail,
puts
.98
tail (since
the curve
is
in
the
middle
–
Notice that theinterval:
Let’s create
a confidence
symmetrical).
Since
weisare
testing
Ha: pconfidence
< .05, a
hypothesized value
of
this
the
appropriate
interval with this
data. 
lower
.44(.56)
would
also
be
in
the
tail.  

.5 is NOT in the 98%
.44

2.326
level
What is the appropriate


500
.98
confidence interval
 to use? 
confidence level
and that we “rejected”
(.388, .492)
H
!
.50
College Attendance Revisited . . .
H0: p = .61
Ha: p ≠ .61
Where p is the proportion of all 2009 high
school graduates in this state who attended
college the year after graduation
Since P-value > a, we fail to reject H0.
Use a = .01
Let’s compute
This is a two-tailed
test so aagets
interval
for
Notice that the
split evenlyconfidence
into both tails,
leaving
problem.
hypothesized value of 99% in this
the middle.
.58(.42) 

.58  2.576
.61.99
IS in the 99%

1500


confidence interval
and that we “failed to (.547, .613)
reject” H0!
Section 10.4
Hypothesis Tests for a
Population Mean
Let’s review the assumptions for a
confidence interval for a population mean
The assumptions are the same for a
large-sample
hypothesis
test
for a
1) x is the sample
mean from
a random
sample,
population mean.
2) the sample size n is large (n > 30), and
3) , the population standard deviation, is known or
unknown
This is the test statistic
This is the test statistic
when  is known. when  is unknown.
x μ
z
σ
n
P-value is area under
the z curve
x μ
t
s
n
P-value is area under
the t curve with df=n-1
The One-Sample t-test for a
Population Mean
Null hypothesis:
Test Statistic:
H0: m = hypothesized value
x m
t 
s
n
Alternative Hypothesis:
P-value:
Ha: m > hypothesized value Area to the right of calculated t
with df = n-1
Ha: m < hypothesized value Area to the left of calculated t
with df = n-1
Ha: m ≠ hypothesized value 2(Area to the right of t) of +t
or
2(Area to the left of t) of -t
The One-Sample t-test for a
Population Mean Continued . . .
Assumptions:
1. x and s are the sample mean and sample
standard deviation from a random sample
2. The sample size n is large (n > 30) or the
population distribution is at least approximately
normal.
A study conducted by researchers at Pennsylvania State
University investigated whether time perception, an
indication of a person’s ability to concentrate, is impaired
during nicotine withdrawal. After a 24-hour smoking
abstinence, 20 smokers were asked to estimate how
much time had elapsed during a 45-second period.
Researchers wanted to see whether smoking abstinence
had a negative impact on time perception, causing
elapsed time to be overestimated. Suppose the resulting
data on perceived elapsed time (in seconds) were as
follows:
69 65 72 73 59 55 39 52 67 57
56 50 70 47 56 45 70 64 67 53
What is the mean and standard
x = 59.30 s = 9.84
n = 20
deviation
of the sample?
Smoking Abstinence Continued . . .
69 65 72 73 59 55 39 52 67 57
56 50 70 47 56 45 70 64 67 53
x = 59.30 s = 9.84 n = 20
Where m is the true mean perceived elapsed
H0: m = 45
time for smokers who have abstained
State thefrom
Ha: m > 45
smoking for 24-hours
Since the boxplot is approximately hypotheses.
Assumptions:
symmetrical, it is plausible that the
1) Itpopulation
is reasonable
to believe
that the sample of Verify
smokers is
distribution
is
approximately
To do this, we need to
graph the
assumptions.
representative ofnormal.
all smokers.
data using a boxplot or normal
2) Since the sample size is not
at
probability
plot
least 30, we must determine if
it is plausible that the
population distribution is
approximately normal.
40 50 60 70
Smoking Abstinence Continued . . .
69 65 72 73 59 55 39 52 67 57
56 50 70 47 56 45 70 64 67 53
x = 59.30 s = 9.84 n = 20
Where m is the true mean perceived elapsed
H0: m = 45
time for smokers who
have abstained
Compute
the test from
statistic
Ha: m > 45
smoking for 24-hours
and P-value.
Test statistic:
P-value ≈ 0
59.30  45
t
 6.50
9.84
20
Use a = .05
Since P-value < a, we reject H0. There is convincing
evidence that the mean perceived elapsed time is greater
than the actual elapsed time of 45 seconds.
Smoking Abstinence Continued . . .
69 65 72 73 59 55 39 52 67 57
56 50 70 47 56 45 70 64 67 53
x = 59.30 s = 9.84 n = 20
Compute
the
appropriate
Where
m
is
the
true
mean
perceived
elapsed
H0: m = 45
interval.from
time for smokersconfidence
who have abstained
Ha: m > 45
smoking for 24-hours
Since P-value < a, we reject H0.
Notice that the
a = .05

hypothesized value of
59.30  1.729 9.84

20 

45 is NOT in the 90%
confidence
(55.497,
63.103)
Since this isinterval
a one-tailed test,
a goes
and in
that
“rejected”
thewe
upper
tail. .05 goes in the
H0! leaving .90 in the middle.
lower tail,
A growing concern of employers is time spent in activities
like surfing the Internet and emailing friends during work
hours. The San Luis Obispo Tribune summarized the
findings of a large survey of workers in an article that ran
under the headline “Who Goofs Off More than 2 Hours a
Day? Most Workers, Survey Says” (August 3, 2006).
Suppose that the CEO of a large company wants to
determine whether the average amount of wasted time
during an 8-hour day for employees of her company is less
than the reported 120 minutes. Each person in a random
sample of 10 employees was contrasted and asked about
daily wasted time at work. The resulting data are the
following:
108 112 117 130 111 131 113 113 105 128
What
is
the
mean
and
standard
x = 116.80 s = 9.45 n = 10
deviation of the sample?
Surfing Internet Continued . . .
108 112 117 130 111 131 113 113 105 128
x = 116.80 s = 9.45 n = 10
H0: m = 120
Ha: m < 120
Where m is the true mean daily wasted time
for employees of this company
The boxplot reveals some skewness, but
State
the
Verify
the
there
is
no
outliers.
It
is
plausible
that
the
Assumptions:
hypotheses.
assumptions.
population
distribution
is
approximately
1) The given sample was a random sample of employees
normal.
2) Since the sample size is not at
least 30, we must determine if
it is plausible that the
population distribution is
110
120
130
approximately normal.
Surfing Internet Continued . . .
108 112 117 130 111 131 113 113 105 128
x = 116.80 s = 9.45 n = 10
H0: m = 120
Ha: m < 120
Test Statistic:
Where m is the true mean daily wasted time
for employees of this company
Compute
the could
test statistic
116potential
.80
 120error
What
we
t 


1
.
07
and P-value.
Type II
.45
have 9made?
10
P-value =.150
Use a = .05
Since p-value > a, we fail to reject H0. There is not
sufficient evidence to conclude that the mean daily wasted
time for employees of this company is less than 120
minutes.
Bolt Example
A manufacturer of a special bolt requires
that this type of bolt have a mean shearing
strength in excess of 110 lb. To determine if
the manufacturer’s bolts meet the required
standards a sample of 25 bolts was obtained
and tested. The sample mean was 112.7 lb
and the sample standard deviation was 9.62
lb. Use this information to perform an
appropriate hypothesis test with a
significance level of 0.05.
Bolt Example Continued
m = the mean shearing strength of this specific
type of bolt
The hypotheses to be tested are
H0: m = 110 lb
Ha: m  110 lb
The significance level to be used for the test is
a = 0.05.
x

110
The test statistic is t 
s
n
Bolt Example Continued
x  112.7, s  9.62, n  25, df  24


112.7  110 

P-value  P t 


9.62


25 

 P(t  1.4)  0.087
Bolt Example Continued
Because P-value = 0.087 > 0.05 = a, we
fail to reject H0.
At a level of significance of 0.05, there is
insufficient evidence to conclude that the
mean shearing strength of this brand of bolt
exceeds 110 lbs.
Charm Example
A jeweler is planning on manufacturing gold
charms. His design calls for a particular piece to
contain 0.08 ounces of gold. The jeweler would
like to know if the pieces that he makes contain
(on the average) 0.08 ounces of gold. To test to
see if the pieces contain 0.08 ounces of gold, he
made a sample of 16 of these particular pieces
and obtained the following data.
0.0773 0.0779 0.0756 0.0792 0.0777
0.0713 0.0818 0.0802 0.0802 0.0785
0.0764 0.0806 0.0786 0.0776 0.0793
0.0755
Use a level of significance of 0.01 to perform an
appropriate hypothesis test.
Charm Example Continued
The population characteristic being studied
is m = true mean gold content for this
particular type of charm.
H0:µ = 0.08 oz
Ha:µ  0.08 oz
a = 0.01
x  hypothesized mean x  0.08
t

s
s
n
n
Charm Example Continued
Minitab was used to create a normal plot along
with a graphical display of the descriptive
statistics for the sample data. The result of this
display is that it is reasonable to assume that the
population of gold contents of this type of charm
is normally distributed
Charm Example Continued
We can see that with the exception of one outlier, the
data is reasonably symmetric and mound shaped in
shape, indicating that the assumption that the
population of amounts of gold for this particular charm
can reasonably be expected to be normally distributed.
Descriptive Statistics
Variable: Gold
Anderson-Darling Normality Test
A-Squared:
P-Value:
0.072
0.074
0.076
0.078
0.080
0.082
95% Confidence Interval for Mu
0.363
0.396
Mean
StDev
Variance
Skewness
Kurtosis
N
7.80E-02
2.51E-03
6.32E-06
-1.10922
2.23191
16
Minimum
1st Quartile
Median
3rd Quartile
Maximum
7.13E-02
7.66E-02
7.82E-02
8.00E-02
8.18E-02
95% Confidence Interval for Mu
7.66E-02
0.0765
0.0775
0.0785
0.0795
7.93E-02
95% Confidence Interval for Sigma
1.86E-03
3.89E-03
95% Confidence Interval for Median
95% Confidence Interval for Median
7.71E-02
7.95E-02
Charm Example Continued
7. Computations:
n  16, x  0.077981, s  0.0025143
0.077981  0.08
t
  3.2
0.0025143
16
This is a two tailed test. Looking up in the table of
tail areas for t curves, t = 3.2 with df = 15. We see
the table entry is 0.003 so
P-Value = 2(0.003) = 0.006
Charm Example Continued
Since P-value = 0.006  0.01 = a, we reject H0
at the 0.01 level of significance.
At the 0.01 level of significance there is
convincing evidence that the true mean gold
content of this type of charm is not 0.08
ounces.
Actually when rejecting a null hypothesis for the 
alternative, a one tailed claim is supported. In this
case, at the 0.01 level of significance, there is
convincing evidence that the true mean gold content
of this type of charm is less than 0.08 ounces.
Section 10.5
Power and Probability of
Type II Error
Recall Type I and Type II Errors:
SupposeH
H0is
isfalse
true
Suppose
Suppose
H
H
is
is
true
Suppose
000 false
and
we
fail
to reject
reject
and
and
we
we
fail
reject
to
it,
and
we
it,
what
type
of
it,
what
what
type
type
ofof
what
type
of
decisionwas
wasmade?
made?
decision
was
made?
decision
Reject
H0
H0 is
true
H0 is
false
Type I
error
Correct
Power
Fail to
Type II
Correct
reject
H0 that
The
probability
we correctly reject H0 is
called the power of theerror
test.
Power and Probability of
Type II Error
The power of a test is the probability of
rejecting the null hypothesis.
When H0 is false, the power is the probability
that the null hypothesis is rejected.
Specifically, power = 1 – b.
Comments on Power
Calculating b (hence power) depends on
knowing the true value of the population
characteristic being tested. Since the true
value is not known, generally, one calculates
b for a number of possible “true” values of
the characteristic under study and then
sketches a power curve.
Suppose that the student body president at a
university is interested in studying the amount of
money that students spend on textbooks each
semester. The director of financial aid services
believes that
average
amount
spent
The the
power
of a test
depends
onon
the true value of the
textbooks
is
$500
each
semester,
and
uses
Because
the was
actual
value
ofthis
m we
istounknown,
However,
ifmean
the
true
mean
$525,
itthen
is
less
likely
that we
If themean!
true
is greater
than
$500,
should
cannot
the
power
for
the
actual
value
of m.
determine
amount
ofbe
financial
aid
for
a from
the sample
would
mistaken
for
awhich
sample
the
reject
Hthe
ifknow
the
true
mean
is
ONLY
a little
greater,
0. BUT,
population
ifthen
the
mean
were
$500.
So,
isplans
more
sayis $505,
the
sample
mean
look
like
we
student
eligible.
The
student
body
president
BUT,
we
can
gain
insight
tomight
the itpower
of mlikely
by
that
will were
correctly
reject
H0much
. wouldn’t
expect
theinvestigating
trueinwe
mean
$500.
Thus
we
some
“what
if “scenarios
...
to ask
each ifstudent
a random
sample
how
haveonconvincing
reject
0.
he or she spent
books thisevidence
semestertoand
useHthe
data to test (using a = .05) the following hypotheses:
H0: m = 500 versus Ha: m > 500
Let’s consider a one-sided, upper tail test.
Fail to Reject H0
Reject H0
Power = 1 - b
b
m0
a
ma
If the null hypothesis is false, then
m > hypothesized value
Textbooks Continued . . .
H0: m = 500
Ha: m > 500
Suppose that  = $85 and
n = 100. (Since n is large, the
sampling distribution of x is
approximately normal.)
What is the probability of committing a Type I error?
a = .05
If m = 500 is true, forThis
what
values
of with
the sample
mean
is the
z value
.95
area to its left.
would you reject the null hypothesis?
Rejection Region
.95
x-500
1.645=
What is the value of this x? 85
We would reject H0 for
100
x > 513.98.
Use:
Textbooks Continued . . .
Suppose that  = $85 and
n = 100.
Ha: m > 500 We would reject H0 for
x > 513.98.
If the null hypothesis is
false, then m > 500.
What is the probability
of a Type of
II xerror
(b)? is b.
This area (to the leftWhat
= if513.98)
m = 520?
H0: m = 500
513.98-520
z=
=-.708
85
100
Rejection Region
b =Look
.24 this up in the
table of areas for z
curves
Textbooks Continued . . .
H0: m = 500
Ha: m > 500
Suppose that  = $85 and
n = 100.
We would reject H0 for
x > 513.98.
What is the power of the test if m = 520?
b = .24
Power
is
the
probability
of
power is in the
Power = Notice
1 -correctly
.24 that
= .76
SAME rejecting
curve as H
b 0.
Rejection Region
Power = 1 – b
Textbooks Continued . . .
Suppose that  = $85 and
n = 100.
Ha: m > 500 We would reject H0 for
x > 513.98.
If we reject H0, then
Notice that, as the distancembetween
> 500. the
Find b andnull
power.
hypothesized value for m and our
if m = 530?
alternative value forWhat
m increases,
b
H0: m = 500
b = .03
decreases AND power increases.
power = .97
Rejection Region
Textbooks Continued . . .
Suppose that  = $85 and
n = 100.
Ha: m > 500 We would reject H for
0
x > 513.98. If the null hypothesis is
Notice that, as the distance
between
the
false, then
m > 500.
Find b andnull
power.
hypothesized value
for ifmmand
our
What
= 510?
alternative value for m decreases, b
increases AND power decreases.
b = .68
power = .32
H0: m = 500
Rejection Region
Textbooks Continued . . .
Suppose that  = $85 and
n = 100.
H0: m = 500
Ha: m > 500
b will increase and power will
decrease.
What happens if we use a = .01?
Rejection Region
b
Rejection Region
Power
b
Power
What happens to a, b, & power when
the sample size is increased?
Fail to Reject H0
a
The
standard and
bThe
decreases
significance
deviation
will
power
Reject
H0(a)increases!
level
remains
decrease making
the curve
same taller
– so the
the
value
where the
and
skinnier.
rejection region
begins must move.
m0
b
Power
ma
Effects of Various Factors on the
Power of a Test
• The larger the size of the discrepancy
between the hypothesized value and the
actual value of the population
characteristic, the higher the power.
• The larger the significance level a, the
higher the power of a test.
• The larger the sample size, the higher the
power of a test
b and Power for a t Test
When using a t-test, the population standard
deviation  is unknown. b not only depends on a,
n, and the actual value of m, but b also depends on
 so we must have an estimate of .
The b curves (on the next slide) can be used to
estimate b and the power of a test based upon the
value of d.
d
alternativ e value - hypothesiz ed value
σ
b curves
Consider testing
H0: m = 100 versus Ha: m > 100
and
focus
the alternative
valueism normal,
= 110. the t
When
theon
population
distribution
Suppose
 =hypotheses
10, n = 7, and
a =m.01.
test for that
testing
about
has smaller
b than does any other test procedure that has
110 the
100
same
d 
 1significance level a.
10
b ≈ .6
Calculate d.
Use the df = 6
Power ≈ .4 curve to
estimate b