Download Type I error.

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Sta220 - Statistics
Mr. Smith
Room 310
Class #18
Section 6-1 and 6-2 Notes
6-1: The Elements of a Test of Hypothesis
Suppose building specifications in a certain city
require that the average breaking strength of
residential sewer pipe be more than 2,400
pounds per foot of length.
Each manufacturer who wants to sell pipe in
that city must demonstrate that its product
meets the specification.
We are less interested in estimating the value of
πœ‡ than we are in testing a hypothesis about its
value.
We want to decide whether the mean breaking
strength of the pipe exceeds 2,400 per linear
foot?
A statistical hypothesis is a statement about the
numerical value of a population parameter.
We define two hypotheses:
(1) The null hypothesis, 𝐻0
(2) The alternative (research) hypothesis, π»π‘Ž
The null hypothesis, denoted 𝐻0 , represents the hypothesis
that will be accepted unless the data provide convincing
evidence that it is false. This usually represents the β€˜status
quo’ or some claim about the population parameter that the
researcher wants to test.
The alternative (research) hypothesis, denoted π»π‘Ž ,
represents the hypothesis that will be accepted only if the
data provide convincing evidence of its truth. This usually
represents the values of a population parameter for which the
researcher wants to gather evidence to support.
Null Hypothesis (𝐻0 ): πœ‡ ≀ 2400
(the manufacturer’s pipe does not meet
specifications)
Alternative Hypothesis (π»π‘Ž ): πœ‡ > 2400
(the manufacturer’s pipe meets specifications)
How can the city decide when enough evidence
exists to conclude that the manufacturer's pipe?
β€œConvincing” evidence in favor of the alternative
hypothesis will exist when value of π‘₯ exceeds
2,400 by an amount that cannot be readily
attributed to sampling variability.
To decide, we compute a test statistic.
The test statistic is a sample statistic, computed
from information provided in the sample, that
the researcher uses to decide between the null
and alternative hypotheses.
Test Statistic
z=
z=
π‘₯ βˆ’2400
𝜎π‘₯
π‘₯ βˆ’2400
𝑠
𝑛
If you examine the figure below, the chance of
observing π‘₯ more than 1.645 standard deviations
above 2,400 is only .05 – if in fact the true mean πœ‡ is
2400.
If the sample mean is more than 1.645 standard
deviations above 2,400, either π»π‘œ is true and a
relatively rare event has occurred (.05
probability) or π»π‘Ž is true and the population
means exceeds 2,400.
Since we would mostly likely reject the notation
that a rare event has occurred, we would reject
the null hypothesis (πœ‡ > 2400) is true.
What is the probability that this procedure will
lead us to an incorrect decision?
Such an incorrect decision – deciding that the
null hypothesis is false when in fact it is true –
this is called a Type I error.
TYPE I Error
A Type I error occurs if the researcher rejects the
null hypothesis in favor of the alternative
hypothesis when, in fact, 𝐻0 is true. The
probability of committing at Type I error is
denoted by 𝛼.
𝛼 = 𝑃(𝑧 > 1.645 when in π‘“π‘Žπ‘π‘‘ πœ‡ = 2,400) = .05
𝛼 = .05
𝐻0 : πœ‡ ≀ 2,400 (Pipe does not meet specs)
π»π‘Ž : πœ‡ > 2,400 (Pipe meets specs)
Test Statistic: z =
π‘₯ βˆ’2400
𝜎π‘₯
Rejection Region: z > 1.645, which corresponds to 𝛼 = .05
The rejection region of a statistical test is the set
of possible values of the test statistic for which
the researcher will reject 𝐻0 in favor of π»π‘Ž .
Let test this.
Suppose we test 50 sections of sewer pipe and
the mean and standard deviation for these 50
measurements to be
π‘₯ = 2,460
𝑠 = 200
Test Statistic is
zβ‰ˆ
2460 βˆ’2400
200
(
)
50
= 2.12
Therefore, the sample mean lies 2.12𝜎π‘₯ above the
hypothesized value of πœ‡, 2,400 as shown in the
figure below.
Since the z-score exceeds 1.645, it falls into the
rejection region.
We would reject the null hypothesis that
πœ‡ = 2,400 and concluded that πœ‡ > 2,400. It
appears that the company’s pip has a mean
strength that exceeds 2,400 pounds per linear foot.
The level of risk, 𝛼, of making a Type I error when
we constructed the test.
Now, suppose we test 50 sections of sewer pipe
and the mean and standard deviation for these
50 measurements to be
π‘₯ = 2,430
s = 200
The z-score for this sample mean is z = 1.06.
Looking at the figure below, this z-score does
not fall into the the rejection region (z >1.645).
We cannot reject 𝐻0 𝑒𝑠𝑖𝑛𝑔 𝛼 = .05.
Even though the sample mean exceeds by 30
pounds per linear, it does not exceed the
specification by enough to provide convincing
evidence that the population mean exceeds
2,400.
Should we accept the null hypothesis
𝐻0 : πœ‡ < 2,400 and conclude that the
manufacturer's pipe does not meet
specifications?
This risk is called a Type II error.
A Type II error occurs if the researcher accepts
the null hypothesis when, in fact, 𝐻0 is false.
The probability of committing a Type II error is
denoted by 𝛽.
A Type II error is often difficult to determine
precisely. Rather than make a decision (accept
π»π‘œ ) for which the probability of error is
unknown, we ovoid the potential Type II error
by avoiding the conclusion that the null
hypothesis is true.
We simply state that the sample evidence is
insufficient to reject π»π‘œ π‘Žπ‘‘ 𝛼 = .05.
Conclusions
The β€œtrue state of nature” columns refer to the fact that
either the null hypothesis is true or the alternative
hypothesis is true. The β€œdecision” rows refer to the action
of the researcher, assuming that he or she will either
conclude that 𝐻0 is true or that π»π‘Ž is true, based on the
results of the sampling experiment.
Type I error can be made ONLY when the null
hypothesis is rejected in favor of the alternative
hypothesis and a Type II error can be made
ONLY when the null hypothesis is accepted.
Conclusions and Consequences for a Test of Hypothesis
True State of Nature
Conclusion
Accept π‘―πŸŽ :
Pipe does not meet
Specs
Reject π‘―πŸŽ : (Accept 𝑯𝒂 )
Pipe does meet Specs
π‘―πŸŽ True:
Pipe does not meet
Specs
𝑯𝒂 True:
Pipe meet Specs
Correct decision
Type II error (𝛽)
Type I error (𝛼)
Correct Decision
Warning!!!
Be careful not to β€œaccept 𝐻0 ” when conducting a
test of hypothesis because the measure of
reliability, 𝛽 = P(Type II error), is almost
unknown. If the test statistic does not fall into
the rejection region, it is better to stat the
conclusion as β€œinsufficient evidence to reject
𝐻0 .”
Procedure
Copyright © 2013 Pearson
Education, Inc.. All rights
reserved.
6.2: Formulating Hypotheses and Setting Up
the Rejection Region
Procedure
Copyright © 2013 Pearson
Education, Inc.. All rights
reserved.
Copyright © 2013 Pearson
Education, Inc.. All rights
reserved.
Rejection regions corresponding to one- and
two tailed tests
Copyright © 2013 Pearson
Education, Inc.. All rights
reserved.
Table 8.2
Copyright © 2013 Pearson
Education, Inc.. All rights
reserved.
Example 6.2.1
A metal lathe is checked and periodically by
quality control inspectors to determine whether
it is producing machine bearings with mean
diameter of .5 inch. If the mean diameter of
bearing is larger or smaller than .5 inch, then the
process is out of control and must be adjusted.
Formulate the null and alternative hypotheses
for a test to determine whether the bearing
production process is out of control.
Solution
We define πœ‡ as the true mean diameter (in inches) of all
bearings produced by the metal lathe.
If either πœ‡ > .5 π‘œπ‘Ÿ πœ‡ < .5, then the lathe’s production
process is out of control.
Because πœ‡ = .5 represents an in-control process (the status
quo), this represents the null hypothesis. Therefore, we want
to conduct the two-tailed test:
𝐻0 : πœ‡ = .5 (the process is in control)
π»π‘Ž : πœ‡ β‰  .5 (the process is out of control)
Example 6.2.2.
The effect of drugs and alcohol on the nervous system
has been the subject of considerable research. Suppose a
research neurologist is testing the effect of a drug on
response time by injecting 100 rats with a unit dose of
the drug, subjecting each rat to a neurological stimulus,
and recording its response time. The neurologist knows
that the mean response time for rats not injected with
the drug (the β€œcontrol” mean ) is 1.2 seconds. She wishes
to test whether the mean response time for drug-injected
rats differs from 1.2 seconds. Set up the test of
hypotheses for this experiment, using 𝛼 = .01.
Solution
Since the neurologist wishes to detect whether
πœ‡ differs from the control mean of 1.2 seconds
in either direction, that is πœ‡ < 1.2 or πœ‡ > 1.2,
we conduct a two-tailed statistical test.
Solution
𝐻0 : πœ‡ = 1.2
(Mean response time is 1.2 seconds)
π»π‘Ž : πœ‡ β‰  1.2
(Mean response time is less than 1.2 seconds or
greater than 1.2 seconds)
Test Statistic:
z=
π‘₯ βˆ’1.2
𝑠
𝑛
Rejection Region
We will reject 𝐻0 for values of z that are either
too small or too large.
We were given 𝛼 = .01 and since this is a twotail test, we have to use 𝛼/2 = .005.
Therefore our rejection region:
𝑧 < βˆ’2.575 π‘œπ‘Ÿ 𝑧 > 2.575
Assumptions:
Since the sample size of the experiment is large
enough (n> 100), the CLT will apply and no
assumptions need to be made about the
population of response time measurements.
Two-tailed rejection region:
a = .01
Copyright © 2013 Pearson
Education, Inc.. All rights
reserved.
Reminder:
Homework 5.2 due today
Homework 5.3 due today
Homework 5.5 due today
Quiz 5 Review due Wednesday May 7, 2014
Homework 5 Review due Wednesday May 7, 2014
Homework 6.2 due Friday May 9, 2014