Download Lecture 14: Introduction to Hypothesis Testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Lecture 14: Introduction to Hypothesis Testing
Devore: Section 8.1
March, 2011
Page 1
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
What is statistical hypothesis?
• A statistical hypothesis is a claim about the value of a
parameter(s) or about the form of a distribution as a whole.
• As an example, consider a normal distribution with the mean µ.
Then, the statement µ = .75 is a hypothesis.
March, 2011
Page 2
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Null and Alternative Hypotheses
• Usually, two contradictory hypotheses are under consideration.
For example, we may have µ = .75 and µ 6= .75. Alternatively,
for a probability of success of some binomial distribution, we
may have p
≥ .10 and p ≤ .10.
1. The null hypothesis H0 is the one that is initially assumed to
be true.
2. The alternative hypothesis Ha is the assertion contrary to
H0 .
March, 2011
Page 3
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• We reject the null hypothesis in favor of the alternative
hypothesis if the sample evidence suggests so. If the sample
does not contradict H0 , we continue to believe it is true.
• Thus, the two possible conclusions from a hypothesis-testing
analysis are reject H0 or fail to reject H0 .
March, 2011
Page 4
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
An Example of the Test
• A test of hypothesis is a method for using sample data to decide
whether the null hypothesis should be rejected.
• How exactly do we formulate a test? It depends on what our
goals are...
• Consider a company that wants to introduce an expensive new
product to its line-up of existing ones. Clearly, there has to be
an extensive evidence in favor of this new product. If it is, for
example, a new type of the lightbulb, we need to ensure that its
average lifetime is much longer than the one for existing types
before adopting it.
March, 2011
Page 5
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• A reasonable test would be to test H0 : µ = a vs. Ha : µ > a
where a is some predetermined threshold.
• Clearly, the alternatives Ha : µ < a or H0 : µ 6= a are of no
interest in this case.
• Ha : µ < a and Ha : µ > a are called one-sided alternatives;
H0 : µ 6= a is called a two-sided alternative.
• The value a that separates null hypothesis from an alternative is
called a null value.
March, 2011
Page 6
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Testing Procedure
• A test procedure is specified by
1. A test statistic, a function of the sample data on which the
decision will be based
2. A rejection region, a set of all test statistic values for which
H0 will be rejected (null hypothesis rejected iff(=if and only
if) the test statistic value falls in this region.)
March, 2011
Page 7
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Example
• Example. Consider the claim about the average nicotine content
of a cigarette brand being at most 1.5 mg. In this case, the best
setup would be to test H0 : µ = 1.5 vs. Ha : µ > 1.5. Why?
We only care if this content is exceeded!
• Let X̄ be a sample average nicotine content. Then, evidence
against H0 would be provided by x̄ > 1.5.
• Note that the choice of the rejection region is somewhat
arbitrary...We could have selected x̄ > 1.55 as a rejection
region.
March, 2011
Page 8
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Error Types
• Example It is possible that for some sample x̄ = 1.8 even
when H0 is true.
• A Type I error consists of rejecting the null hypothesis H0 when
it is true
• Example It is also possible that x̄ = 1.5 for a particular sample
even if H0 is false
• A Type II error involves not rejecting H0 when it is false
March, 2011
Page 9
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• The only way to get rid of both errors is to use the entire
population! In reality, a different procedure has to be followed.
• Assume that 25% of the time automobiles have no visible
damage in 10mph crash tests. Denote p the proportion of all 10
mph crashes that results in no visible damage to the new
bumper. Then, H0
: p = .25 vs. Ha : p > .25. The
experiment is based on n = 20 independent crashes with
prototype of the new design.
March, 2011
Page 10
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Type I Error analysis
• Consider the following procedure:
1. Test Statistic is X - the number of crashes with no visible
damage
2. Rejection region R8
reject H0 if x
= {8, 9, . . . , 20}; in other words,
≥ 8.
• Thus, the probability of Type I error is
α = P ( Type I Error ) = P (X ≥ 8 when X ∼ Bin(20, .25))
= 1 − B(7; 20, .25) = .102
March, 2011
Page 11
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Type II Error Analysis
• In contrast to Type I Error, there is no single β ; instead, there
are different β ’s for different values of p
• Suppose the true value of p is p = 0.3. Then,
β(.3) = P ( Type II Error when p = 0.3)
= P (X ≤ 7 when X ∼ Bin(20, .3))
= B(7; 20, .3) = .772
• It is easy to understand that β decreases as p grows more
different from the null value .25
March, 2011
Page 12
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• Now consider a different rejection region
R9 = {9, 10, . . . , 20}.
• Since X ∼ Bin(20, p), we have
α = P (H0 rejected when p = .25 )
= P (X ≥ 9 when X ∼ Bin(20, .25)) = 1 − B(8; 20, .25) = .041
• Note that the Type I error probability has gone down.
March, 2011
Page 13
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• At the same time,
β(.3) = P (H0 is not rejected when X ∼ Bin(20, .3))
= P (X ≤ 8 when X ∼ Bin(20, .3))
= B(8; 20, .3) = .887
which is larger than before. Think of it as an equilibrium...
March, 2011
Page 14
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Proposition
• If an experiment and a sample size are fixed, decreasing the
size of the rejection region to obtain a smaller value of α always
results in a larger value of β for any parameter value consistent
with with the alternative hypothesis Ha .
• The usual approach is to specify the largest value of α that can
be tolerated and find a rejection region for it. This makes β as
small as possible subject to the bound on α. Such a value of α
is called the significance level of the test.
• Traditional choices are .10,.05 and .01. The resulting test is
called a level α test.
March, 2011
Page 15
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
Example
• Consider again the nicotine example. We test H0 : µ = 1.5 vs.
Ha : µ > 1.5 based on a random sample X1 , . . . , X20 . We
know that σ = .20 and X is normally distributed
• Hence, X̄ is normally distributed with mean µX̄ = µ and
standard deviation σX̄ = √.20 = 0.0354.
32
• We use
X̄ − 1.5
Z=
0.354
as a test statistic
March, 2011
Page 16
Statistics 511: Statistical Methods
Dr. Levine
Purdue University
Spring 2011
• We reject H0 when z ”considerably” exceeds zero...In other
words, we choose c such that
α = P ( Type I Error) = P (Z ≥ c WhenZ ∼ N (0, 1))
• For example, if α = 0.05, we have c = z.05 = 1.645. This
corresponds to x̄ ≥ 1.56.
• Then, for any particular µ > 1.5, β = P (X̄ < 1.56|µ).
March, 2011
Page 17