Download Minimizing Chance of Type I and Type II Errors

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Minimizing Chance of
Type I and Type II Errors
Example: Grade inflation?
H0: μ = 2.7
HA: μ > 2.7
Random sample
of students
Data
n = 36
s = 0.6
and
Decision Rule
Set significance level α = 0.05.
If p-value 0.05, reject null hypothesis.
X
P-value when X-bar is 3 ...
Reject null since p-value is small.
P-value when X-bar is 2.95 ...
Reject null since p-value is small.
P-value when X-bar is 2.865...
(Just barely!) reject null since p-value is small.
Alternative Decision Rule
•
•
•
•
•
Instead of rejecting if p-value  0.05
Equivalent to rejecting when Z > 1.65
Equivalent to rejecting when X-bar > 2.865
X-bar > 2.865 is called “rejection region.”
(Intuitive, since rejecting when X-bar is
larger than expected.)
Type I Error illustrated...
Minimize chance of
Type I error...
• … by making significance level  small.
• Common values are  = 0.01, 0.05, or 0.10.
• “How small” depends on seriousness of
Type I error.
• Decision is not a statistical one but a
practical one.
Example: Serious Type I Error
• New Drug A is supposed to reduce diastolic
blood pressure by more than 15 mm Hg.
• H0: μ = 15 versus HA: μ > 15
• Drug A can have serious side effects, so
don’t want patients on it unless μ > 15.
• Implication of Type I error: Expose patients
to serious side effects without other benefit.
• Set  = P(Type I error) to be small  0.01
Type II Error and Power
• Type II Error is made when we fail to reject
the null when the alternative is true.
• Want to minimize P(Type II Error).
• Equivalently, want to maximize the
“power” of the test, i.e. the probability of
rejecting null when alternative is true!
• Power = 1 - P(Type II error)
Type II Error and Power
Power
• Power is probability, so number between 0
and 1.
• 0 is bad!
• 1 is good!
• Need to make power as high as possible.
Maximizing power …
• The higher the significance level , the
higher the P(Type I error), the higher the
power.
• The farther apart the actual mean is from
the mean specified in the null, the higher the
power.
• The larger the sample, the higher the power.
• The smaller the standard deviation, the
higher the power.
Strategy for designing a
good hypothesis test
• Use pilot study to estimate std. deviation.
• Specify . Typically 0.01 to 0.10.
• Decide what a meaningful difference would
be between the mean in the null and the
actual mean.
• Decide power. Typically 0.80 to 0.99.
• Use software to determine sample size.
Using Minitab to determine
sample size
• Select Stat. Select Power and Sample Size.
• Select appropriate test. For single mean,
“1-sample t…”
• Click on radio button to “Calculate sample
size for each power value.” Specify Power
and Meaningful Difference. Specify Sigma.
• Under Options, specify Alternative
hypothesis and significance level.
• Select OK.
If sample is too small ...
• … the power can be too low to identify
even large meaningful differences between
the null and alternative values.
– Determine sample size in advance of
conducting study.
– Don’t believe the “fail-to-reject-results” of a
study based on a small sample.
If sample is really large ...
• … the power can be extremely high for
identifying even meaningless differences
between the null and alternative values.
– In addition to performing hypothesis tests, use a
confidence interval to estimate the actual
population value.
– If a study reports a “reject result,” ask how
much different?
The morals of the story
• Always determine how many measurements
you need to take in order to have high
enough power to achieve your study goals.
• If you don’t know how to determine sample
size, ask a statistical consultant to help you.
• When interpreting the results of a study,
always take into account the sample size.