Download Concepts in Hypothesis Testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Topic (11) – CONCEPTS OF HYPOTHESIS TESTING
11 - 1
Topic (11) – CONCEPTS OF HYPOTHESIS
TESTING
Recall the definition of Scientific Method:
1. knowledge is obtained in a systematic and
objective manner in order to extend our
understanding.
2. Based on this knowledge we form a
HYPOTHESIS – a tentative or postulated
explanation of the phenomenon. Hence it is a
statement about a population characteristic (eg, a
mean µ or proportion π or the difference between
population means µ1 − µ 2 )
3. To evaluate the hypothesis we DESIGN and
execute an objectively planned experiment.
4. The resulting data are TESTED to determine if
they support or do not support the hypothesis.
Topic (11) – CONCEPTS OF HYPOTHESIS TESTING
11 - 2
A) Construct Hypotheses
Almost all statistical testing procedures are based on
testing two competing claims: the null and alternative
hypotheses
Defn:
Ho is the NULL HYPOTHESIS. This is the
status quo, i.e. it is the truth until disproven by
testing.
HA is the ALTERNATIVE HYPOTHESIS. This
is the competing claim made by the researcher*.
*
There are exceptions, usually for testing equivalency
or goodness of fit to a probability distribution.
The alternative hypothesis, HA, lists the outcomes
claimed to be true by the scientist. The null
hypothesis, H0, then lists the remaining or unclaimed
cases. The testing procedure results in either
1) rejecting the null hypothesis in favor of the
alternative hypothesis because the data
support the alternative, or
2) failing to reject the null hypothesis because
there is insufficient evidence to show it is
wrong
Topic (11) – CONCEPTS OF HYPOTHESIS TESTING
11 - 3
Step 1) State Your Claim In Words
EXAMPLES:
A. In a study of the effect of the new drug for
reducing serum cholesterol levels in men at risk of
heart disease, the company’s claim is that the drug
reduces serum cholesterol levels in the target
population. So we might write:
Ho: the drug does not reduce serum cholesterol levels
HA: the drug reduces serum cholesterol levels
B. An entomologist believes that there is sexual
dimorphism in body size of the periodical cicada.
One measure of size of the length of the hind tibia.
So her hypothesis might be stated:
Ho: there is no sexual dimorphism in the periodical
cicada
HA: there is sexual dimorphism in the periodical
cicada
Note that these are informal statements that need to
be clarified and made more rigorous
Topic (11) – CONCEPTS OF HYPOTHESIS TESTING
11 - 4
Step 2) State The Hypotheses In Terms Of The
Relevant Population Characteristics (parameters)
EXAMPLES
A. In the study of the effect of the new drug for
reducing serum cholesterol levels we had
Ho: the drug does not reduce serum cholesterol levels
HA: the drug reduces serum cholesterol levels
What variable(s) are being measured?
X = blood serum cholesterol level
What population characteristic is being modified by
the drug?
If the drug leads to a reduction in blood levels,
the population mean should go down. (We
assume variability of values doesn’t change.)
What is the value of the characteristic without the
drug?
Without the drug, the target population has a
mean blood serum level of 250.
So we can write the hypotheses more specifically as
Ho: µ = 250
HA: µ < 250
Topic (11) – CONCEPTS OF HYPOTHESIS TESTING
11 - 5
B. sexual dimorphism in the morphology of the
periodical cicada
Since they are studying the hind tibia length we
should first clarify the hypotheses in words:
HA: there is a difference in hind tibia lengths
between males and females
Ho: there is no difference in hind tibia lengths
between the sexes
What population characteristic(s) is(are) different
between the 2 sexes?
The means for hind tibia length in the 2 genders.
Hence we can write the hypotheses as:
Ho: µfemales = µmales
HA: µfemales ≠ µmales
Topic (11) – CONCEPTS OF HYPOTHESIS TESTING
11 - 6
C. The scientist studying the proportion of fiddler
crabs with dominant left pincers hypothesized that
isolation on the island led to a proportion of leftpincered crabs larger than the typical 10%.
Her hypotheses in words are:
Ho: 10% of the population of fiddler crabs on the
island are left-pincered
HA: more than 10% of the population of fiddler crabs
on the island are left-pincered
and in symbols are
Ho: π = 0.10
HA: π > 0.10
Important Point: Hypotheses must be carefully
structured since, as was implied earlier, statistical
tests can only disprove the null hypothesis; they
CANNOT prove that Ho is true.
Topic (11) – CONCEPTS OF HYPOTHESIS TESTING
11 - 7
Forms Of Hypotheses For Single Populations
The null hypothesis is always stated as
H0: population parameter = hypothesized value
where the hypothesized value is given by the problem
(usually it’s the value being challenged).
The alternative has one of the following forms:
1) 2-sided test alternative
HA: parameter ≠ hypothesized value
2) 1-sided upper tail alternative
HA: parameter > hypothesized value
3) 1-sided lower tail alternative
HA: parameter < hypothesized value
Note that the alternative never includes the
hypothesized value.
Having defined your hypotheses, the next step is:
Topic (11) – CONCEPTS OF HYPOTHESIS TESTING
11 - 8
B) Design The Experiment:
a) identify the appropriate statistical test to be used
b) identify type of data to be collected (variables)
c) construct the sampling or experimental design so
that it actually provides the data needed for the
test
Comment: These three cannot be separated as
distinct activities
Point: most of the tests we’ll learn require random
sampling, independent sampling among different
populations, and sample sizes sufficiently large so we
can argue that our statistics are approximately
normally distributed.
Point: if the data are categorical then the test is
different from that for continuous data
Example of Experimental Design: in a study of the
effect of temperature on seedling growth, the scientist
put 100 plants at one temperature in a greenhouse
with its windows painted over and another 100 plants
at a different temperature in a greenhouse with no
shading. He saw a statistical difference in growth and
thus claimed it was due to temperature. Is this good
experimental design?
Topic (11) – CONCEPTS OF HYPOTHESIS TESTING
11 - 9
C) Run The Experiment And Collect The
Data
D) Perform The Statistical Test And Draw
Your Conclusions
Important Fact: When you take a sample from the
population of interest (or run an experiment on a
subset of the population) you are basing your
conclusions about that population on incomplete
information.
As a result your conclusion could be WRONG (just
like CIs)!
A
A
A
A
C
C
A
A
C
A
A
A
C
C
A
C
A
A
A
A
The population has
N=20 elements with a
proportion of successes
(success = C) of 30%.
H0: π =.25
HA: π <.25
Take a sample of 8
elements (with
replacement) and get
{A,A,A,A,A,A,A,C}.
Would your sample lead
you to reject H0?
If so, you made the
wrong conclusion.
Topic (11) – CONCEPTS OF HYPOTHESIS TESTING
11 - 10
There are two possible types of errors that can occur
whenever you use a sample to test a hypothesis about
a population:
• TYPE I error – reject the null hypothesis when it
is, in fact, true for the population
• TYPE II error – fail to reject the null hypothesis
when, in fact, the alternative hypothesis is true
for the population
EXAMPLE A jury trial is very similar to a
hypothesis test. A person is assumed to be not guilty
until it is proven otherwise. So,
Jury
Decision
Not Guilty
(do not
reject H0)
Guilty
(reject H0)
True Situation
Guilty
Not Guilty
(H0 is true) (HA is true)
correct
Type II
error
Type I error
correct
True Situation ≡ True Value of Population Parameter
Jury Decision ≡ The Outcome of a Hypothesis Test
Topic (11) – CONCEPTS OF HYPOTHESIS TESTING
11 - 11
In a statistical test, one of the types of error (but
never both) can be controlled in a somewhat limited
fashion.
E.g. Marg claims she has ESP. To test the claim we
get a set of 3 cards with symbols on them (circle,
square and a triangle). Pick a card at random and look
at it (don’t show Marg!). Ask Marg to say which card
you are looking at. Repeat 25 times. Record the
number of times she was correct.
This is a Binomial experiment with 25 trials. Now, if
Marg does not have ESP then the probability of a
success on any one trial is 1/3 but if she does have
ESP, her success rate would be higher.
Ho: Marg does not have ESP
HA: Marg does have ESP
(π = 0.33)
(π > 0.33)
If the null hypothesis is true, then the distribution of
p, the sample proportion, is approximately Normal
with mean 0.33 and s.d. π (1 − π ) / n = 0.094 .
We could decide that we will reject the null
hypothesis only if the experimental data
overwhelming support the alternative. For example,
we may state that our rejection rule is that Marg must
Topic (11) – CONCEPTS OF HYPOTHESIS TESTING
11 - 12
demonstrate a success proportion greater than
0.33+2(0.094) = 0.52.
Statistical testing procedures have been designed
mostly to control this type of risk, thee risk of a type I
error. That is, the scientist specifies how much of a
risk of a type I error he/she is willing to take.
Defn: The PROBABILITY OF COMMITTING A
TYPE I ERROR is denoted with the Greek letter α
(alpha) and is called the SIGNIFICANCE LEVEL
of the test.
Topic (11) – CONCEPTS OF HYPOTHESIS TESTING
11 - 13
The PROBABILITY OF COMMITTING A TYPE
II ERROR is denoted with the Greek letter β (beta).
The value (1-β ) is called the POWER OF THE
TEST.
INTERPRETATION: The significance level can be
thought of as follows: if we could repeat the
experiment ad nauseam and for each experiment we
performed the test, α % of the tests would lead us to
falsely reject the null hypothesis.
The only way to make sure it is virtually impossible
to commit a type I error is to either
1) census the entire population. Then you’ll know for
sure what the true value of the parameter is.
or
2) do your test controlling for as small a Type I error
as you can (e.g. use α =0.01 or even α =0.001).
Topic (11) – CONCEPTS OF HYPOTHESIS TESTING
11 - 14
By doing (2), you will only reject the null hypothesis
when the sample data overwhelmingly supports the
alternative hypothesis.
Does this increase your chances of a Type II error
though?
Yes, since now you might miss data that are
supportive of HA. Your choice of a really small
significance level states that you will only accept
overwhelming evidence (not reasonable evidence) to
reject the null hypothesis.
Important Point: Choose the largest α tolerable for
the problem and use that in testing.