Download Hypothesis testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Transcript
Lecture 9. Hypothesis testing
Mathematical Statistics and Discrete Mathematics
November 30th, 2015
1 / 14
Motivating example
• A politician running for presidency in the USA made a controversial statement
on the national TV. He is rightfully worried that the statement had a negative
influence on the number of people who will vote for him. He wants us to make a
poll to check if less people support him now compared to the previous study. The
previous study showed that 20% will vote for the politician.
• Let p be the true proportion of supporters of the politician in the population. We
define two mutually exclusive hypotheses:
H0 : p ≥ 0.2,
and
H1 : p < 0.2.
We assume that each person either supports the candidate or not. We plan to ask
225 randomly chosen people for their opinion. If H0 is true, then on average we
should get around 45 or more people who support the candidate. We want to
reject H0 if the number of people supporting the politician is too small compared
to this prediction.
2 / 14
Motivating example
• The question is: what constitutes “too small”? The answer is: this is up to us. We
decide to reject H0 if the number of supporters is smaller than 35.
• We make the poll and observe 37 supporters. The result is hence not convincing
enough (for us), and we (sadly) do not reject H0 .
3 / 14
Hypothesis testing (1)
(1) In a hypothesis testing situation, we are interested in a population parameter θ,
and we have a preconceived notion concerning its value. Based on this, we
define two mutually exclusive hypotheses. The one that we hope the evidence
will support is called the research hypothesis and is denoted by H1 . Its negation
is called the null hypotheses and is denoted by H0 .
The hypotheses are usually of the form
H0 : θ ≤ θ 0 ,
H1 : θ > θ 0 ,
H0 : θ ≥ θ 0 ,
H1 : θ < θ 0 ,
H0 : θ = θ 0 ,
H1 : θ 6= θ0 .
The value θ0 is called the null value and is included in H0 .
Recall that in contrast to hypothesis testing, in a typical estimation problem, there is
no preconceived notion concerning the value of the parameter θ.
4 / 14
Hypothesis testing (1)
(1) A producer of cookies claims that less than 10% of packs contain broken
cookies. We question the claim and want to perform hypothesis testing. Our null
value is p0 = 0.1, and we formulate the relevant hypotheses:
H0 : p ≤ 0.1,
H1 : p > 0.1.
5 / 14
Hypothesis testing (2)
(2) We have to consider the statistical assumptions concerning the distribution of the
data. We need to decide on the test statistic T whose distribution we can
establish under the assumption that θ = θ0 .
(2) To a pack of cookies we assign a random variable X which follows a Bernoulli
distribution. The event {X = 1} means that the pack contains broken cookies,
and the event {X = 0} means that there are no broken cookies. We assume that
H0 is true which means that X ∼ Bernoulli(p0 ) = Bernoulli(0.1). We are
planning to take a sample of 100 packs of cookies. Our test statistic is
T = X1 + X2 + . . . + X100 ∼ Binom(100, 0.1).
6 / 14
Hypothesis testing (3)
(3) Before collecting the data we select the significance level α, a probability
threshold below which the null hypothesis will be rejected. Common values are
5% and 1%. We then compute a critical region of values of T such that the
probability of observing a value t of T in this region under the assumption that
θ = θ0 is α.
(3) We set α = 0.05. We use the central limit theorem to approximate
T ∼ Binom(100, 0.1) by a normal variable with mean 10 and variance 9. We
want to find a number a0.05 such that P(T > a0.05 ) = 0.05. Using that
P(Z > 1.64) = 0.05, we obtain that P(T > 14.92) = 0.05. Our critical region is
hence the set
{t : t ≥ 15}.
7 / 14
Hypothesis testing (4)
(4) We compute the observed value t of the statistic T. If it is in the critical region,
then we reject H0 . Otherwise, we do not reject it.
(4) We buy 100 packages of cookies and 17 of them contain broken cookies. 17
belongs to the critical region and hence the result is statistically significant at
confidence level 95%, and we reject the null hypothesis.
8 / 14
Possible outcomes
The possible outcomes of hypothesis testing are:
• We correctly fail to reject H0 when H0 is true.
• We reject H0 when H0 is true, and we make a type I error.
• We correctly reject H0 when H1 is true.
• We fail to reject H0 when H1 is true, and we make a type II error.
The probability of making a type I error is called the significance level of the test and
is denoted by α. The probability of making a type II error is denoted by β. The
number 1 − β is called power of the test and is the probability of rejecting H0 when
H1 is true.
X Compute the significance level of our test for the American politician. Use
normal approximation of the binomial distribution Binom(225, 0.2). Our critical
region was set to be {1, 2, . . . 35}.
X Compute the power of our test for the american politician if the true proportion
of supporters is p = 0.1.
9 / 14
Significance testing
Significance testing is an alternative (often more convenient) strategy to hypothesis
testing:
• It follows the same two first steps (1) and (2) as hypothesis testing.
• The difference is that step (3) which involves choosing α is skipped.
• After step (2), we evaluate the observed value t of the test statistic T.
• We then evaluate the probability of observing values as extreme as t under the
assumption that θ = θ0 . This probability is referred to as the p-value of the test.
Note that the interpretation of the phrase as extreme depends on the form of H0 .
The p-value of the test is the smallest level at which we could have set α and still
could have been able to reject H0 .
10 / 14
Significance testing
X Compute the p-value for the american politician example Binom(225, 0.2),
t = 37.
X Compute the p-value for the cookie example Binom(100, 0.1), t = 17.
X A street crook is accused of using a biased coin when claiming that it is a fair
coin. Since we do not know which side is biased, we set H0 to be the hypothesis
that the coin is fair, that is p = 1/2. We toss the coin 40 times and observe 28
heads. Compute the p-value of the two-sided test.
11 / 14
Hypothesis and significance tests on the mean
The form of the critical region is aligned with the form of H1 :
H0 : µ ≤ µ0 ,
H1 : µ > µ 0 ,
CR = {t > tc },
H0 : µ ≥ µ0 ,
H1 : µ < µ 0 ,
CR = {t < tc },
H0 : µ = µ0 ,
H1 : µ 6= µ0 ,
CR = {t > tc } ∪ {t <
right-tailed test
left-tailed test
tc0 }.
two-tailed test
As with confidence intervals, we will consider the following cases:
• normal data and variance known,
• arbitrary data, large sample size and variance known (central limit theorem),
• normal data and variance unknown (small samples “≤ 30” - t-distribution, large
samples “> 30” - normal distribution).
Note that in previous examples, we always considered binomial distributions, where
variance is always determined by the mean value, and hence we did not have to
consider the scenarios above.
12 / 14
Doctors think that the average weight µ of a newborn baby in Sweden has grown
compared to the previous estimation of 3.5 kg. The standard deviation is thought not
to have change and is assumed to be 0.8. We set the hypothesis:
H0 : µ ≤ 3.5,
H1 : µ > 3.5.
We assume that the data is normally distributed and we consider the test statistic
T=
X − µ0
√ ∼ N (0, 1)
σ/ n
We weigh a random sample of 10 babies and obtain results:
2.43, 3.84, 3.92, 3.88, 4.78, 1.95, 4.65, 4.20, 3.35, 4.40.
We have x = 3.74, and the value of the test statistic is
t=
x − µ0
3.74 − 3.5
√ =
√
= 0.95.
σ/ n
0.8/ 10
The p-value is
P(T > t) = P(Z > 0.95) = 1 − 0.83 = 0.17.
13 / 14
Consider the same problem but with the assumption that the variance is unknown. We
consider the statistic
X − µ0
√ ∼ T10−1 ,
T=
S/ n
Using that s = 0.92, we compute the value of this statistic for the same data:
t=
x − µ0
3.74 − 3.5
√ =
√ = 0.824.
s/ n
0.92/ 10
The p-value is
P(T > t) = P(T9 > 0.824).
0.824 is between t0.25 = 0.703 and t0.1 = 1.383 which means that the p-value is
between 0.1 and 0.25.
14 / 14