Download Hypothesis Testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

History of statistics wikipedia , lookup

Confidence interval wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Foundations of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
• The concept, illustrated in an example testing the difference
of mean (An introduction to t-test)
• The structure and terminology of a test
• Sensible issues
- meaning of statistical significance
- establishing statistical significance
- assumptions (modified t-tests)
Hypothesis Testing
The Concept:
Statistical hypothesis testing is a process that uses the information in a
sample to decide whether or not to reject H0, the null hypothesis. To
execute a test, one needs:
A.
To formulate the null hypothesis H0 and the alternative hypothesis Ha which
describes the range of possibilities that may be true when H0 is false
B.
To formulate a statistical model describing the probability of observing a particular
realization of a random variable S (test statistic) when H0 is true
C.
To make the decision (=to formulate a rule) that determines whether to reject the
null hypothesis or not
Hypothesis Testing
The Ithaca-Canandaigua-temperature example
A.
To formulate the null hypothesis H0 and the alternative hypothesis Ha
H0: E(X1)-E(X2)=0, Ha: E(X1)-E(X2)=0,
B.
To formulate a statistical model describing the probability of observing a particular
realization of a random variable S (test statistic) when H0 is true
µˆ1 − µˆ 2
~ t (n1 + n2 − 2)
remember t-distribution?
S p 1 / n1 + 1 / n2
Hypothesis Testing
A random variable T has a t distribution with k df, T~t(k), when
T=
A
~ t (k ) where A ~ N (0,1), B ~ χ 2 (k )
B/k
Under the null-hypothesis and assumptions a)-c)
A=
µˆ1 − µˆ 2
~ N (0,1)
1/ 2
Var (µˆ1 − µˆ 2 )
where µ̂1 and µ̂ 2 are sample means and n1 and n2
the sample sizes and
Var (µˆ1 − µˆ 2 )
1/ 2
=σ
1 1
+
n1 n2
a)
X1 and X2 are independent
random variable
b)
X1 and X2 are identically
distributed
c)
The distributions are
normal and have variance
σ2
Since σ is unknown, one considers A divided by
(B/(n1+n2-2))1/2 with
B=
n1 + n2 − 2
σ2
⎛ ∑n1 (x1,i − µˆ ) + ∑n1 (x1,i − µˆ ) ⎞
n
n
2
+
−
⎟
⎜ i =1
i =1
S p2 = 1 22
⎟
⎜
n1 + n2 − 2
σ
⎠
⎝
The result is
µˆ1 − µˆ 2
A
=
~ t (n1 + n2 − 2)
B /(n1 + n2 − 2) S p 1 / n1 + 1 / n2
Sp: pooled estimate of the
common standard deviation
Hypothesis Testing
The Ithaca-Canandaigua-temperature example
A.
To formulate the null hypothesis H0 and the alternative hypothesis Ha
H0: E(X1)-E(X2)=0, Ha: E(X1)-E(X2)=0,
B.
To formulate a statistical model describing the probability of observing a particular
realization of a random variable S (test statistic) when H0 is true
µˆ1 − µˆ 2
~ t (n1 + n2 − 2)
remember t-distribution?
S p 1 / n1 + 1 / n2
C.
To make the decision (=to formulate a rule) that determines whether to reject the
null hypothesis or not
• The likelihood of observing certain value of S is given by f S ( X ) = t (n1 + n2 − 2)
• significance level 1 − ~p = 5%, or ~p = 95%
• non-rejection region Θ( ~p ) = [−1.96,1.96]
How to get this interval?
Hypothesis Testing
The outcome of the test constructed to assess the difference between the mean
temperatures in Ithaca and Canandaigua
• From the statistics given below, one finds that the observed value of the test statistic
is s=1.9/2.01=0.95
• s=0.95 is not unlikely to observe, since s=0.95 ∈ Θ(95%) = [−1.96,1.96]
- the observed value is consistent with the statistical model under H0
- H0 is not rejected
Statistics at the two stations
Hypothesis Testing
The Ithaca-Canandaigua-temperature example
A.
To formulate the null hypothesis H0 and the alternative hypothesis Ha
H0: E(X1)-E(X2)=0, Ha: E(X1)-E(X2)=0,
B.
To formulate a statistical model describing the probability of observing a particular
realization of a random variable S (test statistic) when H0 is true
µˆ1 − µˆ 2
~ t (n1 + n2 − 2)
remember t-distribution?
S p 1 / n1 + 1 / n2
C.
To make the decision (=to formulate a rule) that determines whether to reject the
null hypothesis or not
• The likelihood of observing certain value of S is given by f S ( X ) = t (n1 + n2 − 2)
• significance level 1 − ~p = 5%, or ~p = 95%
• non-rejection region Θ( ~p ) = [−1.96,1.96]
How to get this interval?
Hypothesis Testing
Change in the test assessing the difference between the mean
temperatures in Ithaca and Canandaigua using
Ha: Canadaigua is warmer than Ithaca
Change in the model: none
µˆ1 − µˆ 2
~ t (n1 + n2 − 2)
S=
S p 1 / n1 + 1 / n2
Change in the rule:
• The likeliehood of observing certain value of S is given by f(S) =t (n1+n2-2)
• significance level 1 − ~p = 5%, or ~p = 95%
• non-rejection region Θ( ~p ) = [−∞,1.645]
The outcome:
• s=0.95 is not unlikely to observe, since 0 . 95 ∈ Θ ( ~p ) = [ −∞ ,1 . 645 ]
• H0 is not rejected
Hypothesis Testing
Sensible Issues: Meaning of significance
Suppose one wishes to test H0 that the means of two random variables are equal. H0
is rejected at the 5% significance level when the hypothesized value of µ1-µ2, 0, is not
covered by the 95% confidence interval
Physical significance implies that two populations are well separated, I.e. the
difference in the mean is large
0 lies outside almost every realization of the confidence interval
Statistical significance depends on the size of the confidence interval which is a
function of the sample size
the larger the sample size n, the smaller the confidence interval, the
easier to detect a statistical significant difference, although the difference is
physically insignificant
Statistical significance ≠ physical significance
Hypothesis Testing
Another way to illustrate the emergence of statistical significance
• Two density functions of a control
and experimental random variable
overlap considerably
• As sample size increases, the
spread of the density functions of
the sample mean decreases and
eventually there is virtually no
overlag
• One can distinguish the control
and experimental random variables
with almost perfect reliability
Likelihood of rejection of the null hypothesis depends
not only on the strength of the signal but also on the
amount of available data !