Download PAS08 – Statistical hypothesis testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
PAS08 – Statistical hypothesis testing
Jan Březina
Technical University of Liberec
27. listopadu 2014
Motivation - problem with interval estimate for relative
frequency
Having data {Xi } from Alt(π) with sample relative frequency p = X.
According to Moivre-Laplace theorem:
U=p
p−π
√
π(1 − π)
n
≈ N (0, 1)
Thus
P u(α/2) ≤ U ≤ u(1 − α/2) = 1 − α
after straight forward calculations:
P p − ∆ < π < p + ∆) = 1 − α
but we have to use approximation:
r
π(1 − π)
∆=
u(1 − α/2)
n
Motivation - problem with interval estimate for relative
frequency
Having data {Xi } from Alt(π) with sample relative frequency p = X.
According to Moivre-Laplace theorem:
U=p
p−π
π(1 − π)
√
n
≈ N (0, 1)
Thus
P u(α/2) ≤ U ≤ u(1 − α/2) = 1 − α
after straight forward calculations:
P p − ∆ < π < p + ∆) = 1 − α
but we have to use approximation:
r
r
π(1 − π)
p(1 − p)
∆=
u(1 − α/2) ≈
u(1 − α/2)
n
n
Can we avoid this somehow?
Hypothesis testing - example
Example A political party gained 30% in the last votes, survey after half
a year, 59 respondents, 25%. Can we be 80% sure that there is a drop in
preferences from the previous votes?
null hypothesis (no drop): H0 : π = π0 = 0.3
alternative hypothesis (drop): HA : π < π0
Using interval estimate
using one side interval estimate:
sample rel. frequency p = 0.25, n = 59
r
p(1 − p)
u(0.8) = 0.0474,
∆=
n
π < 0.25 + ∆ = 0.297 < 0.3 with probability 80%
we reject (zamı́tneme) H0 in favour of (ve prospěch) HA
Direct testing
interval estimate use approximation, is therefore INEXACT
direct test under assumption of null hypothesis H0 , we assume π = π0
√
0.3 − 0.25 √
p − π0
n= p
59 = −0.8381
U=p
π0 (1 − π0 )
0.3(1 − 0.3)
u(α) = −u(0.8) ≈ −0.8416 < U , can not reject H0
... but its on the edge
General scheme of hypothesis testing (clasical test)
1. State null hypothesis (equilibrium) and alternative hypothesis (two
side or one side) about parameter.
General scheme of hypothesis testing (clasical test)
1. State null hypothesis (equilibrium) and alternative hypothesis (two
side or one side) about parameter.
2. Select appropriate test - the statistics - function of the data.
Consider test assumptions:
independence, normality, small variance, . . .
General scheme of hypothesis testing (clasical test)
1. State null hypothesis (equilibrium) and alternative hypothesis (two
side or one side) about parameter.
2. Select appropriate test - the statistics - function of the data.
Consider test assumptions:
independence, normality, small variance, . . .
3. Determine distribution function of the statistics.
General scheme of hypothesis testing (clasical test)
1. State null hypothesis (equilibrium) and alternative hypothesis (two
side or one side) about parameter.
2. Select appropriate test - the statistics - function of the data.
Consider test assumptions:
independence, normality, small variance, . . .
3. Determine distribution function of the statistics.
4. Select the significance level (hladina významnosti) α giving probability
of Type I. error or confidence level 1 − α
General scheme of hypothesis testing (clasical test)
1. State null hypothesis (equilibrium) and alternative hypothesis (two
side or one side) about parameter.
2. Select appropriate test - the statistics - function of the data.
Consider test assumptions:
independence, normality, small variance, . . .
3. Determine distribution function of the statistics.
4. Select the significance level (hladina významnosti) α giving probability
of Type I. error or confidence level 1 − α
5. Construct critical region C(α) (for rejection), using quantile function
(inverse of distribution).
General scheme of hypothesis testing (clasical test)
1. State null hypothesis (equilibrium) and alternative hypothesis (two
side or one side) about parameter.
2. Select appropriate test - the statistics - function of the data.
Consider test assumptions:
independence, normality, small variance, . . .
3. Determine distribution function of the statistics.
4. Select the significance level (hladina významnosti) α giving probability
of Type I. error or confidence level 1 − α
5. Construct critical region C(α) (for rejection), using quantile function
(inverse of distribution).
6. Compute value tobs of test statistic from observation data.
General scheme of hypothesis testing (clasical test)
1. State null hypothesis (equilibrium) and alternative hypothesis (two
side or one side) about parameter.
2. Select appropriate test - the statistics - function of the data.
Consider test assumptions:
independence, normality, small variance, . . .
3. Determine distribution function of the statistics.
4. Select the significance level (hladina významnosti) α giving probability
of Type I. error or confidence level 1 − α
5. Construct critical region C(α) (for rejection), using quantile function
(inverse of distribution).
6. Compute value tobs of test statistic from observation data.
7. Reject H0 if tobs ∈ C(α) in favor HA or do not reject H0 .
We construct a statistic (a function of sample vector) such, that its value
grows with the parameter in hypothesis. Then the inequality for
hypothesis match inequality for critical region.
Errors of I. and II. kind
error of the first kind - with probability α, WRONG rejection of H0
error of the second kind - wit probability β, do not reject H0 that doesn’t
hold,
minimize using larger sample or better test (ASSUMPTIONS !!) Power of
the test: 1 − β
Test about mean value (normal, known σ)
Sample {Xi } of size n from normal distribution N (µ, σ 2 ), σ known.
Statistic (Z-test):
Z=
X − µ0 √
n with distribution N (0, 1)
σ
Example: Reading test in CR: mean 124 points, deviation 12 points. One
school, sample of 55 students, we observe sample mean 120 points.
Is it 95% significant?
H0 : µ = 124, HA : µ < 124
Critical region: Z < Zcrit < u(0.05) = −1.64
Z=
. . . reject hypothesis
120 − 124 √
55 = −2.47
12
p-value test
(čistý test významnosti)
p-value: smallest level α on which we can reject H0
Our example: FN (0,1) (−2.47) = 0.0068
Comparison of classical and p-value Z-tests
HA : µ < µ 0
H0 rejecting for:
Zobs < u(α) = FN−1(0,1) (α) resp.
p = F (Zobs ) < α
HA : µ > µ 0
H0 rejecting for:
Zobs > u(1 − α) = FZ−1 (1 − α) resp.
p = 1 − F (Zobs ) < α
HA : µ 6= µ0
H0 rejecting for:
Zobs < FZ−1 (α/2) ∨ FZ−1 (1 − α/2) < Zobs
p = 2 min{1 − F (Zobs ), F (Zobs )} < α
resp.
Error of II. kind, reading test example
Probability β of not rejecting H0 for various values of µ.
X − µ0 √
n < u(α)|EX = µ
1−β =P
σ
X − µ√
µ0 − µ √
=P
n<
n + u(α)|EX = µ
σ
σ
µ − µ√
0
= FN (0,1)
n + u(α)
σ
Example (reading test): µ0 = 124, Sn = 12, n = 55, α = 0.05,
u(α) = −1.64
β(µ) = 1 − F (
124 − µ √
55 − 1.64)
12
0.0
0.2
0.4
0.6
0.8
1.0
Error of II. kind, continued
116
118
120
122
124
Test about mean value (normal, unknown σ)
t-test, statistic:
T =
X − µ0 √
n has Student’s distribution tn−1
Sn
Example: Measurement of heat conductivity coefficient: 0.62, 0.64, 0.57,
0.61, 0.59, 0.57, 0.62, 0.59 The nominal coefficient should be 0.60,
decide if there is a significant deviation (assuming normal data).
H0 : µ = 0.60, HA : µ 6= 0.60, two side test
X = 0.60125, Sn = 0.02531939, T = 0.1396, p = 0.89
R> 1-pt(0.1396, 7)
R> qt(0.975, 7)
. . . fail to reject H0 .
// 0.446454
// 2.364624
T -test in R
> hc=c(0.62, 0.64, 0.57, 0.61, 0.59, 0.57, 0.62, 0.59)
> t.test(hc, mu=0.6, conf.level=0.99)
One Sample t-test
data: hc
t = 0.1396, df = 7, p-value = 0.8929
alternative hypothesis: true mean is not equal to 0.6
99 percent confidence interval:
0.5699235 0.6325765
sample estimates:
mean of x
0.60125
Test about variance (or deviation)
Consider normally distributed sample X1 , . . . , Xn .
X=
Sn2
(n − 1) has χ2n−1 distribution
σ2
Example: Std. deviation in filling beer bottles should not be greater then
0.5 ml, Measurement of bottles in liters: 0.4981, 0.5016, 0.5004, 0.4978,
0.4996, 0.5002, 0.4874, 0.4890, 0.4772, 0.5013, 0.4961. Is the filling
device precise enough?
H0 : σ = 0.005, HA : σ > 0.005, one side test Fχ−1
2 ,10 (1 − 0.05) = 18.3
Sn = 7.67ml,
σ0 = 5ml,
X = 23.55
. . . rejecting H0 , p = 1 − F (X) = 0.009
Note: Very sensitive to normality assumption, problematic usage in
practice. No dedicated function in R.
Test about relative frequency (proportion test)
sample X1 , . . . , Xn from Alt(π), sample relative frequency p
for np or n(1 − p) > 5
T =p
p−π
π(1 − π)
√
n ≈ N (0, 1)
or using F-distribution:
FBi(n,π) (s) = Fdf1 ,df2
df (1 − π) 2
, df1 = 2(n − s), df2 = 2(s + 1)
df1 π
Example: π0 = 0.3, p = 0.25, n = 50, HA : π < π0
s = 0.25 ∗ 50 = 12.5, F75,25 (0.84) = 0.27
. . . can not reject for α < 27%.
R> prop.test(0.25*50, n=50, p=0.3, alternative="less", conf.level=0.7)
Related documents