Download F-test.pdf

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Choice modelling wikipedia , lookup

Bias of an estimator wikipedia , lookup

Linear regression wikipedia , lookup

Regression toward the mean wikipedia , lookup

Regression analysis wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
1
F-Tests
F -test is useful for testing joint hypotheses about variances. It is the ratio of
2 chi-squares (sum of squares of n independent random variables distributed
normally). Let there be k independent variables D.O.F. of simple regression
= n −k − 1
ESS/k
∼ F (k, n − k − 1)
SSR/(n − k − 1)
�n
2
i=1 (Ŷi − Ȳ ) /k
F = �n
2
i=1 (Yi − Ŷi ) /(n − k − 1)
�n
where the ESS = i=1 (Ŷi − Ȳ )2 is the explained sum of squares (see Stock
and Watson, p. 123). Imagine a random variable Y with a mean of Ȳ . Now
take a sample and say that it is greater than Ȳ . What explains this? One
explanation is random variation, due to the inherent variability of the random
variable as measured by the variance of the population, σ 2 . Another explanation
is that particular Y is associated with an X, some other variable, that is greater
than its own mean, X̄.
The SSR is the sum of squared residuals and defined as
F =
SSR =
n
�
i=1
We have
F =
(Yi − Ŷi )2
ESS/k
SSR/(n − k − 1)
Put large variance in the numerator when forming the F .
F is significantly different from 1 then the two variances are not likely to be
the same.
Is it possible to have a significant F -test in multiple regression and still have
insignificant individual variables (that is all with t-stats less than tcrit )? Yes!
It is possible to have large enough sample size such that the F - statistic is
higher than its critical value no matter what the regression is. Possible to have
a significant F -test in simple regression and still have insignificant individual
variable (that is all with t-stat less than tcrit )? No.
1.1
Deriving The F-Test for a Simple Regression (k = 1)
F - statistic is often used as test hypotheses about the regression line as a whole.
It asks what is the ratio of the variance explained by the regression to what is left
over. If the null hypothesis is true, then the variation of Y from observation to
observation will not be affected by changes in X, but must be explained by the
random disturbance term alone. In this case the numerator and denominator of
the F - statistic will be the same. To see this, write (in single deviation form)
1
Ŷi = Ȳ + β̂xi
n
�
i=1
(Ŷi − Ȳ ) = β̂
2
2
n
�
x2i = ESS
i=1
The F is really ratio of two variances. A variance is an expected value of
sum of squares of deviations around the mean of whatever one is taking the
variance of. What is the expected value of the ESS
E(ESS) = E(β̂
2
n
�
x2i )
=
i=1
n
�
x2i E(β̂ 2 )
i=1
Adding and subtracting the population parameter β:
and expanding:
E(β̂ 2 ) = E[(β̂ − β) + β)]2
E(β̂ 2 ) = E[(β̂ − β)2 + 2β(β̂ − β) + β 2 ]
By the theorem on the linear combination of expected values:
E(β̂ 2 ) = E[(β̂ − β)2 ] + E[2β(β̂ − β)] + E(β 2 )]
but since β̂ is an unbiased estimator of β, 2β(β̂ − β) = 0 and since β is a
number and not a random variable E(β 2 ) = β 2 we have:
E(β̂ 2 ) = E[(β̂ − β)2 ] + β 2
so that the expected value of the ESS can be written:
E(ESS) = E[(β̂ − β)2
n
�
x2i ] + β 2
i=1
n
�
x2i
(1)
i=1
Now look at the first term on the right, E[(β̂ − β)2 . This is just a fancy way
to write the variance of β̂. Let’s look at the variance of β̂ for a moment. The
variance of the estimator β̂ is given by
var(β̂1 ) = var
� �n
�
xi Yi
�i=1
n
2
i=1 xi
When you look at this expression, it is really just linear combination of the
Y s with weights
xi
wi = �n
2
i=1 xi
so that
var(β̂1 ) = wi2 var (Yi )
2
and since the var (Yi ) = σi2 the variance of the population, we have
σ2
var(β̂1 ) = �n
x2i
i=1
Now in equation 1 we can substitute this last equation rearranged
var(β̂1 )
n
�
x2i = σ 2
i=1
to get:
E(ESS) = σ 2 + β 2
n
�
x2i
(2)
i=1
Under the null hypothesis that β = 0, we get
E(ESS) = σ 2
This means that if there really isn’t any explanatory power of the Xs the
variance of Ŷ around the mean will be the same as the population variance.
This is what we would have guessed; that is, the expected value of the sum of
squared deviations around the mean is the population variance when β = 0.
Where does the χ2 distribution come in? Each of the Xs below can be thought
separately as a identically and independently distributed random variable (iid),
conditional on X.
So where does this χ2 get its degree of freedom? Think of each condition
Y as deviating from the mean Ȳ and since they are all identically distributed
they have the same mean.
What is the distribution of the SSR under the alternate hypothesis that β
is not equal to zero? Note that the sampling distribution
β̂/SE(β̂)
is distributed as N (x, µ, σ) = (x; 0, 1). Hence β̂ 2 /SE(β̂)2 is distributed as the
sum of squares independently distributed normal random variables with mean
0 and variance 1, that is, as chi-square with n − 2 degrees of freedom. So it
finally makes sense that the F - statistic can be written
F =
for a simple regression.
1.2
ESS
SSR/(n − 2)
Alternative Views of the F-Statistic
This can be seen as a restricted or constrained case of OLS (k = 0). The
unrestricted or unconstrained case would be just be with k = 1. Calculate the
sample variance in the unrestricted case:
3
n
�
i=1
now the restricted
ui 2 /(n − k − 1) = SSR/(n − 2)
n
�
i=1
(Yi − Ȳ )2 /(n − 1) = T SS/1
Now let ∆ESS be the ratio of the increase in the error sum of squares caused
by using restricted model.
∆ESS =
T SS − SSR
SSR
but since T SS = ESS + SSR, we have:
ESS/1
∼ F (1, n − 2)
SSR/(n − 2)
∆ESS =
If this is close to zero, the contribution made by the unrestricted model
(using the regression) is insignificant.
1.3
The F can be expressed in terms of the R2
�n
i=1 ei
R = 1 − �n
2
1 − R2 = 1 −
2
2
i=1 (Yi − Ȳ )
=
T SS − SSR
T SS
T SS − SSR
T SS − T SS + SSR
SSR
=
=
T SS
T SS
T SS
R2
T SS − SSR
=
1 − R2
SSR
F =
ESS/k
SSR/(n − k − 1)
but since T SS = ESS + SSR
F[
Hence:
k
T SS − SSR
]=
n−k−1
SSR
F[
k
R2
]=
n−k−1
1 − R2
F =
R2 /k
(1 − R2 )/(n − k − 1)
If the null hypothesis is true then we would expect the R2 and therefore the
F to be approximately 0. Thus a high value of the F -statistic is a rationale
for rejecting the null. An F - statistic not significantly different from zero leads
4
us conclude that the explanatory variable(s) does little to explain the variation
around the mean of Y.
Main Point: the F -test is really a way of getting to use a distribution for
the R2 to be able to test its significance.
1.4
F -statistic is the same as the t-stat in simple regression
In single deviation form, we can write:
Ŷi = Ȳ + β̂xi
n
�
i=1
(Ŷi − Ȳ ) = β̂
2
��
F = �n
�n
i=1 (Ŷi
n
2
i − Ŷi )
i=1 (Y�
n
(n − 2) i=1 x2i
− Ȳ )2
2
i=1 (Yi − Ŷi ) /(n − 2)
=
β̂ 2
2
n
�
x2i
i=1
= SE(β̂)
�n
2
i=1 xi
�
n
[SE(β̂)]2 i=1
β̂
SE(β̂)
5
=t
x2i
=
β̂ 2
[SE(β̂)]2
= t2