Download t-statistic

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Multiple Regression
Analysis: Inference
Assumptions of the Classical
Linear Model (CLM)
Given the Gauss-Markov assumptions, OLS is BLUE.
Beyond the Gauss-Markov assumptions, we need another
assumption to conduct tests of hypotheses (inference).
Assume that u is independent of x , x , …, xk and u is
normally distributed with zero mean and variance σ²:
1
u ~ N(0, σ²).
2
CLM Assumptions (continued . . .)
Under CLM, OLS is BLUE; OLS is the minimum variance
unbiased estimator.
y|x ~ N(ß + ß x +…+ ß x , σ²)
0
1
1
k
k
Normal Sampling Distributions
Under the CLM assumptions, conditional on the
sample values of the explanatory variables
ˆ j ~ N[ j ,Var( j )]
so that ˆ j is distributed normally because it is a
linear combination of the right-hand side variables.
The t Test
Under the CLM assumptions, the expression
( ˆ j   j )
se( ˆ j )
follows a t distribution (versus a standard normal
distribution), because we have to estimate σ² by ˆ 2.
Note the degrees of freedom: n – k – 1 .
t Distribution
The t Test
-Knowing the sampling distribution allows us to carry out
hypothesis tests.
-Start with this null hypothesis.
-Example: H : ß = 0
0
j
If we accept the null hypothesis, then we conclude
that x has no effect on y, controlling for other x’s.
j
Steps of the t Test
1. Form the relevant hypothesis.
- one-sided hypothesis
- two-sided hypothesis
2. Calculate the t statistic.
t statistic for ˆ j : t ˆ 
j
ˆ j
 
se ˆ j
3. Find the critical value, c.
- Given a significance level, α, we look up the corresponding
percentile in a t distribution with n – k – 1 degrees of
freedom and call it c, the critical value.
4. Apply rejection rule to determine whether or not to accept
the null hypothesis.
Types of Hypotheses and
Significance Levels
Hypothesis: null vs. alternative
- one-sided
H : ß = 0 and H : ß < 0 or H : ß >0
- two-sided
H : ß = 0 and H : ß  0
0
0
j
j
1
1
j
1
j
j
Significance level (α)
- If we want to have only a 5% probability of rejecting
Ho, if it really is true, then we say our
significance level is 5%.
- α values are generally 0.01, 0.05, or 0.10
- α values are dictated by sample size
Critical Value c
What do you need to find c?
1. t-distribution table (Appendix Table B.3, p.
723 Hirschey
2. Significance level
3. Degrees of freedom
- n – k – 1, where n is the # of
observations, k is the # of RHS
variables, and 1 is for the constant.
One-Sided Alternatives
yi = ß0 + ß1x1i + … + ßkxki + ui
H0: ßj = 0
H1: ßj > 0
Fail to reject
reject
(1 - a)
0
a
c
Critical value c: the (1 – α)th percentile in a t-dist with n – k – 1 DF.
t-statistic: tˆ 
j
ˆ j
 
se ˆ j
Results: Reject H if t-statistic > c; fail to reject Ho if t-statistic < c
0
One-Sided Alternatives
yi = ß0 + ß1x1i + … + ßkxki + ui
H0: ßj = 0
H1: ßj < 0
Fail to reject
reject
α
(1 - α)
0
-c
Critical value c: the (1 – α)th percentile in a t-dist with n – k – 1 DF.
t-statistic:
t ˆ 
j
ˆ j
 
se ˆ j
Results: Reject Ho if t-statistic < -c; fail to reject Ho if t-statistic > -c
Two-Sided Alternative
yi = ß0 + ß1X1i + … + ßkXki + ui
H0: ßj = 0
H1:  j  0
fail to reject
reject
α/2
reject
(1 - α)
-c
0
α /2
c
Critical value: the (1 – α/2)th percentile in a t-dist with n – k – 1 DF.
t-statistic:
t ˆ 
j
ˆ j
 
se ˆ j
Results: Reject H if |t-statistic|> c; fail to reject H if |t-statistic|< c
0
0
Summary for H0: ßi = 0
-unless otherwise stated, the alternative is assumed to be
two-sided.
-if we reject the null hypothesis, we typically say “x is
statistically significant at the α% level.”
j
-if we fail to reject the null hypothesis, we typically say “x is
statistically insignificant at the α% level.”
j
Testing Other Hypotheses
-A more general form of the t-statistic recognizes that
we may want to test H : ß = a
0
j
j
-In this case, the appropriate t-statistic is
t  ( ˆ j  a j )
se( ˆ j )
where aj = 0 for the conventional t-test
t-Test: Example
Tile Example
Q = 17.513 – 0.296P + 0.0661 + 0.036A
(-0.35)
(-2.91)
(2.56)
(4.61)
- t-statistics are in parentheses
Questions:
(a) How do we calculate the standard errors?
(b) Which coefficients are statistically different from zero?
Confidence Intervals
Another way to use classical statistical testing is to construct a
confidence interval using the same critical value as was used
for a two-sided test.
A (1 – α)% confidence interval is defined as
ˆ j  c  se( ˆ j )
in a
 
where c is the 1   percentile
 2
t n k 1 distribution.
Confidence Interval (continued . . .)
Computing p-values for t Tests
An alternative to the classical approach
is to ask, “what is the smallest
significance level at which the null
hypothesis would be rejected?”
Compute the t-statistic, and then obtain
the probability of getting a larger value
than this calculated value.
The p-value is this probability.
Example: Regression Relation Between Units Sold and
Personal Selling expenditures for Electronic Data
Processing (EDP), Inc.
Units sold = -1292.3 + 0.09289 PSE
(396.5) + (0.01097)
(a) What are the associated t-statistics for the intercept and
slope parameter estimates?
(b) t-stat for ̂ 0 = - 3.26
p-value 0.009
t-stat for ˆ1 = 8.47
p-value 0.000
If p-value < α, then reject H : ß = 0
0
i
If p-value > α, then fail to reject H : ß = 0
0
i
(c) What conclusion about the statistical significance of the
estimated parameters do you reach, given these p-values?
Testing a Linear Combination of
Parameter Estimates
Let’s suppose that, instead of testing whether ß1 is equal
to a constant, you want to test to see if it is equal to
another parameter, that is H : ß = ß .
0
1
2
Use the same basic procedure for forming a t-statistic.
ˆ1  ˆ2
t
ˆ
ˆ
se 1   2


Note:
Var ( ˆ1  ˆ2 )  Var ( ˆ1 )  Var ( ˆ2 )  2Cov( ˆ1 , ˆ2 ).
Recall Var ( ˆ )  s 2 ( xT x) 1
the off - diagonal elements are estimates of the
covariance s and the diagonal elements correspond
to estimates of variances of the estimated coefficien ts.
Overall Significance
H:ß =ß =…=ß =0
0
1
2
k
Use of F-statistic
F
k , n  k 1, 
F
k , n  k 1, 

SSR
SSE
k
n  k 1
R2 k

1  R 2 n  k  1


F Distribution with 4 and 30 degrees of freedom
(for a regression model with four X variables
based on 35 observations).
The F Statistic
Reject H0 at a
significance level
if F > c
fail to reject

1  
0
c
Appendix Tables B.2,
pp.720-722.
Hirschey
reject
F
Example:
UNITS = -117.513 – 0.296P + 0.036AD + 0.006PSE
t
t
(-0.35)
(-2.91)
t
(2.56)
Pt = Price
ADt = Advertising
PSEt = Selling Expenses
UNITSt = # of units Sold
s
t
(4.61)
standard error of the regression is 123.9
R² = 0.97
n = 32
2
R = 0.958
(a) Calculate the F-statistic.
(b) What are the degrees-of-freedom associated with the Fstatistic?
(c) What is the cutoff value of this F-statistic when α =
0.05? When α = 0.01?
General Linear Restrictions
The basic form of the F-statistic will work for any set of linear
restrictions.
First estimate the unrestricted (UR) model and then estimate
the restricted (R) model.
In each case, make note of the SSE.
Test of General Linear Restrictions
( SSER  SSEUR ) q
F 
2
q , n  k 1,
s
Note that SSER  SSEUR
- This F-statistic is measuring the relative increase in SSE,
when moving from the unrestricted (UR) model to the
restricted (R) model.
- q = number of restrictions
Example:
Unrestricted Model
Qt  0  1Pt   2Yt  3 ADVt  ut
SSEUR
Compute SSE ( SSEUR ); s 
n  k 1
H 0 :  2  3
2
Restricted Model (under H ); note q = 1
0
Qt   0  1 Pt   2(Yt  ADVt )  ut
Compute SSE (SSER )
( SSER  SSEUR )
F 
2
1, n  4,
s
F-Statistic Summary
- Just as with t-statistics, p-values can be calculated by
looking up the percentile in the appropriate F distribution.
- If q = 1, then F = t², and the p-values will be the same.
Summary: Inferences
- t-Test
(a) one-sided vs. two-sided hypotheses
(b) tests associated with a constant value
(c) tests associated with linear combinations of parameters
(d) p-values of t-tests
- Confidence intervals for estimated coefficients
- F-test
- p-values of F-tests
Structure of Applied Research