Download Economics 405

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Linear regression wikipedia , lookup

Regression analysis wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Economics 405
1.
Problem Set #1
Solutions
Wooldridge states “After data on the relevant variables have been collected,
econometric methods are used to estimate the parameters in the econometric
model…” What does he mean by parameters in this context?
Given the population model y   0  1 x  u , then the parameters we want to
estimate are the population slope ( 1 ) and population intercept (  0 ).
2.
a.
b.
1
Determine the first derivative of the function y  10 x 2  .3( ) .
x
1
Recognize that y  10 x 2  .3( )  10 x 2  .3x 1 , then apply the power
x
dy
 1 
 20 x  .3x  2  20 x  .3 2  .
function rule and the addition rule to get
dx
x 
Determine the partial derivative of y with respect to z of the function
y  (2 x  15  3z ) 2 .
Use the chain rule.
3.
y
 2(2 x  15  3z )( 3)  12 x  90  18 z .
z
Suppose that the random variables X and Y have the marginal and joint
probability density functions given in the tables below:
X
2
4
6
Y
5
10
15
fx(X)
.5
.3
.2
fy(Y)
.2
.5
.3
f x , y ( x, y )
X
2
4
6
a.
5
.15
.03
.02
Y
10
.2
.2
.1
15
.15
.07
.08
Are X and Y independent? Why or why not?
No, since f x , y ( x, y )  f x ( x)  f y ( y ) in the joint pdf table.
2
b.
Calculate the unconditional expected value of X.
E ( X )   xf x ( x)  (2  .5)  (4  .3)  (6  .2)  3.4
x
c.
Calculate the relevant conditional expected values for X (i.e.,
E ( X | Y  10), E ( X | Y  15), E ( X | Y  20)) . Compare and comment on
the conditional expected values obtained here versus the unconditional
expected value obtained in b.
There’s a typo in the first line of this question (credit to Naved for spotting
it!). Last part of the first sentence should read: “…(i.e.,
E ( X | Y  5), E ( X | Y  10), E ( X | Y  15)) .”
Accordingly, the correct answer is obtained by taking:
E ( X | Y )   xf x| y ( x | y )   x
x
x
f x , y ( x, y )
f y ( y)
.
  .15     .03     .02  
Therefore, E ( X | Y  5)   2  
    4  
    6  
   2.7
  .2     .2     .2  
  .2     .2     .1  
E ( X | Y  10)   2       4       6      3.6
  .5     .5     .5  
  .15     .07     .08  
E ( X | Y  15)   2  
    4  
    6  
   3.53
  .3     .3     .3  
Note the moral of the story here. Unconditional expected value of X is
3.4. Nevertheless, the conditional expected values of X all depend on the
value of Y. This is a consequence of the dependence between X and Y
noted in part a.
d.
Does E ( XY )  E ( X ) E (Y ) ? (Hint: Determine E(XY) using the joint pdf
and determine E(X) and E(Y) using the marginal pdfs.)
We know E(X) = 3.4.
Furthermore, E (Y )   yf y ( y )  (5  .2)  (10  .5)  (15  .3)  10.5
y
Therefore: E ( X ) E (Y )  35.7
3
To calculate E(XY), take:
E ( XY )   xyf x , y ( x, y )  [( 2  5  .15)  (2  10  .2)      (6  15  .08)]  36.6
x
y
Hence E ( XY )  E ( X ) E (Y ) . This comes as no surprise since X and Y are
not independent. I claimed in class that they would be equal if X and Y
are independent.
4.
Suppose that the random variable X and the random variable Y have the
following probability density functions:
X
-5
5
10
fx(X)
.4
.3
.3
Y
10
15
20
fy(Y)
.2
.5
.3
a. Assume that X and Y are independent. Construct the joint probability density
function table for X and Y. Briefly explain what you’ve done here.
If X and Y are independent then the joint probability density function equals
the product of the unconditional pdfs, i.e., f x , y ( x, y )  f x ( x)  f y ( y ) .
Accordingly,
f x , y ( x, y )
X
-5
5
10
b.
Y
15
.2
.15
.15
10
.08
.06
.06
20
.12
.09
.09
Determine the unconditional expected value of Y.
E(Y) = 15.5.
c.
Determine the expected value of Y conditional on X = 10. Compare your
answer in c with that in b and comment.
E (Y | X )   yf y|x ( y | x)   y
y
y
f x , y ( x, y )
f x ( x)
Therefore:

 .06   
 .15   
 .09  
E (Y | X  10)  10  
   15  
    20  
   15.5
 .3   
 .3   
 .3  

4
Note that E (Y | X )  E (Y ) , which follows from the fact that X and Y are
independent. As a result, the particular level of X has no influence on the
expected value of Y.
d.
Show mathematically whether E ( XY )  E ( X ) E (Y ) .
Since X and Y are independent, it should be true that
E ( XY )  E ( X ) E (Y ) . We know that E(Y)=15.5. Using the table of
marginal densities for X above, I calculate E(X) = 2.5. Therefore:
E(X)E(Y) = 38.75. To calculate E(XY), take
E ( XY )   xyf x , y ( x, y )  [( 5  10  .08)  (5  15  .2)      (10  20  .09)]  38.75
x
y
Hence E(XY) = E(X)E(Y). We should have expected this since X and Y
are independent.
5.
A professor decides to run an experiment to measure the effect of time pressure
on final exam scores. He gives each of the 400 students in his course the same
final exam, but some students get 90 minutes to complete the exam while others
have 120 minutes. Each student is randomly assigned one of the examination
times based on the flip of a coin. Let Yi denote the number of points scored on the
final exam and let Xi denote the amount of time that the student has to complete
the exam. Consider the regression model: Yi   0  1 X i  ui .
a.
Explain what the term ui represents. Why will different students have
different values of ui?
ui is the error term. It will contain other unobserved factors that also affect
performance on the exam such as hours studied, quality of study, student
ability, and so forth.
b. Will the zero conditional mean assumption, E (u | X )  0 , be valid for this
model? Why or why not?
The professor has set up what is known as a controlled experiment. As a
consequence, the zero conditional mean assumption should be valid. In
other words, given the random nature by which students are assigned test
taking time (X), there is no way that the factors mentioned in part a could
be correlated with amount of time the student has to take the test. With
this said, then, it should seem reasonable that 1  0 , i.e., more time
improves test score.
Usually in econometric problems we do not have the luxury of using such
nice experimental data, however. Rather, we work with observational
data. With observational data, the researcher/econometrician has no
control over X, unlike the professor in the hypothetical. This leads to a
5
useful thought experiment. Suppose we now assume that the professor
merely records the time it takes each student to take the test (X) and then,
once the exam is graded, records the student test scores. In this scenario,
the professor has observed, rather than controlled via random assignment,
the value of X. Now one could well argue that ZCMA is violated. For
example, it might be the case that students who have studied harder and
are better prepared actually take less time to finish the test! If this is the
case, when the regression is run the estimated effect of X could well be
negative or at least could be much closer to 0 (though positive) because
test taking time and preparation are negatively correlated. The problem
here is that the effect of being well prepared will be confounded with the
effect of additional time and, as a result, it will be impossible to determine
the pure effect of additional time on test score since other things (like
study time) are not being held equal.
The estimated regression is Yˆi  40  .33 X i . Compute the regression’s
prediction for the average score of students given 90 minutes to complete
the exam. Compute the regression’s prediction for the average score of
students given 120 minutes to complete the exam. What is the estimated
gain in score for a student who is given an additional 10 minutes on the
exam?
c.
Yˆi  40  .33(90)  69.7
Yˆi  40  .33(120)  79.6
Yˆ  ˆ1X  .33  10  3.3 . Hence an extra 10 minutes of time is
predicted to improve exam score by 3.3 points.
6.
a.
Show algebraically that
n
(1)
 x (x
i
i 1
n
i
n
i 1
i 1
 xi ( xi  x )   ( xi  x ) 2 .
 x )   [ xi  xi x ]   xi2  x  xi   xi2  nx
2
i 1
( x  x )  [ x
(2) 
  x  2nx  nx
2
2
i
i
2
i
n
2
2
x
n
i
  xi2  nx 2
 2 xi x  x 2 ]   xi2  2 x  xi   x 2   xi2  2nx
  xi2  nx 2
Hence both sides of the statement solve to
x
2
i
 nx 2
x
i
n
 nx 2
6
b.
Convince me that the result is true for the ACT scores given in problem
2.3 on p. 61 of the text.
Total
Xbar
ACT = X
1
X
21
24
26
27
29
25
25
30
207
25.875
2
(Xi - Xbar)
-4.875
-1.875
0.125
1.125
3.125
-0.875
-0.875
4.125
3
X*(Xi-Xbar)
-102.375
-45
3.25
30.375
90.625
-21.875
-21.875
123.75
56.875
4
(Xi-Xbar)^2
23.765625
3.515625
0.015625
1.265625
9.765625
0.765625
0.765625
17.015625
56.875
Notice that while the individual entries in columns (3) and (4) are very
different, they both add up to 56.875!
7.
A researcher wished to investigate the effect of the federally funded school lunch
program on 10th grade student test scores. The researcher hypothesized that
students participating in the lunch program would have better scores than those
not participating. (That is, full stomachs make for better performance and empty
stomachs lead to worse performance.) Seems like common sense but when the
researcher ran the regression for a randomly selected sample of 408 10th grade
classes from the State of Michigan, the computer spit out the following result:
math10 =32.14 – 0.319lnchprg
n = 408,
where math10 is the school’s average score on a standardized test; and
lnchprg is the percentage of students at the school who are eligible
for the free lunch program (eligibility is determined on the
basis of family income).
a. What does the fitted model predict will be the effect on test scores of a 10
percentage point increase in eligibility for the free lunch program?
The model predicts that a 10 percentage point increase in free lunch program
eligibility lowers test scores by 3.19 points (= -.319 x 10).
7
b. What is the predicted average test score at a school for which 50% of the
students are eligible for the free lunch program? What is the predicted
average test score at a school for which 10% of the students are eligible for
the free lunch program?
32.14 - .319(50) = 16.19 in the first case and 32.14 - .319(10) = 28.95 in the
second instance.
c. Contrary to common sense, the negative sign on the slope coefficient in the
fitted model suggests that better nutrition has an adverse effect on exam
performance! Discuss why the estimated coefficient is negative. [Hint: Base
E ( y | x)
E (u | x)
 1 
your discussion on the relation
, which we derived
x
x
from the population regression model.]
What factors are in the error term? Family income, poverty level, quality of
instruction, and so forth. As free lunch eligibility (x) increases, I would
expect u to decrease, i.e., free lunch eligibility will be associated with higher
levels of poverty, lower levels of income, perhaps poorer quality of
instruction, all of which lead to poorer performance on the exam. The fitted
model suggests that these adverse correlations between x and u overwhelm
whatever positive causal effect (  1 ) the free lunch program has on test scores.
E (u | x)
 0 and larger in absolute value than  1 , which
In other words,
x
presumably is positive.
n
8.
The formula for the least squares slope coefficient is ̂1 
 (x
i 1
i
 x )( y i  y )
.
n
 (x
i 1
i
 x)
2
Show step-by-step how this formula is obtained using the method of moments
approach.
If the zero conditional mean assumption is valid, then E (u | x)  E (u )  0 . This
then implies two restrictions on the moments of the u distribution:
(1)
(2)
E (u )  E ( y   0  1 x)  0
E ( xu)  E[ x  ( y   0  1 x)]  0
(follows from the independence of x and
u implied by zcma).
To derive the method of moments coefficient estimators, we impose conditions
(1) and (2) on the sample data and solve for the sample values of the coefficients,
8
which will be our estimators of the population coefficients. In other words, if the
zero conditional mean assumption is valid or reasonable, then the sample data
should be subject to the logical implications of conditions (1) and (2). Now note
that condition (1) says that the average difference between y and the systematic
part of the population regression function is 0. Imposing this requirement on our
randomly chosen sample of n, implies:
n
(3)
(1 / n) [ yi ˆ0  ˆ1 xi ]  0
i 1
Note that (3) is the average difference between observed y and predicted y.
Furthermore note that I’ve put ^s over the betas indicating that they will be
estimators of the parameters.
Now, since we’re not asked to prove the following, I’ll simply note that (3) solves
out to ˆ0  y  ˆ1 x . We’ll use this result in what follows.
To get the slope estimator, impose condition (2) on the sample. This yields:
n
(4)
(1 / n) [ xi * ( yi ˆ0  ˆ1 xi )]  0 . Substituting for ̂ 0 in (4) gives us:
i 1
n
(5)
(1 / n)[ xi * ( yi  ( y  ˆ1 x )  ˆ1 xi )]  0 . Re-arranging terms inside the
i 1
rounded brackets results in:
n
(6)
(1 / n) [ xi * (( yi  y )  ˆ1 ( xi  x ))]  0 . Doing the multiplication implied
i 1
inside the square brackets gives:
n
(7)
(1 / n) [ xi * ( yi  y )  ˆ1 xi ( xi  x )]  0 . Now apply Property Sum.3 (p.
i 1
696) to the sum to get:
(8)
n
n
i 1
i 1
(1 / n)[  xi * ( yi  y )   ˆ1 xi ( xi  x )]  0 . Multiply both sides by n then
move the second sum to the other side of the equation to get:
n
(9)
 ̂ x ( x
1 i
i 1
n
i
 x )   xi * ( yi  y ) . Using Property Sum.2 (p. 695), results
i 1
in:
(10)
n
n
i 1
i 1
̂1  xi ( xi  x )   xi ( yi  y ) . Solving for ˆ1 yields:
9
n
(11)
 x (y
̂1 
i 1
n
 x (x
i 1
above that
i
i
i
 y)
. We know from class discussion and from problem 6
i
 x)
n
n
n
n
i 1
i 1
i 1
i 1
 xi ( yi  y )   ( xi  x )( yi  y ) and  xi ( xi  x )   ( xi  x ) 2
respectively. Substituting into (11) gives:
n
(12)
̂1 
 (x
i 1
i
 x )( y i  y )
.
n
 (x
i 1
i
 x)
2