Download Gender Economics Courses in Liberal Arts Colleges

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Regression analysis wikipedia , lookup

Discrete choice wikipedia , lookup

Linear regression wikipedia , lookup

Data assimilation wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Instrumental variables estimation wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Gender Economics Courses in Liberal Arts Colleges: Comment
William H. Greene
Abstract
Burnett (1997) proposes a model of the joint determination of two binary choice variables,
presence or absence of a gender economics course and presence or absence of a women’s studies
program. The econometric techniques used in estimation of her model are not consistent with the
model, and will not produce consistent estimates of the parameters of the model. This note reestimates
her bivariate probit model using maximum likelihood procedures. We also present some related results
on specification and estimation of a model in which two binary variables are jointly (simultaneously)
determined and on computation of marginal effects in a bivariate probit model.
Keywords: bivariate probit, marginal effects, gender economics
JEL classification: C34, C51
In ‘Gender Economics Courses in Liberal Arts Colleges,’ Burnett (1997) proposes a model for
explaining the presence of a gender economics course in a liberal arts college’s course offerings. The
model is based on covariates such as the academic reputation of the college, whether the college has a
religious affiliation, region of the country, and other variables relating to the number of women on the
faculty of the college and in the economics department. Two interesting aspects of her model are, first,
the fact that the ‘response variable’ of interest, presence of a gender economics course, is a binary
variable and, second, that one of the important covariates, presence of a women’s studies program, is
likely to be jointly determined
with the gender economics indicator, and is a binary variable as well. The first of these mandates use of
an appropriate technique for binary choice modeling, such as a probit model.
_______________________________
William H. Greene is a professor of economics at the Stern School of Business at New York University (44 West
Fourth St. New York, NY, 10012) . The author would like to thank Professor Nancy Burnett for providing her data for
this analysis, and Professors William Becker, William Walstad, and Peter Kennedy, the editors of the Journal of
Economic Education, for suggesting this work and for editorial assistance in preparing the manuscript.
1
The second greatly complicates the analysis in that it calls for something resembling a simultaneous
equations treatment. The major difficulty that this represents is that this is a nonlinear model, for which
familiar simultaneous equations techniques are inappropriate.
The author formulates probit models of sorts for both binary choice variables, and fits them
sequentially using an iterative least squares technique. In brief, an equation for the presence of the
women’s studies program is fit, then results from this equation are used ‘downstream’ in an equation for
the gender economics variable. The estimator is, therefore, a variant of two step (not ‘two stage’) least
squares. Binary choice models are not generally estimated by least squares, linear or nonlinear. The
method of choice in most studies is maximum likelihood. The purpose of this note is to suggest an
alternative model specification and estimator that is at once more appropriate for the theoretical
specification presented in her article and, surprisingly enough, much simpler to implement. In addition,
the discussion in Burnett’s article raises two interesting issues in binary choice modelling which, it seems
are not very widely known, first, an appropriate treatment for two simultaneously determined binary
variables, and, second, the computation of marginal, or partial effects in a bivariate probit model. The
discussion below will also include some useful results on these subjects. Lastly, having reestimated the
model in full, we revisit the empirical results. We find that our results are broadly consistent with
Burnett’s, but, we would argue, a bit sharper.
THE MODEL
Burnett gives extensive discussion to motivate the equations of the model, so we will merely lay
out the estimating equations here. The reader is referred to her paper for the background details. Using
our own notation (slightly different from the author’s), the model is as follows:
2
Equations:
GndrEcon
WomStud
Variable definitions:
GndrEcon
WomStud
AcRep
EconFac
PctWEcn
Relig
Sou
Mid
Nor
West
= f1(AcRep, WomStud, EconFac, PctWEcn, Relig),
= f2(AcRep, PctWfac, Relig, Sou, Mid, Nor,West).
= 1 if gender economics class is offered, 0 otherwise,
= 1 if the college includes a women’s studies program, 0 otherwise,
= academic reputation, lower means better reputation, 1 is best,
= number of full time economics faculty,
= percentage of female economics faculty,
= 1 if the college has a religious affiliation, 0 otherwise,
= regional dummy variable, South,
= regional dummy variable, Midwest,
= regional dummy variable, North,
= regional dummy variable, West.
(Burnett included an overall constant in her second equation, rather than the West dummy variable.)
ESTIMATION OF A RECURSIVE, SIMULTANEOUS
BINARY CHOICE MODEL
Given the obvious relationship between women’s studies programs and gender economics
courses, the preceding certainly appears to be a recursive simultaneous equations model. If the two
dependent variables were continuous measures amenable to linear types of regression modeling, and if
the omitted factors (determinants) in the two equations, (i.e., the counterparts to regression
disturbances), were uncorrelated, then the parameters of the two equation system could be estimated
consistently, indeed, efficiently, by ordinary least squares. However, the first condition is violated by
construction – these are binary choices. As we shall see, the evidence is at least suggestive that the
second is not, but the first is sufficient to render the simplest techniques inappropriate. Moreover, in
spite of the lack of correlation (to be discussed below), this remains a simultaneous equations model.
To develop the estimators, it is convenient to represent the model in simpler terms.
y1
x1
y2
x2
=
=
=
=
Let
GndrEcon,
all variables on the right hand side of the first equation except WomStud
WomStud,
all variables on the right hand side of the second equation.
Burnett obtained parameter estimates for the two equations by formulating the following two
equation model:
3
f2 (y2)
u2
f1 (y1)
= (x2)y2  [1 - (x2)]1 – y2
= y2 - f2 (y2)
= (x1 + u2]y1  [1 - (x1 + u2)]1 – y1,
where () is the distribution function of the standard normal distribution. Estimates of [,, ] were
obtained by using “an iterated, seemingly unrelated Marquardt-Levenberg technique” (Burnett, 1977,
372). Note the use of the ‘residual’ in the second equation to mitigate the simultaneity of the WomStud
variable in the GndrEcon equation. There is an ambuguity in the text at this point; the author states that
a “seemingly unrelated” estimator was used, which suggests joint estimation of the two equations,
whereas later in the paragraph, she states that she “substitut[ed] the residuals from the second equation to
form the 2SCE,” which suggests single equation, sequential estimation. Because the Rivers and Vuong
(1988) study on which Burnett’s estimator is based proposed a two step approach, we assume the latter is
the case.
The two step procedure represents an attempt to deal with the obvious simultaneity problem.
However, the approach has several shortcomings. First, neither right hand side specification given
above is a conditional mean function. Neglecting the possible correlation across the two equations, the
conditional means in the two cases would be the respective () functions as they appear above. As
such, the use of least squares with the functions shown above as the regression function is questionable.
On the other hand, with the apropriate conditional mean function, one can consistently estimate the
parameters of binary choice models using nonlinear least squares. But, nonlinear least squares is
inefficient.
The preferred technique is maximum likelihood.
Indeed the terms in the likelihood
function (for a single equation approach, anyway) are precisely those given by Burnett above. As such,
had she maximized the sum of the logs of the right hand sides as given above, rather than minimized the
sum of squared deviations, she would (abstracting from ‘u 2’ which remains to be considered) have
obtained maximum likelihood estimates of the model parameters. There remains the question of using
the residual, u2, in the second equation. The two step procedure suggested by Rivers and Vuong (1988)
is proposed for a probit model in which one or more of the right hand side variables are endogenous, but
4
continuous.
Thus, Burnett’s use of the residual in the second equation would be appropriate if
WomStud were a continuous variable, but not a binary outcome.
Finally, the Rivers and Vuong
technique dictates conditional maximum likelihood estimation at the second step, not nonlinear least
squares.
All of this suggests a fairly straightforward way to proceed, based essentially on Burnett’s two
step approach. First, the initial estimation of the WomStud (y2) equation can (should) be done with the
familiar maximum likelihood probit analysis. At the second step, rather than insert a ‘residual’ in the
GndrEcon equation in place of WomStud, simply insert the predicted values from the Womstud equation.
These would be the sample estimates of (x2), (i.e., the sample estimates of E[y2|x2]). The second
step consists of maximum likelihood probit estimation of the y1 equation, with right hand side variables
x1 and the sample estimates of (x2) obtained earlier. The estimated asymptotic covariance matrix at
this step must be corrected. This is a direct application of Murphy and Topel’s (1983) results for two
step estimation. (An application that is close to this one, and which could be modified to match it
exactly, appears in Greene (1997, 142-144.)
The two step procedure suggested in the previous paragraph gives consistent estimators and
appropriate asymptotic standard errors. However, it is potentially inefficient insofar as it does not
account for the possible correlation between the (implicit) disturbances in the two equations. Surprisingly
enough, a consistent, fully efficient estimator for this model is actually much simpler to apply than even
the revised two step estimator suggested earlier. Based on the relationships among joint, conditional,
and marginal probabilities,
Prob[y1 = 1, y2 = 1]
= Prob[y1=1| y2 = 1]  Prob[y2 = 1]
= {BVN(y1, y2=1) / Prob[y2 = 1] }  Prob[y2 = 1]
where BVN is used to denote the cumulative distribution function of the bivariate normal distribution.
The result seems trivial, but it has an important implication for this model. Inserting the variables of the
model, and introducing the two parameter vectors as in Burnett’s equations, we have
5
Prob[y1 = 1, y2 = 1] = [BVN(x1 + y2, x2, ) / (x2) ]  (x2)
(remember, y2 = 1), in which (x2) = Prob[y2 = 1]. This is a univariate probit model for y2, though in
this context, is, as well, the marginal probability that y2 equals 1.
After cancelling like terms in the
numerator and denominator, we are left with just the bivariate probability,
Prob[y1 = 1, y2 = 1] = BVN(x1 + , x2, ).
There are four cases, or cells for this bivariate probit model; the remaining three can be constructed
analogously to the first. For the remaining cells, we have
Prob[y1 = 0, y2 = 1] = BVN(-x1 - , x2, -) ,
Prob[y1 = 1, y2 = 0] = BVN(x1, x2, -) ,
Prob[y1 = 0, y2 = 0] = BVN(-x1, -x2, ) .
But, these are just the terms that enter the usual likelihood function for the bivariate probit model. The
counterintuitive result is that in the bivariate probit model, unlike in the linear simultaneous equations
model, if the two dependent variables are jointly determined, we just put each on the right hand side of
the other equation (or, in our case, one of them) and proceed as if there were no simultaneity problem.
This result is not new. One form of it appears (unfortunately with a couple of sign errors) on
page 123 of Maddala’s now classic 1983 monograph on limited dependent variable models.
The
complications that appear throughout Maddala’s discussion are bred of his inclusion of structural, latent
variables. Our model is much simpler; the observed responses are what appear on the right hand sides.
On the other hand, even abstracting from that simplification, Burnett’s model is precisely Maddala’s
‘Model 6’ on his page 122. We suggest, then, that the appropriate estimator for Burnett’s model is the
bivariate probit model, which can be fit easily by maximum likelihood using any of several econometrics
computer programs. (We used LIMDEP.)
THE DATA
Data used in this study were generously provided by Professor Burnett. She gives summary
statistics for the data used in her estimation in her Table 1. We encountered a few small discrepancies in
assembling the data for our regressions. It appears that in the group of 35 national colleges, one of the
6
midwestern colleges was classified as a northern college in the data set provided to us. A few other
minor differences appear in the summary statistics below, but there is no obvious explanation. For
completeness, our counterpart to Burnett’s Table 1 of summary statistics is given below.
Where they
differ from ours, Burnett’s reported statistics are shown in parentheses below ours.
7
Table 1. Descriptive Statistics
All 132 Colleges
35 National Colleges
97 Regional Colleges
----
---
Variable
Mean
Std.Dev.
Mean
Std.Dev.
Mean
Std.Dev.
--------
GndrEcon
0.234848
0.425519
0.714286
0.458349
0.0618557
0.242145
(0.4329)
(0.2409)
WomStud
0.439394
0.498204
0.942857
0.235504
0.257732
0.439658
(0.4963)
(0.4374)
AcRep
119.242
62.1201
18.6857
12.5435
155.526
13.9045
(119.0984)(62.1290)
(155.3298)
(19.7094)
PctWecn
0.247866
0.217745
0.220143
0.133794
0.257869
0.240786
(0.2471)
(0.2179)
(0.2568)
(0.2410)
EconFac
6.74242
3.747
10.3429
3.82671
5.4433
2.74609
(6.7500)
(3.7627)
(5.4536)
(2.7800)
PctWFac
0.35772
0.155059
0.324569
0.0967468
0.369682
0.170144
(0.3579)
(0.1554)
(0.3699)
(0.1706)
Relig
0.575758
0.49611
0.114286
0.322803
0.742268
0.439658
(0.4942)
(0.4374)
Nor
0.333333
0.4732
0.714286
0.458349
0.195876
0.398935
(0.3257)
(0.4686)
(0.6857)
(0.4710)
Sou
0.242424
0.430182
0.0857143
0.284029
0.298969
0.460184
(0.4286)
(0.4578)
Mid
0.272727
0.447058
0.0857143
0.284029
0.340206
0.47624
(0.2803)
(0.44915)
(0.1113)
(0.3228)
(0.4738)
West
0.151515
0.359916
0.114286
0.322803
0.164948
0.373062
Cross Tabulation of GndrEcon vs. WomStud
GNDRECON=0 GNDRECON=1 Total
WOMSTUD=0
72
2
74
WOMSTUD=1
29
29
58
Total
101
31
132
ESTIMATES OF A BIVARIATE PROBIT MODEL
Full information maximum likelihood estimates of the parameters of the bivariate probit model
are given below. In a first pass, the correlation between the two structural disturbances, , was allowed
to vary freely. The estimated value was 0.13594. The t ratio on this coefficient of only 0.108 suggests
that the effects may well be uncorrelated. The second set of results, our preferred set, was computed
with  fixed at zero. (This is equivalent to fitting two separate probit equations, with WomStud simply
included as a covariate in the GndrEcon equation - the probit counterpart of fitting a fully recursive
simultaneous equations model equation by equation, using ordinary least squares in spite of the
simultaneity.) Using the two sets of results, we also carried out a likelihood ratio test of the null
hypothesis that  equals zero against the alternative that  does not equal zero. The test statistic, LRT =
-2(-85.64578 - (-85.63172)) = .04612 is distributed as chi-squared with one degree of freedom under the
null hypothesis. The value of .04612 is virtually nil, so the null hypothesis is not rejected. Based on
8
these results, the model was reestimated with  constrained to equal zero. These are the results given in
Table 2. We emphasize, the resulting model is the first one discussed at the beginning of this article,
with the two equations estimated by simple, univariate, binomial probit methods, using maximum
likelihood as the estimation criterion.
The results are broadly consistent with Burnett’s. As she did, we find that academic reputation
is the most significant variable in either equation.
The sign is consistent, negative as expected.
Another statistically significant variable in the gender economics equation is the percentage of women on
the economics faculty. The negative effect of religioius affiliation that Burnett observed in her results is
preserved in ours, although unlike her result, we find no statistical significance. The size of the
economics faculty is not significant in the bivariate probit model either. The regional differences that she
observes are replicated. We observe as she did that women’s studies programs are more likely in the
north and midwest than in the south and west. (We replaced the constant term in the second equation
with the WEST dummy variable so as to display this result more conveniently.) In the main, in terms of
their magnitude and their statistical significance, the unambiguous result of the regressions in Table 2 is
that the primary determinant of a women’s studies program or a gender economics course is the strong
(in terms of numbers) presence of women on the faculty, a result that seems unsurprising. On the other
hand, the apparent influence of a large economics faculty and of religious affiliation both diminish
dramatically in our model, compared to hers. Finally, as might be expected, we find that a significant
determinant of the presence of a gender economics course is the presence of a women’s study program.
Table 2. Estimated Bivariate Probit Model
Number of observations
132
Iterations completed
19
Log likelihood function
-85.64578
Log likelihood with  not zero -85.63172
Variable
Coefficient
Standard Error b/St.Er. P[|Z|>z]
Mean of X
Index
equation for GNDRECON
Constant -1.417628571
.80692136
-1.757
.0789
ACREP
-.1143415362E-01 .40805763E-02
-2.802
.0051 119.24242
WOMSTUD
1.109512483
.56741514
1.955
.0505 .43939394
ECONFAC
.6730283502E-01 .68736793E-01
.979
.3275 6.7424242
PCTWECN
2.539164644
.98690669
2.573
.0101 .24786591
9
RELIG
-.3482478834
.49841969
-.699
.4847 .57575758
Index
equation for WOMSTUD
ACREP
-.1956572288E-01 .55235099E-02
-3.542
.0004 119.24242
PCTWFAC
1.942934611
.84349502
2.303
.0213 .35772046
RELIG
-.4493737886
.33313246
-1.349
.1774 .57575758
SOU
1.359691119
.65941139
2.062
.0392 .24242424
MID
2.338647544
.81043972
2.886
.0039 .27272727
NOR
1.886695657
.82039949
2.300
.0215 .33333333
WEST
1.824807716
.87231488
2.092
.0364 .15151515
Disturbance correlation
RHO(1,2) .0000000000 (Fixed Parameter) Unrestricted value 0.135938
MARGINAL EFFECTS
As Burnett suggests (1997, 374), the coefficients in a binary choice model can be
misleading. Because the model is actually of a probability, the absolute scale of the coefficients gives a
distorted picture of the response of the dependent variable to a change in one of the stimuli. Thus, for
example, it seems improbable that the isolated effect of religious affiliation on the expected value of
Gender Economics would be -.35. Gender economics courses appear in only 31 percent of the colleges
in total. As has been widely documented (Greene (1997)), in order to compute marginal effects in a
binary choice model, one must scale the coefficients. In a simple binary probit model, we would have
E[y1|x1] = (x1) = Prob[y1 = 1]
so that, for a continuous variable, zi,
E[y1|x1]/zi = (x1)/zi = (xi)   z
where () is the density function of the standard normal distribution and  z is the coefficient on that
variable. If zi is a binary variable, then the appropriate way to measure the marginal effect is to use
Effect on E[y1 |x1] = E[y1|x1z=1] - E[y1|x1z=0]
The approach in the general bivariate probit model is essentially as shown above, but the
computations are quite a bit more involved (Greene (1996)).
The computations involve the preceding
derivatives, as well as, for example, for a variable z that might appear in either equation,
BVN((x1 + ,x2,)/zk = {(x1 + )[(x2 - (x1 + ))/ 1   2 ]} z
+ {(x2)[(x1 + ) - (x2))/ 1   2 ]}z
10
The results are much simpler in our model because  equals zero. The result of this is that the joint
probability is the product of the marginals. The conditional mean function in the model analyzed here is
E[y1|x1,x2] = Ey2 E[y1 | x1,x2,y2 ]
= Prob [y2 = 1] E[y1 | x1,x2,y2 = 1] + Prob[y2 = 0] E[y1 | x1,x2,y2 = 0]
= (x2) (x1 + ) + (-x2) (x1),
where we have used the symmetry of the normal distribution in (-x2) = 1 - (x2). For y2,
E[y2|x2]
= (x2).
Therefore, the marginal effects in the women’s studies equation are exactly those shown above for the
univariate probit model. But the marginal effect of a change in a variable in the gender economics
equation will be a sum of terms. One will account for the direct effect of a change in that variable on the
probability that y1 equals one, and the other will measure the indirect effect of the change in this variable
on the probability that y2 equals 1 in the women’s studies equation which, in turn, affects the probability
that y1 equals one. Thus:
(1) For a continuous variable, z, which might appear in x1 and/or x2, we have
 E[y1|x1,x2]/z =
[(x2) (x1 + ) + (-x2) (x1)]  z
(direct effect)
+ [(x2) (x1 + ) + (-x2) (x1)] z
(indirect effect)
where  z and z are the coefficients on z in the two equations. Note that either of these may be zero.
(2) For a binary variable, q, which might appear in x1 and/or x2, we have
E[y1 | x1,x2,q=1] - E[y1 | x1,x2,q=0] = [(x2) (x1 + ) + (-x2) (x1)]| q=1
- [(x2) (x1 + ) + (-x2) (x1)]| q=0.
(3) For the second, endogenous binary variable, y2, we have
E[y1 | x1,x2,y2 = 1] - E[y1 | x1,x2,y2 = 0] = (x1 + ) - (x1)
In all cases, standard errors for these are computed using the delta method. Let k(,,,data)
equal the marginal effect, which is computed using the parameter estimates and some configuration of
the data; in the computations below, the sample means are used, with dummy variables set to zero and
one. Let
 = k(,,,data)/[,]
11
 = k(,,data)/
Since  is zero, our model is estimated using separate probit equtions. Let Vb denote the estimated
asymptotic covariance matrix of the estimates in the first probit equation (gender economics) and Va
denote the same for the women’s studies equation. Then, an estimate of the asymptotic variance for the
estimated marginal effect is
Asy.Var.k =  Vb  +
 Va 
The square root gives the estimated standard error for the estimator. (If  were not zero, these two terms
would be combined into a single vector and a single asymptotic covariance matrix which would apply to
all parameters in the model. The partitioning into two terms is possible when  equals zero in the
bivariate model, because the asymptotic covariance matrix for the full set of estimators becomes block
diagonal.)
The estimated marginal effects of the variables in the two equations are given in Table 3. At
this point, our results depart sharply from Burnett’s. The primary reason is that the implicit scaling in
the probit model differs from that in her regression model. However, beyond that, we find some
substantive differences in the estimated effects. In terms of its marginal effect, academic reputation is
far less influential in the bivariate probit model, and, qualitatively far less important than Burnett’s
results suggest.
In terms of its effect on gender economics, nearly all of the effect of academic
reputation is the direct effect on the probability itself, as opposed to the indirect effect of academic
reputation on the probability that the college has a women’s studies program.
By far, the most
substantial effect, at the margin, on the probability that the college will have a gender economics course
is exerted by a large female presence on the economics faculty. The existence of a women’s studies
program has a lesser, but also substantial effect, similar to the percentage of women on the faculty of the
college. (Note that the entire effect of this variable is indirect, but that this effect is still fairly large.)
Finally, the effect of religious affiliation is quantitatively smaller in our results, but, in contrast to
Burnett, it is more important in women’s studies than in gender economics.
12
Table 3. Estimated Marginal Effects
Direct
Indirect
Total
Gender Economics Equation
AcRep
-0.002022 -0.001453 -0.003476
PctWecon +0.4491
+0.4491
EconFac +0.01190
+0.1190
Relig
-0.07049
-0.03227
-0.1028
WomStud +0.1863
+0.1863
PctWfac
+0.13951
+0.13951
Women’s Studies Equation
AcRep
-0.00754
-0.00754
PctWfac +0.13789
+0.13789
Relig
-0.13265
-0.13266
(standard error, t-ratio)
(type of variable)
(.00126, -3.087)
(.1568, 2.864)
(.01292, 0.922)
(.1055, -.974)
(.0868, 2.146)
(.08916, 1.565)
(Continuous)
(Continuous)
(Continuous)
(Binary)
(Endogenous)
(Continuous)
(.002187, -3.448)
(.01002, 13.76)
(.18803, -.706)
(Continuous)
(Continuous)
(Binary)
CONCLUSIONS
The preceding has suggested a number of aspects of modeling in a recursive, simultaneous
equations model involving two binary choice variables. We suggested an alternative to the estimation
technique used by Burnett in her study, and, in the process, were able to apply her data and model to
some useful, though apparently not widely used computations in this setting. In spite of her
misspecifications of the conditional means at the outset, our base results do not differ radically from hers;
most of the qualititative results are replicated. However, the results do differ enough to conclude that
the use of the more appropriate techniques is clearly warranted.
REFERENCES
Burnett, N. 1997. Gender economics courses in liberal arts colleges. Journal of Economic
Education 28 (Fall): 369-377.
Greene, W. 1997. Econometric Analysis (3rd ed.), Englewood Cliffs, NJ: Prentice Hall.
Greene, W. 1996. Marginal effects in the bivariate probit model. Working paper number 96-11.
Department of Economics, Stern School of Business, New York University.
LIMDEP. 1998. Econometric Software, Inc.: Plainview, New York.
Maddala, G. 1983. Limited dependent and qualitative variables in econometrics. Cambridge:
Cambridge University Press.
Murphy, K. and R. Topel. 1985. Estimation and inference in two step econometric models.
Journal of Business and Economic Statistics, 3 (Fall): 370-379.
Rivers, D. and Q. Vuong. 1987. Limited information estimators and exogeneity tests for
simultaneous probit models. Journal of Econometrics 39 (November): 347-366.
13
14