* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Diffusion or Confusion? Modeling Policy Diffusion with Discrete Event History Data
Survey
Document related concepts
Transcript
Diffusion or Confusion? Modeling Policy Diffusion with Discrete Event History Data Jack Buckley Department of Political Science State University of New York at Stony Brook Stony Brook, NY 11794-4392 [email protected] Paper prepared for the 19th Annual Summer Political Methodology Meetings, Seattle, Washington, 2002. The research reported in this paper was supported by the National Science Foundation’s Graduate Research Fellowship program. Abstract In a review of the literature on policy diffusion in the states, Berry and Berry (1999) conclude that the “gold standard” methodological approach to the empirical modeling of these processes is discrete event history analysis. I concur, but find that the literature largely ignores several important issues in the proper specification of these models, including choice of functional form, modeling different mechanisms of diffusion, modeling duration dependence, and spatial autocorrelation in the cross-sections. I use data from Berry and Berry’s (1990) classic study of the diffusion of state lotteries to suggest possible improvements to this research. Introduction Since Walker’s (1969) seminal work, students of the policy process have recognized that diffusion, or the spreading of policy innovations from state to state, has important consequences both for theories of public policy and real policy outcomes. As Berry and Berry (1999) point out in their review of the diffusion literature, the “gold standard” approach to the empirical testing of diffusion models is widely regarded to be event history analysis. I concur, but find that the literature largely ignores several important issues in the proper specification of these models, including modeling duration dependence, modeling different mechanisms of diffusion, choice of functional form, and spatial autocorrelation or dependency in the cross-sections. In this paper, I use data from Berry and Berry’s (1990) classic study of the diffusion of state lotteries to suggest possible improvements to this research. The best recent empirical research on policy diffusion and innovation in the states, such as Berry and Berry's (1990;1992) work on state lotteries and tax policy, Mooney and Mei-Hsien's (1995) study of pre-Roe abortion policy, or Mintrom's (1997; Mintrom and Vergari 1998) analysis of education reform, shares a common methodological approach: discrete event history analysis. This method has the flexibility to allow modeling of both what the literature calls the internal determinants of policy change (i.e. per capita income in a state, the percent of religious fundamentalists, etc) and specialized measures that test different theoretical approaches to diffusion (discussed in greater detail below). In the standard approach a starting date is chosen, such as the first year a policy is introduced, and data are collected on all the states that are in the risk set— have a positive probability of adopting the policy—for a series of discrete time periods. As states adopt the policy and are thus removed from the risk set, data are no longer collected on them. The data set is then analyzed typically using a dichotomous dependent variable indicating whether the policy was adopted in a given year by a given state. Coefficients and their standard errors are estimated using familiar maximum likelihood estimators such as logit or probit (Allison 1984). This approach to testing theory with discrete event history data is straightforward and computationally economical, but has several shortcomings. First, as Berry and Berry note, this simple analysis “assumes that the probability that a state will adopt in one year is unrelated to its probability of adoption in prior years,” (1999:192). In duration modeling or event history analysis parlance, the hazard rate is flat—the model is constrained to allow no duration dependence (see Box-Steffensmeier and Jones 2002), chapter 6). Second, researchers have not paid enough attention to including theoretically sound measures of diffusion that prevent confounding this important concept with duration dependence. Third, several alternatives to the traditional logit or probit estimators exist that can lead to different results or to the modeling of additional processes beyond the scope of the original model. Finally, the literature to date has ignored issues of spatial dependency in the data. As I address each of these issues in turn below, the central point I wish to convey is that a host of modeling choices (and sometimes tradeoffs) must be made to properly model policy diffusion. I also wish to stress that this paper is not a primer on duration modeling generally—in fact I refer frequently to just such a work, the excellent and accessible book by Box-Steffensmeier and Jones (2002). Rather the focus here is more narrowly on issues specifically important to the modeling of policy diffusion. Duration Dependence The unique attribute of duration or survival analysis is the ability to consider functions of time conditional on the data in hand or time itself. Practitioners of event history analysis frequently refer to one of these functions as the hazard rate, h(t ) , which can be defined as the instantaneous rate at which units “fail” (i.e. states adopt a policy) given that they have survived to time t. As Berry and Berry point out in the quotation cited above, their event history analysis of state lottery adoptions (1990) and tax policy diffusion (1992) implicitly assumes that this rate is unchanging over time—a flat hazard rate. In other words they are assuming that, conditional on the covariates in their model, the probability of duration until policy adoption is not changing as a function of time (Box-Steffensmeier and Jones 2002: 134-5). This assumption of flat hazard or no duration dependence is equivalent to the assumption that their models are completely and correctly specified—a rather strong assumption for a social science model. Moreover, since the purpose of their research is the study of policy diffusion over time (and space), it would seem to be more practical to relax this assumption and account for the possibility of unmodeled duration dependence in the data (Beck, Katz, and Tucker 1998). Fortunately, there are several relatively simple means of doing this. As Box-Steffensmeier and Jones point out (2002: 135), the most general means of doing this in the context of discrete duration modeling is the inclusion of a dichotomous (or “dummy”) variable for the t-1 time periods under observation. The advantage of this approach is that it is very flexible—no a priori functional form need be chosen to model the effect of time on the hazard. However, adding t-1 variables can make it difficult to estimate other parameters if the initial number of degrees of freedom is not large.1 One alternative approach is the inclusion of time itself as a regressor in the model. This can be done simply by including a “counter” variable representing the time of observation2, or by including one or more transformations of time (such as the natural log of time, or the square and cube of time). This method is certainly more parsimonious than the temporal dummies approach, but requires the researcher to make an assumption about the effects of time conditional on the other covariates in the model. If, for example, the researcher includes a linear time counter, then the implicit assumption is that conditional on the various quantities in the model and their effects, the hazard rate is increasing at a constant rate with time. To illustrate the use method and its effect on the estimates obtained, I turn to a familiar example: the Berry and Berry (1990) study of state lottery adoptions.3 Berry and Berry estimate a discrete event history model of the form: Adopt i ,t = Φ ( β 0 + β1Fiscal Health i ,t −1 + β 2 Per Capita Incomei ,t −1 + β 3Single Party Controli ,t + β 4 Gubernatorial Election Yeari ,t + β 5 Neither Election Year nor Year Afteri ,t + β 6 Percentage Fundamentalistsi ,t + β 7 Number of Neighboring States with Lotteryi ,t ) 1 Mintrom and Vergari include dichotomous variables for most of their time points (the first two are collapsed into one). However, they do this not in recognition of the issue of modeling duration dependence but as an alternative specification of diffusion. I discuss this further below. c ∈ [1, 2,..., t ] or c ∈ [0,1,..., t ] . 2 Generally coded so that 3 Note that I do not include any of the interactive terms discussed by (Frant 1991) and (Berry and Berry 1991) in the controversy over whether Berry and Berry’s theory is correctly tested. where policy adoption by a state in a given year is a probit function of the linear combination of lagged fiscal health and per capital income, dichotomous variables for single party control, gubernatorial election year, and whether t is neither an election year or the year after an election, the percentage of religious fundamentalists out of the whole state population, and the number of neighboring states with a lottery. The first column of Table 1 presents my results of the estimation of this model using the original dataset. Unsurprisingly, I exactly replicate the results of Berry and Berry.4 In the second column of the table, however, I add a simple linear term using a time counter variable (coded 1 for the first year of observation to 23 for the last), yielding the model (with the other terms abbreviated in a slight abuse of notation): Adopt i ,t = Φ( β 0 + x 'i ,t β + β8 Time Countert ) This addition to the model has some interesting consequences. First, and most importantly, the ratio of the estimate of the marginal effect of the number of neighboring states to its standard error is smaller than in the Berry and Berry model, small enough in fact to lead to the failure to reject the null hypotheses that the coefficient is equivalent to 0 at conventional two-tailed significance levels. Also, note the estimated of effect of lagged fiscal health is now statistically significant, as is the time counter. This latter result suggests (at least when the linear functional form is chosen) that there is an increasing risk of states adopting a lottery over 4 More precisely, the coefficient estimates are identical but my standard errors are slightly different because I also cluster on the state variable, since the observations over time for a single state are almost certainly not independent. time, even though the neighboring states hypothesis is not supported. I caution the reader not to immediately interpret this result, however, and in the next section I examine alternative specifications of diffusion estimated simultaneously with a duration dependence term. Table 1 About Here Before moving on, however, I first need to explore alternative specifications of duration in the model. Following the suggestions of Box-Steffensmeier and Jones (2002), I estimate first a model including a term of the natural logarithm of the time counter, and second a model using a natural cubic spline of time that attempts to recover some of the flexibility of the temporal dummy approach. The first model is simply: Adopt i ,t = Φ ( β 0 + x 'i ,t β + β 8 ln [Time Countert ]) The results, presented in the third column of Table 1, are similar to those of the linear time counter model, although they are somewhat more supportive of the state competition (number of neighbors) theory of diffusion. The cubic spline approach, suggested by Box-Steffensmeier and Jones (2002: 137) follows the recent work of Beck, Katz, and Tucker (1998) and Beck and Jackman (1998) on the use of spline functions or locally weighted regression to estimate smooth functions of variables. In this case, I estimate a five knot cubic spline of the probability of adoption of a state lottery on the time counter variable and include the linear prediction as a covariate to model duration dependency.5 This new variable allows me to capture considerable non-linearity without the loss of degrees of freedom associated with dummy variables or the need to specify a restrictive functional form. The model is thus: Adopt i ,t = Φ ( β 0 + x 'i ,t β + β 8τˆi ,t ) where τˆi ,t are the residuals from the cubic spline regression. The results of this approach are in the last column of Table 1, and Figure 1 provides a visual comparison of the linear, natural log, and cubic spline specifications of duration dependence. Once again, estimates are similar to those of the original model. Duration dependence is positive and statistically significant in this model as well. Figure 1 About Here Table 2 provides comparison of the four models on two criteria—estimated coefficient of the number of neighboring states (the diffusion measure in the original paper) and the chi-square statistics and p-values for likelihood ratio tests of the original model with no duration dependence against the three alternative models. All three of the models have statistically-significant likelihood ratio statistics, indicating superior fit to the original model. The cubic spline model, as should be expected given its greater flexibility, appears to be the best fit. Accordingly, I will continue to include the time spline variable in subsequent models as I go on to discuss other aspects of the event history analysis of policy diffusion. 5 This is easily done using the downloadable command “spline” for the software package STATA 7. Here I rescale the result by a factor of 100 for ease of comparison with the other duration approaches. Modeling Diffusion I now turn to the most theoretically interesting choice that the researcher must make: how to properly model the diffusion mechanism itself. Berry and Berry (1999) identify two relevant patterns of diffusion in state policy adoption: regional diffusion and national interaction models.6 I take these categories as starting points for our discussion. Within the category of regional diffusion models, Berry and Berry further specify two subsets: neighbor models and fixed-region models. In the former, policies are thought to spread to contiguous states due to their shared borders. This is the hypothesized diffusion mechanism in the state lottery example, and Berry and Berry support this approach by arguing that states directly compete for revenue which can be lost to neighboring states as consumer drive across state lines to purchase lottery tickets. The neighbor model might also be appropriate, Berry and Berry argue, for other types of economic competition among states (1992). Estimates from a model using the neighboring states measure of diffusion are thus identical to those in the models presented above; for the sake of comparison with later models, I present again the results of the cubic spline duration dependence model with number of neighbors as diffusion measure, using the Berry and Berry (1990) data on state lottery adoptions, in the first column of Table 3. Table 3 About Here 6 They also identify a third and fourth: leader-laggard models and vertical influence. The former, as they point out, is either virtually impossible to specify or else is equivalent to one of the other mechanisms. The latter is only relevant to policies with incentives or direction from the Federal government to the states. In contrast to the neighbor model, the fixed-region model does not constrain diffusion to occur only among contiguous states. Rather, states are assumed to have a positive probability of adopting policies that other states in the same geographical region have adopted. Mooney and Mei-Hsien (1995), for example, adopt this approach in their analysis of pre-Roe abortion policy. Regions can either be the familiar divisions in the United States (i.e. Northeast, Southeast, etc.) or they may be constructed empirically by factor analysis (Berry 1994; Walker 1969). In the second column of Table 3, I replicate the lottery analysis replacing the covariate for the number of neighbors adopting with a new measure: the percentage of other states in state i’s region at time t that have adopted the lottery. We specify four, approximately equal regions: Northeast, Southeast, Midwest and West (includes the Southwest). As the table shows, results are similar to the original model. The national interaction model is a sort of epidemiological approach to diffusion at the national level. Pioneered by Gray (1973), this approach assumes that leaders of the states (governors, legislators, policy professionals) constitute a single social system through which ideas propagate in accordance with simple dynamics borrowed from communications theory (see Berry and Berry 1999: 172-174 for a more thorough explanation). An alternative (but perhaps equivalent given certain assumptions at the micro-level) formulation is the “maturation effects” approach of Mintrom and Vergari (1998). In this model unspecified forces cause state leaderships to conclude that the “time has come” for certain policy ideas. Figure 2 Here If a national interaction process is occurring, policy diffusion will follow the familiar S-shaped or learning curve as states are slow to adopt at first, more rapidly adopt in the middle, then slow down again as the system becomes saturated (as almost all states have adopted). Figure 2 shows the empirical curve for the cumulative proportion of states adopting a lottery over time. The figure suggests that the saturation point has not been reached: there is no “flattening out” at the end of the observation period.7 How can one include a national interaction or maturation effect in an event history analysis of policy diffusion? Mintrom and Vergari choose to include dummy variables for years observed, but as we mention above, this is equivalent to allowing for unspecified duration dependence (which they do not model separately). Another possibility is including a count of the number of states in the entire nation (or the proportion of states) that have adopted the policy at time t. This is equivalent to the neighborhood model with the entire nation as neighborhood. As in the case of the time counter, this total counter can be transformed (i.e. use logged or polynomial functions of the variable) or locally-weighted or spline regression can be used to account for nonlinearities. To illustrate this approach, I estimate a model identical to those discussed in the previous section with the residuals from a cubic spline regression of whether a state has adopted a lottery on the total number of states with a lottery at time t (denoted cˆi ,t ). 7 This is supported by the fact that since 1986, the last year of the lottery study, 10 more states have adopted a lottery. Duration dependence ( τˆi ,t ) is also included separately in the model using the cubic spline approach. The model is thus: Adopt i ,t = Φ ( β 0 + x 'i ,t β + β 8τˆi ,t + β 9 cˆi ,t ) Results of this model are presented in the third column of Table 3. The coefficient on the total spline is not statistically significant, suggesting that this model of diffusion is not appropriate for the state lottery case.8 I now turn to a brief discussion of the estimation of diffusion models with different functional forms than the probit used up until now. Functional Forms Yet another modeling choice that the researcher confronts when conducting discrete event history analysis is the selection of the appropriate functional form for the empirical model. In an ideal world, this choice would not matter. Unfortunately, as we demonstrate below, choice functional form can have a substantial effect on results when real data are analyzed. Most political scientists are familiar with both the probit function and the logit function; indeed, most of the time they are used interchangeably in the analysis of dichotomous random variables. If π i ,t is the probability that state i adopts a given policy at time t, then the probit function is: π i ,t = Φ(x 'i ,t β) 8 Note also that the correlation between the total spline and the time spline is 0.92. where phi is the cumulative distribution function of the standard normal, and the logit function: π log i ,t 1−π i ,t = x 'i ,t β A simple graph of the two functions, however, reveals that they are not identical. As Figure 3 illustrates (the third curve in the figure is discussed below), the logit function has “fatter tails” than the probit—it approaches 0 and 1 more slowly. In many applications this is unimportant; however the difference can become a problem with a large number of observations or when many of the predicted probabilities are very small or very large (or, equivalently, if the unobserved continuous random variable has very large or small values). This latter case is precisely the problem in many discrete event history analyses: as units fail (or states adopt a policy) they are removed from the data set. Adoption or failure thus essential a “rare event”—the number of observed 1’s in the dependent variable vector is small relative to the number of 0’s. Figure 3 Here Table 4 Here As an example of the consequences of this choice, I present two identical models of state lottery diffusion (with duration dependence modeled by time spline and diffusion by regional percentage) differing only in functional form. I do not calculate first derivatives or predicted probabilities to compare results—I only consider here the differences in the ratios of estimated coefficients to their standard errors. The results are in the first two columns of Table 4. As is evident, changing from a probit functional form to a logit has a substantial effect on the results of conventional hypothesis tests on the estimated values of the coefficients versus the null that they are equal to zero. Most prominently, both diffusion and single party control go from being significant at the 0.05 and 0.10 levels (respectively) to not statistically significant. From this narrow perspective, at least, choice of functional form matters. Despite this issue, logit and probit are still considered appropriate for discrete event history analysis (Allison 1984), Box-Steffensmeier and Jones (2002). Nevertheless, there are several alternative functional forms worth considering. The first of these is the complementary log-log (or “cloglog”) function: π i = 1 − exp − exp ( x 'i ,t β ) As Figure 5 illustrates, this function is asymmetrical: it has a “fat tail” as it approaches 0 but it approaches 1 more quickly than either the logit or probit functions. This suggests that it may be more appropriate for the “rare event” discrete event history case. As the third column in Table 4 shows, estimates of the lottery diffusion model using the cloglog function are similar (in terms of what is conventionally significant) to those obtained using the logit estimator. Another justification for using the complementary log-log function that BoxSteffensmeier and Jones (2002: 133) point out is that it is mathematically the discretetime analog of the continuous-time Cox proportional hazards model that is their preferred model for continuous duration analysis. Box-Steffensmeier and Jones also discuss the use of the conditional logit form of the Cox model (in which “ties” are resolved via the exact discrete method) as yet another alternative for discrete event history analysis. For completeness, I also include one additional estimator for discrete event history data. As Western and Jackman (1994) and more recently Gill (2001) have discussed, the logic of frequentist statistical inference is on less than firm ground when it comes to analyses when no random sampling is used. In all of the diffusion literature (and in all of the models estimated in this paper to this point), every attempt is made to gather data on all 48 contiguous states9 for a long a time period as possible (usually from the first adoption of a policy until the date of the research). Given this fact, Gill proposes two solutions: present additional measures of variance and treat the estimates obtained as population parameters, or use a full Bayesian analyses and present results not as point estimates with standard error but as marginal posterior distributions. I illustrate a simple version of the latter, again using the state lottery data with the time spline and regional percentage variables, and a logit link function. The model10 is: adopti ,t ~ bernoulli (π i ,t ) π i ,t 1 − π i ,t = x 'i ,t β with uninformative priors on the covariates and missing values (which are estimated jointly with the other parameters). Results of a Markov chain Monte Carlo estimate of the model are presented in the last column of Table 4 (based on 2500 iterations with 500 discarded as burn-in). The table gives the posterior empirical means and the 2.5 and 97.5 percentiles. Results are similar to the other estimators (although interpretation is, of course, quite different). Of particular interest is the posterior for the diffusion 9 Actually 46 in the lottery data due to listwise deletion of missing values for the party control variable. 10 I include the code for estimating this model using the free package WinBugs (Gilks, Thomas, and Spiegelhalter 1994; Spiegelhalter, Thomas, and Best 1999) in the Appendix. parameter—although the [2.5%,97.5%] interval includes 0, most of the probability mass is positive (see Figure 4). Figure 4 Here This simple Bayesian model barely begins to demonstrate the power of this approach for estimating complex models of diffusion, and for incorporating prior information to condition our posterior estimates. Unfortunately I do not have the space here to discuss this area in greater detail. I now turn to our final issue: spatial dependence and autocorrelation. Considering Space Tobler's (1979) First Law of Geography holds that “everything is related to everything else, but near things are more related than distant things.” Unfortunately, geography can also hamper the proper estimation of empirical models of diffusion. Why? In most of the examples presented above, I estimate robust standard errors clustered at the state level to account for the non-independence of observations. This, however, does not account for two additional problems.11 First, just as in the case of duration dependence discussed above, unmeasured factors that vary with respect to geography may confound estimates of diffusion. As I demonstrate below, the solutions for this problem are quite similar to those introduced for modeling duration dependence. At the regional level, or in the immediate vicinity of each additional state, there is a different problem. Several of the “internal determinants” variables in the lottery 11 Yet a third potential problem, spatial heteroscedasticity, is not considered. model(s), such as lagged fiscal health, lagged per capita income, and even the percent of the population that adheres to a fundamentalist religion, are likely to be highly correlated within neighborhoods during certain years. The business cycle may affect the states differently, but nearby states often have similar economies and cultural histories. There is a large literature on accounting for or modeling this spatial autocorrelation in econmetrics (see, for example, Anselin 1988). Unfortunately, there is little (no) work on discrete event history analysis with spatial dependence. I conclude this paper with some thoughts on this issue. Figure 5 Here First, how do we know that either problem is a concern? A simple graphical analysis is revealing: Figure 5 shows the same cumulative proportion of states adopting a lottery versus time that is presented in Figure 2—but this time broken down by the four regions discussed above. The Northeast, in the lower left quadrant, exhibits the classic “S” or learning curve discussed above—much better than the national aggregate, in fact. The other three regions, however, each exhibit different diffusion patterns. Most striking is that of the Southeast—no state lotteries adopted at all until the very end of the series. First, I examine the problem of unmodeled spatial dependence. As in the case of temporal (or duration) dependence, a possible strategy for accounting for this is the introduction of dichotomous variables, here for geographic region. The results of this are demonstrated in the first column of Table 5, as part of a model using the familiar state lottery covariates, the time spline, and the complementary log-log functional form. As the table illustrates, the inclusion of the geographic dummies leads to a failure to reject the null hypothesis (at conventional significance levels) that policy diffusion through the neighboring states mechanism is occurring. The problem with geographic dummy variables is not the loss of degrees of freedom (as in the case of duration dependence) but the arbitrariness of the regional categories. I now turn to two methods of overcoming this weakness. Table 5 Here One alternative method of accounting for unmodeled differences in space, that avoids the arbitrary nature of dichotomous region variables, is a generalized geographical approach. Rather than constrain a model to the a priori definitions of regions, I instead model the diffusion of policy considering covariates for the physical location of the state capitals. Figure 6 shows the spatial data—x and y grid coordinates12 for each city (referred to sometimes as “eastings” and “northings”). Once again, I introduce flexibility into our model by estimating cubic spline regressions on x and y and including the smoothed values in the model instead of the untransformed covariates. Figure 7a and b present the results of the spline regressions graphically. They suggest that the extreme east and west cities are more likely to adopt lotteries, as are cities that are more northern. Figures 6a,b and 7 here The second column of Table 5 includes these splines in the same cloglog model discussed above. Conditional on the other covariates, neither the smoothed spline values nor the neighboring states variable are found to be significant. 12 I use arbitrary grid references instead of the more familiar latitude and longitude measures to avoid introducing nonlinearities due to spherical coordinates. A final way of introducing geography to the model is the use of a single spline for both the x and y dimensions, such as a thin-plate spline (Hastie, Tibshirani, and Friedman 2001), which is the multidimensional analog of the cubic smoothing spline. Two thinplate spline fits of x and y position to the lottery adoption variable are shown as contour maps in Figures 8a and b. The first spline fit was constructed by placing five knots at five centroids of the data, the second uses 10 knots. As a comparison of the figures shows, the use of more knots allows for a more complex contour surface. Figures 8a, b Here The results of including the smoothed values of the two thin-plate spline regressions in the same cloglog model are presented in the last two columns of Table 5. In both cases, once again, the coefficient on the number of neighboring states adopting a lottery is not significant. The final issue to discuss is the potential problem of spatial autocorrelation. In the context of the lottery data, the variables that appear to be suspect from a theoretical standpoint are lagged fiscal health, lagged income, and the percentage of the state population adhering to a fundamentalist religion. In Table 6, I present the results of a standard diagnostic, Moran’s I statistic (Anselin 1988), evaluated for each state capital using the 1964 cross-section (the only year to have a complete set of observations). The analysis is repeated for neighbors of each state in three concentric neighborhoods: 01000, 1000-2000, and 2000-3000 units (recall that the entire country is about 5000 X 7000 units). Table 6 Here As the results show, spatial autocorrelation is a problem for all three variables in the smallest neighborhood, in which states are positively correlated with their neighbors (unsurprisingly). Interestingly, the religion variable is also negatively correlated within the largest neighborhood.13 Accounting for this spatial autocorrelation in the context of a discrete duration model, however, is not as simple as diagnosing it. One possible approach is the use of a geographically weighted regression (Brunsdon, Fotheringham, and Charlton 1996), perhaps in conjunction with a generalized linear model with an appropriate (i.e. logit, probit or cloglog) link function or a relatively new and more general Bayesian approach (LeSage Forthcoming). The problem here is that states that adopt the policy late or not at all in the period of observation have a disproportionate number of observations in space and may cause the results to be biased. To illustrate this approach (mindful of its shortcomings) I conclude this paper with a comparison of a linear probability duration model (i.e. with the use of the least squares estimator instead of properly accounting for the dichotomous nature of the dependent variable) of the adoption of a state lottery with a geographically weighted regression model in which local linear regressions are estimated (in a 1000 iteration Monte Carlo Process) for each data point in space and the bandwidth parameter is estimated jointly with the covariates. The results are below in Table 7. 13 I believe that this is an artifact of the size of the nation relative to the neighborhood size and the historical concentration of fundamentalists in the South. Table 1: Comparing Models of Duration Dependence Using State Lottery Adoption Data (Berry and Berry 1990)—Probit Function, Standard Errors in Parentheses Variable Berry and Berry Model* (No Duration Dependence) -1.69 (1.22) Linear Duration Dependence Log Duration Dependence Cubic Spline Duration Dependence -2.52 (1.28) -2.56 (1.25) -2.40 (1.45) Lagged Per Capita Income 0.023 (0.027) 0.008 (0.008) 0.013 (0.007) 0.013 (0.007) Single Party Control -0.40 (0.22) -0.41 (0.21) -0.41 (0.21) -0.43 (0.21) Gubernatorial Election Year 0.82 (0.34) 0.81 (0.36) 0.84 (0.35) 0.72 (0.37) Neither Election Year nor Year After 0.59 (0.34) 0.60 (0.36) 0.57 (0.35) 0.57 (0.38) Percentage -0.034 (0.019) Fundamentalists -0.07 (0.03) -0.05 (0.02) -0.07 (0.03) Number of Neighboring States with Lottery 0.27 (0.086) 0.15 (0.10) 0.18 (0.10) 0.15 (0.09) Constant -4.51 (0.94) -8.68 (1.50) -4.57 (0.98) -3.80 (0.89) Lagged Fiscal Health Time Counter Natural Log of Time Counter 0.078 (0.022) 0.49 (0.25) Time Spline 0.15 (0.03) Log-Likelihood -89.56 -84.86 -87.20 PCP 97% 97% 97% PRE 0% 3.7% 3.7% N = 857 * all models use robust standard errors clustered at the state level -77.78 97% 0% Figure 1: Comparing Methods of Accounting for Duration Dependency 24 Time Counter ln(Time Counter) Cubic Spline 19 14 9 4 -1 65 70 75 Year 80 85 Table 2: Comparing the Fit of Duration Dependency Models Model Coefficient for Number of Neighboring States (Standard Error) Likelihood Ratio Statistic (d.f.) No Duration Dependence 0.27 (0.09)*** N/A Linear Function 0.15 (0.10) 9.4 (1)*** Natural Logarithm 0.18 (0.10)** 4.7 (1)** Cubic Spline 0.15 (0.09)* 23.6 (1)*** * p < .10 (two-tailed for first column) ** p < .05 *** p < .01 Cumulative Proportion Adopting Lottery Figure 2: Cumulative Proportion of All 48 Contiguous States Adopting a Lottery versus Time 0.5 0.4 0.3 0.2 0.1 0.0 65 70 75 Year 80 85 Table 3: Comparing Models of Diffusion Using State Lottery Adoption Data (Berry and Berry 1990)—Probit Function, Standard Errors in Parentheses Variable Lagged Fiscal Health Lagged Per Capita Income Single Party Control Gubernatorial Election Year Neither Election Year nor Year After Percentage Fundamentalists Time Spline Number of Neighboring States with Lottery Regional Percentage Number of Neighbors -2.40 (1.45) 0.013 (0.007) -0.43 (0.21) Regional Percentage -2.78 (1.46) 0.011 (0.007) -0.42 (0.21) Nationwide Total (Cubic Spline) -2.34 (1.40) 0.015 (0.007) -0.40 (0.21) 0.72 (0.37) 0.57 (0.38) 0.70 (0.37) 0.57 (0.38) 0.78 (0.38) 0.59 (0.38) -0.07 (0.03) 0.15 (0.03) -0.07 (0.02) 0.14 (0.03) -0.08 (0.03) 0.19 (0.08) 0.15 (0.09) 0.98 (.052) Nationwide Total Spline -2.85 (6.89) East Spline North Spline Constant -3.80 (0.89) -3.75 (0.89) 2 61.18 (8) 59.99 (8) Wald χ (d.f.) PCP 97% 97% PRE 0% 0% N = 857 All models estimated with robust standard errors clustered on state -3.85 (0.84) 47.90 (8) 97% -3.7% Figure 3: Logit, Probit and Complementary Log-Log Compared Probit Cloglog Logit Table 4: Choice of Functional Form Matters for the Lottery Data Variable Probit Logit Complementary Log-Log Bayesian Linear Model with Logit Link Lagged Fiscal Health -2.78 * (1.46) -5.80 * (3.14) -5.01 * (2.67) -6.25 [-11.15, -4.69] Lagged Per Capita Income 0.011 (0.007) 0.02 (0.16) 0.016 (0.14) 0.02 [-0.003, 0.05] Single Party Control -0.42 ** (0.21) -0.68 (0.48) -0.52 (0.45) -0.76 [-1.77, 0.07] Gubernatorial Election Year 0.70 * (0.37) 1.56 * (0.83) 1.44 * (0.79) 1.947 [0.19, 4.45] Neither Election Year nor Year After 0.57 (0.38) 1.24 (0.87) 1.19 (0.81) 1.64 [-0.05, 4.19] Percentage Fundamentalists -0.07 *** (0.02) -0.14 ** (0.07) -0.13 ** (0.07) -0.16 [-0.29, -0.06] Time Spline 14.46 *** (3.26) 28.86 *** (6.33) 25.50 *** (5.29) 29.40 [17.32, 41.16] Regional Percentage 0.98 * (.052) 1.82 (1.19) 1.65 (1.10) 1.69 [-0.85, 4.19] Constant -3.75 *** (0.89) -7.14 *** (1.80) -6.52 *** (1.50) -7.93 [-11.15, -4.70] 59.99 (8) 68.39 (8) 88.30 (8) Wald χ 2 (d.f.) PCP 97% 97% 97% PRE 0% 3.7% 0% N = 857, N = 901 for Bayesian model (missing data estimated jointly with parameters) All classical models estimated with robust standard errors clustered on state Bayesian model results are mean and [2.5%, 97.5%] of posterior distributions * p < .10, two-tailed ** p < .05, two-tailed *** p < .01, two-tailed Figure 4: Empirical Posterior Distribution of Regional Percentage Coefficient b8 sample: 2001 0.4 0.3 0.2 0.1 0.0 -5.0 0.0 5.0 Figure 5: Lottery “Learning Curves” by Region 66 Midwest 72 78 West 84 1.0 Cumulative Proportion Adopting Lottery 0.8 0.6 0.4 0.2 0.0 Southeast Northeast 1.0 0.8 0.6 0.4 0.2 0.0 66 72 78 84 Year Figure 6: Grid Coordinate Map of State Capitals, by Region 5000 Northeast Southeast Midwest West Northings 4000 3000 2000 1000 0 1000 3000 5000 Eastings 7000 Figure 7a and b: Splines for x and y Positions of Capital Cities 0.08 xspline 0.06 0.04 0.02 0.00 1000 3000 5000 7000 x 0.12 yspline 0.08 0.04 0.00 0 1000 2000 3000 y 4000 5000 Figures 8a and b: Thin-plate Spline Fit to Probability of Adopting State Lottery, Knots at Five Centroids 7000 0.1 0.1 0.1 0.1 0.1 0.1 0.1 5000 0.0 0.0 y 0.0 3000 0.0 1000 1000 2000 3000 4000 5000 x Thin-plate Spline Fit to Probability of Adopting State Lottery, Knots at Ten Centroids 0.1 0.1 7000 0.1 0.1 0.1 0.1 0.0 5000 0.0 y 0.0 0.0 3000 0 0. 0.0 0.0 0.0 0.0 0.0 1000 1000 2000 3000 x 4000 5000 Table 5: Accounting for Geography, Complementary Log-Log Function, Standard Errors in Parentheses Variable Region Dummies Geographic Splines (x and y) Thin-Plate Spline, Knots at 5 Centroids Lagged Fiscal Health -3.62 (2.37) -3.93 (2.41) -3.94 (2.50) Thin-Plate Spline, Knots at 10 Centroids -3.83 (2.50) Lagged Per Capita Income 0.02 (0.01) .0.02 (0.015) 0.02 (0.01) 0.02 (0.01) Single Party Control -0.60 (0.43) -0.56 (0.45) -0.51 (0.45) -0.50 (0.45) Gubernatorial Election Year 1.41 (0.79) 1.42 (0.80) 1.42 (0.79) 1.41 (0.79) Neither Election Year nor Year After 1.24 (0.81) 1.25 (0.82) 1.25 (0.81) 1.25 (0.81) Percentage Fundamentalists -0.12 (0.08) -0.13 (0.07) -0.13 (0.07) -0.13 (0.07) Number of Neighboring States Adopting Policy 0.13 (0.16) 0.16 (0.17) 0.16 (0.16) 0.13 (0.17) Time Spline 0.29 (0.06) 0.28 (0.06) 0.27 (0.05) 0.28 (0.05) Southeast -0.86 (1.32) Midwest -1.05 (0.71) West -0.87 (0.76) 7.58 (8.32) 9.24 (7.11) -7.37 (1.84) -7.55 (1.79) East Spline 6.78 (13.52) North Spline 5.37 (10.62) Thin-Plate Spline Constant -6.27 (1.42) -7.28 (1.70) Log-likelihood -78.4 -78.9 -79.1 PCP 97% 97% 97% PRE 9% 6% 6% N = 857 All models estimated with robust standard errors clustered on state -78.9 97% 6% Table 6: Spatial Autocorrelation Diagnostics Lagged Fiscal Health Distance (0-1000] Moran’s I 0.182* Standard Deviation 0.118 (1000-2000] -0.029 0.062 (2000-3000] -0.044 0.065 Distance (0-1000] Moran’s I 0.473** Standard Deviation 0.116 (1000-2000] -0.098 0.060 (2000-3000] -0.038 0.064 Distance (0-1000] Moran’s I 0.516** Standard Deviation 0.112 (1000-2000] 0.060 0.058 (2000-3000] -0.280** 0.062 Lagged Income Percentage Fundamentalist Expected value of I = -0.021 for all tests * p < .05 ** p < .01 Table 7: Comparing Results of Geographically Weighted Regression to Linear Probability Model Estimates Variable Linear Probability Model Estimate (Standard Error) Geographically Weighted Regression Simulation Mean (Standard Deviation) Lagged Fiscal Health -0.16 (0.07) -0.14 (0.06) Lagged Income 0.0004 (0.0004) 0.0004 (0.06) Single Party Control -0.02 (0.01) -0.02 (0.009) Gubernatorial Election Year 0.03 (0.015) 0.03 (0.02) Neither Election Year nor Year After 0.02 (0.01) 0.01 (0.01) Percentage -0.001 (0.0004) Fundamentalists† -0.001 (0.0008) Number of Neighboring States with Lottery 0.02 (0.007) 0.01 (0.008) Time Spline 0.93 (0.19) 1.0 (0.37) Eastings 4852.4 (2324.6) Northings 2323.2 (1160.5) Constant -0.04 (0.04) -0.04 (0.02) N = 857 † Null hypothesis of spatial nonstationarity rejected at less than the .01 level References Allison, Paul D. 1984. Event History Analysis: Regression for Longitudinal Data. Newbury Park, Calif.: Sage. Anselin, Luc. 1988. Spatial Econometrics: Methods and Models. Dordrecht: Kluwer. Beck, Nathaniel, and Simon Jackman. 1998. Beyond Linearity by Default: Generalized Additive Models. American Journal of Political Science 42:596-627. Beck, Nathaniel, Jonathan N. Katz, and Richard Tucker. 1998. Taking Time Seriously: Time-Series-Cross-Section Analysis with a Binary Dependent Variable. American Journal of Political Science 42 (October):1260-88. Berry, Frances Stokes. 1994. Sizing Up State Policy Innovation Research. Policy Studies Journal 22 (3):442-456. Berry, Frances Stokes, and William D. Berry. 1990. State Lottery Adoptions as Policy Innovations: An Event History Analysis. American Politcal Science Review 84 (2):395-415. Berry, Frances Stokes, and William D. Berry. 1991. Specifying a Model of State Policy Innovation (Response). American Politcal Science Review 85 (2):573-9. Berry, Frances Stokes, and William D. Berry. 1992. Tax Innovation in the States: Capitalizing on Political Opportunity. American Journal of Political Science 36 (3):715-742. Berry, Frances Stokes, and William D. Berry. 1999. Innovation and Diffusion Models in Policy Research. In Theories of the Policy Process, edited by P. A. Sabatier. Boulder, Colorado: Westview. Box-Steffensmeier, Janet M., and Bradford S. Jones. 2002. Timing and Political Change: Event History Modeling in Political Science. Ann Arbor: University of Michigan Press. Brunsdon, C., A. S. Fotheringham, and A. Charlton. 1996. Geographical Weighted Regression: A Method for Exploring Spatial Non-Stationarity. Geographical Analysis 28:281-298. Frant, Howard. 1991. Specifying a Model of State Policy Innovation. American Politcal Science Review 85 (2):571-3. Gilks, W. R., A. Thomas, and David J. Spiegelhalter. 1994. A Language and Program for Complex Bayeisan Modeling. The Statistician 43:169-178. Gill, Jeff. 2001. Whose Variance is it Anyway? Interpreting Empirical Models with State-Level Data. State Politics and Policy Quarterly 1 (Fall):318-338. Gray, Virginia. 1973. Innovation in the States: A Diffusion Study. American Political Science Review 67 (4):1174-1185. Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. 2001. The Elements of Statistical Learning, Springer Series in Statistics. New York: Springer. LeSage, James P. Forthcoming. A Family of Geographically Weighted Regression Models. In Advances in Spatial Econometrics, edited by L. Anselin and J. G. M. Florax. New York: Springer-Verlag. Mintrom, Michael. 1997. Policy Entrepreneurs and the Diffusion of Innovation. American Journal of Political Science 42:738-770. Mintrom, Michael, and Sandra Vergari. 1998. Policy Networks and Innovation Diffusion: The Case of Education Reform. Journal of Politics 60 (1):126-148. Mooney, Christopher Z., and Lee Mei-Hsien. 1995. Legislative Morality in the American States: The Case of Pre-Roe Abortion Regulation Reform. American Journal of Political Science 39 (3):599-627. Spiegelhalter, David J., Andrew Thomas, and N. G. Best. 1999. WinBUGS Version 1.2 User Manual. Cambridge, U.K.: MRC Biostatistics Unit. Tobler, W. 1979. Cellular Geography. In Philosophy in Geography, edited by S. Gale and G. Olsson. Dordrecht: Reidel. Walker, Jack L. 1969. The Diffusion of Innovations among the States. American Political Science Review 63 (3):880-899. Western, Bruce, and Simon Jackman. 1994. Bayesian Inference for Comparative Research. American Politcal Science Review 88:412-23. Appendix: WinBugs Code for Simple Bayesian Discrete Event History Model (Linear with Logit Link) model { for (i in 1:n) { adopt[i] ~dbern(pi[i]) logit(pi[i]) <- b0 + b1*lagfiscal[i] + b2*lagincome[i] + b3*party[i] + b4*elect1[i] + b5*elect2[i] + b6*religion[i] + b7*timespline[i] + b8*regionpercent[i] party[i] ~dbern(.5) } b0 ~ dnorm(0,0.001) b1 ~ dnorm(0,0.001) b2 ~ dnorm(0,0.001) b3 ~ dnorm(0,0.001) b4 ~ dnorm(0,0.001) b5 ~ dnorm(0,0.001) b6 ~ dnorm(0,0.001) b7 ~ dnorm(0,0.001) b8 ~ dnorm(0,0.001) } _______________________[Data and starting values omitted]__________________