Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Optimal Policy with Endogenous Signal Extraction Esther Hauk∗ Andrea Lanteri† Albert Marcet‡§ February 2014 Preliminary and incomplete Abstract This paper studies optimal policy in models with multidimensional uncertainty and endogenous observables. We first consider a very general setup where the policy-maker does not observe the realisations of the shocks that hit the economy, but only some aggregate variables that are endogenous with respect to policy, therefore standard first order conditions do not hold. We derive first order conditions of optimality from first principles and we illustrate why the estimation of the state of the economy cannot be separated from the determination of the optimal policy. In an optimal fiscal policy application with incomplete markets and endogenous Partial Information, we find that the optimal policy response to aggregate data can be quite non-linear: it calls for tax smoothing across states in normal times, but in some cases for a strong adjustment of fiscal positions during a slump. We show that policies that disregard the endogeneity of the filtering problem and hence these non-linearities can be quite wrong. Finally, our model can rationalise the fiscal response of some European countries to the Great Recession: a slow reaction, followed by large deficits and a delayed sharp fiscal adjustment that protracts the downturn. ∗ IAE-CSIC and BGSE, Campus UAB, 08193 Bellaterra (Barcelona); email: [email protected] † Department of Economics and Centre for Macroeconomics, LSE, Houghton Street, London WC2A 2AE; email: [email protected] ‡ ICREA, IAE-CSIC and BGSE, Campus UAB, 08193 Bellaterra (Barcelona); email: [email protected] § The authors appreciate very helpful comments from Wouter den Haan, Robert Shimer, Jaume Ventura and other participants at LSE Macro work-in-progress seminar, IAE-CSIC Macro work-in-progress, Barcelona GSE Winter Workshop 2013. 1 ”In the policy world, there is a very strong notion that if we only knew the state of the economy today, it would be a simple matter to decide what the policy should be. The notion is that we do not know the state of the system today, and it is all very uncertain and very hazy whether the economy is improving or getting worse or what is happening. Because of that, the notion goes, we are not sure what the policy setting should be today. [...] In the research world, it is just the opposite. The typical presumption is that one knows the state of the system at a point in time. There is nothing hazy or difficult about inferring the state of the system in most models.” (James Bullard, interview on Review of Economic Dynamics, November 2013) 1 Introduction In this paper we aim to build a bridge between the policy world, that is characterized by a large amount of uncertainty about the current state of the economy, and academic research on optimal policy, that has typically separated the issue of making policy decisions from that of inferring the state of the economy from the data. Although the problem of signal extraction played a key role in many research papers in the initial stages of the Rational Expectations revolution during the ’70s and early ’80s, that interest dwindled considerably, perhaps because finding an optimal policy under limited information when observables are endogenous is still a problem that has not been solved in general.1 The aim of this paper is to address this problem. We design optimal policy when some relevant shocks that hit the economy are not observable and aggregate observable variables are themselves endogenous with respect to policy decisions, so that policy-makers should take into account the feedback between observables and control variables at the time of making their decisions. We also draw some conclusions about the importance of taking Partial Information seriously for welfare. To give a concrete example of key policy decisions made under uncertainty about the state of the economy and based on endogenous aggregate variables, consider the recent financial crisis. As it unfolded, the policy discussion often hinged on whether the recession was due to a permanent shock 1 See section 2 for a detailed discussion. 2 (perhaps due to a lower expected productive capacity of the economy) or to a temporary shock (perhaps due to a fall in demand). All we knew for sure in November 2008 was that employment and output were low, and yet the G20 decided in its Washington meeting to take aggressive expansionary fiscal policy measures in order to reactivate the economy spending 2% of world GDP, presumably adhering to the idea that the shock was temporary.2 But from this relatively optimistic initial estimate, several governments came to the conclusion that a fiscal adjustment was necessary after observing larger than expected deficits. Many countries have gone through tax hikes and austerity in spending after that. All along the decisions on fiscal policy were taken observing output and employment behaviour. In turn, these variables clearly depend on whether an expansionary fiscal policy or austerity is adopted. How to deal with this endogeneity is the main focus of this paper. In order to address this question, we solve for optimal policy in models with multidimensional uncertainty and endogenous observables. We first examine a two-period version of Lucas and Stokey (1983) as we consider it the most standard fiscal policy dynamic model. To model limited information, we introduce two shocks (to demand and to supply) and to make the issue relevant we assume incomplete insurance markets. Then we solve for optimal Ramsey taxation under the assumption that the government does not observe the realizations of the shocks, but only some endogenous Partial Information. We first derive a first order condition (FOC) for the optimal policy relying on first principles. The difficulty here is that the policy choice affects the distribution of observables, so that standard first order conditions do not hold. The optimality condition we find modifies the standard FOC found in dynamic stochastic models as the probabilities of each state of nature need to be weighted by a kernel that depends on the effect of policy on the the observed variable. This illustrates why the so-called ”separation principle” of Kalman filter models fails: we cannot separate estimation and optimization since the objective conditional probabilities of underlying shocks are not the proper weights of the derivative of welfare in each state. The FOC is derived for a general model so that our technique can be widely applied. We show some cases where the optimality condition coincides with the standard one. For example, we show why in a linear framework with additive shocks the 2 Coincidentally, the Spanish Finance Minister at the time, Pedro Solbes, published an article in El Paı́s on the 10th of November 2013 under the title ”Cuando decidı́ salir del Gobierno” explaining how the outcome of this meeting eventually lead to the current Spanish Debt crisis and near-default in summer 2012. 3 endogeneity induced by Partial Information does not really matter as has already been discussed in the literature.3 The Ramsey optimal fiscal policy derived in the model is of interest on its own. It dictates optimal tax smoothing across states even in a case where under Full Information taxes would be volatile. This is because one implication of the optimal policy is to average all the possible contingencies using weights that depend on how reactive observables are to policy and how the different shocks interact. We show a case where the implied policy reaction to fiscal adjustments to an economic crisis is very non-linear. In particular, if future taxes may be close to the maximum of the Laffer curve, as in some European economies in the current crisis, a government may go very quickly from not reacting to low expected output to reacting very strongly, the reason being that there is a range of observables that may imply a very large increase in taxes in the future in order to avoid default. The stark and sudden adjustment only occurs once policy-makers start fearing that a fiscal crisis may materialize under the worst possible realizations of the shocks, but importantly this inference is endogenous to the output observed and the policy chosen. This may rationalize why some countries reacted slowly to the crisis and went for austerity with a delay. Finally, we build an infinite-horizon model that confirms the main results developed in the paper. In some cases, the Ramsey government under Partial Information reacts slowly to a downturn. Interestingly, the delayed fiscal adjustment induces a longer recession relative to the Full Information benchmark. Our main contribution is to provide a general solution to the endogenous signal extraction and optimization problem when there is no separation of any kind between these two problems. Furthermore, we explore the welfare consequences of policies that disregard the endogeneity of the filtering problem. We show that alternative policies based on averages of the more standard Full Information policy, which would be correct in linear models with exogenous filtering problems, turn out to be very incorrect. We present an example where these policies would induce tax cuts in the region of observables where the optimal policy calls for a strong tax increase. We also show that linear approximations - which are quite standard in the literature - can be quite misleading: in our fiscal policy model the correct solution is 3 See section 2 below. 4 highly non-linear in nature. The remainder of the paper is organized as follows. The related literature is discussed in Section 2. Section 3 introduces our two-period optimal fiscal policy model with incomplete markets and Partial Information. Section 4 provides the general first order condition, compares the Full Information solution with the Partial Information solution and discusses the case of ”invertibility” when the two solutions coincide. In section 5 we apply these first order conditions to our Ramsey fiscal policy problem under different assumptions on preferences and government expenditure levels. Section 6 is dedicated to robustness. Section 7 presents the infinite-horizon model. Section 8 concludes. 2 Related literature Policy making under Partial Information is hardly a new topic in macroeconomics. In the ’70s and ’80s the issue of limited information was central to the development of Rational Expectations (RE) models. In his seminal paper on RE, Lucas (1972) formulated the signal extraction problem when agents do not observe nominal and real shocks but only a combination of the two. Many other papers used limited or asymmetric information.4 The issue of endogeneity of observables was often in the literature and many papers discussed the validity of conditioning choices on fundamental shocks and whether or not prices revealed fundamental information, or whether asymmetric information RE equilibria could be reached. Although reluctantly at the beginning, nowadays the standard approach in macroeconomics is to build models where all variables and prices are solved conditional on some observed fundamentals. In their seminal paper on recursive competitive equilibrium, Mehra and Prescott (1980) assumed that state variables were either directly observable or an invertible function of observables. From then on, the literature has tended to make this assumption, disregarding the problem of Partial Information. Limited information is no longer a central issue, in part because it is well known that for linear models with additively separable errors a ”separation” principle holds, guaranteeing that the agent can first solve the signal extraction problem (usually using the Kalman filter) and then solve for the optimal 4 For example all the papers with unobserved transitory and permanent shocks. 5 policy conditional on the estimate. Optimal policy under separation just requires a re-definition of the fundamental variables, so that the Kalman filter becomes the observed variable. Other papers address the issue of how price-taking agents solve signal extraction problems where prices are observed. In Townsend (1983) firms face a signal extraction problem about shocks influencing prices. Prices are endogenous to the model, but since these prices are taken as given by firms there is separation in the firms’ minds. Relatedly, Guerrieri and Shimer (2013) analyze a competitive market where traders have multidimensional private information. In a model with dispersed information, Angeletos and Pavan (2010) look at a problem of optimal policy and show how its contingency on aggregate variables can affect the way information is distributed in the economy and agents respond to it. In our model, the only uninformed agent is the government, so that there are no information externalities. However, the government does not benefit from any type of separation as it fully understands the endogeneity of the distribution of observable variables. Mirman et al. (1993) considers the problem of a monopolist who sets quantities under imperfect information about the demand schedule, which is hit by two different unobserved shocks. They assume that the monopolist observes the induced price only after an equilibrium has been realized and hence they abstract from the issue of conditioning on endogenous variables. However, they show how current decisions affect the distribution of future observables. Relatedly, Wieland (2000a, 2000b) considers optimal policy when there is limited information, in a class of armed-bandit problems. In this case there is separation conditional on past endogenous variables so that the government choice affects the information revealed by the equilibrium variables, there is no influence between today’s choice variable and the distribution of today’s observable. In our setup the signal extraction problem is contemporaneously endogenous to the government’s decision. There is a scarce literature on this topic and it is restricted to the linear model with additively separable Gaussian shocks. Pearlman et al.(1986), Pearlman (1992), and Svensson and Woodford (2003) point out that with Partial Information on truly forward looking variables the separation principle might fail because the forward looking variables depend on what policy is going to be like in the future. However, they show that under linearity, normality and symmetry of information, a separation principle with a modified Kalman filter continues to hold. Baxter et al. 6 (2007, 2011) derive an ”endogenous Kalman filter” for all these cases which is equivalent to the solution of a standard Kalman filter of a parallel problem where all the states and the measurement are fully exogenous. The only exception to the separation principle is Svensson and Woodford (2004) where like in our paper the government’s information set is a subset of the private sector’s information set. This papers shows that, in spite of the failure of the separation principle, there is a suitable modification of the standard Kalman filter that works, thanks to linearity and additively separable shocks. Nimark (2008) applies this technique to a problem of monetary policy where the central bank uses data from the yield curve while at the same time understanding that it is affecting them. Our contribution to this literature is to provide a general solution to the endogenous signal extraction and optimization problem. In our model there is no separation of any kind as the shocks are allowed to enter non-linearly in the equilibrium conditions. In this setup, linear approximations can be quite misleading: we show a model where the correct solution is highly non-linear in nature.5 The literature of optimal contracts under private information is perhaps less directly related to our work. This literature usually assumes revelation of the private information conditional on the equilibrium actions (”invertibility”) and introduces the assumption that agents react strategically to the optimal contingent policy set up by the principal (in our case the policy function R chosen by the government). Instead, we consider a setup where this reaction (endogenously) does not take place because agents are atomistic. However, our results should be useful to study models of optimal contracts under private and limited information (without invertibility), as the endogeneity of the observed variable that we address would be present also in that kind of models. 5 Optimal non-linear policies have been found in the literature but for totally different reasons. Swanson (2006) obtains a non-linear policy when he relaxes the assumption of normality in the linear model with separable shocks. He considers a model where the separation principle applies. The non-linearity results entirely from Bayesian updating on the a priori non-Gaussian shocks. Orphanides and Wieland (2000) obtain optimal nonlinear policy response even in a world with perfect certainty by relaxing the standard assumptions on optimal monetary policies that policy makers’ preferences are quadratic and the economy is linear. 7 3 A simple model of optimal fiscal policy Before coming to the general problem of optimal policy with endogenous signal extraction, we present a simple model that we will use to illustrate the problem and its solution. We consider a very simple two-period version of the fiscal policy model in Lucas and Stokey (1983) with incomplete insurance markets. A Ramsey government needs to finance an exogenous and deterministic stream of expenditure (g1 , g2 ), where subscripts indicate time periods, using distortionary income taxes (τ1 , τ2 ) and bonds b issued in the first period that promise a repayment in second period consumption units. The economy is populated by a continuum of agents. Each agent i ∈ [0, 1] has utility function h U (ci1 , l1i , ci2 , l2i ) = γu(ci1 ) − v(l1i ) + β u(ci2 ) − v(l2i ) i (1) where cit and lti for t = 1, 2 are consumption and hours worked respectively, with u′ > 0, u′′ < 0, v ′ > 0, v ′′ > 0. γ is a temporary preference shock, which we will refer to as a demand shock, with probability density function fγ (γ). When the realization of the demand shock is high, agents like first period consumption relatively more. As will be clear in the following, this will make them willing to work more in their intratemporal labour-consumption decision and also more impatient in their intertemporal allocation of consumption. Viceversa, low γ will lead to lower labour and more patient agents, ceteris paribus. Given that agents are identical, in the following we can describe the problem of a representative agent and drop the subscripts i for notational convenience. The production function is linear in labour and output and is given by yt = θt lt . We assume that θ1 is equal to the realization of a random variable θ with probability density function fθ (θ) and we will refer to it as the productivity shock. As far as θ2 is concerned, we will distinguish two cases, one where θ2 = θ1 , in which case the productivity shock is permanent, and one where θ2 = Eθ, that is, second period productivity is a known constant, equal to the mean of the first period shock, in which case the productivity shock is temporary. Firms maximization implies that agents receive a wage equal to θt , so that the period budget constraints of the representative agent are c1 + qb = θ1 l1 (1 − τ1 ) (2) 8 and c2 = θ2 l2 (1 − τ2 ) + b (3) where q is the price of the government discount bond b. Maximization of (1) subject to (2) and (3) implies a standard labourconsumption margin condition for each period and a consumption Euler equation for bonds: v ′ (l1 ) = θγ(1 − τ1 ) (4) u′ (c1 ) v ′ (l2 ) = θ(1 − τ2 ) (5) u′ (c2 ) u′ (c2 ) (6) q=β ′ γu (c1 ) As anticipated, the demand shock enters the first period labour supply decision described by (4) as well as the bond pricing equation (6). A description of a competitive equilibrium is completed by imposing the resource constraint at both periods: c t + g t = θt l t . (7) A Ramsey government chooses a sequence of taxes and a bond issuance in order to maximize (1) subject to the competitive equilibrium conditions (2), (3), (4), (5) and (7). The chosen policy will be a function of the government’s information set at t = 1 (note that all the uncertainty is resolved after the first period and second period taxes will have to balance the budget). Before specifying our assumptions on this information set, let us proceed to show how in general this Ramsey problem can be reduced to the choice of a first period tax rate, subject to a single constraint that summarizes the reaction of the private sector to government policy. First, by combining the budget constraints (2) and (3) with the first order conditions (4), (5) and (6), we can derive the implementability constraint of the Ramsey problem: γu′ (c1 ) c1 − v ′ (l1 ) l1 + β [u′ (c2 ) c2 − v ′ (l2 ) l2 ] = 0. (8) Now, note that we can use the resource constraints (7) to substitute out the consumption terms in (8) to obtain an equation of the form G(l1 , l2 , γ, θ) = 0, 9 (9) where we have left implicit that θ2 will either be equal to θ or to a known constant depending on the assumption on the persistence of the productivity shock. Equation (9) defines implicitly a function that maps a choice of first period labour l1 into a second period labour l2 . Call this map Limp 2 . The government and the private agents must understand that given a realization of the shocks, for any choice of l1 there is only one value of l2 , that is Limp 2 (l1 ), that is consistent with competitive equilibrium conditions. Under some specific assumptions on u and v, it will be possible to solve for l2 as a function of l1 in closed form. More in general, it will always be possible to characterize the marginal effect of l1 on l2 by applying the implicit function theorem to (9). With this in mind, we can define a new objective function of the Ramsey problem as follows: imp W (l, θ, γ) ≡ U (θl − g1 , l, θ2 Limp 2 (l) − g2 , L2 (l)). (10) Note that by using all the equilibrium conditions, we have expressed utility as a function of first period labour only, for which we have suppressed the time subscript. Finally, let us define the reaction function of the private sector to a first period tax rate τ1 . Using (7) for t = 1 to substitute out consumption in (4), we get v ′ (l1 ) − θγ(1 − τ1 ) = 0, (11) u′ (θ1 l1 − g1 ) which implicitly defines first period labour as a function of the first period tax rate and the realizations of the shocks: l = h(τ, θ, γ) (12) where we have again suppressed the time subscript from first period labour and tax rate. Hence the Ramsey optimal policy can be characterized as the choice of τ that maximizes (10) subject to (12). This formulation takes implicitly into account that the future allocation and tax rate will then be determined by the equilibrium conditions described above. The government will choose a tax policy contingent on its information set I G in order to maximize (10) subject to (12). We will consider different assumptions on the elements of I G . The standard assumption in the Ramsey taxation literature is that of Full Information (FI), which in our model implies that both the government and the agent observe the realization of 10 (θ, γ) at t = 1 and are allowed to make their decisions based on this observation, so that I G includes (θ, γ) as well as all the parameters of the model (preferences and government expenditure). However, the aim of this paper is to characterize the solution to the Partial Information (PI) problem, when only one aggregate endogenous variable, say labour or output, is observed, instead of the two exogenous shocks. Hence the elements of I G will be only this endogenous variable and all the parameters of the model. The problem of finding the optimal policy contingent on Partial Information is complicated by a key issue: the distribution of the observable variable is endogenous to a choice of policy, as is clear from (12). Hence, the problem cannot be treated as a standard problem of optimization under uncertainty, where the distribution of the shocks is exogenous and conditioning on these shocks allows to ignore the uncertainty at the time of taking first order conditions. Here, the government must take into account that a certain policy will imply a certain distribution for l1 which in turn is going to be the argument of the policy function. This simultaneity issue cannot be dealt with standard signal extraction methods that are well suited for exogenous or predetermined observables, and calls for a new technique. In the next section, we will solve a slightly more general model under both informational assumptions (FI and PI) and develop the endogenous filtering technique. 4 General first order conditions We first derive the general solution to our policy problem for any welfare function W (τ, l, A) of the government and any reaction function l = h(τ, A) of the atomistic consumers where A refers to the exogenous shocks. Note that in the model presented in the previous section, A = (θ, γ) and τ does not directly affect welfare, but only indirectly through an allocation. In this more general formulation welfare can directly depend on taxes, e.g. the government might dislike taxes. We will refer to all first period variable without a subindex. Hence the shocks are θ and γ, the first period tax rate is τ and the first period labor supply is l. Under this formulation the government maximizes welfare by choosing a function R of the observable variables to set the tax rate. 11 4.1 Full Information Under FI, both the agent and the government observe the shocks in period 1. This implies that the policy function for τ will have the form τ = RF I (A), i.e. the government chooses τ contingent on the exogenous shock A. In the fiscal policy example, we have A = (θ, γ). Denote by Φ the space of possible values of A. Formally, the government chooses R : Φ → ℜ+ to solve max {R:Φ→ℜ+ } h E W RF I (A), h(RF I (A), A), A i (13) It is easy to find the FOC for this problem since the argument of the function R (namely A) has a known exogenous distribution, independent of the choice for R. This is the standard case in macro models, and it can be determined by standard methods that the optimal policy function RF I∗ is formed by choosing RF I,∗ (A) = τ where τ satisfies the FOC Wτ τ, h(τ, A), A + Wl τ, h(τ, A), A hτ (τ, A) = 0 (14) for all A ∈ Φ, where Wτ and Wl refer to the first derivative of W with respect to τ and l respectively while hτ refers to the derivative of h with respect to τ. The optimal policy function RF I∗ (A) is found by solving for τ in equation (14) for each A. Note that uncertainty does not really play a role in this setup, it just indexes the solutions by the realized values of A. In the model presented in the previous section, the FI policy solves Wl (l, θ, γ) = 0 (15) for all realizations of (θ, γ). The implied tax function τ F I (θ, γ) is then obtained by inverting the reaction function h. This is the so-called primal approach to the Ramsey optimal taxation problem under FI. The FI policy is one of tax smoothing over time as the government wants to spread the distortions equally in the two periods. In the case of CRRA preferences (both u and v are power functions), tax smoothing will be perfect and the government will choose the constant tax rate τ that solves the intertemporal budget constraint τ θ1 l 1 − g 1 + β u′ (c2 ) (τ θ2 l2 − g2 ) = 0. γu′ (c1 ) 12 (16) It is clear from (16) that the government needs to know the realization of both productivity and demand shock in order to implement this policy. In particular, the realization of θ is a crucial piece of information, as it determines the revenue that a given tax rate is going to raise, while the demand shock affects the interest rate the government will have to pay on its debt (or receive on its assets). Furthermore, both shocks clearly contribute to the determination of an allocation (c1 , c2 , l1 , l2 ). 4.2 Partial Information We now consider the more interesting case that arises whenever the government cannot observe the exogenous shocks, but only some endogenous variables, while the agents observe the shocks that hit their preferences and wages. In particular, we will make the assumption that the only observable variable for the government is labour (and we will consider output as observable in the robustness section). The observability of labour makes the optimal fiscal policy particularly interesting for a number of reasons. First of all, note that tax revenue in the first period is τ1 θ1 l1 . This means that at the time of setting taxes, the government is uncertain about the revenue that a given policy will generate, as θ1 is not learned at t = 1 with certainty. Arguably, this is a crucial feature of actual fiscal policy decision, as very frequent revisions of the official forecasts for fiscal deficits seem to confirm. Second, employment data have a higher frequency than other macroeconomic data, like output and its components. Hence it seems sensible to ask the question of how the government should set its policy based on this type of information. Optimal behavior with PI uncertainty implies that the government chooses a different τ depending on the observed l. Therefore, the government now chooses a function R : ℜ+ → ℜ+ and given an observed employment level l will set the tax rate τ = R(l). (17) Optimal behavior under uncertainty means that the government makes contingent plans, a plan for each realization, therefore it chooses the best from all ”reaction” functions of the form (17). Let us call L(R, A) the observable l (random variable) induced by the shock A and a policy R. This will be the equilibrium value of l as defined 13 implicitly by the zero of a function H defined as follows: H(l, A, R) ≡ l − h(R(l), A) = 0. (18) In words, the government knows that a given R implies that the employment level chosen by agents is the random variable L(R, ·) : Φ → ℜ+ given by the private sector’s reaction function evaluated at the tax implied by R. The tax rate is then given by T (R, A) = R L(R, A) (19) Notice the distinction between L, T and R. The latter is a function of l while L and T are functions of R and the realizations of the shocks. Let F(R) ≡E (W (T (R, A), L(R; A)) , A) (20) be the objective function for a given choice for R. We can now re-define the PI problem as max {R:ℜ+ →ℜ+ } 6 F(R) (21) and denote its solution by R∗ . 4.2.1 Case 1: Invertibility It turns out that there are cases where even under PI, the government can still implement the FI policy. This is whenever the information set of the government is invertible, allowing to learn the true state of the economy at t = 1. 2 To formally define Invertibility, consider the manifold in R+ that describes all possible (τ, l) at the solution of the FI. This is the set n o 2 M ∗ ≡ ((τ, l)) ∈ R+ : τ = R∗F I A and l = h τ, A f or some A ∈ Φ . (22) Definition 1 Invertibility holds if for any l such that (τ, l) ∈ M ∗ for some τ, there exists a unique τ such that (τ, l) ∈ M ∗ 6 Notice that F maps the space of functions into R. The expectation operator integrates over realizations of A using the true exogenous distribution of A, so that the above objective function is mathematically well defined given the above definitions for T , L and under standard boundedness conditions. 14 Remark 1 Invertibility is automatically satisfied when • A is one-dimensional and h RF I∗ (A) , A is a monotonic function of A • when Φ is a finite set. Then, the dimensionality of Φ does not matter, we can expect to be able to map an equilibrium into the shock since there are finitely many realizations, only by coincidence would the same equilibrium point (τ, l) occur for two different realizations of A if A can only have finitely many values.7 We prove in Appendix A that Proposition 1 Under invertibility R∗ = RF I∗ To illustrate a case of Invertibility in the model presented in the previous section, assume that γ = 1 with certainty, while θ is random and observed only by the agent at t = 1 (and by the government at t = 2). Assume also that the government observes only l1 at t = 1 and hence can make its policy contingent only on labour. This is only apparently a PI problem. As long as h(τ F I (θ, 1), θ, 1) is invertible with respect to θ, observing the labour choice is equivalent to observing the true exogenous state θ and hence the government can implement the FI policy. 4.2.2 Case 2: The general case The most interesting case of PI arises when the information set of the government is not invertible, meaning that the endogenous observables are not sufficient to back out the actual realizations of the shocks. Observe that Remark 2 Invertibility is generally violated if A is multi-dimensional and A has a continuous distribution or if h RF I∗ (·) , · is non-monotonic. To solve (21) for the general case we need to go back to first principles and derive a variational argument where we take a deviation from the optimal solution in any possible direction and then derive the optimal policy by exploiting the fact that the optimal deviation is no deviation. 7 An exception is Wallace (1992) who defines the discrete supports of shocks in order to get non-invertibility even with a finite set Φ. However, this is clearly a degenerate case. 15 Assume for the optimal choice R∗ there is a unique equilibrium (τ, l) for each realization. Take any function δ : R+ → R and a constant α ∈ R+ . We will now consider reaction functions of the form R∗ + αδ. We consider only δ’s for which R∗ + αδ has a unique equilibrium (τ, l) for any realization A if α is close enough to zero. Fix δ and consider solving the problem max F(R∗ + αδ) (23) α∈ℜ in other words, now we maximize over small deviations of the optimal reaction function in the direction determined by δ. It is clear that 0 ∈ arg max F(R∗ +αδ) α∈ℜ F(R∗ ) = max F(R∗ +αδ) α∈ℜ (24) (25) because, if 0 would not be an arg maxα∈ℜ , since R∗ + αδ is feasible in the PI problem, this would contradict the fact that R∗ is optimal for the PI problem. Since 0 solves the one-dimensional maximization problem (23) the FOC of that problem give dF(R∗ + αδ) |α=0 = 0 (26) dα We now compute this one-dimensional derivative and evaluate it at α = 0. dE (W (T (R∗ +αδ, A), L(R∗ +αδ, A), A)) dF(R∗ + αδ) = (27) dα dα R d Φ W (T (R∗ +αδ, A), L(R∗ +αδ, A), A) dFA (A) = (28) dα In Appendix B we show that this derivative evaluated at α = 0 is Z Φ [Wτ∗ R∗′ + Wl∗ ] L′ 0,δ + Wτ∗ δ dFA (A) = 0 (29) where it is understood that Wτ∗ , Wl∗ , δ,R∗′ are evaluated at equilibrium op∗ +αδ,A) timal choices and L′0,δ = dL(R dα which is the only part that needs α=0 to be determined in (29). To compute L′0,δ we apply the implicit function theorem to function H defined in (18) to obtain δ(L(R, A))hτ dL(R+αδ, A) = dα 1 − hτ R′ α=0 16 Plugging this into (29) and rearranging, we can conclude that for any variation δ Z δ dFA (A) = 0 (30) (Wτ∗ + Wl∗ h∗τ ) 1 − h∗τ R∗′ Φ Now we derive implications of these FOC only in terms of primitives of the problem. Consider any l which has positive density. Formally, given l and some ε > 0, define the set of realizations for which equilibrium labour is within ε of l : n B ≡ A ∈ Φ : L(R∗ , A) ∈ l − ε, l + ε o Note that B is indexed by R∗ , l and ε. Now, assume h∗ , R∗′ are differentiable at l for all A. Assume also that these derivatives are bounded for all realizations in B. Assume that l has positive density in equilibrium, that is ProbB >0 for any ε > 0. Take δ to be the indicator function8 of the set (l − ε, l + ε) then (30) becomes Z B 1 dFA = 0 1 − h∗τ R∗′ (Wτ∗ + Wl∗ h∗τ ) (31) and letting ε → 0 we have that E ! Wτ∗ + Wl∗ h∗τ L(R∗ , A) = l = 0 ∗ ∗′ 1 − hτ R (32) This will hold for any l that can be an equilibrium and where the differentiability assumption holds. This says the following: ideally the government would like to reach the FI optimum and set Wτ∗ + Wl∗ h∗τ = 0 for all realizations. But this is an impossible task: due to PI a given choice R∗ is compatible with various A and therefore with various values of Wτ∗ + Wl∗ h∗τ at l. All these derivatives cannot be made equal to zero by one single choice of the number R∗ (l). In the first order condition (32) 1−h1∗ R∗′ acts as a kernel in expectations. τ Without this kernel we would get the familiar condition of optimization under uncertainty where the government would simply set E Wτ∗ + Wl∗ h∗τ | L(R∗ , A) = l = 0. Under Partial Information however, the density fl is endogenous to the 8 Strictly speaking we need a smoothed version of the indicator function since as assumed differentiability in step (27). This is a standard problem with a standard solution in variational arguments. 17 function R. Therefore when choosing R the government has to take into account how marginal changes in R affect fl . This is captured by 1−h1∗ R∗′ . τ The FOC given by (32) looks difficult to handle since it involves R∗′ , but with a final step we can obtain a simpler expression without R∗′ . All we need is to compute the distribution of the shocks conditional on ¯l, that is, we need to determine the endogenous filter. To do this, note that we started by defining our objective function as an integral over realizations of A, that is, a double integral over the support of θ’s and γ’s. However, once we condition on ¯l, there is really only one source of uncertainty left, say θ and the other shock γ is uniquely determined by equilibrium. To see this let us spell out θ and γ in the definition of function H: H(l, θ, γ, R) ≡ l − h(R(l), θ, γ). (33) For each ¯l, (33) defines an implicit function γ = γe (¯l, θ, R) that maps a realization of θ into the corresponding γ, for a given policy. Hence, conditional on ¯l, we are now integrating over a line, the locus of (θ, γ) consistent with ¯l and the chosen policy. Therefore, in (32) we can integrate over say θ’s only and we get9 Z Θ(l̄,R) Wτ∗ + Wl∗ h∗τ f dθ = 0 1 − h∗τ R∗′ θ|l̄ (34) where Θ(¯l, R) is the set of θ’s with positive density conditional on observing ¯l, γ is γe (¯l, θ, R∗ ) for all θ̄ in the integral and W ∗ , W ∗ and h∗ are evaluated τ l τ at ¯l, θ, γe (¯l, θ, R∗ ) . To find f θ|l̄ we apply Bayes’ rule f θ|l = f l|θ fθ ∀ l, θ fl (35) Now, to derive the density of l conditional on θ̄, consider that in equilibrium l is a function of γ, namely L(R∗ , θ, γ) and observe that by definition γe is the inverse of L with respect to γ. Hence we just need to apply the change of variable rule to get the density of the endogenous random variable l as a function of the density of an exogenous random variable γ: 9 fl|θ ¯l, θ = fγ γe ¯l, θ el ¯ l, θ γ (36) This is without loss of generality. Clearly, we could equivalently integrate over γ’s and e ¯l, γ, R) use H to define an implicit function θ( 18 where γel is the partial derivative of γe with respect to ¯l. Also, we have that f l|θ ¯l, θ = 0 if θ ∈ / Θ(¯l, R∗ ) Finally, in order to compute the partial derivative γel , we apply once again the implicit function theorem to H and get 1 − h∗τ R∗′ γel ¯l, θ, R∗ = h∗γ (37) Plugging (37) into (36), using Bayes’ rule and the fact that the denominator fl (¯l) drops out in the first order condition, we obtain the main result, contained in the following Proposition Proposition 2 The first order condition of the PI problem is given by Z Θ(l̄,R) Wτ∗ + Wl∗ h∗τ fγ (γe ∗ )fθ (θ)dθ = 0 f or all ¯l h∗γ (38) where stars denote that the partial derivatives and γ are evaluated at the optimal policy for a given ¯l. Note that in the special cases where shocks enter the reaction function in an additively separable fashion, this expression simplifies significantly and we have the following corollary. Corollary 1 If the second cross derivative of the reaction function with respect to the two shocks hγθ |¯l = 0, then it is optimal to just average the Full Information FOCs using the prior distribution of the shocks. This is the case in linear models with additively separable shocks (e.g. New Keynesian optimal monetary policy model). It is also the case in models where only one of the shock enters the reaction function linearly, which is sufficient to have hγ independent of θ. Otherwise, the non-linearities imply that the prior is reshaped using a kernel which now has the simple form of 1 . h∗γ It may be worth emphasizing the failure of the separation principle in this derivation: As can be seen in equation (37), the derivative of the optimal policy function enters the expression for the derivative of γe with respect to the observable, which in turn affects the kernel used to weight contingencies 19 in the determination of the policy function. Also, the set of shock realizations that are compatible with a ¯l is itself a function of R, so the government affects the possible contingencies that will give rise to a certain observation and also the density of that observation. Hence it is clear that there cannot be any separation between the stage of the estimation of the state conditional on observables and the stage of the solution to the optimal policy problem. 4.2.3 Algorithm for Partial Information Given (38) it is easy to calculate the PI solution using the following numerical algorithm. We first need to discretize the support of shocks (θ, γ) and support of l. Then at each level of l we solve the first order condition (38). We then take into account that the support of (θ, γ) is endogenous with respect to the choice of τ . In other words, we iterate to find a fixed point of the mapping between (i) a policy R that solves (38) at each l for a given conditional distribution and (ii) the conditional distribution of the shocks consistent with R at each l. 5 Computations Let us now illustrate the solution of the optimal fiscal policy model introduced above. We will proceed by introducing different assumptions on preferences and persistence of shocks and show how they affect the optimal policy. 5.1 Linear-quadratic utility First of all, consider the case u(c) = c and v(l) = B2 l2 , that is linear utility from consumption and quadratic disutility from labour effort. This allows to derive simple analytical expressions for the reaction function and its derivative. In particular, it is easy to see from the first order condition (4) that the reaction function (12) specializes to l = h(τ, θ, γ) = γθ (1 − τ ), B (39) implying that both the productivity shock and the demand shock affect the slope of labour supply with respect to the tax rate hτ , hence making this model non-linear (in the sense of not having linearly additive shocks), despite the reaction function being linear in taxes. 20 The two partial derivative hτ and hγ are also easily obtained: hτ (τ, θ, γ) = − γθ B (40) θ (1 − τ ) (41) B Let θ be uniformly distributed on a support [θmin , θmax ], θ2 = θ1 (permanent shock), γ uniformly distributed on [γmin , γmax ] and assume β = .96, B calibrated to get average hours equal to a third and government expenditure constant and equal to 25% of average output. By observing a certain ¯l and imposing a tax rate τ̄ , all the government can infer is a certain realization of the product of the two shocks, but not the individual realizations of the shocks. Hence the government is uncertain whether say productivity is high and demand low, or viceversa, and in gen ¯ e eral there is a continuum of realizations θ̄, γ (θ̄, l; R) consistent with the observation of ¯l and a policy R. Figure 1 illustrates the optimal policy for this case, plotting the tax rate against observed labour. The red line is R∗ , while the yellow region is the set of all equilibrium pairs (lF I , τ F I ) that could have been realized under Full Information. For the lowest labour that is realized under FI, the government knows that choosing τ F I (θmin , γmin ) is optimal, as that observation, combined with this policy, allows full revelation of the state. Hence, PI and FI coincide. The same is true for the highest admissible labour, which implies the FI equilibrium for (θmax , γmax ). In between these two extremes, there is no full revelation, and it can be seen that the optimal policy calls for a tax rate in between the minimum and the maximum FI policies for each observation (but it is sometimes far from being the average of those tax rates). In general, R∗ is decreasing as higher observed labour suggests higher conditional expectation for productivity, hence allowing to balance the intertemporal budget constraint with a lower distortionary tax. Finally, the figure compares the optimal policy with a linear policy obtained connecting the two full revelation points with a straight line. While the optimal policy is not quite linear, in this example a linear approximation would not be too wrong. We will see below that this property is not robust, in particular it will not resist changes in preferences. Figure 2 illustrates the combinations of shocks consistent with observing an average realization of labour (l = .33), that is, we plot the function hγ (τ, θ, γ) = 21 Figure 1: Optimal policy with linear-quadratic utility 0.3 τ 0.28 0.26 0.24 0.22 0.2 0.26 0.28 0.3 0.32 0.34 0.36 0.38 0.4 l γe (θ, .33; R∗ )). As anticipated, this function is decreasing, as the product of productivity and demand shock consistent with an observation of l and a tax rate τ must be constant. This line is contrasted with the same object under FI for the same level of labour. 5.2 Log-quadratic utility We now introduce curvature in the utility from consumption. Clearly, this induces both risk aversion and a wealth effect on labour supply as the marginal utility of consumption now enters the reaction function h. For simplicity we study another special case that allows for an analytical reaction function h. Assume u(c) = log(c) and again v(l) = B2 l2 . In order to make the PI problem more interesting, let us also assume that the productivity shock is temporary and θ2 is known to be equal to the mean of θ.10 Now the first order condition (4) becomes Bl1 c1 = γθ1 (1 − τ1 ) 10 (42) A permanent θ combined with these preferences leads to equilibrium labour being independent of θ under the FI tax policy. 22 Figure 2: Set of admissible shocks 1.06 1.04 PI FI γ 1.02 1 0.98 0.96 0.94 2.85 2.9 2.95 3 3.05 θ 3.1 3.15 3.2 and after substituting out consumption using the resource constraint, we obtain that labour supply is the positive root of a quadratic equation, so that (12) specializes to l = h(τ, θ, γ) = Bg1 + q (Bg1 )2 + 4Bθ2 γ(1 − τ ) 2Bθ . (43) It is important to note that now, differently from the linear-quadratic case, θ has two opposing effects: the substitution effect between leisure as consumption (as before) and the wealth effect, that acts in the opposite direction. With log-quadratic preferences, the second effect dominates and hence high realizations of θ will lead to low labour, ceteris paribus. With this in mind, let us illustrate the optimal policy in figure 3 (red line), once again contrasted to the set of FI equilibria. For low labour, now the government learns that productivity must be high, so the tax rate can be rather low. The lowest labour realization leads to the FI equilibrium for (θmax , γmin ). Then taxes start to increase: higher l’s signal lower expected productivity and hence revenue, as the set of admissible θ’s is gradually including lower and lower realizations. This goes on up to a point where the set of admissible θ’s conditional on l is the whole set [θmin , θmax ]. From that point on, going to the right, the tax rate changes slope and becomes 23 decreasing and this is because now, with any θ being possible, increasing l signals an increasing expected revenue, hence allowing lower tax rates, up to the point where the highest θ’s start being ruled out, at which point the policy becomes increasing again, up the full revelation point (θmin , γmax ). Figure 3: Optimal policy with log-quadratic utility 0.275 0.27 0.265 0.26 τ 0.255 0.25 0.245 0.24 0.235 0.23 0.31 0.315 0.32 0.325 0.33 l 0.335 0.34 0.345 Consistently with the description above, the wealth effect of productivity makes the locus of admissible realization of shocks for ¯l = .33 an increasing function in the (θ, γ) space, as shown in figure 4. Now conditional on l, we can have combinations of high productivity (low wealth effect on labour supply) and high demand or low productivity and low demand. Optimal policy with PI calls for a substantial smoothing of taxes across states. This can be seen in figure 5, where the equilibrium cumulative distribution function of tax rates under PI (red line) is contrasted with the one obtained under FI (blue dotted line). This result is rather intuitive and it carries a general lesson for optimal fiscal policy decisions under uncertainty: When the government is not sure about what type of disturbance is hitting the economy, it seems sensible to choose a policy that is not too aggressive in any direction and just aims at keeping the budget under control on average. In our model, this smoothing of taxes across states will imply a larger 24 Figure 4: Set of admissible shocks 1.03 1.02 1.01 γ 1 0.99 0.98 PI FI 0.97 0.96 2.7 2.8 2.9 3 θ 3.1 3.2 3.3 3.4 variance of tax rates in the second period with respect to the FI policy. In the second period, all the uncertainty is resolved and the tax rate will be whatever is needed to balance the budget constraint. This is of course taken into account at the time of choosing a policy under uncertainty, so that we could say that optimal policy is very prudent while the source of the observed aggregate variables is not known and then responsive after uncertainty has been resolved. In this sense, this model can rationalize the slow reaction of some governments to big shocks like the current recession. The Spanish example in the latest recession is a case in point. In 2008, it was far from clear how persistent the downturn would be and also whether is was demanddriven or productivity-driven and the government did not adjust its fiscal stance quickly, only to make large adjustments in the subsequent years. 25 Figure 5: Equilibrium CDF of tax rates 1 0.9 PI FI 0.8 0.7 Fτ 0.6 0.5 0.4 0.3 0.2 0.1 0.23 5.3 0.235 0.24 0.245 0.25 τ 0.255 0.26 0.265 0.27 0.275 Close to the top of the Laffer curve Let us now look at the case where government expenditure is very high, equal to 60% of average output in both periods.11 We will see that this leads to a very non-linear optimal policy and to an exception to tax-smoothing across states. The government needs to be able to balance the budget in the second period and is thus now very concerned about the amount of debt that will need to be issued for a given tax as high debt, combined with high future expenditure, may call for very high taxes in the future, getting the economy closer to the top of the Laffer curve, where taxation is most distortionary and hence consumption is very low. Figure 6 shows optimal policy for this case (red line), again contrasted with the set of FI outcomes (yellow region). For sufficiently low observed labour, the government learns that only sufficiently high realizations of θ are admissible. Consistently, the tax rate can be rather low. However, there is a threshold ¯l at which suddenly the 11 All other assumptions on preferences and shocks are the same as in the previous section. 26 Figure 6: Optimal policy with high government expenditure 0.66 0.64 τ 0.62 0.6 0.58 0.56 0.31 0.315 0.32 0.325 0.33 l 0.335 0.34 0.345 0.35 worst possible outcome in terms of productivity becomes consistent with the observation. At that point, the government fears that if tax revenue is not sufficiently high in the first period, then a high debt will need to be issued and the economy will reach the top of the Laffer curve in the second period. This calls for raising high taxes in the first period, and it can be seen that the optimal policy is very steeply increasing. Then, for high l’s optimal policy becomes smooth again at a higher level of taxation. In order to gain further intuition on the reason of this strong non-linearity in the optimal policy, imagine a case with only two possible values of productivity, high and low. The FI Ramsey policy would call for high taxes if productivity is low and low taxes if it is high. One can think that for most realisations of the demand shock, the government would be able to infer the realisation of the productivity shock, but for a very small intermediate range of demand shocks both realisations of the productivity shock are possible. Clearly, the PI governemnt would like to be able to replicate the FI policy, and hence it will do so when this is possible. Now, take the limit of the intermediate range of γ’s that generate confusion going to just a single point. At that point, the PI government needs to jump from the FI policy for high 27 θ to the FI policy for low θ, and this generates a sharp non-linearity. However, this is not the end of the story: now that this PI policy is chosen, the set of admissible shocks conditional on l becomes itself different from the one obtained under FI. In fact, this example allows us to see the fixed point between optimal taxes and conditional distributions in action. When the observed l is high, the government fears low θ and raises taxes accordingly. However, at high tax rates the agent would like to work less, and if high l is actually observed, then this must mean that the wealth effect has been very strong, that is, productivity is very low, confirming the government’s belief. In this way an optimal policy and a conditional distribution of shocks consistent with it confirm each other in equilibrium. Figure 7 shows the conditional loci of the shocks for a low and a high level of labour. It can be seen that for high labour there is a wider range of productivities, including low levels that would bring the economy closer to the top of the Laffer curve. Figure 8 illustrates two possible Laffer curves, for high and low productivity scenarios conditional on an average level of observed labour, showing the potential fall in revenue that the government would face without an adequate fiscal adjustment in the first period. Figure 7: Set of admissible shocks with high government expenditure 1.2 low l high l 1.15 1.1 γ 1.05 1 0.95 0.9 2.7 2.8 2.9 3 θ 28 3.1 3.2 3.3 3.4 Figure 8: Laffer curves 0.8 θ low 0.75 θ high revenue 0.7 0.65 0.6 0.55 0.5 0.45 0.5 0.55 0.6 0.65 0.7 τ 0.75 0.8 0.85 0.9 0.95 Increasing the level of government expenditure shows that optimal policy with PI can be very non-linear in order to avoid the worst outcomes. In normal times, policy has to be smooth, but when there are contingencies that are particularly dangerous for agents, then optimal policy calls for being very reactive to observables in order to prevent those cases to materialize. This is exemplified by the optimality of increasing taxes steeply in the first period to avoid having to distort the economy too heavily in the second period if realized productivity turn out to be low (and hence the fiscal deficit turns out to be high). This lesson seems relevant for the understanding of the fiscal policy reaction to the financial crisis in 2008 and afterwards, especially in countries like Spain and Italy, that arguably where in danger of getting close to the top of the Laffer curve, as testified by the fact that significant increases in taxes after 2009 did not raise the amount of revenue as much as it was desired by these governments. 29 5.4 Approximations and the endogenous filter As can be seen from figures 3 and 6, the optimal policy can be very nonlinear and hence very different from a policy that simply connects the two full revelation points (at lowest and highest l) with a straight line. This suggests that linearization may be misleading in problems of optimal policy with endogenous Partial Information. In this subsection, we explore alternative suboptimal policies and we show how well they approximate the optimal policy derived above. It turns out that policies that disregard the endogeneity of the filtering problem can look very different from the optimal one. However, an approximation of the optimal policy that takes into account the endogeneity of the supports of the shock but does not weight them properly turns out to be a sufficiently good approximation in many examples. First let us consider the following two policies that disregard the endogeneity of the filtering problem: 1. RF I,av (l) = E(RF I (θ, γ)|lF I (θ, γ) = l) 2. RF I,ce (l) = RF I (E(θ, γ)|lF I (θ, γ) = l) We refer to the first as the average of Full Information taxes for each realized l and to the second as the certainty equivalence policy that applies that Full Information policy to the conditional expectation of the state, conditional on each l. Note that while the first directly averages tax rates, the second applies that tax policy to the average state. Both seem intuitive policies to implement under Partial Information. In an exogenous Kalman filtering model, policy 2 would correspond to the optimal policy, where one can separately estimate the state and then solve a Full Information optimization problem conditional on that estimate. In fact, as the Full Information policy is roughly linear, policy 1 and 2 turn out to be quite similar to each other. However, both of them disregard the endogeneity of the distribution of observables under endogenous Partial Information and hence the fixed point argument between distributions and policy. Hence, they turn out to look very different from optimal policy, as illustrated by figure 9 for the log-quadratic utility case with high government expenditure. While the welfare cost of using these policies rather than the optimal one is not large in absolute value, it is of the same order of magnitude of the loss in welfare of going from Full Information to Partial Information 30 under the optimal policy, that is, they would double the utility loss for the agent if implemented. This is because they are averaging the wrong set of contingencies, that is, the contingencies that are consistent with each l under Full Information, disregarding the fact that the loci of admissible shocks are equilibrium outcomes that depend on policy. For example, these policies do not raise taxes sufficiently in the first period when very low θ’s are possible, potentially leading to larger required distortions in the second period. Therefore, these approximations based on naive averaging or methods applicable to exogenous filtering problems are quite wrong. Figure 9: Optimal policy and suboptimal alternatives 0.62 optimal FI, av R 0.615 FI, ce R 0.61 0.605 0.6 0.595 0.59 0.585 0.31 0.315 0.32 0.325 0.33 0.335 0.34 0.345 0.35 We will now show that in many cases there exists a ”naive” policy that takes into account the endogeneity of the admissible support of the shocks and can well approximate the optimal policy. This policy is therefore conceptually very different from policies 1 and 2, as it fully takes into account that the set of shocks that are consistent with a given observed labour and a given tax rate is endogenous to this tax rate. This ”naive” policy solves Z θ∈Θ(l1 ,Rn ) Wl (l, θ, γe n )fγ (γe n )fθ (θ)dθ = 0 31 (44) where γe n ≡ γe (l, θ, Rn ). Hence the ”naive” policy averages the FI first order conditions (15) using the prior distribution of these shocks , that is, it disregards the fraction hhγτ of the optimal first order condition given by equation (38).12 In other words, the weights given to the admissible realizations are not correctly updated. In particular when the variance of hhγτ is low conditional on l, this policy is a very good approximation for the optimal one, as illustrated in figure 10 for the log-quadratic utility case with low g. Figure 10 shows that the two tax rates are very close to each other and that the naive policy always results in a higher tax rate. The reason why the optimal policy implies a lower tax Figure 10: Optimal and ’naive’ policy with low g 0.2545 optimal no weighting 0.254 0.2535 τ 0.253 0.2525 0.252 0.2515 0.251 0.322 0.323 0.324 0.325 0.326 0.327 0.328 0.329 l 0.33 0.331 rate than the suboptimal one is instructive about the role played by the endogenous filter. The numerator of the fraction hhγτ is an increasing function of productivity θ. Intuitively, this means that taxes are more distortionary, the higher productivity (note that this an intrinsic non-linearity of the model). Furthermore, the denominator hγ is decreasing in productivity θ. To see this, note that the demand shock multiplies the utility from consumption. When 12 Recall that in our two period Lukas and Stokey model Wτ = 0. I 32 productivity is high, consumption is high, marginal utility is low and hence the demand shock is less effective on labour supply. The effect of both the numerator and denominator of this fraction hhγτ is hence to put more weight to points with high productivity, where the tax rate would be lower with FI, leading to a lower tax than the ”naive” one. Summarizing, the fraction hhγτ instructs the Ramsey government to put more weight on contingencies where (i) observables are more responsive to policy (high slope of labour supply with respect to taxes) and (ii) the demand shock is less effective. However, there are also cases where it is important not to disregard the optimal weighting. High government expenditure is one of these, as exemplified by figure 11, where the distance between the two taxes rates is of the order of one percentage point. Figure 11: Optimal and ’naive’ policy with high g 0.612 optimal no weighting 0.6115 0.611 0.6105 τ 0.61 0.6095 0.609 0.6085 0.608 0.6075 0.33 0.331 0.332 0.333 0.334 0.335 l We now show the endogenous filter in action in our model. As we have seen in section 3, computing the optimal policy with PI implies solving the joint problem of maximizing the expectation of the objective function and computing the conditional distribution of the shocks conditional on observables. This distribution is given by (35). Figure 12 plots γel (¯l, θ, R∗ ) against the support of θ’s consistent with ¯l = 33 .33. While the prior was assumed to be uniform, the posterior obtained by applying the endogenous filter is not uniform and specifically is increasing in θ, implying that high productivity contingencies have higher conditional density. After observing a given l, the government updates its prior by exploiting the restriction on the realization of the two shocks implied by (37). Figure 12: Endogenous filtering 6.94 6.92 6.9 6.88 6.86 6.84 6.82 6.8 6.78 2.7 2.8 2.9 3 θ 3.1 3.2 3.3 3.4 It is worth emphasizing once more that this distribution is affected by the policy decision. In particular, it depends on the slope of the policy function, R∗′ , because this slope determines the effect of the shock γ on the equilibrium distribution of the observable variable. 6 6.1 Robustness Risk aversion and precautionary fiscal adjustments As we have seen in the previous section, when the economy is close to the top of the Laffer curve, optimal policy is very non-linear in the observable 34 variable, creating a region of sharp fiscal adjustments for intermediate realized values of labour. The government raises taxes dramatically in the first period, in order to prevent the worst scenarios with low productivity, high taxes and low consumption in the second period. In this section, we investigate how this policy implication changes with different degrees of the risk-aversion parameter σ. Figure 13 illustrates the optimal policy for σ = 1 (baseline case), σ = 1.5 and σ = 2. It can be seen that as risk aversion increases, the area where policy is more reactive of observables becomes wider, and the government reacts strongly even for weaker signals of a recession. Intuitively, this is because the government wants to avoid contingencies with high debt that would lead to high taxes in the second period. The more risk averse the agent is, the more painful it is to be in those states, where consumption has to be cut substantially. However, this larger region of reaction also makes the policy function less steep, as can be seen from the picture. Figure 13: Changing risk aversion 0.625 σ=1 σ = 1.5 σ=2 0.62 0.615 0.61 0.605 0.6 0.595 0.59 0.31 0.315 0.32 0.325 0.33 35 0.335 0.34 0.345 0.35 6.2 Observable output We now consider the case where the observable variable is output, instead of labour. Now the government knows the value of the product θl, but not the values of the factors independently. Figure 14 illustrates the optimal policy for this case, with linear-quadratic preferences and permanent productivity shock. It can be seen that the result is remarkably similar to that obtained in section 4.1. However, when output is observed, the government has a lot more information than when labour is observed. This is because current revenue τ θl is known, and hence there is no uncertainty about the amount of debt that needs to be issued. The only uncertainty is about the amount of revenue that will be collected in the future, as the value of (permanent) productivity is unknown. As we have seen, uncertainty about the debt is key to get large fiscal adjustments as in section 4.3. Figure 14: Optimal Policy with output observed 0.32 0.3 τ 0.28 0.26 0.24 0.22 0.2 0.8 0.9 1 y 1.1 1.2 We also performed a further robustness check assuming a truncated normal distrution for our shocks, rather than a uniform distribution as in the benchmark examples presented. All qualitative results are robust to this change in the assumption, highlighting the fact that the non-linearities are 36 induced by endogenous PI rather than by an ad hoc distributional assumption. 7 An infinite horizon model with debt In this section, we present an infinite horizon model of optimal fiscal policy with endogenous signal extraction and debt and we discuss its numerical solution. We will see that some key intuitions developed in the two-period model are still present. In particular, under PI the government sometimes reacts slowly to recessions and as a consequence needs to raise taxes for a longer time endogenously prolonging slumps. We assume linear utility from consumption in order to abstract from time-consistency issues: it is well known that if we introduced curvature in utility, the government would have an incentive to twist the interest rate ex-post. Also, in order to simplify the solution we assume that the shocks are i.i.d. over time. This allows us to abstract from incentives for the government to use its policy to experiment and try to learn about the state of the economy. Hence, we are left with the simplest infinite horizon model of fiscal policy with endogenous signal extraction. Future work will address the solution to more general infinitehorizon setups. 7.1 Full Information We consider an incomplete-markets model inspired by Example 2 of Aiyagari et al. (2002), with linear utility from consumption and standard convex disutility from labour effort. Preferences of the representative agent are given by: E0 ∞ X β t [γt ct − v(lt )] (45) t=0 where γt is a demand shock, i.i.d. over time. The period budget constraint of the representative agent is ct + qt bt = θt lt (1 − τt ) + bt−1 (46) where θt is an i.i.d. productivity shock. The standard first order conditions for utility maximisation are v ′ (lt ) = θt (1 − τt ) γt 37 (47) and γ̄ . (48) γt where γ̄ is the unconditional expectation of the demand shock γ. The Ramsey government finances a constant stream of expenditure gt = g ∀t and chooses taxes and non-contingent one-period debt in order to maximise utility of the agent subject to the above competitive equilibrium conditions as well as the resource constraint ct +g = θt lt . Under FI, the government can choose a sequence of taxes conditional on a sequence of shocks At , where At = (θt , γt ). The period implementability constraint with associate Lagrange multiplier λt is v ′ (lt ) γ̄ bt−1 = ct − lt + β bt . (49) γt γt We now introduce an upper bound on debt, bmax . We will assume that whenever debt goes above this threshold, the government pays a quadratic utility cost β χ2 (bt − bmax )2 and we will set the parameter χ to be an arbitrarily high number in order to mimic a model with an occasionally binding borrowing constraint while still retaining differentiability of the problem. The first order conditions for Ramsey allocations with respect to hours and debt are: qt = β γt θt − v ′ (lt ) + +λt θt − and λt λt ′ [v (lt ) + v ′′ (lt )lt ] = 0 γt γ̄ = Et λt+1 + χ(bt − bmax )I[bmax ,∞) (bt ). γt (50) (51) where we denote by I[bmax ,∞) (b) the indicator function for the event b > bmax . Thanks to the assumption of linear utility from consumption, the Ramsey policy is time-consistent and allocations satisfy a Bellman equation that defines a value function W F I (bt−1 , At ). Thus taxes are given by a time-invariant policy function τt = RF I (bt−1 , At ) 7.2 Partial Information We start the description of the PI problem by specifying its timing. At the beginning of each period t, the Ramsey government observes the realisation of the exogenous shocks of last period At−1 , the value of its outstanding debt 38 bt−1 and the realisation of current labour lt . Based on this information it sets the tax rate τt . Note that because of the i.i.d. assumption on the shocks, information about outstanding debt summarises all the information about past realisations that is relevant in terms of the objective function and the constraints of the Ramsey problem. Hence the optimal policy has a recursive structure and taxes are given by a policy function τt = R(bt−1 , lt ). In other words, the government cares about past realisations of the exogenous shocks only to the extent that they affect the level of current outstanding debt. As a consequence, debt is a sufficient state variable. The value function of the problem is defined by the following Bellman equation W (b) = max R: ℜ2 →ℜ+ Eγ(θl − g) − v(l) + βW ( χ (b + g − θl + −β ( 2 βγ̄ v ′ (l)l )γ γ (b + g − θl + − bmax )2 βγ̄ v ′ (l)l )γ γ )+ (52) where l satisfies l − h(R(b, l), θ, γ) = 0. The only difference with respect to the reaction function in the two-period model is that now debt affects labour indirectly through the tax rate. Note that in (52) we have substituted future debt from the budget constraint (49). It is important to highlight a key difference with respect to the FI problem: while in that case a choice of τt implied a choice of bt , now, a choice of τt implies a function that maps the realisations of At conditional on lt into a debt level bt . In other words, just like in the two-period model, the government is uncertain about how much debt will need to be issued and in particular must take into account that bad realisations of productivity may lead to a debt level above bmax , if taxes are not sufficiently high. In order to solve the model, we exploit its recursive structure, by solving for the PI first order condition at each point on a grid for debt and iterating on the value function of the problem. To see how this works, consider the objective function defined by the right-hand side of (52). For a given guess for the value function, this is just a function of observed labour to which we can apply the main theorem of the paper (Proposition 2) and obtain the general first order condition with PI.13 13 The FOC is explicitely shown in Appendix C 39 This first order condition will involve the derivative W ′ (b), which satisfies, by an envelope condition (derived in Appendix C): γ W ′ (b) = E W ′ (b′ ) − χ(b′max )I[bmax ,∞) (b′ ). γ̄ (53) Hence by solving the first order condition using (53) and iterating on the Bellman equation (52), we can approximate the optimal policy. In the next subsection, we show some numerical results obtained after parametrising the economy. While the model is not meant to be a quantitative model of fiscal policy, it can nonetheless rationalise important features of the fiscal response to the Great Recession, with slow and large fiscal adjustments inducing protracted slumps. 7.3 Simulation results In order to parametrise the economy, we assume quadratic disutility from labour, and the discount factor as well as shock distributions as in the benchmark two-period model above. We will now evaluate the fiscal response to a sudden recession that induces uncertainty in the government’s problem under PI and we will compare its policy response to the FI policy. To do this we first hit the economy with the mean value of the shocks for a long sequence and then (in period 400) we hit it with a one-off negative θ shock combined with a one-off (smaller) positive γ shock, so that the economy enters a recession, but the PI government is confused about its source. In figure 15 we can see that the FI government raises taxes in period 400 and after that taxes very gradually go back towards steady-state. However, the PI government reacts slowly and needs to raise taxes with a delay (first panel). Interestingly, this delayed fiscal adjustment induces a longer recession as can be seen by the third panel (hours): higher taxes discourage work for a longer period. This behaviour of the economy is qualitatively similar to what happened in some European countries after the financial crisis (e.g. Spain), where an initial slow reaction, or even an expansionary policy, has been followed by a necessary large fiscal adjustment and the recovery has so far been very slow and weak. While this policy is optimal in our setup where only current income can be taxed, the above findings suggest that allowing for retrospective taxation could improve welfare. Society would be better off if the government could adjust taxes on past income after observing the realisation of past shocks and consumers knew of this possibility. However, retrospective taxation might not be easily 40 implementable in the real world due to time-consistency issues, as ex-post surprising taxes on past income are non-distortionary. Figure 15: Impulse responses taxes 0.265 FI PI 0.26 0.255 398 399 400 401 402 403 404 405 406 407 408 404 405 406 407 408 404 405 406 407 408 debt 0.2 0.15 0.1 398 399 400 401 402 403 hours 0.33 0.328 0.326 0.324 0.322 398 399 400 401 402 403 Figure 16 illustrates a long stochastic simulation of the model. It is easy to see that taxes are very responsive to debt. One interesting question is whether taxes are smoother or more volatile under PI with respect to FI. Intuitively, there seem to be two opposing forces. On the one hand, the PI government does not observe the shocks, and hence smooths its policy across states for a given debt level. However, this policy induces necessary fiscal adjustments following the dynamics of debt, so that this pushes towards higher volatility under PI. The results from long simulations is that this second effect seems to dominate and the FI government is more successful than the PI government at smoothing tax rates. It can be seen that often when debt gets close to the borrowing limit (20% of mean output) the PI government imposes larger fiscal adjustments. This can be rationalised in analogy with the example of the two-period economy close to the top of the Laffer curve. Fear of future large required adjustments in the event of low θ lead the PI government to raises taxes significantly. 41 Figure 16: Simulation taxes 0.275 FI PI 0.27 0.265 0.26 0.255 0.25 0 20 40 60 80 100 120 140 160 180 200 debt 0.2 0.18 0.16 0.14 0.12 0.1 0.08 20 8 40 60 80 100 120 140 160 180 200 Conclusion We derive a method to solve models of optimal policy with limited information without any separation assumption between the optimization and signal extraction problem. The method works in general and we show that designing algorithms to solve these problems is quite easy. We also show that Partial Information on endogenous variables matters as some revealing non-linearities appear in very simple models. Optimal fiscal policy under endogenous Partial Information calls for smooth tax rates across states when the government budget is under control, and for regions of large response to aggregate data when the economy is close to the top of the Laffer curve or to a borrowing limit. Uncertainty about the state of the economy helps to understand the slow reaction of some European governments to the Great Recession, followed by sharp fiscal adjustments and prolonged downturns. Clearly, while we have illustrated the technique in a model of optimal fiscal policy, the methodology can be easily extended to other dynamic models, for example in the analysis of optimal monetary policy in sticky price models (e.g. Clarida et al. 1999) under the assumption of Partial Information. Our optimal policy smoothing result is likely to extend to that setup, potentially leading to a microfoundation for smooth nominal interest rates. 42 References [1] Aiyagari, S.R., A. Marcet, T.J. Sargent and J. Seppala (2002) ”Optimal Taxation without State-Contingent Debt,” Journal of Political Economy, University of Chicago Press, vol. 110(6), pages 1220-1254, December. [2] Angeletos, George-Marios and Alessandro Pavan (2009) ”Policy with dispersed information” Journal of the European Economic Association 7(1):11–60 [3] Baxter,Brad, Graham,Liam and Stephen Wright (2007) ”The Endogenous Kalman Filter”, Birkbeck School of Economics, Mathematics and Statistics working paper BWPEF 0719 [4] Baxter,Brad, Graham,Liam and Stephen Wright (2011) ”Invertible and non-invertible information sets in linear rational expectations models” Journal of Economic Dynamics & Control 35: 295–311 [5] Clarida, Richard, Jordi Galı́ and Mark Gertler (1999) ”The Science of Monetary Policy: A New Keynesian Perspective”, Journal of Economic Literature Vol. XXXVII:1661–1707 [6] Guerrieri, Veronica and Robert Shimer (2013) ”Markets with Multidimensional Private Information”, Society for Economic Dynamics Meeting Papers series 2013 n. 210. Lucas Jr, Robert E., Jr. (1972) ”Expectations and the Neutrality of Money”, Journal of Economic Theory, 4(2), 103-124. [7] Lucas, Robert E., Jr. and Nancy L. Stokey (1983) ”Optimal fiscal and monetary policy in an economy without capital”, Journal of Monetary Economics 12, 55-93 [8] Mehra, R.and E.C Prescott.(1980).”Recursive competitive equilibrium: the case of homogeneous households”, Econometrica 48(6),1365–1379 [9] Mirman, L. J., Samuelson, L., and Urbano, A. (1993), ”Monopoly experimentation”, International Economic Review, 549-563. [10] Nimark, Kristoffer (2008) ”Monetary policy with signal extraction from the bond market”, Journal of Monetary Economics 55, 1389–1400 43 [11] Orphanides, A. and Wieland, V. (2000) ”Inflation Zone Targeting”, Journal of Economic Dynamics and Control 44: 1351-1387 [12] Pearlman, J., D. Currie and P.Levine (1986) ”Rational expectation models with private information”, Economic Modelling 3(2): 90-105 [13] Pearlman, Joseph (1992) ”Reputational and nonreputational policies under partial information”, Journal of Economic Dynamics and Control 16(2): 339-357 [14] Svensson, Lars E.O. and Woodford, Michael (2003) ”Indicator variables for optimal policy”, Journal of Monetary Economics 50, 691–720 [15] Svensson, Lars E.O. and Woodford, Michael (2004) ”Indicator variables for optimal policy under asymmetric information” Journal of Economic Dynamics & Control 28, 661 – 690 [16] Swanson, Eric T. (2006) ”Optimal nonlinear policy: signal extraction with a non-normal prior”, Journal of Economic Dynamics and Control 30: 185-203 [17] Townsend, Robert M. (1983) ”Forecasting the Forecasts of Others”, Journal of Political Economy 91(4), 546-588 [18] Wallace, Neil (1992) ”Lucas’s signal extraction model”, Journal of Monetary Economics 30, 433-447 [19] Wieland, Volker (2000a) ”Monetary policy, parameter uncertainty and optimal learning”, Journal of Monetary Economics 46: 199-228 [20] Wieland, Volker (2000b) ”Learning by doing and the value of optimal experimentation”, Journal of Economic Dynamics & Control 24: 501534 44 Appendix A: Proof of Proposition 1 It is clear that the PI problem is equivalent with modifying the FI problem by adding the following constraints f (A) = R f A f or all A, A ∈ Φ such that R f A ,A f (A) , A = h R h R (54) to the feasible set. Therefore it is clear that the max of the FI problem is higher than or equal to the max of the PI problem. But under invertibility the optimal valueofthe FI problem satisfies the additional restrictions (54) f (A) = R f A only if A = A, therefore the FI solution solves the PI since R problem. Appendix B: Derivation of (29) We compute namely dF (R∗ +αδ) dα as given by (28) and evaluate it at α = 0. Recall (28), dF(R∗ + αδ) d = dα R Φ W (T (R∗ +αδ, A), L(R∗ +αδ, A), A)dFA (A) dα Under enough boundedness conditions on the derivative we can pass the derivative operator inside the integral. Hence using T (R∗ +αδ, A) = R∗ (L(R∗ +αδ, A))+ αδ (L(R∗ +αδ, A)) dF(R∗ + αδ) dα R dW ((R∗ (L(R∗ +αδ, A)) + αδ (L(R∗ +αδ, A)) , L (R∗ +αδ, A) , A) = Φ dFA (A) dα " # Z dL (R∗ +αδ, A) ′ ∗′ ∗ ∗ ∗ = ((R (L(R +αδ, A)) + αδ (L(R +αδ, A))) + δ (L (R +αδ, A)) dα Φ Wτ ((R∗ (L(R∗ +αδ, A)) , L (R∗ +αδ, A) , A) ! dL (R∗ +αδ, A) ∗ ∗ ∗ dFA (A) + Wl ((R (L(R +αδ, A)) , L (R +αδ, A) , A) dα or writing a slightly more elegant expression, letting 45 ∗′ ∗ R∗′ (55) α,δ = R (L(R +αδ, A)) ′ ′ ∗ δα,δ = δ (L(R +αδ, A)) ∗ δα,δ = δ (L (R∗ +αδ, A)) dL (R∗ +αδ, A) L′α,δ = dα ∗′ Wτ,α,δ = Wτ ((R∗ (L(R∗ +αδ, A)) , L (R∗ +αδ, A) , A) ∗′ Wl,α,δ = Wl ((R∗ (L(R∗ +αδ, A), A) , L (R∗ +αδ, A)) the derivative is i dF(R∗ + αδ) Z h ∗′ ∗′ ′ ∗′ ′ ∗′ ∗ = Wτ,α,δ Rα,δ + αδ α,δ + Wl,α,δ L + Wτ,α,δ δα,δ dFA (A) α,δ dα Φ Evaluating at α = 0 expressions at (55) we have R∗′ 0,δ ′ δ0,δ ∗ δ0,δ ∗′ Wτ,0,δ ∗′ Wl,0,δ = = = = = R∗′ (L(R∗ , A)) δ ′ (L(R∗ , A)) δ (L (R∗ , A)) Wτ ((R∗ (L(R∗ , A)) , L (R∗ , A)) = Wτ∗ Wl ((R∗ (L(R∗ , A)) , L (αδ, A)) = Wl∗ (56) Therefore from (26) we have Z Φ [Wτ∗ R∗′ + Wl∗ ] L′ 0,δ + Wτ∗ δ dFA (A) = 0 where it is understood that Wτ∗ , Wl∗ , δ,R∗′ are evaluated at equilibrium optimal choices which is our (29). Appendix C: Derivation of the Envelope Condition (53) In this Appendix we derive the Envelope Condition (53). First of all let us introduce the necessary notation. A tax policy is a function of debt and labour R(b, l) and labour is a function of a policy R, outstanding debt and the exogenous shock, L(R, b, A) defined by the zero of H(l, A, R) ≡ l − h(R(b, l), A), 46 (57) in analogy with the two-period model. By total differentiation of (57), the partial derivative of labour with respect to debt, Lb , is given by Lb (R, b, A) = − γθRb (b, l) . + γθRL (b, l) (58) v ′′ (l) Now, for simplicity consider a case without borrowing penalty. In order to derive the envelope condition, we differentiate (52) with respect to b and get " !# ′ γ ′ ′ ∗ ∗ ′ ∗ ∗′ ∗ W (b) = E (γθ − v (l )) Lb + W b + βbL Lb γ̄ where −θγ + [v ′′ (L(R∗ , A, b))L(R∗ , A, b) + v ′ (L(R∗ , A, b))] βγ̄ ′ b∗L = l∗ = L(R∗ ; b, A) L∗b = Lb (R∗ ; b, A). Using (58) we can write " # ′ γ (−γθR∗b (l, b)) ′ + W . b∗ W ′ (b) = E γθ − v ′ (l∗ ) + βW ′ b bL ′′ v (l) + γθR∗L (l, b) γ̄ (59) Using Proposition 2, the FOC of PI Ramsey problem is E " ′ ∗′ ∗ ′ ∗′ θγ − v (l ) + βW (b ′∗ ′ )b∗L # h∗τ |¯l = 0 1 − h∗τ R∗L (60) for all ¯l. Furthermore, we have that the partial derivative of the reaction function h with respect to taxes is hτ = −γθ . v ′′ (l) So from (59) we get ′ W (b) = E " ′ ∗ γθ − v (l ) + βW ′ b ∗′ ′ b∗L 47 # ′ γ h∗τ R∗b (l, b) ′ . + W b∗ 1 − h∗τ R∗L (L, b) γ̄ Now, applying the law of iterated expectations, using the fact that Rb (l, b) is known given L, b and using (60), we obtain ′ " W (b) = E E " ′ ∗ γθ − v (L ) + βW = E 0+W ′ b ∗′ γ γ̄ # ′ b ∗′ ′ b∗L ! (62) Finally, adding the marginal cost of excessive debt this becomes W ′ (b) = E # ′ γ h∗τ R∗b (L, b) ′ L + W b∗ (61) 1 − h∗τ R∗L (L, b) γ̄ γW ′ (b′ ) − χ(b′max )I[bmax ,∞) (b′ ). γ 48