Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Dynamic Programming We are often interested in net present values: Vt t t 1 t 2 t 3 ... 2 3 t Net present values can be expressed as the sum of a current payoff and a discounted future value: Vt t Vt 1 Often we can eliminate time subscripts by establishing that value each period depends on a set of "state variables" that describe different "states of the world": V (k ) (k , k ') V (k ') where k ' depends on k This yields a "stationary" problem (one that is not time-dependent) Dynamic Programming is a set of tools and techniques for analyzing such problems A Simple Example Consider a simple economy with a representative agent who lives forever The agent's problem is: max tU (ct ) {ct ,it }0 subject to t 0 kt 1 kt (1 ) it ct it F (kt ) where is the discount factor, U (.) is the utility function, ct is consumption in period t , kt is capital in period t , it is investment in period t , and F (.) is the production function We can use the constraints to eliminate it and express ct as ct F (kt ) kt 1 kt (1 ) f (kt ) kt 1 Capital as a State Variable The agent's problem can be rewritten as max {kt 1 }t 0 subject to U ( f (k ) k t t 0 t 0 kt 1 f ( kt ), t 1 ) t 0,1, 2,.... k0 0 is given Since this is an infinite horizon problem, it looks the same every period; the only thing that differs if you begin in another period is the initial capital stock Thus, intuitively, the solution to this problem must depend entirely on k0 Another feature of this problem: if we start in period 0, the agent does not need to choose the entire sequence of capital stocks immediately The agent just needs to choose the next period's capital stock; this is sufficient to determine current consumption and next period's output The agent can wait until period 1 to choose k2 , and so on Given this, we just need to describe the agent's preferences over kt 1 given kt The Value Function Suppose that the agent's problem has been solved for every value of k0 Then we can define a value function V ( k0 ) that represents the value of the maximized objective function for each k0 : V ( k0 ) tU ( f (kt ) kt 1 ) t 0 when the sequence of capital stocks is chosen optimally subject to the constraints Now for any k0 we can say V ( k0 ) max U ( f (k0 ) k1 ) V (k1 ) 0 k1 f ( k0 ) The time subscripts are now completely unnecessary: the problem is stationary, or independent of the date The Functional Equation In general, given a current capital stock k , V (k ) max U ( f ( k ) k ') V (k ') 0 k ' f ( k ) This one equation in the unknown function V (.) is called a functional equation The study of dynamic optimization problems through the analysis of such functional equations is called dynamic programming We can often describe the optimal choices using a policy function g (.) The relationship kt 1 g (kt ) completely describes the dynamics of capital accumulation (changes in the state variable) from any given initial state Computing the Value Function Note that in the problem V (k ) max U ( f ( k ) k ') V (k ') 0 k ' f ( k ) U (.) and f (.) are known The unknown is the value function In general there are three ways to solve for the value function: 1. For a few simple cases, V (k ) has a known functional form (if you pick the right U (.) and f (.), a textbook can tell you what V (k ) is) 2. You can guess. This method is notoriously unreliable 3. You can compute V (k ) numerically on a computer. This requires that you choose particular functional forms and parameter values, and then use an appropriate computational technique Computing the Value Function The standard approach for computing value functions is known as “iteration on the value function” 1. To proceed, first specify functional forms and parameter values for the primitives of the problem: utility functions, production functions, the discount factor, etc. 2. Then set up a grid of possible values of the state variable: in cases where the state variable is continuous (such as the capital accumulation problem) this results in an approximate solution rather than an exact one, but this is usually good enough 3. The next step is to choose an initial value function; this sounds difficult but in practice it usually doesn’t matter too much what you start with: setting V(k) = 0 for every k on the grid is a typical choice Iteration on the Value Function 4. At every grid point k , taking the initial function V (.) as given, choose the value of k ' that solves the agent's problem: max U ( f (k ) k ') V (k ') 0 k' f(k) Record the value of U ( f (k ) k ') V (k ') as V (k ) in a new vector of values 5. Repeat step 4. using the new V (.) function; keep repeating until V (.) no longer changes (at which point we say that the value function has converged). Mathematical Foundations • We would like our model to yield a unique solution for V(.) (a unique function, not a point) • This means no matter which initial value function we choose at the start of the iterations, the final value function is always the same • Similarly, we would like g(.) to be single-valued (a function instead of a correspondence) • Then, given an initial state, the model would yield a unique prediction for the entire path of capital accumulation • Whether the model yields unique solutions for the value function and policy function depends on the primitives of the problem: utility functions, production functions, etc. • Stokey and Lucas describe general sufficient conditions for a model to yield a unique value function and a unique policy function (these conditions are sufficient, not necessary) General Notation We'll follow the notation in Stokey and Lucas (1989) Section 4.2 and consider functional equations of the form: v( x) max[ F ( x, y ) v( y )] y ( x ) under the assumption that F is bounded and 1 X is the set of possible values of the state variable : X X is the correspondence describing the feasibility constraints A {( x, y ) X X : y ( x)} is the graph of F : A R is the return function [0,1) is the discount factor Assumptions 4.3: X is a convex subset of R l , and the correspondence : X X is non-empty, compact-valued, and continuous. 4.4: The function F : A R is bounded and continuous, and (0,1). 4.7: F is strictly concave. 4.8: is convex in the sense that for any [0,1], and x, x ' X , if y ( x) and y ' ( x '), then y (1 ) y ' ( x (1 ) x '). Key Results Given assumptions 4.3, 4.4, 4.7, and 4.8, there is a unique bounded and continuous function v that satisfies the functional equation. Further, v is strictly concave, and the associated optimal policy function g is single-valued and continuous. Given x0 , the sequence {xt } defined by xt 1 g ( xt ), t 0,1, 2,... is the unique solution to the dynamic programming problem. We'll go over the assumptions to establish what might go wrong if they are not satisfied, but it is important to keep in mind that the assumptions are sufficient conditions, not necessary ones. It is possible to use dynamic programming in other environments. "Existence proof by computer" is fairly common. The Content of the Assumptions To understand most of the assumptions, we do not really need to think about dynamic programming. It is sufficient to think about any constrained optimization problem. In a constrained optimization problem, the constraints must allow for at least one solution. Thus, ( x) must be non-empty. The constraints must also create a feasible set that is closed and bounded. Closed is necessary because you cannot maximize on an open interval. To see this, consider max x subject to x [0,1). Bounded is necessary so that the objective function does not reach Closed and bounded compact, so ( x) must be compact-valued. The Assumptions Assuming that F is bounded helps ensure that v is bounded A bounded feasible set is insufficient for this: consider max 1/ x s.t. x [0,1] The solution is x 0 Assuming that F is continuous is important for obtaining a max If the objective function is not continuous, a max may not be obtainable Consider f ( x) 0 if x 0 and 1/ x if x 0 If the problem is max f ( x) s.t. x [0,1] then there is no solution x The Assumptions Note that in the value maximization problem the objective function is F ( x, y ) v ( y ) In order to ensure that this function is continuous, we need ( x) to be continuous Otherwise, slight changes in x could lead to jumps in y that could cause v( x) to have breaks We need 1 because otherwise the infinite sum t F ( x, y ) t 0 may not be bounded even if F ( x, y ) is The Assumptions In order to obtain a unique solution y for every x, we need to ensure that the objective function is strictly concave This rules out flat segments or multiple peaks In order to attain a single-peak and have a unique solution, the constraint set must be convex and X must be convex To see why, consider the problem max{ x 2 } s.t. x 1 or x 1 x