* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Optimal Stochastic Linear Systems with Exponential Performance
Survey
Document related concepts
Genetic algorithm wikipedia , lookup
Lateral computing wikipedia , lookup
Perturbation theory wikipedia , lookup
Computational electromagnetics wikipedia , lookup
Multi-objective optimization wikipedia , lookup
Knapsack problem wikipedia , lookup
Control theory wikipedia , lookup
Control system wikipedia , lookup
Travelling salesman problem wikipedia , lookup
Computational complexity theory wikipedia , lookup
Inverse problem wikipedia , lookup
Mathematical optimization wikipedia , lookup
Secretary problem wikipedia , lookup
Transcript
124 IEEE TRANSACTIONS OX AUTOMATIC CONTROL, VOL. Dimitri P. Bertsekas fS'i0-1\1'71) was born in -$thew, Greece, in 1942. He received the Mechanical and ElectricalEngineering Diploma from t.he Kational Technical University of Athens. Athens, Greece, in 1965, the 3I.P.E.E. degree from George xt-ashington University,Washington, D.C.? i n 1969, and the Ph.D. degree in system science from the 3Iazsachusetts Institttte of Technologg, Cambridge, in 19i1. From 1966 to 1967 he performed research a t the Xational Technical 1-niversity of Athen.. and from 1967 to 1969he wa3 with the U.S. Army Research Laboratories, Fort Belvoir, \-a. In the summer of 1971 he worked at S?-stems Cont.ro1, Inc.. Palo Alto, Calif. Since 1971he h a been an Acting Xsistant Professor in t.he Department of Engineering-Economic Systems, StanfordUniversity,Stanford, Calif., and has taught courses in optimization by vector space nlethodr and nonlinear programming. His present and past research interests include the areas of estimation and control of uncertainsystems, minimax problems, dynamic programming,optimization problems m-ith nondifferentiable cost functionals, and nonlinear programming algorithms. AC-18, NO. 2, APRIL 1973 Ian B. Rhodes (hI'67) was born in Melbourne, Australia, on ?Jay 29, 1941. He received the B.E. and 31. Eng. Sc. degrees in electrical engineering from the University of ?\Ielbourne, JIelbourne, Australia, in 1963 and 1965. respectively, and the Ph.D. degree in electrical engineering from Stanford ITniversity, Stanford, Calif., in 1968. In January1968 he was appointed Ax': -1-tant Professor of ElectricalEngineering atthe IIassachusetts Institnte of Technology, Cambridge, and taught, there until September 1970.. when he joined the faculty of WashingtonUniversity, St. Louis, 310.: as lssociate Profmor of Engineering and Applied Science in the graduate program of ControlSystems Science and Engineering.His research interests lie in mathematical system theory and ita applications. Dr. Rhodes k a member of the Society for Industrial and Applied 11athematics and Sigma Xi. He is an hsociate Editor of the InternationalFederation of Automat.ic Control Journal =1ufomutica:an -4ssociat.e Editor of t h e I E E E T ~ a s s a c T r o s sos .%CTOUATIC CONTROL, and Chairman of the Technical Committee on Large Systems, Differential Games of the IEEE Control Systems Society. Optimal Stochastic Linear Systems with Exponential Performance Criteria and Their Relation to Deterministic Differential Games DAVID H. JACOBSON Absfracf-Two stochastic optimal control problems are solved whose performance criteria are the expected values of exponential functions of quadratic forms. The optimal controller is linear in both cases but dependsupon the covariance matrix of the additive process noise so that the certainty equivalence principle does not hold. The controllers are shown to be equivalent to those obtained by solving a cooperative and a noncooperative quadratic (dserential) game, and this leads to some interesting interpretations andobservations. Finally, some stabilityproperties of the asymptoticcontrollers are discussed. variables. Second, this 1inea.r controller is identical to that, xhichis obtained by neglecting the addit.ive Gaussian noise and solving the resulta.nt deterministic linear-quadratic problem(LQP) * (certaintyequivalenceprinciple). Thusthe controller for the stochasticsystem is independent of thestatistics of the additive noise. This is appealing for small noise intensity.but forlarge noise (large Covariance) one has the intuitive feeling that perhaps a different controllerwould be more appropriate. I n this paper 11-e consider optimal control of linear sysI. IXTRODFCTION temsdist.urbedbyadditiveGaussian noise, \Those assoHE SO-CALLED linear-quadratic-Gaussian (LQG) ciatedperformancecriteria are t.he expected values of a exponential functionsof nega.t.ive-semidefiniteand positiveproblem' of optimal stochastic control [ l ]pozL <$esse8 number of interesting features. First, theopt.ima1feedback semidefinite quadratic forms. We shall refer to the former controller is a linear (time-varying) function of the state case as the LE-G problem, and the latter as the LE+G problem, and t.0 their deterministic counberparts as LE-P Manuscript received February 10, 1972: revised October 20, 1972. and LE-P, respectively. In the deterministic cases LE'P, Paper recommended by I. B. Rhodes, Chairman of the IEEE SCS the solutions are identica.1to that for the LQP (the natural Large Systems, Differential Games Committ.ee. This work w s s u p yields ported by the Joint Services Electronic3 Program under Contracts logarithm of t.he exponential performance criteria N00014-67-A-0295-0006,0003, and 0008. quadratic forms). However, n-hen noise is present. LE*G The author was with the Division of Engineering and Applied Physics, Harvard University, Cambridge, , J I a s . He is non- wit.h the problems, the optimal controllers are different from that of Department. of Applied llathematics, Un1versit.y of the Witwatersthe LQG problem. I n particular, though a5 in the cme of rand? Johannesburg, South Africa. This 1s a problem with linear dynamics disturbed by additive the LQG problem. these are linea? functions of the state T Gausian noise, together m-ith a performance criterion which is the expected value of a positive-semidefinite quadratic form. This is the same as the LQG problem, hut with noise set. to zero. EREKTIAL 125 JdCOBSOh': DETERMIXISTIC variables, theydependexplicitly upon the covaria.nce matrices of theadditiveGau.ssiannoise. For small noise int.ensit,y (small covariance) the solut.ions of the LE*G and LQGproblems are close, but forlarge noise int.ensity t.here is a marked difference. In particula,r, as the noise intensit.y tends t,o infinity t,he optimal gains for t.he LE-G problem tend t.0 zero; intuitively this implies t.hat if t.he random input is "very wild" 1itt.le can be .gained (in t.he sense of reducing the va.lue of t.his particular performance crit.erion) by controlling t.he system. In the LE+G problem the opt.imal controller ceases to exist if t.he noise int.ensity is sufKciently large (t,hat is, the performance criterion becomes infinite, regardless of the cont,rol input,). These new controllers, which ret,ain thesimplicity of the solution of the LQG problem, could prove t,o be attractive in certain applications. I n addition to formulatingand solving the LE*G problems, --e demonstrate that. their solut,ions are equivalent t o the solutions of cooperative and noncooperat,ive linear-quadrat.ic zero-sum (differential) ga.mes. These equivalencesprovide interpretations for t.he stochastic controllers in terms of solutioas of deterministic zero-sum games, and srice versa. It is hoped that t.hese equivalences will aid in t,he quest for new formulations a.nd (proofs of existence of) solut,ions of stochastic nonlinear systems and nonlinear differential games. We investiga.t.e briefly the infinite-time version of the LE*Gproblems andpoint out, t.hatthesteady-st.ate optimal controller for theLE-G problem is not, necessarily stable. On the other hand, the steady-state optimal controller for the LE,+G problem, if it exists, isstable. Thus the LE+G formulationmaybepreferableinthe infinite-time case. P, >0 k (positive definite) ; 0,. . . ,N - 1. = (4) 1ot.e t,hat &[a!:] = 0, &[cQc~!:'] = k Px-l; = 0 , . * . , N - 1 (5) --here & denotes expectation. C . Perjormance Criterion The performance of t.he stochasticlinear syst,ems is measuredby the criterion (vit.h u = - for LE-G and U= for LE+G) + s- 1 ~ " ( x oA) u&lr, II ~,"(~l;;lc.)~,"(~~;k)~~"(~.~-~N (6) f: = 0 where pzu(xk;k) = exp { u ~ x k T Q ~;x k } k pU"(uk;k)= exp { u+ukTRkuk] ; k = 0,. . -,W = 0,. . ,N - 1 (8) k = (7) and Qk 2 0 (positive semidefinit,e); R, > 0 (positive definite) ; k = 0,. 0,. ,N , N - 1. (9) (10) Note t,hat, (6) can be writt.en as D . Problem We are required t,o find a policy ~ k " 11. FORMULATIOX OF DISCRETE-TIME LEfG PROBLEMS C,"(X,), k = 0, * - . ,AT - 1; x, { ZO,Zl, . ,x,} (12) A . D ymmics (11). We shall consider a. linear discrete-time dynanuc system which minimizesperformancecriterion Yote that V--(zo) and V + ( x o ) for arbitrary controls {uk) described by are bounded as follows : Xk+l = A $ k B~UJ; rkok, k = 07 . . . 1AT - 1; -1 5 V - ( X ~5) 0, 1 5 B'(z~) 5 a . (13) x. given, (1) 111. FORMULATION OF LE+P where the"state"vector xk E R", thecontrolvector If no noise is present, u,E R", and the Gaussian noise input ax. E RQ. Thematrices A,, B,, r!: have appropriate dimensions and depend a k EO; k = O , - - . , N - 1. (14) upon the time k. Minimization of (11)is equivalentto minimization of B. hToise + + The noise input is asequence {ak)of independently distributed Gaussian random variables having probability subject. t.0 densit,y x-1 p,(Wl; . .,a.v-d = II ~ k + l p(ak;k) (2) = A,x, + Bs21.k; X- = O , . . . , N - 1, (16) which is astandardLQP.ThusLE-Pand LE+P are equivalent. and both will be referred to as LEP. As the wherep,:R@XR " - t R + a n d p : R Q X I + ~ R + a r e g i v e n b ; solution of the LQP is well known, we st.ate itnow without proof. The optima.1controller for the LEP(LQP) is k=O 126 IEEE TRANSACTIOSS ON AUTOWSATIC CONTROL, APRIL IT'.\-" = QN. 1973 (34) I n addition, we have that FA-'' = 1 (35) and the optimal policy is ?&!/.x." = - c/::"x, (36) where C," 4 (R, + BpTT-i;;+l"B,) -1B,T,+1"&; k = 0 , . -.,A: - 1. (37) I n order to prove that, (30) and (36) solve ( 2 8 ) , we need the following probably n-ell-known but underexploit.ed lemma. Lenmza: If (P, - urkTw,+lurn) > 0, then 1: . exp u 5 (ii,x, + B ~ u k ) T T ~ ~ ' X . ++l uR,u,)} ( A k z P (38) where TT',+l" is defined in (33). Pr00.f: See the Appendix. Substituting (30) into (28) and using the lemma and ( 3 5 ) ,we obtain 0 exp f a+zl,TJV,-"a,]= nlin up,"(~~:;k)pu(r(~,;k) UL . exp { u $ ( d , r , + B,uk)T~~/:+1u(L41:Z, + B&/:)J. (39) Equation (39) is satisfied by (32), (36): and (37), so that solved. As in the LEP the LE*G problemisindeed (LQP), it is easy t o verify that, under assumptions (-it), Alternatively, the development could be continuedusing (X), (9), and (lo), IFJ:- and IT7/:- are positive semidefinite for and identical resultswould be obtained. RENTISL k = O , - - . , N ;so tha.t, (R, + B,T@-,+l-Bk) > 0, which ensures that.(32), defined for negat.ive u. V. A . 127 GAMES JACOBSON: DETEFMINISTIC (33),(35), (a)so t,hat PROPERTIES OF SOLGTIOXS OF DISCRETE-TIME LE*G PROBLEXS The LE-G Problem The optimal feedback controller for the LE-G problem is a linear f u n d o n of the syst.em state u,- = -Ck-~g; k = c,--+ 0 ; and (37) are well O , . . . , N - 1. Cr- + Dk; k = 0 ; . .,A7 - 1, (42) the optimal gains forthe LQP (LEP). Not,e, from (30) and (33), that J-(x,;k) -+ - exp { - + Z , ~ W ~ - Z ~ ] , k = 0,- . .,A’. (43) Thus, for small noise intensities (Pp-l small, k = 0; . . , N - l ) , the solution of the LE-G problem is close to that of the LEP,LQP, and LQG problem. Case 2 (A,,,in(Pk-l) +.m; k = 0 ; . .,A7 - 1): Here we shall assume that rsTQx+Jk > 0, k = 0, . * ,N - 1; (44) < ~0; 0 5 Pk-’ As P, IT,+^- 4 -+ k = o,.. - , N - 1. (45) w,+’- ~ ~ k + l - r , ( r , ~ ~-lrkTwk+l-; l~+l-r,) 0,. . . , N - 1 .= (50) As in t.he LE-G problem t,he cert.aint,yequivalence principle does not hold because C k + dependsupon the covariance of the additive process noise. We a.gain consider the t.wo limiting cases of zero noise and “infinite” noise. Case 1 (A,,,in(Pk) m; k = 0,. . - 1): I n t.his case, as t,he covariance mat,rix tends to zero, n-e see that. -+ Ck++ Dk; k = O ; . . ,N 7 1 (31) and J+(xk;k)4exp {+X~~W,+Z~), k = 0,. ,N - 1; (52) + so that for small noise intensity the solution of t,he LE+G problem is close t.0 that of the LEP,LQP, LQG problem. Case 8 (Amin(P,-l) + m; k = 0 ; . - , N - 1): For P , sufficiently small (Le., largecovariance), J+(xk;k) can cease t.0 exist. To see this, let us assume that rkTQk+Jr > 0; k = O,.*-,A7 - 1 (53) and that p j - rjTwHl+rj >0; j = k + 1,.. .,A7 - 1. (54) From (53), (54), (32), and (33) we have that rl:Twr+l+r, > 0; (55) so t h a t for P , su&cientJy small P, 0, then, we have k B. T h e LE+G Problem so that, from (31)-(33), rkTwx-+l-rk > 0; (49) for which the new controller (36) offers an alternative to the standard LQG solution. e , - 0,. . . , N - 1. = An explanat.ion for (49) is: If all component.s of x, are disturbed by aninfinitely wild additive noise, then t,here is no point [as far as performance criterion (6) is concerned] in exercising control to t.ry t o count.eract these infinite unpredictable disturbances. Of major interest, are t,he cases in which (41) The main difference between this and the feedba.ck law for the LQG problem is that Ck- depends upon. P,;-l, the covariancematrix of the Gaussianadditivedisturbance 0 1 ~ In ~ . the LQG case the optimalfeedbacklawis indepemlent of the Covariance of the input noise and, indeed, is the same as that for the deterministic LQP (so-died cert.aint,y equivalence principle). Here, in the case where our criterion 1s the expected value of minus, an exponent.ia1 function of anegative-semidefinite quadratic form, the cert.ainty equivalence principle does not hold. It is int,eresting to invest.igate two limit,ing cases: the firstin which Amin ( P b )+ m (input ollr 0, k = 0,. . N - 1); and t.he second in which Amin (P,-’) 4 a (input “infinitely wild”). Case 1 (Amin(Pk) + m; k = 0,. . ,N - 1):In this case it is clear4 from (30),(32), and (33) that k - rlcTw,+l+rk > 0, (56) which implies that the left-hand side of (38) is infinite. Clearly, then, k = O,...,N - 1 (46) J+(x,;k) is infinite. and, from (30) and (35), ( 57) Since k is arbitrary, k E {O,. . ,A7 - I}, we can conclude that if the noise covariance is sufficient.ly large, the performancecriterion P+(z0)isinfinite,regardless of the Sote that,if rkhas rank n for k = 0,. . .,N - 1, that choice of controls {u.,} . We sha.11 have more to say about. this interesting case when we t,reat the continuous-time These limiting cases can be argued rigorously; the arguments are LE+G problem in Section VIII. st,raightforward and are left t o the reader. J-(x&) +0; k = O , . - - , N - 1. (47) 128 IEEE TRANSSCTIOXS ON AETOMATIC COXTROL, .URIL 1973 VI. THEDISCRETE-TILIE LE';G PROBLEJIS 4KD DETERMIXISTIC GASIES A . T h e LE-G Problem then The solution of the LE-G problem is, by inspection (or short,calculation),equivalent tothe solution of the following cooperative deterministic game (LQP) : 1 - xkTWj;+xk = min nmx 2 : x i ) \ai) 1 .\-- 1 [ 2 +U.~~R,U.~ (xiTQixf i=k 1. 1 2 - aiTP$aj)+ 7 xNTQh7xx (66) If the determinant of t,he left-hand side of (65) is nonzero but the matrixfails to be positive definite, then, as is well known, (63) cea.5es to be bounded. Hon-ever, if the lefthand side of (65) is singular for some values of X. E { 0,. . . , N - I ] then (63) may exist. Thus, provided subject. to the dynamic constraint I P, - rcTwlz+l+rlz/ +0; x0 given. (59) It turns out t.hat k = 0,. . .,x - 1, (6'7) me have that J+(x,;k) is finite if and only if (63) is finite. Ourinterpretation of the abovenoncooperative deterministic game is as follows: If player u I : assumes that at will not. cooperate in minimizing the quadraticcriterion (even though ukk n o w t h a t ap behaves as a Gau:-' wan randomvariable),thenthefeedbackcontroller (policy) that is obt.ained for uk,upon solving (63), namely, uk+ = -CJc+xk; k = O , . . . , N - 1, (68) Kotethat in the above formulat.ion we determine opt,imal control l a m is optimal for the LE+G problem. Thus this rat.her conservative game formulationin which the noise at is '&- = -c ,;-X,:, CY^- = - A e - ~ k ; treated as a noncooperatiL:e player gives rise to a control k = 0:. . . ! X - 1.(61) policy which solves the LE+G stochastic control problem. When looked at from this viewpoint the min-max game We now havea new interpretation for the lincar- solution for uk(‘karst case design") does not appear t o quadratic game. If player u k ussumes that player a&:will be t o o pessimistic, since the performance criterion of the cooperatein minimizing thequadratic criterion (ercn LE +G problem is rather appealing. though u p knom that c y I ; behaves as a Gaussian random variable),thenthe feedbackcontroller (policy) that is 1'11. FORMULAT~OK OF C O S T ~ K U O U S - T ~ ~ I E obtained for u6,upon solving (58) and (59): namely, LE*G PROBLENS u?:- == k --Ck-xk; = O,.. . , X - 1, (62) I n cont.inuoustime, the LEhG problems t.ake the form is optimal also for the LE-G problem. Thus thepolicy for treating cyk as a cooperatit-eplayer makes sense when interpreted as the solution of the stochastic LE-G problem. ut obtained by game that. has an equivalent + (xJzTQtxk uBTRkuk - akTPkak) + uk+ = - c k + ~ kaL+ , = and 1 1 +xTQxxx] (63) CY,+are determined as -&+xk; XIt is well known that if (6% subject to = AX solution is noncooperat.ive, namely, subject. to (59), n-here feedback l a m (policies) - 2 B. The LE+G Problem Here, thedeterministic + 1xUr)QfWd]) = 0,. . .!X - 1. (64) + BU + ra; x(toj given (TO) where, for notational simplicity, t,ime dependence of the variables has been suppressed5 and where a( - ) is a Ga.ussian x-hite-noise process having E [ a ( t )1 = 0; t E [fo,t,l r [ f f ( t ) f f T ( s )= ] P-Wt - s ) ; t,s E [to$,] (71) (72) n-here 6 is the Diracde1t.afunction. Sote that solving in (69) we seekoptimal an control policy Kote that Q 2 0, R > 0, P > 0 for all t E [to,t,], and that Q j 2 0. 129 JACOBSON: DETERMINISTIC D I F F E R E X T U GAMES u"(X,t) = C"(X,t),t x st-here c": e arguments. subject to E [to,t,I; x R 1 + R" 4 (zb) ;7 E [to,tI] (73) is a measurable function of its where we require theopt.ima1controls in feedback(policy) form VIII. SOLUTIOS OF COKTIKUOUS-TIME LE+G PROBLEM AND RELATION TO DIFFERESTIAL G-kl\iEs A . Solution of LE*G Problems We can solve thecontinuous-time LE*G problem either by formally taking the limit of t.he solut.ions for the discrete-t,ime cases or by solving t.he "genera.lized" Hamilton- Jacobi-Bellman equation aJ" - - W ) at = nlin + u(zTQz uTRu)J"(z,t) which resu1t.s in 1 - xTS-(t)z(t) = 2 min U(.)d.) [ltf(xT@ + uTRu.+ aTPa)d t Because of our assumptions of positive (sen1i)defhitmess of Q , R , P , and Q f Jit is known that S-(t) exists for all t E [fo,t,j so t.hat (69) is well posed. In the case of theLE+G problem, theappropriat,e differentia.1 game is noncooperative, namely, subject to (83). The opt,imalfeedback laws are and where 1 z'(t)X+(t)z(t)= min ma,x 2 u(.) a(*) - is the optimalpolicy. Using either method, we find that [J''a (x'Qx + uTRu - aTPa)dt + 21 z T ( t f ) Q f d t f ) ](88) - provided t,lIat s+(tf)= Qr (89) has a solution in [t,t,]. Kote that, by standard result,s on Riccati differential equations, (89) has a solution for all t E [to,tl] if (BR-'BT B. Relation to Continuous-Time Differential Games By inspection we see that. the optimal controllerfor the LE-G problem (u negative) is obtained from the solution of the following cooperative d8erent.ialgame : minimize u(-)a(.) lf' (X'QX + UTRU +a ~ ~dt a ) + 21 z'(tr)C?f~(tr) (82) - - rP-TT) 2 0 , t E [to,tf]; (90) and so (90) guarantees the existence of J+(z,t);t E [2$,tf]. If (90) is not satisfied [say for Amin(P-I)sufEcient.ly large], then (89) may exhibit a finite esca.pe time (S(t)+ 03 for some t E [to,tf]) , which would implythat, (86) is unbounded and also that J+(zo;to) is unbounded. Ix. SONE S T A 4 B I L I T PP R O P E R T I E S OF UNDISTURBED LIKEARSYSTEM CONTROLLED BY SOLUTIOK OF LE*G PROBLEMS In this section we assume that all parameters are time invariant, andwe briefly invest,igat,estability of the system 130 I E E E TFZANS.4CTIONS ON ATjTOblATIC CONTROL, APRIL A . Stability Properties of C,Here we assume that the pair (A,B)is controllable and that Q > 0. These assumptions guarantee the existence of Sm-, the unique positive-definite steady-state solution of the Riccati equation. That is, S,- > 0 satisfies Q 1973 asymptotic st,abilitg of (91) with controllers C,- or C,+. In the first case, (97) is used to guarantee negativity of L-, xhile in thesecond it is used t o guarantee existence of S , +. X. COKCLUSIOS + S,-A + ATS,- In this paper u-e have presented explicit (modulo solution of Riccat.i difference or differential equations) solu- S,-(BR-~BT rP-lrT)s,- = 0, (92) tions of stochasticcontrolproblemshavinglineardynamics, additive Gaussiannoise, and exponential objective and n-e have the steady-state feedback gain functions.Thesesolutions are linearfeedbackcontrol ,C=- = R-’BTS,-. (93) policies which depend upon the covariance matrix of the additive process noise so t.hat. the certainty equivalence We now define principle of LQG theory does not hold. I n certain applicaL+xTS,-x, (94) tions these new controllers may be preferable, especially which is posit.ive definite. Along trajectories of (91): we perhaps in economics n-here multiplicativc objective functions are of intrinsic interest.. ha.ve Bydemonstratingcertain equivalencesbetweenour i- = +xT(S,-A ilTS,-)x - xTS,-BR-‘BTS,-x, (95) stochasticcontrolformulationsanddeterministic different.ial games: we are able to give a stochastic interpretation which, upon using (92), is to min-max (worst case) design of linear s-stems. This min-max design is not. unL = -+.x‘[Q S,-(BR-’BT - r P - ’ T T ) S m - ] ~ .(96) suggests that the LLpes~imisti~” at.tractive since it corresponds, in a stochastic setting, to Kow if minimization of the expectedvalue of an exponent.ia1 BR-lBT - rP-TT 2 0 (97) function of a quadratic form, which is quite an a.ppealing criterion. Another significant result of t.hese equivalences we h a r e is that existence of solutions of the stochasticcontrol L-<o, Y x f O (98) problems implies and is implied by existence of solut.ions of the differential games. Hopefullythesenotionscanbe and system (91), x%-ithcontroller C,-, is asymptotically extended to provide existence results for nonlinearstostable. chastic control problems and nonlinear differential games. Kote that simple examples show that (91) can be m Certain stability properties of the steady-state solutions stable if condition (97) is violated. of the stochasticcontrolproblemare also investigated. In particular, we point, out that the steady-state controller B. Stability Properties of Cm+ for t.he LE-G problem can result in an unstable dynamic In this case we assume condition (go), namely, system, while the steady-state controller for the LE+G BR-’BT - rP-TT 2 0 (99) problem, if it exists, a.lways shbilizes t.he dynamic system. In this sense, the LE+G formulation is preferable. and also that, Q > 0. Sote that. because of (99) we can Sote thatwe have not considered in this paper themore write complex problemin which noisy measurements of the l$TAiT 4 BR-lBT - rp-lrT. (100) state are made, viz., + + + If we assume non- that thepair (A$’) is controllable, then it follows tha.tthere exists aunique positive-definite matrix S,+ u-hich satisfies Q + S,+A + ATS,+ ~k = = R-’BTS,+, (102) = k utu = CCU(Z,) ; where u is - or C,+ E /3k; 0 , . . .! X - 1 ZI: L = 0,. . .,x - 1, L+ A +XTS,fX. (103) k ( % , ~ 1 , * ..,zL}; = O , . - - , M - 1. It is casy to verify that, L i is a Lyapunov function and (91) with controller C,+ is asymptotically stable.Kote the interesting point that (97) is suffjcient to guarant.ee {.[s- (xrTQkxh-k 1 Vu(.,) p (105) + and where The appropriate performance criterion is Define (104) where { @a,crJ~,xo} are independent. Gaussian random variables. I n this ca3e the optimal controls are restricted to the form - S,+(BR-~BT- rP-lrT)s_+ = o (101) and + H,;x~ u&[zo exp k=O (106) JACOBSOX: DETERMINISTIC 131 DIFFEREXTIAL GAbIES The above problenlappears to beintrinsicallymuch harder than t.he perfect, state case and could be t,he t,opic of a future paper. - UrkTwjz+lur,) +B ~ U J . ( 110) The Lernnla is proved by (109) since t,he integrand is a proba.bilitydensityfunctionlmving mea.n E , ~and comriance APPENDIX Lemma: If Pk - E~ A crI'TW,+lurK > 0, then (P, - urkTw,+lurk) -1. (111) ACKNOmLEDGlhiENT The author wishes to thank L. Zadeh for stimulating discussions, during the Spring of 1971, on fuzzy set. theory, which cont.ributed to the development of certain resu1t.s inthispaper. Also,crit.ica1 commentsfrom D. i l l a p e , L. Ho, and J. Speyer are appreciated. REFERENCE 108) where TTk+luis d e h e d in (33). Proof: The left-hand sideof (108) is, using ( l ) , equal t.o where [ 11 3.1.At.hans, "The role and use of the etochast.ic linear-quadraticGaussian problem incontrolsystem design,,' ZEEE Trans. Automat. Con&. (Special Issue on Linear-&i.ta~ratic-GQussian Problem), vol. $GIB, pp. 529-652, Dec. 1971. David H. Jacobson (M'69) was born in Johannesburg, Sout,h Africa, on February 23, 1943. He received the B.Sc. degree in engineering from t.he Universit.y of the Witwatersrand,Johannesburg, South Africa, in 1963, andthePh.D. degree andD.I.C. in engineering from theImperial College of Science and Technology, University of London, London, England, in 1967. He was a Postdoctoral Fellow in the Division of Engineering and Applied Physics, Harvard University, Cambridge, Mass., and in 1968 was appointed Assist.ant Professor. In July 19'71 he became Associat,e Professor of Applied h,lathenmtics. Current,lyhe is Professor in the mathematical field, t,he University of t,he Witwatersrand. His interests are in t,he and applications and computing areas of optimalcontroltheory methods for the solut.ion of dynamic optimizat.ion problems. I n t.he lat,ter area hehas developed and applied the technique of differential dynamicprogramming. He is coaut.hor of the book Differential Dynamic Programming (New York: Elsevier, 1970). Dr. Jacobson is a graduate of the SouthAfrican Inst.it,ute of Electrical Engineers and was an Associate Editor of the IEEE TRANSACTIONS os ~UTobraTrc CONTROL from Sept,ember 19'71 to June1972, when he returned to SouthAfrica.