УДК 519.837.4, 517.977.8 F. F. Nikitin
Вестник СПбГУ. Сер. 10. 2014. Вып. 2
VISCOSITY SOLUTIONS AND PROGRAMMED ITERATION METHOD FOR ISAACS EQUATION
St. Petersburg State University, 199034, St. Petersburg, Russia Federation
To solve zero-sum differential games Isaacs derived PDE of Hamilton—Jacobi type for value function. However, in many differential games the value function is not smooth. The theory of viscosity solutions overcomes non-smoothness of the value function by introducing generalized solutions of PDE. Programmed iteration method considers functional equation for the value function which is called generalized Isaacs—Bellman equation. In the paper connection between the theory of viscosity solutions and programmed iteration method is studied. It turns out that successive approximations utilized in programmed iteration method for finding solutions of generalized Isaacs—Bellman equation and any fixed point of value operators are corresponding viscosity super or sub-solutions of Isaacs equation. Bibliogr. 24.
Keywords: zero-sum differential games, viscosity solutions, Isaacs equation, programmed iteration method, value operators, value of differential game.
Ф. Ф. Никитин
ВЯЗКОСТНЫЕ РЕШЕНИЯ И МЕТОД ПРОГРАММНЫХ ИТЕРАЦИЙ ДЛЯ УРАВНЕНИЯ ФЙЗЕКСА-БЕЛЛМАНА
Санкт-Петербургский государственный университет, 199034, Санкт-Петербург, Российская Федерация
Для решения дифференциальных игр Айзе^ предложил уравнение в частных производных типа Гамильтона—Якоби, которому удовлетворяет функция цены игры. Однако для большинства содержательных примеров дифференциальных игр функция цены не является гладкой. Теория вязкостных решений решает данную проблему, рассматривая обобщенные решения уравнений в частных производных. Метод программных итераций, в свою очередь, обходит проблему негладкости функции цены с помощью функционального уравнения функции цены игры, которое называется обобщенным уравнением Айзекса— Беллмана. В данной статье устанавливается связь между теорией вязкостных решений и методом программных итераций. Доказывается, что последовательные приближения используются в методе программных итераций для нахождения решения обобщенного уравнения Айзекса—Беллмана и любая неподвижная точка оператора значения является соответствующим супер- или суб-решением уравнения Айзекса—Беллмана. Библиогр. 24 назв.
Ключевые слова: дифференциальные игры, вязкостные решения, уравнение Айзекса, метод программных итераций, частные производные, функция цены.
1. Introduction. Development of the theory of zero-sum differential games was started independently in 60's by Rufus Isaacs [1] and Lev Pontryagin [2].
Rufus Isaacs introduced a partial differential equation for the value function of the game, called Isaacs equation nowadays, and provided a framework for finding its solution. Using method of characteristics he solved many examples of differential games and inspired further research in this direction [3]. However one of the obstacles to apply Isaacs method and to build rigorous mathematical theory was the problem of non-smoothness of the value function arising in many differential games.
Никитин Федор Федорович — ассистент; e-mail: [email protected] Nikitin Fedor Fedorovich — assistant; e-mail: [email protected]
Rigorous theory for the value function of differential game arised in [4-8]. The idea was to approximate differential game by discrete-time games. Sequences of lower and upper value functions were constructed and it was proved that under certain conditions lower and upper value functions converge to the same limit as length of step of discretization approaches to zero. This limit was understood as the value function of differential game.
Alternative approach called positional differential game theory was developed by Krasovskii and Subbotin [9]. They characterized the value of the game as a function satisfying certain conditions of stability. Later Subbotin expressed stability conditions in terms of inequalities for Dini directional derivatives for locally Lipschitz functions [10] and this research evolved to the theory of minimax solutions of first order PDE [11].
The theory of viscosity solutions for Hamilton-Jacobi equations started with the work by Crandall and Lions [12]. Later this theory was applied to differential games by Evans and Souganidis [13], they proved that the value function of game is a viscosity solution of Isaacs equation with certain boundary condition. Equivalence of viscosity and minimax solutions was established by Subbotin [14].
Programmed iteration method for the theory of differential games was introduced in 70s independently by Chentsov [15, 16] and Chistyakov [17, 18]. Chentsov developed it in the scope of positional differential game theory as a mean to construct stable bridges which are crucial elements required to construct solutions of differential games. Chistyakov introduced programmed iteration method as a way to build rigorous theory of differential games and to bypass difficulties of Isaacs method. This paper continues the latter research.
The connection between programmed iteration method and the theory of generalized solutions of PDE was established by Chentsov and Subbotin in [19]. They proved that sequences of successive approximations in programmed iteration method converge to minimax solution of corresponding Isaacs equation. This result due to equivalence of minimax and viscosity solutions implies that the limit of the sequence of successive approximations is, in turn, a viscosity solution.
In this paper it is shown that every fixed point of maxmin value operator is a viscosity supersolution of lower Isaacs equation and every fixed point of minmax value operator is a viscosity subsolution of upper Isaacs equation. From this and validity of cross property of successive approximations (see lemma 3.6) it immediately follows that every maxmin (minmax) approximation in programmed iteration method is indeed a viscosity subsolution (supersolution) of upper (lower) Isaacs equation. Under Isaacs condition then, due to convergence of successive approximations to the common fixed point of value operators, fact that common limit is a viscosity solution of Isaacs equation is immediate consequence.
The paper is organized as follows. In section 2 the zero-sum differential game under investigation is described and some facts on Isaacs equation and viscosity solutions are discussed. Next section introduces programmed iteration method and lists main lemmas and theorems on value operators and successive approximations. Section 4 contains main results of the paper.
2. Differential game and Isaacs equation. Consider a two person zero-sum differential game of fixed duration T —t0. Dynamics of the game is described by differential equation
— = f(t,x,u,v) (1)
(t e [to,T], x e I", u e P c I, v e Q c I). Game starts at time t0 from position
x(to) = xo
and ends at time T with terminal payoff
H(x(-)) = H (x(T)). (3)
The player controlling parameter u at every moment t based on the knowledge of the position (t,x(t)) chooses u(t) e P and aims at minimizing payoff (3). His/her opponent, possessing the same information about position (t, x(t)), sets the control v(t) e Q at every point t with the goal of maximizing same payoff function (3).
Standard assumptions in the theory of zero-sum differential games regarding dynamic system (1) and payoff (3) are as follows. Function f is continuous on the set [t0, T] x 1" x P x Q, local Lipschitzian in x, i. e. for any compact subset K C 1" there exists L > 0 such that
\\f (t, x', u, v) - f (t, x", u, v)\\ < L\\x' - x"||, Vt e [to, T]; x',x'' e K; Vu e P; Vv e Q,
and for some A > 0
\\f(t,x,u,v)\\ < A(l + \\x\\) Vt e [t0,T], Vx e 1", Vu e P, Vv e Q.
These three assumptions guarantee that for any initial position (2) and any Lebesque measurable controls u(-) : [t0,T] ^ P and v(-) : [t0,T] ^ Q equation (1) has unique solution on [t0,T] [20]. Let L([t0,T],S) be a set of Lebesque measurable functions which map [t0,T] to S then sets
Ut0 = {u(-) : [t0,T] ^ P|u(.) eL([t0,T],P)}
and
Vto = {v(-) : [t0,T] ^ Qlv(-) e L([t0,T],Q)}
constitute the sets of admissible open-loop controls of players.
In the theory of zero-sum differential games one extra condition is put on the right-hand side of (1). Namely, function f is such that
mrAxmm{l, f (t,x,u,v)) = mmmrAx{l, f (t,x,u,v)) (4)
vEQ uEP uEP vEQ
holds for any l e 1" and any (t, x) e [t0, T] x1". This assumption is called Isaacs condition or condition of existence of saddle point in small game. In this paper validity of Isaacs condition is not assumed by default and it is mentioned explicitly whenever used.
The only assumption about terminal payoff is that the function H is continuous on 1". Under all of these assumptions including Isaacs condition it is known [4-8] that differential game (1)-(3) has a value denoted as w*(t0,x0).
Let D C [t0,T] x 1" be a set of all positions (t,x) in game (1)-(3) (denoted as r(t0,x0)) achievable from initial position (t0,x0). Then one considers the set
r(D) = {r(t,x)l(t,x) eD}
of differential games parameterized by initial position (t,x). Function w* (t,x) considered as a function of position (t, x) is called the value function of the differential game (1)-(3).
In his work [1] Isaacs derived partial differential equation for the value function of the game in the following form
duo
—- (t, x) + maxmin(Vw(t, x), fit, x, u, v)) =0. (5)
dt veQ ueP
Since payoff is of terminal type the value function is satisfying natural boundary condition
w(T,x) = H (x). (6)
Isaacs showed that when the value function of the game exists and is of class C1 then (5), (6) holds. Note that above is valid under condition (4). If it is not, one considers the pair of equations
duo
—- (t, x) + maxmin(Vw(t, x), fit, x, u, v)) =0, (7)
dt veQ ueP
duo
-j— (t, x) + minmax(Vw(t, x), f(t, x, u, v)) =0. (8)
dt ueP veQ
They are called lower and upper Isaacs equations correspondingly.
As mentioned above one of the obstacles of Isaacs method is that for many differential games value function is not smooth, i. e. does not belong to class C1. Viscosity solutions theory generalizes the notion of solution of PDE on non-smooth case. Definitions for the particular case of Isaacs PDE are as follows.
Definition2.1. Lower semi-continuous function w+(-) : [to,T] x 1" ^ 1 is a viscosity supersolution of (5) if and only if for any function ((■) € C1((t0, T) x 1") such that when w+ — ( achieves local minimum at some point (t*,x*) € (t0,T) x 1" then
d(
— (t*, x*) + maxmin(V</>(i*, x*), f(t*,x*,u, v)} < 0.
dt veQ ueP
Definition 2.2. Upper semi-continuous function w- (■) : [to, T] x 1" ^ 1 is viscosity subsolution of (5) if and only if for any function ((■) € C 1((t0 ,T) x 1") such that when w- — ( achieves local maximum at some point (t*,x*) € (t0,T) x 1" then
d(
— (i*, x*) + maxmin(V</>(i*, x*), f(t*,x*,u, v)} > 0.
dt veQ ueP
Definition2.3. Viscosity solution is a function w(^) : [t0,T] x 1" ^ 1 which is viscosity supersolution and subsolution of (5) at the same time. 3. Programmed iteration method. Consider the space
UC(V) = {w(-) : V^ 1}
of uniformly continuous functions. Operator defined on the space UC(V) maps function w(^) € UC(V) to the function o w(^) such that the value of image of function w(^) in point (t*,x*), i. e. o w(t*,x*) is calculated as
o w(t*,x*)= max max inf w(t,x(t,tif,x,u(),v)). (9)
te[t*,T] veQ u(-)eUt„
The operator in (9) is called maxmin value operator. In similar way operator is defined by the formula
o w(t*,x*)= min min sup w(t,x(t,t*,x*,u,v())), (10)
+ te[t„T] ueP v(-)eVt„
where as before $+ o w(t*,x*) means the value of image of function w(-) at point (t*,x*). Operator $+ in (10) has name minmax value operator.
In programmed iteration method two sequences of successive approximations are constructed
w
_ (■) = $_ o w(n-l) (■), (11)
w+n(■) = $+ o wCn_1)(■) (12)
with initial approximations
(t*,x* ) = max inf H (x(T,t*,x*,u(),v)), (13)
vEQ u(-)eUt„
w^ (t* ,x*) = min sup H (x(T,t*,x*,u,v(-))). (14)
uePv(-)eVtt
They are called maxmin and minmax approximations respectively.
Space UC(D) is equiped with partial order and distance function. Partial order is defined naturally. Two functions wi(-),w2(-) e UC(D) satisfy relation wi(-) < w2(:) if and only if
wi(t*,x*) < w2(t*, x*) V(t*,x*) e D.
The structure of metric space on the set UC(D) is introduced with uniform distance function p(-, ■) which is
p(wi(-),w2(-)) = sup lwi(t*,x*) - w2(t*,x*)l (tt,xt)€V
Below main properties of value operators and successive approximations are formulated, for proofs see [15-18, 21].
Lemma 3.1. Operators $— and $ + preserve order on the set UC(D), that is
w1(■) < w2(:) ^ $- o w1(:) < $- o w2(■),
w1(:) < w2(■) ^ $ + o w1(^) < $+ o w2(■).
Lemma 3.2. For any function w(^) e UC(D)
$— o w(^) > w() $ + o w(^) < w(^).
Lemma 3.3. Operators $ c and $ + are invariant on set UC(D), i. e. for any w(^) e UC(D)
$- o w(^) eUC(D), $ + o w(^) eUC(D).
Lemma 3.4. Operators $ c and $ + are continuous and satisfy Lipschitz condition (in the sense of metrics p) with constant L = 1. Moreover Lipshitz's condition does not hold for any smaller constant.
Lemma 3.5. The pair of equations
is equivalent to equation
$ ^ o w(^) = w(■), $ + o w(^) = w(^)
$_w(^) = $c+o w(■).
Lemma3.6. Any successive approximation for one value operator is a fixed point of other value operator, i. e. for any k ^ 0
$ L o w(k){) = w(k) (•), $ + o w(k (•) = w(k (•).
Theorem 3.1. For any g_ (•),g<(0'> (•) € UC(D) successive approximations
g_) {•) = $_ o g__(•), g™ (•) = $+ o g^(.)
converge uniformly on space UC(D).
From lemmas 3.4, 3.6 and theorem 3.1 follows
Theorem 3.2. Consecutive approximations (11) and (12) converge to common fixed points of value operators.
Formulated above lemmas and theorems do not rely on Isaacs condition (4). In paper [22] authors proved that if Isaacs condition holds then successive approximations (11) and (12) converge to the same common fixed point of value operators and this is unique common fixed point of value operators satisfying boundary condition (6). Moreover common fixed point of value operators which satisfies (6) is indeed the value function of the game considered in positional and so-called recursive strategies [21, 23]. Equation
o w(• ) = $+ o w(• )
got the name of generalized Isaacs-Bellman equation due to lemma 3and the following
Theorem 3.3. Under Isaacs condition if w* ( • ) € C1(D) then w* (• ) is a common fixed point of value operators if and only if it satisfies Isaacs-Bellman equation
dw
-r— (t*, x*) + maxmin(Vw(t*, xA /(t*, x*, w, v)) = 0 V(t*, x*) G T>.
at vEQ uEP
4. Viscosity solutions and programmed iteration method. In this section connection between viscosity solutions theory and programmed iteration method is established through the following theorems.
Theorem 4.1. Continuous fixed point w(• ) of value operator $ c_ is a viscosity supersolution of (7).
Proof. Let w(• ) be fixed point of operator $ c_ and function w_ — $ has local minimum at point (t*,x*). Thus
w(t*,x*) — $(t*,x*) < w(t, x) — $(t, x) y(t, x) € S(t*,x*),
where S(t*,x*) is some open set containing (t*,x*). Since w(• ) is fixed point one gets
w(t,x(t,t* ,x*,u( • ),v)) — w(t* ,x* ) = 0. (15)
From (15) then follows that for any t € [t*,T] and v € Q
max max inf
te[t*,T] veQ u( )eUtt
inf
u( ■)€Ut„
^ 0
w(t, x(t, t*,x*,u(■), v)) — w(t*,x*)
and, hence, for some open set S(t*) containing t*
j>(t,x(t,t*,x*,u(■ ),v)) — $(t*,x*) < 0 Vt £ S(t*), Vv G Q. (16)
inf
u(-)eUt„
To complete the proof one needs to show that
— (t*, x*) + maxmin(V</>(t*, x*), /(t*, x*, w, v)} < 0.
dt veQ ueP
Suppose opposite. Let there exists (t* ,x*) € V such that
+ maxmin(V</>(t*, x*), /(t*, x*, u, v)} > 0.
dt veQ ueP
Then for some constant c > 0
— (t*, x*) + maxmin(V</>(t*, x*), /(t*, x*, u, v)) > c
dt veQ ueP
and, hence, for some v* € Q and any u € P
^(t*,x*) + (V</>(t*,x*),/(t*,x*,w, v*)) > c. dt
One could easily prove that the function defined as
F(t, x, u) = + (V</>(i, x), /(t, x, w, w*)} (17)
continuous in (t, x) uniformly by u. Then for some e > 0 and any (t, x) from e-sphere
Se (t*, x*)
F(t,x,u) > c Vu € P. (18)
Let x(t) = x(t,t* ,x*,u(t), v*) be a solution of (1) from initial point (t*,x*) corresponding to admissible control u(^) and fixed control v* € Q. From (17) and (18) then follows existence of interval [t*,t* + e) such that for any t € [t*,t* + e)
^(t, x(t)) + (V<f>(t, x(t)), f(t, x(t), u(t), v*)) > c.
By taking integral of both parts from t* to t* + e and according to Lebesque theorem [24] one gets
((t,x(t)) — ((t*,x*) > ce Vu(^)
and, hence,
inf
u(-)€Ut„
j)(t, x(t, t* ,x*,u(-),v* )) — ,x*)
^ ce.
The last expression, obviously, contradicts (16).
In similar way one could derive symmetrical theorem.
Theorem4.2. Continuous fixed point w(-) of value operator is a viscosity subsolution of (8).
From theorem 4.1 and lemma 3.6 follows
Corollary 4.1. Sequence (11) with initial approximation (13) is a sequence of viscosity subsolutions of upper Isaacs equation (7).
In turn from theorem 4.2 and lemma 3.6 follows
Corollary 4.2. Sequence (12) with initial approximation (14) is a sequence of viscosity supersolutions of lower Isaacs equation (8). From theorems 4.1 and 4.2 follows
Corollary 4.3. Common fixed point of value operators is a viscosity solution of Isaacs equation (5).
Literature
1. Isaacs R. Differential games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization. New York: John Wiley and Sons, Inc. 1965. 384 p.
2. Pontryagin L. S. On the theory of differential games // Russian Math. Surveys. 1966. Vol. 21. P. 193-246.
3. Lewin J. Differential games. London: Springer, 1994. 242 p.
4. Elliot R. J., Kalton N. J. Values in differential games // Bull. Amer. Math. Soc. 1972. Vol. 78. P. 427-431.
5. Fleming W. H. The convergence problem for differential game //J. Math. Anal. Appl. 1961. Vol. 3. P. 102-116.
6. Friedman A. Differential games. New York: Wiley, 1971. 350 p.
7. Roxin E. Axiomatic approach to differential games //J. Optim. Theory Appl. 1969. Vol. 3. P. 153-163.
8. Varaiya P. P. On the existence of solutions to a differential game // SIAM J. Control Optim. 1967. Vol. 5. P. 153-162.
9. Krasovskii N. N., Subbotin A. I. Game-theoretical control problems. London: Springer, 2011. 532 p.
10. Subbotin A. I. A generalization of the basic equation of the theory of differential games // Dokl. Akad. Nauk. 1980. Vol. 22. P. 358-362.
11. Subbotin A. I. Generalized solutions of first order PDEs: The Dynamic Optimization Perspectives. Boston: Birkhauser, 1995. 312 p.
12. Crandall M. G., Lions P. L. Viscosity solutions of Hamilton-Jacobi equations // Trans. Amer. Math. Soc. 1983. Vol. 277. P. 1-42.
13. Evans L. C., Souganidis P. E. Differential games and representation formulas for solutions of Hamilton-Jacobi equations // Indiana Univ. Math. J. 1984. Vol. 33. P. 773-797.
14. Subbotin A. I. On a property of the subdifferential // Mat. Sb. 1993. Vol. 74. P. 63-78.
15. Chentsov A. G. The structure of an approach problem // Dokl. Akad. Nauk. 1975. Vol. 224. P. 1272-1275.
16. Chentsov A. G. On differential game of approach // Mat. Sb. 1976. Vol. 99. P. 394-420.
17. Chistyakov S. V. On solutions for game problems of pursuit // Prikl. Mat. Mekh. 1977. Vol. 41. P. 825-832.
18. Chistyakov S. V., Petrosyan L. A. On one approach for solutions of games of pursuit // Vestnik Lening. University. Ser. 1: Mathematica, mechanica, astronomia. 1977. Vol. 1. P. 77-82.
19. Chentsov A. G., Subbotin A. I. An iterative procedure for constructing minimax and viscosity solutions to the Hamilton-Jacobi equations and its generalization // Proc. Steklov Inst. Math. 1999. Vol. 224. P. 286-309.
20. Coddington E., Levinson N. Theory of ordinary differential equations. New York: McGraw-Hill, 1955. 474 p.
21. Chistyakov S. V. Value operators in the theory of differential games // Izv. IMI Udm. State Univ. 2006. Vol. 37. P. 169-172.
22. Nikitin F. F., Chistyakov S. V. Existence and uniqueness theorem for a generalized Isaacs-Bellman equation // Differential equations. 2007. Vol. 43. P. 757-766.
23. Chistyakov S. V. Programmed iterations and universal e-optimal strategies in positional differential game // Dokl. Akad. Nauk. 1991. Vol. 319. P. 1333-1335.
24. Kolmogorov A. N., Fomin S. V. Elements of the theory of functions and functional analysis. New York: Courier Dover Publications, 1999. 288 p.
References
1. Isaacs R. Differential games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization. New York: John Wiley and Sons, Inc., 1965, 384 p.
2. Pontryagin L. S. On the theory of differential games. Russian Math. Surveys, 1966, vol. 21, pp. 193-246.
3. Lewin J. Differential games. London: Springer, 1994. 242 p.
4. Elliot R. J., Kalton N. J. Values in differential games. Bull. Amer. Math. Soc., 1972, vol. 78, pp. 427-431.
5. Fleming W. H. The convergence problem for differential game. J. Math. Anal. Appl., 1961, vol. 3, pp. 102-116.
6. Friedman A. Differential games. New York: Wiley, 1971, 350 p.
7. Roxin E. Axiomatic approach to differential games. J. Optim. Theory Appl., 1969, vol. 3, pp. 153—
8. Varaiya P. P. On the existence of solutions to a differential game. SIAM J. Control Optim., 1967, vol. 5, pp. 153-162.
9. Krasovskii N. N., Subbotin A. I. Game-theoretical control problems. London: Springer, 2011, 532 p.
10. Subbotin A. I. A generalization of the basic equation of the theory of differential games. Dokl. Akad. Nauk, 1980, vol. 22, pp. 358-362.
11. Subbotin A. I. Generalized solutions of first order PDEs: The Dynamic Optimization Perspectives. Boston: Birkhauser, 1995, 312 p.
12. Crandall M. G., Lions P. L. Viscosity solutions of Hamilton-Jacobi equations. Trans. Amer. Math. Soc., 1983, vol. 277, pp. 1-42.
13. Evans L. C., Souganidis P. E. Differential games and representation formulas for solutions of Hamilton-Jacobi equations. Indiana Univ. Math. J., 1984, vol. 33, pp. 773-797.
14. Subbotin A. I. On a property of the subdifferential. Mat. Sb., 1993, vol. 74, pp. 63-78.
15. Chentsov A. G. The structure of an approach problem. Dokl. Akad. Nauk, 1975, vol. 224, pp. 12721275.
16. Chentsov A. G. On differential game of approach. Mat. Sb., 1976, vol. 99, pp. 394-420.
17. Chistyakov S. V. On solutions for game problems of pursuit. Prikl. Mat. Mekh., 1977, vol. 41, pp. 825-832.
18. Chistyakov S. V., Petrosyan L. A. On one approach for solutions of games of pursuit. Vestnik Leningr. University, ser. 1: Mathematica, mekhanica, astronomia, 1977, vol. 1, pp. 77-82.
19. Chentsov A. G., Subbotin A. I. An iterative procedure for constructing minimax and viscosity solutions to the Hamilton-Jacobi equations and its generalization. Proc. Steklov Inst. Math., 1999, vol. 224, pp. 286-309.
20. Coddington E., Levinson N. Theory of ordinary differential equations. New York: McGraw-Hill, 1955, 474 p.
21. Chistyakov S. V. Value operators in the theory of differential games. Izv. IMI Udm. State Univ., 2006, vol. 37, pp. 169-172.
22. Nikitin F. F., Chistyakov S. V. Existence and uniqueness theorem for a generalized Isaacs-Bellman equation. Differential equations, 2007, vol. 43, pp. 757-766.
23. Chistyakov S. V. Programmed iterations and universal e-optimal strategies in positional differential game. Dokl. Akad. Nauk, 1991, vol. 319, pp. 1333-1335.
24. Kolmogorov A. N., Fomin S. V. Elements of the theory of functions and functional analysis. New York: Courier Dover Publications, 1999, 288 p.
Статья рекомендована к печати проф. С. В. Чистяковым.
Статья поступила в редакцию 19 декабря 2013 г.