Научная статья на тему 'STRATEGICALLY SUPPORTED COOPERATION IN DIffERENTIAL GAMES WITH COALITION STRUCTURES'

STRATEGICALLY SUPPORTED COOPERATION IN DIffERENTIAL GAMES WITH COALITION STRUCTURES Текст научной статьи по специальности «Математика»

CC BY
5
3
i Надоели баннеры? Вы всегда можете отключить рекламу.
Область наук
Ключевые слова
DIFFERENTIAL GAME / COALITION STRUCTURE / STRATEGIC STABILITY / IMPUTATION DISTRIBUTION PROCEDURE / DEVIATION INSTANT / ε-NASH EQUILIBRIUM / STRONG ε-NASH EQUILIBRIUM

Аннотация научной статьи по математике, автор научной работы — Wang Lei, Song Li, Petrosyan Leon, Sedakov Artem, Gao Hongwei

The problem of strategic stability of long-range cooperative agreements in di?erential games with coalition structures is investigated. We build a general theoretical framework of the cooperative di?erential game with a coalition structure basing on imputation distribution procedure. The notion of imputation distribution procedure is the basic ingredient in our theory. This notion may be interpreted as an instantaneous payo? of an individual at some moment which prescribes distribution of the total gain among the members of a group and yields the existence of a Nash equilibrium. Moreover, a few assumptions about deviation instant for a coalition are made concerning behavior of a group of many individuals in certain dynamic environments; thus, the time-consistent cooperative agreement can be strategically supported by an ?-Nash equilibrium or a strong ?-Nash equilibrium.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «STRATEGICALLY SUPPORTED COOPERATION IN DIffERENTIAL GAMES WITH COALITION STRUCTURES»

Contributions to Game Theory and Management, VIII, 336—346

Strategically Supported Cooperation in Differential Games with Coalition Structures*

Lei Wang1, Li Song2, Leon Petrosyan3, Artem Sedakov3 and Hongwei Gao2*

1 Department of Mathematics, Teachers College of Qingdao University 16 Qingda First Road, Qingdao, 266071, China 2 College of Mathematics, Qingdao University 308 Ningxia Road, Qingdao, 266071, China 3 Faculty of Applied Mathematics and Control Processes Saint Petersburg State University Universitetsky prospekt 35, Saint Petersburg, 198504, Russia * Corresponding author, [email protected]

Abstract The problem of strategic stability of long-range cooperative agreements in differential games with coalition structures is investigated. We build a general theoretical framework of the cooperative differential game with a coalition structure basing on imputation distribution procedure. The notion of imputation distribution procedure is the basic ingredient in our theory. This notion may be interpreted as an instantaneous payoff of an individual at some moment which prescribes distribution of the total gain among the members of a group and yields the existence of a Nash equilibrium. Moreover, a few assumptions about deviation instant for a coalition are made concerning behavior of a group of many individuals in certain dynamic environments; thus, the time-consistent cooperative agreement can be strategically supported by an e-Nash equilibrium or a strong e-Nash equilibrium.

Keywords: differential game, coalition structure, strategic stability, imputation distribution procedure, deviation instant, e-Nash equilibrium, strong e-Nash equilibrium.

1. Introduction

Human behavior is dynamic, and cooperation runs through human behavior. It happens often that players agree to cooperate over a certain period. It also happens often that some cooperative agreements are abandoned before reaching the maturity. It is important that cooperation remains stable on a time interval. When we analyze the problem of stability of long-range cooperative agreements there are three important aspects which must be taken into account, including time consistency, strategic stability and the irrational-behavior-proof condition.

Time consistency involves the property that as the cooperation develops, partners are guided by the same optimal principle at each instant of time and hence do not possess incentives to deviate from the previous cooperative behavior.

* This research was supported by National Natural Science Foundation of China (71003057, 71171120, 71373262), Projects of International (Regional) Cooperation and Exchanges of NSFC (71311120090, 71311120091, 71411130215), Specialized Research Fund for the Doctoral Program of Higher Education (20133706110002), Graduate Student Education Innovation Plan of Qingdao University (QDY12017, QDY13004), and Saint Petersburg State University (9.38.245.2014).

The concept of time consistency and its implementation was initially proposed in (Petrosyan, 1977), (Petrosyan and Danilov, 1979), (Petrosyan and Danilov, 1982), (Petrosyan and Danilov, 1986) and was developed in (Petrosyan, 1993), (Petrosyan and Zenkevich, 1996), and (Petrosyan, 1997). Some new results about time consistency can be found in (Petrosyan and Zaccour, 2003), (Yeung and Petrosyan, 2005), and (Gao et al., 2014).

Strategic stability means that the outcome of the cooperative agreement must be attained in some Nash equilibrium, which will guarantee the strategic support of the cooperation. The agreement will be developed in such a manner that at least individual deviations from the cooperation will not give any advantage to the deviator. Some results about strategic stability can be found in (Petrosyan and Grauer, 2002), (Gao and Petrosyan, 2009), and (Petrosyan and Zenkevich, 2009).

The irrational-behavior-proof condition means that the partners involved in the cooperation must be sure that even in the worst scenario they will not lose compared with non-cooperative behavior. Since one cannot be sure that the partners will behave rational on a long time interval, this aspect must be also taken into account. The concept of the irrational-behavior-proof condition was initially proposed in (Yeung, 2006). A further investigation can be found in (Gao et al., 2013).

Some results about dynamic games with coalition structures are given in (Petrosjan and Mamkina, 2003), (Kozlovskaya et al., 2010). In this paper we focus on the problem of strategic stability in cooperative differential games with coalition structures. We build a general theoretical framework of the cooperative differential game with a coalition structure basing on imputation distribution procedure (IDP). The notion of imputation distribution procedure (IDP) is the basic ingredient in our theory. This notion may be interpreted as a instantaneous payoff of an individual at some moment which prescribes distribution of the total gain among the members of a group. This notion yields the existence of a Nash equilibrium. Moreover to construct an e-Nash equilibrium or a strong e-Nash equilibrium in such a game a few assumptions about deviation instant of a coalition concerning the behavior of a group of many individuals in certain dynamic environments are made. It turns out that e-Nash equilibrium or strong e-Nash equilibrium exist in such a differential game with a coalition structure which guarantee the strategic support of cooperation.

The paper is organized as follows. In Section 2 we define the basic concepts and set up standard terminology and notation about a cooperative differential game with a coalition structure. In Section 3 we prove the existence of e-Nash equilibrium in a regularized differential game with a coalition structure and the existence of strong e-Nash equilibrium in a strictly regularized differential game with a coalition structure.

2. Formal Definitions and Terminology

In this section we define the basic concepts of a cooperative differential game with a coalition structure and set up standard terminology and notation.

Differential Game r(x0,T — t0) Let N = {1, 2,..., n} be the set of players. We consider an n-person differential game r(x0,T —t0) with independent motions on the time interval [to,T] (see (Dockner et al., 2000)). Motion equations have the

form:

xi = fi(xi,Ui), Ui G Ui C R1, xi G Rm, i = 1,...,n. (1)

It is assumed that the system of differential equations (1) satisfies all conditions necessary for the existence, sustainability and uniqueness of the solution for any n-tuple of measurable controls ui(t),..., un(t). The payoff of player i is given by:

Hi(xo,T - to; ui(-),... ,u„(-)) = / hi(x(r))dr, (2)

J t0

where hi(x) is a continuous function, x(t) = (x1(r),...,x„(r)) is a solution (a trajectory) of (1) when open-loop controls u1(t),... ,u„(r) are used, and x(to) = (xi(to),..., x„(to)) = xo.

Optimal Cooperative Trajectory x(t) Suppose that there exist an n-tuple of open-loop controls u(t) = (u1(t),..., un(t)) and a trajectory x(t), t G [to, T], such that

n

max V^Hi(xo,T - to; ui(t),... ,un(t)) =

Ul(t),...,Un(t) '

i=1

nnT

^#i(xo,T - to; u1(t),...,un(t))^ / hi(x(r ))dr.

i=1 i=1 to

The trajectory x(t) = (x1(t),..., xn(t)) satisfying (2) we call the optimal cooperative trajectory.

Characteristic Function The characteristic function in r(xo, T — to) is defined in a classical way:

n T

V(xo, T - to; N) = ^ / hi(x(r))dr, i=1 to V(xo,T - to; 0) =0,

V(xo, T - to; S) = Vairs,N\s(xo, T - to),

where VairS,N\s(xo,T - to) is a value of a zero-sum game between coalition S acting as player 1 and coalition N\S acting as player 2, with the payoff of S: EieS Hi(xo, T - to; u1(t),..., un(t)).

Imputation Set L(xo, T - to) Define L(xo, T - to) as the imputation set of the game r(xo, T - to) (see (von Neumann and Morgenstern, 1994)):

L(xo, T - to) = {a = («1,..., an) : a > V(xo, T-to; {i}), ^ ai = V(xo, T-to; N)}.

ieN

Core C(xo, T - to) Define C(xo, T - to) as the core of r(xo, T - to):

C(xo, T - to) = {a = («1,..., an) G L(xo, T - to) : ai > V(xo, T-to; S), S C N}.

ies

Imputation Distribution Procedure ft(t) Let a G L(xo, T - to). Define imputation distribution procedure (IDP) (see (Petrosyan, 1993)) as a function ft(t) = (ft1(T),..., ftn(t)), t G [to, T], such that

ai = I fti(T )dT. (3)

to

Regularized Game ra(x0,T — t0) For every a G L(x0,T — t0), we define a non-cooperative game ra(x0, T — t0) which differs from game r(x0, T — t0) only by payoffs defined along the optimal cooperative trajectory X(t),t G [t0,T].

Denote a payoff function in ra(x0,T — t0) by H®(x0,T — t0; ui(t),..., un(t)) and the corresponding trajectory by x(t). Then H®(x0, T — t0; u1(t),..., un(t)) = Hi(x0, T — t0; ui(t),..., un(t)), if there does not exist t G (t0, T] such that x(t) = X(t). Let t = sup{t' : x(t) = X(t),t G (t0,t']}. Then

H"(x0, T — t0; ui(t),..., un(t)) = / pi(t)dT + Hi(x(t), T — t; ui(t),..., un(t))

J to

= I Pi(t)dT + I hi(x(T))dT.

to t

In a special case, when x(t) = X(t), t G (t0, T], we have

Ha(x0, T — t0; ui(t),..., Un(t)) = / Pi(t)dT = ai.

to

Consider subgames r(x(t),T — t), imputation sets L(x(t),T — t) and cores C(x(t),T — t). Let a(t) G L(x(t),T — t). Suppose that a(t) can be selected as a differentiable function of t,t G [t0,T]. Game ra(x0, T —10) is called a regularized game of r(x0, T — t0) (a-regularization) if IDP P is defined in such a way that

i(t)= J ßi(T)dr,

Pi(t) = -ai(t). (4)

In particular, if a(t) G C(x(t), T — t), ra(x0, T — t0) is called a strictly regularized game of r(xo, T — to).

Time-consistency From (4) we get

«i = / Pi (t )dr + «¿(t). (5)

J to

Now suppose that M (x0, T — t0) C L(x0, T — t0) is some optimality principle in the cooperative version of game r(x0, T — t0), and M(x(t), T — t) C L(x(t), T — t) is the same optimality principle defined in the subgame r(x(t), T — t) with an initial condition on the optimal trajectory. M can be the core, the stable set, the Shapley value, nucleolus, ect. If a G M(x0,T —10) and a(t) G M(x(t),T — t), condition (5) gives us time consistency of the chosen imputation a or the chosen optimality principle in game ra(x0, T — to).

Differential Game with Coalition Structure rP (xo, T — to) Let P = (Si,...,Sm) be a partition of player set N such that Si fl Sj = 0, « = j, yS = N, |Si| = ni,Ylm1 ni = n. Suppose that each player « from N is playing in the interests of coalition Sk G P to which he belongs trying to maximize the sum of payoffs of its members, i. e.

max Y^ Hi (xo, T — to; ui(t),..., w„(t)). (6)

ui,iesk

iest

a

Define uSk = {ui, i G Sk} as the strategy of coalition Sk and xSk = {xi, i G Sk} as a trajectory of coalition Sk. Write

Hsk(xo, T - to; usj (t),... ,usm(t)) = ^ Hi(xo,T - to; m(t),... ,un(t))

ieSfc

as the payoff of coalition Sk. Suppose that coalitions in P are playing cooperatively with objective (2) and state dynamics (1). We call the above game as a cooperative differential game with a coalition structure denoted by (xo, T - to). Suppose that there exist an n-tuple of open-loop controls u(t) = (u1(t),..., un(t)) and a trajectory x(t) = (x1(t),..., xn(t)), t G [to, T] satisfying (2). Then trajectory x(t) is an optimal cooperative trajectory of r(xo,T - to). We define x(t) as the optimal cooperative trajectory of rP(xo, T - to) at the same time.

The characteristic function in rP(xo,T - to) is defined by:

nT

V(xo, T - to; P) = / hi(x(T))dT, i=1 to V(xo, T - to; 0) = 0,

V(xo, T - to; S) = Vair5,p\5(xo, T - to),

where VallS,P\s(xo, T - to) is a value of zero-sum game played between coalition S acting as player 1 and coalition P \ S acting as player 2 in which the payoff of coalition S equals ESfc£S Hsk (xo,T - to; us1 (t),... ,usm(t)). Define LP(xo, T - to) as the imputation set in rP(xo, T - to):

LP(xo, T - to) = {a = (asj,..., asm) :

ask > V(xo, T - to; {Sfc}), ^ aSk = V(xo, T - to; P)}.

sfceP

Define CP (xo, T - to) as the core in rP (xo, T - to):

CP(xo, T - to) = {a = (asj,..., asm) G (xo,T - to) :

^ asfc > V(xo, T - to; S), ScP}.

sfces

Let a G (xo,T - to). Define imputation distribution procedure (IDP) of

rP(xo,T - to) as a function ft(t) = (ftSl (t),... ,ftSm (t)),t G [to,T], such that

asfc =/ ftsfc (t)dT,Sfc GP. (7)

to

Regularized Game with Coalition Structure rP(xo,T - to) For every a G (xo,T - to), we define a non-cooperative game rP(xo,T - to) which differs from game rP(xo,T - to) only by payoffs defined along the optimal cooperative trajectory x(t),t G [to,T]. Denote the payoff function in game rp(xo, T - to) by Hjafc (xo,T - to; uSl (t),... ,uSm(t)) and the corresponding trajectory by x(t). Then (xo, T - to; usi (t),... ,usm (t)) = Hsfc (xo,T - to; usi (t),...,uSm (t)), if

there does not exist t € (t0,T] such that x(t) = x(t) for t € (t0,T]. Let t = sup{t' : x(t) = X(t),t € (t0,t']}. Then

HSk (x0 ,T - t0 ; USi (t),...,USm (t)) =

/ Psfc (T)dT + Hsfc (X(t), T - t; US1 (t),..., (t))

J to

' fT Psk (t)dT + / hSk (x(t))dT

to t

where hSk(x(t)) = ieSfc h(x(t)). In a special case, when x(t) = x(t),t € (t0, T], we have

HSk (x0,t - t0; USi (t),...,USm (t)) = / Psfc (t )dT = «sfc.

J to

Consider subgames rP(x(t),T — t), imputation sets LP(x(t),T — t) and cores CP ( x(t), T — t). Let a(t) € LP(x(t), T — t). Suppose that a(t) can be selected as a differentiable function of t,t € [t0,T]. Game (x0,T —10) is called a regularized game of rP(x0, T — t0) (a-regularization) if IDP ft is defined in such a way that

aSk (t) = J ftSk (t)dT

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

or

ftSk (t) = —aSk (t). (8)

In particular, if a(t) € CP(x(t), T — t), rp(x0, T — t0) is called a strictly regularized game of rP(x0, T — t0). From (8) we get

aSk = / ftSk(")dT + aSk(t), Sfc €P. (9)

to

Now suppose that MP(x0,T — t0) C LP(x0,T — t0) is some optimality principle in the cooperative version of game rP(x0,T — t0), and MP(x(t),T — t) C LP(x(t),T — t) is the same optimality principle defined in the subgame rP ( x(t),T — t) with an initial condition on the optimal trajectory. If a € MP(x0,T — t0) and a(t) € MP(x(t),T — t), condition (9) gives us time consistency of the chosen imputation a or the chosen optimality principle in game

rP (x0,t —10).

e-Nash Equilibrium and strong e-Nash Equilibrium of rP(x0,T —10) In a differential game with a coalition structure, different members of a coalition may deviate from their strategies at different time moments. And the trajectory realized by the deviations possibly has no changing, which cannot be regarded as the actual deviation. To define e-Nash Equilibrium and strong e-Nash Equilibrium of rP (x0, T —10), we shall define deviation instant for a coalition.

In r(x0, T — t0), we say that for player i € N strategy Wj(-) is essentially different from strategy Uj(-) under n-tuple U(-) if trajectory Xj(^) under n-tuple «(•)yui(•) is different from trajectory Xj(-) under «(•), i. e. there is t € (t0,T] such that Xj(t) =

xi(t). If strategies ui(^) and ui(^) are essentially different, we define ii(u(^)^ui(^)) = sup{t : xi(t) = xi(t),t G (to, T]} as the deviation instant between strategies ui(^) and ui(^).

We say that coalition Sk G P has the same deviation instant under n-tuple u(-) if ii(u(^)yui(^)) is the same for every i G Sk. We shall write t(u(^)||uSfc(•)) to denote ii(u(^)yui(^)) if Sk has the same deviation instant. We say that S C P has the same deviation instant if t(u(^)||uSfc (•)) is the same for every Sk G S.

Suppose that every Sk G P has the same deviation instant. An m-tuple u*(-) = (uS1 (•),•••, u*Sm(•)) is an e-Nash equilibrium of rp(xo, T - to) if and only if

HSk (xo, T - to; u*(-)) > HSfc (xo, T - to; «*(•) || usfc(•)) - e, (10)

for all Sk G P and all uSk.

Suppose that every S C P has the same deviation instant. An m-tuple u*(-) = (uS (•),..., uS (•)) is a strong e-Nash equilibrium of FP(xo, T - to) if and only if

]T HSk(xo,T - to; u*(0) > ^ HSk(xo,T - to; «*(•} y us(•)) - e, (11)

SkES SkES

for all ScP and all us = {uSk, Sk G S}.

3. Existence of e-Nash Equilibrium and Strong e-Nash Equilibrium in Differential Games with Coalition Structures

Theorem 1. Suppose that every Sk GP has the same deviation instant. For every e > 0, the regularized game rp(xo, T — to) has an e-Nash equilibrium with payoff a.

Proof. The proof is based on the construction of e-Nash equilibrium in piecewise open-loop (POL) strategies with memory. Remind the definition of POL strategies with memory in a differential game. Denote any admissible trajectory of the system (1) on the time interval [to, t], t G [to, T] by x(t). The strategy uSi(•) of player Sj is called POL if it consists of the pair (a, a), where a is a partition of time interval [to, T], to < ti < ... < t; = T, tk+1 - tk = Ô > 0, k = 0,1, 2,..., l - 1, and a is a map which corresponds an open-loop control uSi(t),t G [tk,tk+1) for each point (x(tfc),tfc),tfc G a.

Consider POL strategies u(-) = (a, a), where a maps each point ( x(tk ), tk ) on the optimal trajectories to an open-loop control USi(t),t G [tk,tk+1 ) satisfying (2) and a is arbitrary at other points.

Consider a family of zero-sum games r{S i p\{S }(x, T - t) from the initial position x and duration T -1 between coalition <S consisting from a single player Sj and coalition P\{Sj}. The payoff of player Sj is equal to HSi (x, T -1; uSl (t),..., uSm (t)) and the payoff of player P\{Sj} is equal to (-HSi). Let U(x, t; •) be an |-optimal POL strategy of player P\{Sj} in r{S.} p\{S-}(x,T -1). Note that U(x,t; •) = {US,,Sj GP\{Sj}}. " ^

Let x(t) = {xSl (t),... ,xSm (t)} be the segment of an admissible trajectory satisfying (1) on time interval [to,t],t G [to,T]. For each Sj G P define a(Sj) = sup{t : xSi(t) = xSi(t),t G (to, T]} and a(Sj) = minSi a(Sj) = a(Sj). a(Sj) lies in one of the intervals [tk, tk+1 ), k = 0,1, 2,..., l - 1. And i(Sj ) - to is the length of the time interval starting from to on which x(t) coincides with cooperative trajectory x(t).

Define the following strategies of player Sj É P.

' (t), for (x(tk),tk) on the optimal cooperative

trajectory;

(x(tk+1),tk+1; •), Sj-th component of the |-optimal POL strategy .(•) = < of player P\{Sj} in game

_ arbitrary,

if ifc < i(Sj)) < tfc+i; for all other positions.

To show that

(xo, T — to), we have to show that

(■)) is an e-Nash equilibrium in

HS (xo, T — to; «*(•)) > HS (xo, T — to; «*(•) y u* (■)) — e,

(12)

for all Sj eP and all uSi. It is easy to see that when m-tuple «*(•) is played, the game develops along the optimal trajectory x(t). If under «*(•) || uSi(•)) trajectory x(t) is also realized then (12) will be true.

Now suppose that under «*(•) || uSi(•)) trajectory x(t) is different from x(t). Suppose t(Sj) e [tk,tfc+i). Since the motion of players are independent, we get xSj(tfc+i) = xSj(tfc+i) for Sj e P\{Sj}. From the definition of «*(•) it follows that the players in P\|Sj} will use their strategies wS. (x(tk+1 ),tk+1; •) and player Sj starting from position (x(tk+1 ),tk+1) will get no more than V(x(tfc+1 ),T -tfc+1; {Sj}) + §, where V(x(tfc+1 ),T -tfc+1; {Sj}) is the value of game rpS } -p\{s }(x(tk+1 ),T — tfc+1). By choosing J = tk+1 — tk sufficiently small one can

achieve that integral j/'k+1 hSi(x(t))dr will be small (less than |). Then the to-

tal payoff H^(xo, T — to;

u*(.)

uSi(•)) of player Sj in game (x0, T — t0) when m-tuple of strategies «*(•) || uSi(•)) is played cannot exceed the amount

rtk ftk + 1

rtk /"tfc+i e

ßSi(r)dr +/ hSi(x(r))dr + V(x(tfc+i),T — tfc+i; {Sj}) + - < Jt 0 Jtk 2

[ k ßSi(t)dr + V(x(tfc+i), T — tfc+i; {Sj}) + 34e. (13)

t0

When m-tuple «*(•) is played, payoff HS? (x0,T —10; «*(•)) of player Sj is equal

to asi = ¡T0 (t)dr = //0= ^Si(r)dr + asi(tfc). But «s,(tfc) e LP(x(tfc),T — tfc), then we get aSi(tk) > V(x(tk), T — tk; {Sj}). From the continuity of function V and continuity of trajectory x(t) by appropriate choice of J = tk+1 — tk the following inequality can be guaranteed: V(x(tk), T—tk; {Sj}) > V(x(tk+1), T—tk+1; {Sj}) — |. So H? (xo,T — t0; u*(•)) will be no less than

tk e

ßSi(t)dT + V(x(tfc+i), T — tfc+i; {Sj}) — -.

J to 4

(14)

Combining (13) and (14) we finish the proof of Theorem 1. This means that the cooperative solution (any imputation) can be strategically supported in a regularized game (x0, T — t0) by a specially constructed e-Nash equilibrium. □

Theorem 2. Suppose that every ScP has the same deviation instant. For every e > 0, the strictly regularized game (xo, T — to) has a strong e-Nash equilibrium with payoff a.

u

*

u

s

Proof. The proof is based on the construction of strong e-Nash equilibrium in piece-wise open-loop (POL) strategies with memory. Consider POL strategies «(•) = (a,<r), where a maps each point (x(tk),tk) on the optimal trajectories to an open-loop control USi(t),t G [tk, tk+i), Si G P, satisfying (2) and a is arbitrary at other points.

Consider a family of zero-sum games rpp^ (x, T — t) from the initial position x and duration T — t between coalition S and coalition P\S in which the payoff of coalition S equals es HSi (x, T — t; wSl (t),...,usm (t)). Let Up\s (x, t; •) be an 2-optimal POL strategy of player P\S in rp(x, T — t). Note that Up\s(x, t; •) =

jus, G P\S}.

Let x(t) = jxSl (t),... ,xSm (t)} be the segment of an admissible trajectory satisfying (1) on time interval [to,t],t G [to,T]. Since every S C P has the same deviation instant, for every S C P we can define a(S) = i(Sj) = supjt : xSi (t) = xSi(t),t G (t0,T]},Sj G S and a(S) = mins t(S). a(S) belongs to one of the intervals [tk, tk+i), k = 0,1, 2,..., l — 1. And a(S) — to is the length of the time interval starting from t0 on which x(t) coincides with cooperative trajectory x(t).

Define the following strategies of player Sj G P :

' USi(t), for (x(tk),tk) on the optimal cooperative

trajectory;

USi(x(ifc+i),tk+i; •), Sj-th component of the |-optimal POL strategy «Si= > of player P\S in game rJ,p\S(x(tk+i),T - tk+i),

if tk < i(S) < tk+i; _ arbitrary, for all other positions.

We shall show that «*(•) = («S1 (•),•••, «S (•)) is a strong e-Nash equilibrium in rP(xo, T — to). We have to show that

^ HSi(xo,T — to;«*(•)) > ^ H£(xo,T — to;u*(•) || us(•)) — e, (15)

SiES SiES

for all ScP and all «s = {«Si, Sj G S}. It is easy to see that when m-tuple «*(•) is played, the game develops along the optimal trajectory x(t). If under «*(•) || «s(•) trajectory x(t) is also realized then (15) will be true.

Now suppose that under «*(•) || «s(•) trajectory x(t) is different from x(t). Suppose i(S) G [tk, tk+i). Since the motion of players are independent we get xSj (tk+i) = xSj(tk+i) for Sj G P\S. From the definition of «*(•) it follows that players in P\S will use their strategies US.(x(tk+i),tk+i; •) and coalition S starting from position (x(tk+i),tk+i) will get no more than V(x(tk+i),T — tfc+i; S) + 2, where V(x(tk+i), T — tk+i; S) is the value of game rp p\s(x(tk+i), T — tk+i). By choosing S = tk+i — tk sufficiently small one can achieve that integral JitT1 E Si es hSi(x(t))dT win be small (less than 4). Then the total payoff ES-es Hs (xo,T — to; «*(•) || uSi(•)) of coalition S in game rp(xo,T — to) when m-tuple of strategies «*(•) || «s(•) is played cannot exceed the amount

ptk /"ifc+i e

J2 A* (r)dr + hSi (x(r))dr + V(x(tk+i ), T — tk+i; S) + - <

a,— o^io c* ,— c ^ tk

tk ___ /'tfc + 1 e

2

SiES~ "k

tk 3e

4

to tk

/• t

£ /

Si£S to

]T ftSi (T )dT + V (x(tk+i ), T — tk+i; S) + —. (16)

When m-tuple «*(•) is played, payoff Y1S es (x0, T — t0; u*(-)) of coalition S is equal to

ftk

sies SiESJ to SiESJ to Sies

pT ftk

Y. «Si = X) / ßSi(t)dr = / ßsi(t)dr + ^ asi(tfc).

c,— o c,— o^tn c,— o^tn c,— o

But aSi(tfc) G (x(tfc),T - tfc), then we get £Sie5 (tfc) > V(x(tfc),T - tfc; S). From the continuity of function V and continuity of trajectory x(t) by appropriate choice of 5 = tk+i — tk the following inequality can be guaranteed:

e

V(x(tfc), T — tfc; S) > V(x(tfc+i), T — ifc+i; S) — 4. So J2S (x0,T — to; w*(0) will be no less than

/ifc p

(t)dr + V(x(tfc+i), T — tfc+i; S) — -. (17)

^¿fcO °

Combining (16) and (17) we finish the proof of Theorem 2. □

It should be noticed that if every ScP has the same deviation instant, a strong e-Nash equilibrium is also an e-Nash equilibrium in the strictly regularized game rp(x0, T — to). So the existence of strong e-Nash equilibrium implies the existence of e-Nash equilibrium in rP(x0,T — to). And if every S C N has the same deviation instant, we can easily construct a strong e-Nash equilibrium in the strictly regularized game rP(x0,T —10) from a strong e-Nash equilibrium in the strictly regularized game ra(x0, T — t0) (see Petrosyan and Zenkevich (2009)). So the existence of strong e-Nash equilibrium in the strictly regularized game ra(x0,T —10) implies the existence of strong e-Nash equilibrium in the strictly regularized game

rP (x0,T —10).

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Acknowlegments. The authors express their gratitude to R. J. Brown for useful discussions on the subjects.

References

Dockner, E., Jorgensen, S., Van Long, N. and Sorger, G. (2000). Differential Games in Economics and Management Science. Cambridge University Press, Cambridge.

Gao, H., Petrosyan, L. (2009). Dynamic Cooperative Game. Beijing: Science Press, 391-399 (in Chinese).

Gao, H., Petrosyan, L., Qiao, H., Sedakov, A., Xu, G. (2013). Transformation of Characteristic Function in Dynamic Games. Journal of Systems Science and Information, 1(1), 22-37.

Gao, H., Petrosyan, L., Sedakov, A. (2014). Strongly Time-Consistent Solutions for Two-Stage Network Games. Procedia Computer Science, 31, 255-264.

Kuhn, H.W. (1953). Extensive Games and the Problem of Imputation. In: Contributions to the Theory of Games II (eds. H. W. Kuhn and A.W. Tucker), Princeton, Princeton University Press, pp. 193-216.

Kozlovskaya, N., Petrosyan, L., Zenkevich, N. (2010). Coalitional Solution of a Gema-Theoretic Emission Reduction Model. International Game Theory Review, 12(3), 275286.

Petrosyan, L. (1977). Stable Solutions of Differential Games with Many Participants. Vi-estnik of Leningrad University, 19, 46-52.

Petrosyan, L. (1993). Differential Games of Pursuit. World Scientific, Singapore, pp. 270282.

Petrosyan, L. (1997). Agreeable Solutions in Differential Games. International Journal of Mathematics, Game Theory and Algebra, 7, 165-177.

Petrosyan, L. and Danilov, N. N. (1979). Stability of Solutions in Nonzero Sum Differential Games with Transferable Payoffs. Journal of Leningrad University N1, 52-59 (in Russian).

Petrosyan, L. and Danilov, N. N. (1982). Cooperative Differential Games and Their Applications. Tomsk University Press, Tomsk.

Petrosyan, L. and Danilov, N. N. (1986). Classification of Dynamically Stable Solutions in Cooperative Differential Games. Isvestia of high school, 7, 24-35 (in Russian).

Petrosyan, L. and Grauer, L. V. (2002). Strong Nash Equilibrium in Multistage Games. International Game Theory Review, 4(3), 244-264.

Petrosjan, L.A., Mamkina, S.I. (2006) Dynamic Games with Coalitional Structures. International Game Theory Review, 8(2), 295-307.

Petrosyan, L. and Zaccour, G. (2003). Time-Consistent Shapley Value of Pollution Cost Reduction. Journal of Economic Dynamics and Control, 27, 381-398.

Petrosyan, L. and Zenkevich, N. A. (1996). Game Theory. World Scientific, Singapore.

Petrosyan, L. and Zenkevich, N. A. (2009). Conditions for Sustainable Cooperation. Contributions to Game Theory and Management, SPb, GSOM, Vol.2, 344-355.

von Neumann, J. and Morgenstern, O. (1944). Theory of Games and Economic Behavior. Princeton University Press, Princeton, NJ.

Yeung, D.W. K., Petrosyan, L. (2005). Subgame Consistent Economic Optimization. Springer, New York, NY.

Yeung, D.W.K. (2006). An Irrational-Behavior-Proofness Condition in Cooperative Differential Games. Int. J. of Game Theory Rew, 8(4), 739-744.

i Надоели баннеры? Вы всегда можете отключить рекламу.