Научная статья на тему 'Conditions for Sustainable cooperation'

Conditions for Sustainable cooperation Текст научной статьи по специальности «Математика»

CC BY
5
3
i Надоели баннеры? Вы всегда можете отключить рекламу.
Область наук
Ключевые слова
PAYOFF DISTRIBUTION PROCEDURES (PDP) / IMPUTATION DISTRIBUTION PROCEDURES (IDP) / DIFFERENTIAL GAME / COOPERATIVE SOLUTION / TIME-CONSISTENCY OF THE COOPERATIVE AGREEMENTS / STRATEGIC STABILITY / IRRATIONAL BEHAVIOR PROOFNESS

Аннотация научной статьи по математике, автор научной работы — Petrosyan Leon A., Zenkevich Nikolay A.

There are three important aspects which must be taken into account when the problem of stability of long-range cooperative agreements is investigated: time-consistency of the cooperative agreements, strategic stability and irrational behavior proofness. The mathematical results based on imputation distribution procedures (IDP) are developed to deal with the above mentioned aspects of cooperation. We proved that for a special class of differential games time-consistent cooperative agreement can be strategically supported by Nash equilibrium. We also consider an example where all three conditions are satisfied.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «Conditions for Sustainable cooperation»

Conditions for Sustainable Cooperation

Leon A. Petrosyan1 and Nikolay A. Zenkevich2

1 Faculty of Applied Mathematics

Saint Petersburg University Universitetskiy pr., 35, Petrodvorets,

Saint Petersburg, Russia, 198504

2 Graduate School of Management

Saint Petersburg University Volkhovsky per. 3,

Saint Petersburg, Russia, 199004

Abstract There are three important aspects which must be taken into account when the problem of stability of long-range cooperative agreements is investigated: time-consistency of the cooperative agreements, strategic stability and irrational behavior proofness. The mathematical results based on imputation distribution procedures (IDP) are developed to deal with the above mentioned aspects of cooperation. We proved that for a special class of differential games time-consistent cooperative agreement can be strategically supported by Nash equilibrium. We also consider an example where all three conditions are satisfied.

Keywords:differential game, cooperative solution, time-consistency of the cooperative agreements, payoff distribution procedures (PDP), imputation distribution procedures (IDP), strategic stability, irrational behavior proofness

1. Introduction

Cooperation is a basic form of human behavior. And for many practical reasons it is important that cooperation remains stable on a time interval under consideration. There are three important aspects which must be taken into account when the problem of stability of long-range cooperative agreements is investigated.

1. Time-consistency (dynamic stability) of the cooperative agreements. Time-con-sistency involves the property that, as the cooperation develops cooperating partners are guided by the same optimality principle at each instant of time and hence do not possess incentives to deviate from the previously adopted cooperative behavior.

2. Strategic stability. The agreement is to be developed in such a manner that at least individual deviations from the cooperation by each partner will not give any advantage to the deviator. This means that the outcome of cooperative agreement must be attained in some Nash equilibrium, which will guarantee the strategic support of the cooperation.

3. Irrational behavior proofness. This aspect must be also taken in account since not always one can be sure that the partners will behave rational on a long time interval for which the cooperative agreement is valid. The partners involved in the cooperation must be sure that even in the worst case scenario they will not loose compared with non cooperative behavior.

The mathematical tool based on payoff distribution procedure (PDP) or imputation distribution procedure (IDP) is developed to deal with the above mentioned aspects of cooperation.

2. Continuous Time Case

Consider n-person differential game r(xo,T — to) with prescribed duration and independent motions on the time interval [to,T]. Motion equations have the form:

Xi = fi(xi,u), ui € Ui C Re,xi = (xii,...jxim) € Rm,fi = (/ii,...,/im) € Rm

Xi(to)= x0, i = 1,... ,n. (1)

It is assumed that the system of differential equations (1) satisfies all conditions necessary for the existence, prolongability and uniqueness of the solution for any n-tuple of measurable controls u1(t),..., un(t).

The payoff of player i is defined as:

Hi(xo,T — to; ui(■),...,Un('))=/ hi(xo; x(r))dr,

■'t0

where hi(xo; x) is a continuous function and x(t) = |x1(r),...,xn(t)} is

the solution of (1) when open-loop controls u1(t),..., un(t) are used and x(to) = {xi(to),..., x„(to)} = {xo,... ,xn} = xo.

Suppose that there exist an n-tuple of open-loop controls u(t) = (u1(t),..., un(t)) and the trajectory x(t), t € [to,T], such that

n

max y^#i(xo,T — to; u1(t),... ,un(t)) =

«i(t),...,«„(t) '

i = 1

n n p t

Hi(xo,T — to;u1(t),...,un(t))^ / hi(xo;x(t))^t. (2)

i=1 i=1 to

The trajectory x(t) = (x1 (t),... ,xn(t)) satisfying (2) we shall call ’’optimal cooperative trajectory”.

Let N = {1,..., n} be the set of players. Define in r(xo, T — to) characteristic function in a classical way:

n T

V(xo, T — to; N) = ^ / hi(xo; x(t))^t,

i=1 to V(xo, T — to; 0) = 0,

V(xo,T — to; S) = Val I^n\s(xo,T — to), (3)

where Val \s(xo ,T — to) is a value of zero-sum game played between coalition S acting as first player and coalition N \ S acting as player 2, with payoff of player S equal to:

^ ^ Hi (xo ,T to; u1(,),... , un (,)).

ies

Define L(xo,T — to) as imputation set in the game r(xo,T—to) (see Neumann and Morgenstern (1947)):

L(xo ,T — to) = {a = («1,..., an) :

ai > V(xo,T — to; {i}), «i = V(xo,T — to; N) }. (4)

ieN

Regularized game ra(x0,T — to)- For every a G L(x0,T — t0) define the noncooperative game ra(x0,T — to), which differs from the game r(x0,T — t0) only by payoffs defined along optimal cooperative trajectory X(t), t G [t0, T].

Let a G L(x0,T — t0). Define the imputation distribution procedure (IDP) (see Petrosjan (1993)) as function p(t) = (pi(t), ..., pn(T)), t G [t0,T] such that

= [ pi (t)dT (5)

Jto

Denote by H“ (x0 ,T — t0; u1(-),...,Mn(-)) the payoff function in the game ra(x0, T — t0) and by x(t) the corresponding trajectory, then

H“(x0,T — t0; ui(-),.. .,«„(•)) = Hi(x0,T — t0; ui(-),.. .,««(•))

if there does not exist such t G (t0,T] that x(t) = X(t) for t G (t0,t]. Let t = supjt' : x(t) = X(t),t G [t0,t']j and t > t0, then

Ha(x0, T — t0; ui (■),..., (■)) =

I pi (t )dT + Hi(x(t),T — t; ui (•),...,«„(•)) =

Jto

rt fT

pi (t )dT + J hi(x(t); x(t ))dT.

to

In a special case, when x(t) = x(t), t G [t0,T] (if x(t) is an optimal cooperative trajectory in the sense of (2)), we have

Ha(x0,T — t0;ui(-),...,u„(-)) = f Pi(T)dT = ai.

to

By the definition of payoff function in the game ra(x0, T — t0)we get that the

payoffs along the optimal trajectory are equal to the components of the imputation

a = (ai,..., an).

Consider the current subgames (see Neumann and Morgenstern (1947))

— r(x(t),T — t) along X(t) and current imputation sets L(x(t),T — t). Let a(t) G L(x(t),T — t). Suppose that a(t) can be selected as differentiable function of t, t G [t0 ,T].

Definition 2.1. The game ra(x0 ,T — t0) is called regularization of the game r(x0 ,T — t0) (a-regularization) if the IDP p is defined in such a way that

ai(t) = pi (t )dT

or

pi(t) = —ai(t). (6)

From (6) we get

ai = f pi(t)dT + ai(t), (7)

to

where a = (ai, a2,...,an) G L(x0,T — t0), and

a(t) = (ai(t), a2(t),..., an(t)) G L(x(t),T — t). Suppose now that

a

M(x0,T — t0) C L(x0,T — t0) is some optimality principle in the cooperative version of the game r(x0,T — t0), and M(x(t),T — t) C L(x(t),T — t) is the same optimality principle defined in the subgames r(x(t),T — t) with initial conditions on the optimal trajectory. M can be c-core, HM-solution, Shapley Value, Nucleous e.t.c. If a G M(x0,T —10), and a(t) G M(x(t),T — t) the condition

(7) gives us the time consistency of the chosen imputation a, or the chosen optimality principle. Then we have the time consistency (dynamic stability) of the chosen cooperative agreement.

Consider now the problem of strategic stability of cooperative agreements. Based on imputation distribution procedure p, satisfying (5) we can prove the following basic theorem.

Theorem 2.1. In the regularization of the game ra(x0 ,T—10) for every e > 0 there exist an e-Nash equilibrium (Nash (1951)) with payoffs a = (ai,..., ai,... , an).

Proof. The proof is based on actual constraction of the e-Nash equilibrium in piecewise open-loop (POL) strategies with memory.

Remind the definition of POL strategies with memory in differential game. Denote by X(t) any admissible trajectory of the system (1) on the time interval [t0,t], t G [t0,T].

The strategy ui(-) of player i is called POL if it consists from the pair (a,a), where a is a partition of time interval [t0,T], t0 < ti < ... < t = T (tk+i — tk = 5 > 0), and a mapping a which corresponds to each point (X(tk),tk), tk G a an open-loop control ui(t), t G [tk ,tk+i).

Consider a family of associated with r(x,T — t), but not with ra(x,T — t) zero-sum games rri}jN\{i}(x,T — t) from the initial position x and duration T — t between the coalition S consisting from a single player i and the coalition N \ {i} with player’s i payoff equal to

Hi(x,T — t; ui(-)..., Un(■)).

The payoff of player N \ {i} in r{i} N\{i}(x,T — t) equals to (—Hi). Let u(x,t; ■) be the e-optimal POL strategy of player N \ {i} in rlri}jN\{i}(x,T — t). Note, that u(x,t; ■) = {Uj(x, t; ■)}, j G N \ {i}.

Let X(t) = {Xi(t),... ,Xn(t)} be the segment of an admissible trajectory of

(1) defined on the time interval [t0,t], t G [t0,T]. For each i G {1,...,n} define t(i) = sup{ti : Xi(ti) = Xi(ti)} and t(j) = mini t(i) = t(j). t(j) lies in one of the intervals [tk, tfc+i), k = 0,1,...,1 — 1. Thus, t(i) —10 is the length of the time interval starting from t0 on which xi (t) coincides with Xi (t) — the i-th component of the cooperative trajectory X(t). And t(j) — t0 is the length of the time interval starting from t0 on which x(t) coincides with cooperative trajectory X(t).

Define the following strategies of player i G N.

ui(t) for (X(tk),tk) on the optimal cooperative

trajectory X(t) (X(t) = X(t),t G [t0,tk]);

U(X(tk+i),tfc+i; ■) i-th component of the e/2-optimal POL

fi ( ) < strategy of player N \ {j} in the game

\{j} (x(tk+i ),T tfc+i), if tk ^ ^(j) < tk+i; arbitrary for all other positions.

Show that w*(-) = (wfis e-Nash equilibrium in ra(xo,T — to). The following equality holds

H, (xo,T — to; «*(•)) = Hi(xo,T — to; u*(-), ...,«£(•)) = f A(t)dt = a,. (8)

Jto

Consider the n-tuple (M*(-)||Mi(-)) where player i changes his strategy w*(-) on Uj(-).

We have to show that

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Hi(xo > T — to; w* (,)) > Hi (xo ,T — to; w*(,)||wi (,)) — e. (9)

for all i G N and all POL «,(•) of player i.

It is easy to see that when the n-tuple w*(-) is played the game develops along the optimal trajectory x(t). If in (M*(-)||wi (■)) the trajectory x(t) is also realized then

(8) will be equality and thus true. Suppose now that in (w*(■)||ui(■)) the trajectory x(t) different form x(t) is realized. Then let

t = inf{t: x(t) = x(t)}.

and t G [tfc-i,tk). Since the motion of players are independent we get

xm (tk) xm (tk) for m' G N \ {i} and xi (tk) = xi (tk) (but xj (tk_1 ) xj (tk_1 )

for j G N). Then from the definition of «*(•) it follows that the players m G N \ {i}

will use their strategies Um(x(tk),tk; ■) which are e/2-optimal in a zero-sum game r^N\{,}(x(tk),T — tk) against the player i which deviates from the optimal trajectory on a time interval [tk-1 ,tk).

If the players from the set N \ {i} will use their strategies Um(x(tk),tk; ■), player i starting from position x(tk),tk will get not more than

e

V(x(tk),T — tkl {*}) +

where V (x(tk), T—tk; {i}) is the value of the game r{,j}jN\{,}(x(tk), T—tk). Then the total payoff of player i in ra(xo,T — to) when the n-tuple of strategies (w*(■)||ui(■)) is played cannot exceed the amount

Pth— 1 e Ptk

2

ptk-1 e ^tfc

(3i(T)dT + V(x(tk),tk;{i}) + - +/ hi(xi(T))dT. (10)

-'to 2 Jtk-i

But the payoff of player i when the n-tuple «*(•) is played is equal to

pT p tk —i p T p tk —i

a, = pi(r)dr = pi(r)dr + / A(r)dT = A(r)dT + a, (tfc_i).

«/to ^ to j tk-1 J to

(11)

By the definition of IDP (see (5), (6)), a,(tk-1) G L(x(tk-1 ),T — tk-1),

f $(t)dr = a,(tfc_1) > V(x(tfc_1 ),T — tfc_1; {i}). (12)

tk-1

From the continuity of the function V and continuity of the trajectory x(t) by appropriate choice of 5 > 0 (tk+1 —tk = 5) the following inequalities can be guaranteed:

e

\V(x(tk-i),T — tfe—i; {*}) — V(x(tk),T — tkl {*})| < —,

fT e

[3i(r)dT = ai(tk-1) > V(x(tk),T — tfc; {*}) -tk-1 4

tk

Compare ai(tk_1) and V(x(tk),tk; {i}) + e/2 + /tkk i h^x,(r))dr. By choosing

5 = tfc+1 — tk sufficiently small one can achieve that the integral fttk h^x, (t))dr

^ t k — 1

will be also small (less than e/4).

Adding to both sides of (12) the amount j^—1 A (t)^t and using the previous inequality we get

tk—1 tk—1

a, = fa(t)^t + a,(tfc_1) > / ^,(T)dT + V(x(tfc_1 ),T — tfc_1; {i}) >

to to

[tk—1 e

> fa(r)dT+ V(x(tk),T-tk;{i}) - -

to 4

tk—1 e tk e

> (3i(T)dr + V(x(tk),T -tk]{i}) - -+ hi(r)dT - -

to 4 tk—1 4

tk—1 tk e

> fa(r)dT+ V(x(tk),T-tk;{i}) + hi(r)dT - -

to tk—1 2

tk—1 tk

> ^,(T)dT + V(x(tk),T — tfc; {i})+ / h,(T)dT +

to tk—1

eee

+------------. (13)

2 2 2 v 7

Here first four addends in the right part of the inequality constitute the upper bound of player i payoff when («*(•) ||w* (■)) is played. But a, is the payoff of player i when w*(-) is played, and we get

H,(xo,T — to;«*(•)) = a, >

Ptk—1 P^k e

> (3i(T)dT + V(x(tk),T-tk;{i})+ h^dr + - - e >

to tk—1 2

> (xo ,T — to; w*(')||wj (,)) — e (14)

and we get (9). The theorem is proved. □

This means that the cooperative solution (any imputation) can be strategically supported in a regularized game ra(xo,T — to) (realized in a specially constructed Nash equilibrium) by the Nash equilibrium «*(•) defined in the Theorem 2.1.

Conditions for the irrational behavior proofness of the cooperative solutions. Suppose now that in some intermediate instant of time the irrational behavior of some player (or players) will force the other players to leave the cooperative agreement, then the irrational behavior proofness condition (see D.W.K. Yeung (2007)) requires that the following inequality must be satisfied

V(xo,T — to; {i}) < / $(t)dT + V(x(t),T — t; {i}), i G N. (15)

to

If the IDP ^(t) can be chosen in such a way, that both time-consistency and irrational behavior proofness conditions are satisfied (the strategic stability as we have shown follows from time-consistency) the cooperative agreement about the choice of the imputation a = (a1, a2,..., an) is stable.

From (15) we have the following condition for IDP

^(t ) = (A(t ),^2 (t),...,P„(t )):

d

/3i(r) > ~—V(x(t),T - r; {*}), i = (16)

dT

In (16) V(x(t),T — t; {i}) is the value of the zero-sum game played with coalition N\{i} as one player and player {i} with the coalitional payoff equal to [-h,(x(t X T — t ; M1,...,wn)j. Suppose that y(t),t G [t, T ] is the trajectory of this zero-sum game, when the saddle point strategies are played. We suppose that for each initial conditions x(t),T — t,t G [to,T] such saddle point exist (if not we

can consider e-saddle point in piecewise open loop srategies which for every given

e > 0 exist always, but the following formulas in this case are to be considered with e-accuracy).

Then we can write

V(x(t),T — t; {i}) = J h,(x(t);y(t))dt, where y(T) = x(t). From (16) we have

d fT

Pi(T) > J hi(x(T);y(t))d =

'T n m dh,(x(T),y(t))

r j /-/ \ / \\ f dhi(x(T),y(t)) _

-[-hi(x(T);y(T)) + 2^2-^-------gZ---------/ifc(x(r),w(r))dt]

,7t l=1 k=1

/'T\^'^ dh,(x(t),y(t))

= hi(x(r); x(t)) - J XX---------------^---------fik(x(T),u(r))dt

dh, (x(t ),y(t))

/t i=1 k=1

or

ni \ i \ w /’Tv^'sm dh,(x(T),y(t))

/?i(r) > /ii(x(r);x(r)) - XX---------------------------fik(x(T),u(T))dt.

■'t l=1 k=1 xik

3. Discrete Time Case

In what follows as basic model we shall consider the game in extensive form with perfect information.

Definition 3.1. A game tree is a finite oriented treelike graph K with the root xo.

We shall use the following notations. Let x be some vertex (position). We denote by K(x) a subtree K with the root in x. We denote by Z(x) immediate successors of x. The vertices y, directly following after x, are called alternatives in x (y G Z (x)). The player who makes a decision in x (who selects the next alternative position in x), will be denoted by i(x). The choice of player i(x) in position x will be denoted by x G Z (x).

Let N = {1,..., n} — be the set of all players in the game.

Definition 3.2. A game in extensive form with perfect information (see Kuhn (1953)) G(xo) is a graph tree K(xo), with the following additional properties:

— The set of vertices (positions) is split up into n + 1 subsets Pi,P2,

which form a partition of the set of all vertices of the graph tree K. The vertices (positions) x G P are called players i personal positions, i = 1,... ,n; vertices (positions) x G Pn+1 are called terminal positions.

— In each vertex (position) x the system of real numbers h(x) = (h1(x),..., hn(x))is defined. hj(x) is interpreted as stage payoff of player i in the vertex (position) x.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Definition 3.3. A strategy of player i is a mapping Uj(-), which associate to each position x G Pj a unique alternative y G Z (x).

As in the previous case denote by Hj(x; u1 (•),...,«„(•)) the payoff function of player i G N in the subgame G(x) starting from the position x.

i

Hi(x; ui(-),... ,u„(-)) = X hi(xj)

i=1

where x = (x1,x2,... ,x|) is the path realized in the subgame G(x), when the n-tuple of strategies (u1(-),..., un(-)) is played, x1 = x.

Denote by u(-) = (u1(-),..., u„(-)) the n-tuple of strategies and the trajectory (path) x = (xo, x1,..., xm), xm G Pn+1 such that

n

max VHj (xo; u1(-),...,«n(-)) =

ui (•),...,«„(•)

n n m

= X H (xo; u1 (-),...,un(-)) = X(Xhi(xfc)). (17)

i=1 i=1 k=0

The path x = (x0,... ,xm) satisfying (17) we shall call ’’optimal cooperative trajectory”.

Define in G(x0) characteristic function in a classical way

nm

V(xo;N) = ( hi(xfc)),

i=1 k=o V(xo; 0) = 0,

V(xo; S) = VaZ rs,N\s(xo),

where VaZ PSjN\s(xo) is a value of zero-sum game played between coalition S acting as first player and coalition N \ S acting as player 2, with payoff of player S equal to

^ ^ (xo ; u1 (,) , . . . , un (,)) .

iGS

Define L(xo) as imputation set in the game G(xo).

L(xo) = < a =(«!,... ,an) : a > V(xo; {i}), X = V(xo; N)

I iGN

Regularized game Ga(xo). For every a G L(xo) define the noncooperative game Ga (xo), which differs from the game G(xo) only by payoffs defined in the ver-texes(positions) along optimal cooperative path x = (xo, ...,xm). Let a G L(xo).

Define the imputation distribution procedure (IDP) as function ft? = (fti(k),..., ft„(k)), k = 0,1,... ,m such that

m

a = $3 fti(k). (18)

k=0

Suppose in the situation (u1(-),..., u„(-)) the path (x0,..., xl') is realized. Define by H“(x0; ui(-),... ,u„(■)) the payoff function in the game Ga(x0)

r l'

Hi1 (x0; Mi(-),...,M„ (■)) = X fti(k) + X hi(xfc),

k=0 k=r+1

where r is defined as max{k : xk = aT?} = r.

By the definition of the payoff function in the game Ga (x0) we get that the payoffs along the optimal cooperative trajectory are equal to the components of the imputation a = (ai,..., a„) (H“(x0; Ui(-),... ,u„(-)) = ai).

Consider current subgames G(xk) along the optimal path x and current imputation sets L(xk). Let a? € L(xk).

Definition 3.4. The game Ga(x0) is called regularization of the game G(x0) (a-regularization) if the IDP ft is defined in such a way that

m

a? = X fti (j)

j=k

or fti(k) = a? — ak+i, i € N, k = 0,1,..., m — 1, fti(m) = am, a0 = a;.

Theorem 3.1. In the regularization of the game Ga(x0) there exist a Nash equilibrium with payoffs a = (ai).

Proof. Along the cooperative path we have

a? > V(x?; {i}), i € N, k = 0,1,..., m.

since a? = (a?,..., a„) € L(xk) is an imputation in G(xk) (note that here V(x?; {i}) is computed in the subgame G(x?) but not Ga(x?)). In the same time

m

a? = X (j) j=?

and we get

m

(j) > V (x?; {i}), i € N, k = 0,1,...,m. (19)

j=?

But fti (j) is the payoff of player i in the subgame Ga (x?) along the cooperative

path, and from (19) using the arguments similar to those in the proof of Theorem 2.1 one can construct the Nash equilibrium with payoffs a = (ai,..., a„) and resulting cooperative path x = (x0,..., xm).

The irrational behavior proofness condition in this case will be i

y^fti(j) + Vi(xi+i; {i}) > Vi(x0; {i}), 0 < l < m,i € N. (20)

j=0

Conditions for Sustainable Cooperation

4. Example

In this example as an imputation we shall consider Shapley value (Shapley (1953)). Using the proposed regularization of the game we shall see that there exist a Nash equilibrium with payoffs equal to the components of the Shapley value.

Fig.1. Game G(xo)

In the game G(x0), N = {1,2}, Pi = {x0,x2,x4}, P2 = {xi_,x3},

P3 = {yi ,y2,y3,y4,y5,ye}, h(xo) = (1,0), h(xi) = (0,1), h(x2) = (1,0),

h(x3) = (2,1), h(x4) = (1,1), h(x5) = (4,4), h(yi) = (0,0), h(y2) = (0,0), h(y3) = (1, 2), h(y4) = (2, 2), h(y5) = (3, 2), h(ye) = h(x5) = (4,4). The cooperative path is x = {xi,x2,x3,x4,x5}.

xo Xl x2 X3 X4 x5

y(x;{l}) 0 0 2 4 5 4

V(x; {2}) 0 3 5 4 5 4

y(x;{l,2}) 15 15 14 13 10 8

Sh(x; {1}) 7,5 6 5,5 6,5 6 4

Sh(x; {2}) 7,5 9 8,5 6,5 4 4

/?i(x) = /3i (j) 1,5 0,5 -1 0,5 2 4

l32 (x) = /32(j) -1,5 0,5 2 2,5 0 4

It can be easily seen that the inequality (19)

m

X^i(j) - V (xfc;{i}) j=k

for i <G N holds in this case and the irrational behavior proofness condition (20) is also satisfied:

i

X^i(j)+ V»(xi+i; {i}) - V(xo; {i}), i = 1,2, 1 < l < 4.

j=o

References

Neumann, J. and J. Morgenstern (1947). Theory of Games and Economic Behavior. Princeton.

Petrosjan, L. (1993). Differential Games of Pursuit. World Scientific: Singapore.

Nash J. (1951). Non-cooperative games. Ann. Mathematics, 54, 286-295.

Kuhn, H. W. (1953). Extensive games and the problem of imputation. In: Contributions to the Theory of Games II (eds. H. W. Kuhn and A. W. Tucker), Princeton, Princeton University Press, pp. 193-216.

Shapley, L. S. (1953). A Value for n-Person Games. In: Contributions to the Theory of Games (eds. H.W. Kuhn and A.W. Tucker), Princeton, Princeton University Press, pp. 307-315.

Yeung, D. W. K. (2007). An irrational-behavior-proofness condition in cooperative differential games. Int. J. of Game Theory Rew, 9, 256-273.

i Надоели баннеры? Вы всегда можете отключить рекламу.