Научная статья на тему 'Some cases of cooperation in differential pursuit games'

Some cases of cooperation in differential pursuit games Текст научной статьи по специальности «Математика»

CC BY
13
4
i Надоели баннеры? Вы всегда можете отключить рекламу.
Область наук
Ключевые слова
GROUP PURSUIT GAME / COOPERATIVE GAME / NASH EQUILIBRIUM / CORE

Аннотация научной статьи по математике, автор научной работы — Pankratova Yaroslavna

In this paper we study a time-optimal model of pursuit in which players move on a plane with bounded velocities. The game is supposed to be a nonzero-sum simple pursuit game between an evader and m pursuers acting irrespective of each other. The key point of the work is to construct some cooperative solutions of the game and compare them with non-cooperative solutions such as Nash equilibria. It is important to give a reasonable answer to the question if cooperation is profitable in differential pursuit games or not. We consider all possible coalitions of the players in the game. For example, the pursuers promise some amount of the total payoff to the evader for cooperation with him. In that way, a cooperative game in characteristic function form is constructed, and its various cooperative solutions are found. We prove that in the game Γ v(x 1 0,...,z m 0,z 0) there exists the nonempty core for any initial positions of the players. In a dynamic game existence of the core at the initial moment of time is not sufficient for being accepted as a solution in it. We prove that the core in this game is time-consistent.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «Some cases of cooperation in differential pursuit games»

Some Cases of Cooperation in Differential Pursuit Games

Yaroslavna Pankratova

International Banking Institute,

60, Nevsky prospect, St. Petersburg, 191023, Russia E-mail address: yasyap@gmail.com

Abstract. In this paper we study a time-optimal model of pursuit in which players move on a plane with bounded velocities. The game is supposed to be a nonzero-sum simple pursuit game between an evader and m pursuers acting irrespective of each other. The key point of the work is to construct some cooperative solutions of the game and compare them with non-cooperative solutions such as Nash equilibria. It is important to give a reasonable answer to the question if cooperation is profitable in differential pursuit games or not. We consider all possible coalitions of the players in the game. For example, the pursuers promise some amount of the total payoff to the evader for cooperation with him.

In that way, a cooperative game in characteristic function form is constructed, and its various cooperative solutions are found. We prove that in the game r (x0,. ..,z^^n, z0) there exists the nonempty core for any initial positions of the players. In a dynamic game existence of the core at the initial moment of time is not sufficient for being accepted as a solution in it. We prove that the core in this game is time-consistent.

Keywords: Group pursuit game, cooperative game, Nash equilibrium, core. Introduction

The process of pursuit represents a typical conflict situation. When only two players are involved in the process of pursuit we deal with a classical zero-sum differential pursuit game. These games grew out of the problem of setting and solving military pursuit games and were developed by [Isaaks, 1965]. In the case when more than two players participate in the game and the players’ objectives are not strictly opposite it would be rather reasonable to consider such a game as a non zero-sum one. Although even in this case some game theorists used zero-sum models dividing all the players into two groups with opposite interests (see [Chikrii, 1992] and [Melikjan, 1981]).

In contrast of this approach to the problem of pursuit we consider a group pursuit game as nonzero-sum (see [Petrosjan, 1983], [Tarashnina, 1998]). Earlier differential games were used to model military problems. Now we try to apply them to some economic situations. It is obvious that players’ goals are not always strictly opposed. We want to illustrate how differential games can be used for solving economic problems. In such kind of games under “capture” we understand just meeting of players and delivering some goods or information. In other words, players are not aimed to destruct each other. These are the socalled nonzero-sum pursuit games. In order to investigate such a nonzero-sum game we construct both a corresponding game in normal form and its TU-cooperative version. Then we consider two different games (cooperative and noncoopertive) and find their solutions. The key moment of this paper is comparing and analyzing of these solutions.

In the framework of classical cooperative game theory with transferable utilities many solution concepts have been known, and there is a famous concept of the core, among them [Scarf, 1967]. Basic notations and results are described in the monograph of [Pecherskii and Yanovskaya, 2004].

Dynamic aspects of solving of classical cooperative games are considered in [Chistyakov, 1993] and [Petrosjan and Kuzyutin, 2008].

In dynamic games the property that provides for a solution to be feasible throughout the game is very important. This requirement is called time-consistency of a solution of a game. This property and the connected with it imputation distribution procedure (IDP) were introduced by Petrosjan, 1989.

Moving along cooperative trajectory, the players on some sense travel over the subgames, which differ from each other with initial states and duration. It is obvious that when time is passing either the players’ opportunities or the players’ interests may change. Therefore, at some instant t, being in the corresponding current subgame, the originally adopted optimal solution may either not exist or not satisfy the players’ interests any more. In other words, time-consistency of a solution of a dynamic game means that at each time instant within the game the players do not have any reasons to deviate from the originally adopted “optimal” behaviour.

1. Differential game of pursuit with one evader and m pursuers

The game under study is a time-optimal model of pursuit in which n players -m pursuers P1,..., Pm and one evader E - move on a plane with bounded velocities a\, ..., am and 3, respectively. Moreover,

3 < min{ai, a2, ..., am}, a\, a2, ..., am, 3 = const.

The players Pi,..., Pm and E start their motion at the positions x°, ..., xm and z0, respectively, and have the possibility of making decisions continuously in time. At each instant they may choose directions of their motion (velocity vectors) and velocities within prescribed limits. Thus, sets of control variables of players have the following forms

UPj = {uP. = {v}p.,u2p.) : (wp3)2 + (43)2 < a]}, j = 1, to,

Ue = {ue = (uE, u2E) : (uE)2 + (uE)2 < 32}.

In this case, the motion of the players is described by the following system of differential equations

ij = uPj, uPj G UPj, j = 1 , to,

Z = ue , ue e Ue

with initial conditions

Xl(0) = x0, ..., xm (0) = x0m, Z (0) = Z°. (2)

We describe the case of perfect information. This means that each player, choosing control variables up. (xl,..., xtm, z:) and ue(xl,..., xtm, z:) at each time instant t > 0 knows the time t and his own as well as all other players positions.

We assume that the pursuers use strategies with discrimination against the evader. This means that at each instant t players Pi, P2, ... , Pm have additional information about the value of the vector-parameter ue chosen by evader E at the same instant t. In such a situation, the evader is said to be discriminated. The evader uses piecewise open-loop strategies.

Denote as Up. and UE the admissible strategy sets of the players Pj and E, respectively.

The functions Xj(t) j = 1,..., m and z(t), t G [0, t], which satisfy equations (1) and initial conditions (2), where t - the end of the game, are called trajectories for the players Pj and E. Here by capture we mean coincidence of players’ positions.

Denote by Pj: = xj and by El = zl the current positions of pursuer Pj and evader E at the time instant t and

tPj(x<j,z°,uPj(-),uE(-)) = min{t : x\ = z(}, j = 1, to.

If there is no such t then tp. = +ro.

Let

t E (xi, . . . , xm, z ,UP\ ( * ) ,..., UPm (') ,UE (,)) min{tpi ,..., t Pm } .

Denote by KE the payoff function of the evader. It is equal to its meeting time with the first of the pursuers multiplied by some number y > 0. Here 7 is a price of a time unit. So,

KE (xi , ..., xm , z ,UPl ( *) ,..., UPm ( * ) , UE ( ' ))

= 7 X tE ^ ...^m z0, UPi (•), ..^ UPm (•), UE (-)).

The payoff function to player Pj (j = 1, 2,..., m) is given as follows

KP. (xi,. .., xm, z ,UPi ('),..., UPm (0, UE (^))

= -7 X tP. (x?,...,xm,z0,UPl ( •) , . . . ,U Pm, (0,UE (•)).

The evader gets the total time of the pursuit. Each pursuer gets a negative value of his time of pursuit. The game ends as soon as at least one of the pursuers catches the evader.

The objective of each player in the game is to maximize his own payoff function. In other words, all this means that each pursuer has a reason to meet the evader before the other players do, and the evader wants to be caught as late as possible. So, we define the nonzero-sum pursuit game as a normal form game as follows

r(x°, ...,x0m, z0) = {N, {Uj }jEN, {Kj }jEN),

where N = {Pi,..., Pm, E} is the set of players, Uj is the set of admissible strategies of player j and Kj is a payoff function of the j-th player (j e N).

As a solution of this game we consider a concept of Nash equilibrium.

Here the discrimination of player E has an important sense as the pursuers choose directions of their movement depending on a choice of the evader.

If ue = (uE ,UE) is a control variable of the evader, then movement with the

vector velocities up3- = \Joij — ('ME)2j is called the parallel pursuit strategy

(n-strategy) of the pursuer Pj, j = 1,...,m.

It is known that if pursuer P uses n-strategy and evader E uses a piecewise open-loop strategy, then their meeting happens within the Apollonius circle.

The Apollonius circle is the set of points M such that

\E°M\ \P°M\

3 a

Fig. 1: Apollonius circle

The Apollonius point is the point on the Apollonius circle which is the most far from the evader’s position.

Construct the Apollonius circles for a case of two pursuers. The Apollonious circle Ai = A(x°, z°) for the pursuit game rPi\E(x°, z°) between Pi and E is defined in the following way

Ai = A(x°,z°) = S (O°,R°),

where

Wx° — Z°W ll.x° - z°W

IE°cfiI = cii = /32 x —yj-, Ri = o.i x /3 x " ~

a? - j32 ’ i i K a? - 32 '

Similarly, A(x°, z°) = S(O°, R°°) is the Apollonius circle for the pursuit game rPj\E(x°, z°) between Pj and E, where

Wx° _z°w wx^ _z°w

\E°0°A = dj = /32 x —yj-, Rj = o.j x (3 x ~ 11

a2 _ f32 . “3 a2 _ f32 •

Fig. 2: Apollonius circles

1.1. A cooperative form of the game r(x0,... ,x°m,z0)

Let us assume that utilities of players are transferable, i.e. the players in the game are in such conditions that the total payoff, which is earned by a coalition S C N, can be arbitrarily divided between members of the coalition. It is interesting to consider all possible variants of players cooperation in this game. Suppose that each pursuer tries to agree with evader and promises to divide their payoff between two of them. It is important to find out, whether such behavior is favorable for the players, and namely, whether each player increases his own payoff by cooperation with the other players.

This game can be interpreted in the following way: imagine that the evader has something that each pursuer needs to have. It can be a kind of good or information.

Moreover, it is supposed the information to disappear once any of the pursuers reaches the evader. Thus, each pursuer wants to get it before the other does. It seems quite interesting to consider all possible cooperation between the players in this game, assuming the payoffs to be transferable. It would be rather helpful to know what the best way for the pursuers to “share” the evader is: whether to cooperate with each other, or to try to win over the evader to his side, or to form the grand coalition of n players. With this purpose with every game r(x1,..., x^, z°) we associate the corresponding game in characteristic function form r (x1,..., , z°).

Now we introduce a cooperative form of the game r(x°,..., x^, z°).

Let 2n be the set of all subsets of N. The function v : 2N ^ R1 with the

following two properties

1. v(0) =0,

2. v(S U R) > v(S) + v(R), S,R c N, S n R = 0,

is called the characteristic function (c.f.). Condition 2 is the superadditivity property. For any coalition S c N we define the characteristic function as follows

where uS and uN\S are vectors of admissible strategies of coalitions S and N \ S, respectively.

In paper [Tarashnina, 2002] the characteristic function of the game r(x°,x°,z°) is defined as

where t is the minimal total pursuit time.

In this case the goal of the players is to choose such strategies that maximize the total pursuit time tE. These strategies make all the players move to the point 5 which is the intersection point of the Apollonius circles, the nearest to the initial position of E.

Definition 1. The trajectory (xi (-),...,xm(-), z(-)) of system (1)-(2) such that Kn (xi(-),...,xm(-),z(-)) =

= 53 Ki (x 1 (*), . . . , xm (*), z(')) = v(N; x1 ,...,x°m, z°)

ieN

is called a cooperative trajectory in the game r(x°, .. . ,x<mn, z°).

All in all, the non-cooperative solution, which is a Nash equilibrium, proposes all the players to move to the point z, which is the farest intersection point of the Apollonius circle, whereas according to the cooperative solution all the players should move to the point 5.

In the case of the pursuit game considered above with two pursuers we construct the characteristic function using the values of corresponding zero-sum games. However, already for a game with three pursuers to construct the characteristic function in such a way is not possible, as the value of zero-sum game of pursuit with three pursuers is not known (to get a solution of such a game is a really complicated and still not solved problem).

Consider a game with m pursuers and one evader. We assume that all Apollonius circles have not empty intersection. If it is not so the i-th circle contains the j-th one that we count the player i to be “dummy”. The “dummy” player cannot influence on the process of pursuit, therefore, this game is reduced to a game with m — 1 pursuers.

Fig. 3: The gamer(ij,...,i5^ ,z0)

In the game r(x1,.x^, z0) as the worth of a coalitions S we take the guaranteed payoff of S against N\S.

0 11 a II 0 11Z <Z II o 11Z j 11

Denote g[l = 7 x ^----------------U-, gu = 7 x ^g^ = 7 x ^

ai — p p ’j p

0 llz0 — Zj, j k II 0 |z0 — Zi j II llz0 — zll

9i,j,k = 37 x "------- ” ", ~g\j = 7 x "-------p ' , 9* = 1 x —-—11 and, using this

approach, we construct the characteristic function for the game r(x0,... ,x<nn, z0) in the following form:

v({pi}, xi, z°) = ~9°, i = l,m,

v({E},x0,z0) = g0,

v({pi, pj}, x0, x°0, z0) = —2g°0 j,

v({Pu E}, x”, z ) = 0, i = 1, to,

v({pi ,pj ,pk },x0,x00,x0k,z0) = —3g°0 j, k, (3)

v

({Pi,Pj, E}, xA,xj,Z ) = —z0j,

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

v({pl, . ..,pm},x0, . ..,x°m,z°) = —m x g0, v({p1,...,pm,E},x0,...,x0m,z0) = — g*, where z and z are defined by formulas

Z= argmax \\z0 — z\\, zeA1n...nAm

Zij = argmax ||z° - z||, i,j = l,m, i ^ j,

zeA^nAj

zi,j,k= argmax ||z°-z||, i, j, k = 1, to, i ^ j ^ k, zeAinAjnAk

Sij = argmin ||z° - z||, i,j = 1, to, * ^ j,

zeA^nAj

z = argmin \\z0 — z\\.

zeAin...nAm

Definition 2. The pair (N,v(S; x0,...,xjn,z0)), where N = {pi,...,pm,E} is the set of players, and v is the characteristic function defined by (3) and (4), is called a cooperative differential game in characteristic function form and denoted by

rv (x0,...,xnn,z0).

Proposition 1. In the differential TU game of pursuit rv (x0 ,...,x<nn,z0) the constructed characteristic function (3) is superadditive for any initial positions and velocities of the players E, pi,..., pm.

By checking of corresponding inequalities it is possible to be convinced that the constructed characteristic function possesses the superadditivity property.

Definition 3. The cooperative n-person game (N, v) is equivalent to the game (N, v') if there exists a positive number k and n arbitrary real numbers cA (i G N) such that for any coalition S C N

v'(S) = kv(S) + ^2 ci.

ies

In fact, by setting k = 1, cPi = gi(0) = g0, cE = 0, we construct the game rv/(x0,..., x<nn, z0), which is equivalent to the game rv(x0,..., xn, z0). In such a case the characteristic function v' has the form

v({Pi},x°,z°) = 0, i = l,m, v({E}, x0, z0) = g0,

v({Pi, Pj}, xi, xp z°) = 9° + 9°j ~ 29i,j, h j = 1, m, j, v({Pu E}, x°, z°) = g°, i = l,m,

v({pi, pj ,pk}, x0, x°0, xk, z0) = g*0 + gj + g0 — 3g0 j, k, (5)

v({pi, pj, E}, x0, x°0, z0) = g0 + g0 — g0j,

v({pi,. ..,pm},x0, ...,xnn,z0) = g0 +... + g°m — m x g0,

v({pi ,...,pm ,E},x0i,...,xm, z0) = —g*.

We shall examine this game by using dominance relation. Recall that the imputation £ dominates the imputation n with respect to the coalition S (£ ys n) if the following conditions hold

£i > ni, i G S,

£(s) = Y1 £i < v(S).

v(

ies

The following theorem is needed for the sequel.

Theorem 1. Suppose (N,v) and (N,v'} are two equivalent games, then the map £ ^ £', where

£i = k£i + cA, i G N,

establishes one-to-one mapping between the imputation set of the game (N,v) and the imputation set of the game (N, v') such that £ —s n implies £' —s n'■

1.3. Existence of the Core in the cooperative pursuit game rv (

„0 ,v0

v ^x1 , .. ., , '

It follows from the superadditivity condition that it is advantageous for the players to form the maximal coalition N and obtain the maximal total payoff v(N; x0,..., xm, z0) that is possible in the game rv (x0,xm, z0). Various methods for “equitable” distribution of the total profit between players are considered as optimality principles. The set of distributions that satisfies an optimality principle is called a solution in a cooperative game (in the sense of this optimality principle).

Let us describe the imputation set in the game rv (x0,..., xm, z0). Denote by £ = (£Pl,..., £Pm, £E) an imputation. The imputation set is defined as follows

Ev (x1,..., xm,

x0 ,z0)

| > v(Pi), i=l,m, £B> v(E)]^2£i

I i£N

v(N)

or

Ev(x 1, . .. ,x^,zu) = <( £ : £Pi > -gl, i = l,m, £E > g°] ^2 £i = ~9

i£N

(6)

As an optimality principle in rv(x1,..., x°m, z0) we take the core. An analytical description of the core is provided by the following theorem, which was independently proved by [Bondareva, 1989] and [Shapley, 1969].

Theorem 2. For the imputation £ = (£i,£2,...,£n) to belong to the core it is necessary and sufficient that the inequality

£(S) = £ £i > v(s)

ies

holds for all S C N■

By Theorem 2, for the imputation £ to belong to the core it is necessary and sufficient that the following system of inequalities holds

£Pi + £Pj > -2 • 9i,j,

£pk + £e > 0,

£Pi + £Pj + £Pk > “3 • 9i,j,k, £Pi + £Pj + £e > -g° j,

£Pi + ... + £Pm >-m • g0, £Pi + ... + £Pk ,£e > ~9i

"-----------V------------'

m1

k = 1, to,

i, .. ., k = 1, m, i^...^k.

0

*

In paper [Tarashnina, 2002] it is proved that in the cooperative game with two pursuers and one evader there is a nonempty C-core for any initial positions of the players. In this paper we show that this is true for a general case of n-persons of game. The following proposition holds.

Proposition 2. In differential TU game of pursuit r(x°, .xm, z°) there exists the non-empty core for any initial positions of the players.

Proof.

Summing the inequalities (7) and multiplying result by ( — 1), we obtain

+ ... + Cm_1'' - (c~ -1- -1- c~ -1-< 0 1 A° -1- A°

— (Cm + ... + Cm ) - (£Pl + ... + £Pm + £E) < 2 (gl ,2 + #1 , 3 + .. •) +

'----------V-------------'

cm

+ 3 (g° ,2 , 3 + g° ,2 , 4 + ..•) + (g° ,2 + g°, 3 + ..•) +

cm cm

+ 4 (g° ,2 , 3 , 4 + 9°,2 , 3 , 5 + ..•) + 2 (g°, 2 , 3 + 9°,2,4 + - -•) +

'--------------V--------------' '-------------V-----------'

cm cm

+ ... + m X g° +(m — 2) X (g°,2,..,m- 1 + g+2,...,m-2,m • •

(8)

Let us simplify the left part of an inequality.

It is known that

/'~i° | /'-*1 | /'-*2 | | s~im—1\ _ cym

Cm + Cm + Cm + ••• + Cm + Cm 2 ,

then

s~i 1 | /'-»2 | | /'-im— 1 _ nm o

Cm + Cm + ••• + Cm 2 2.

On construction, we have v(N) = £p1 + ... + £pm + £E = —9*. Compare v(N) and

the worths of the other coalitions. It is obvious that the following relations hold

g* = min ||z° — z\\ < ... < min ||z° — z\\,

z^Ain...nAm zeAiHAj

g° = max llz ° — z||<...< max 11 z° — z\\.

zeAin...nAm zeAiHAj

Therefore, g* < g°. This means, it is possible to replace the right part of inequality (8) by the following estimation:

(2Cm+3Cm+Cm+4Cm+2Cm+... +

+m X Cm + (m — 2) X Cm—1) X g*.

-1

c

Hence,

(2m - 2) • g* < (2Cm + 3Cm + cm + 4Cm + 2^ + ... + m x cm + (m-2) x cm-1) x g*, and

(2m-2)<2cm+3cm+cm+4cm+2cm+...+mxcm+(m-2)xcm 1. (9)

Now it is left to prove that inequality (9) holds. It is possible to confirm, that

cm+2cm+3cm+...+m x cm=m x 2m 1.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Then

2cm + 3cm + ... + m x cm = m x (2m-1 - 1).

It is obvious that

2 x (2m-1 - 1) < m x (2m-1 - 1).

Consequently, the inequality (9) is fulfilled. This implies that the inequality (8) is satisfied for any initial positions of the players. Hence, system (7), which describes the core, is combined. This means that there exists the non-empty core in the described above cooperative game.

It can be easily checked that the vector r/0 = (-g0, -g0,..., -g0, -g*, (m-1) x g0) is an imputation from the core. This completes the proof.

2. Time-consistency of the core

We focus our attention on time-consistency of the core cv (x°,..., Xm, z0) in the game

r (x0 x0 z0)

-*-vV X1 ,...,xm,z ).

Let an optimality principle be chosen in the game rv(x°,...,xjm,z0). Let it be the core. The solution of this game constructed at the initial moment t = 0 is cv (x°,..., Xm, z0). It follows from Proposition 1 that cv (x0,..., Xm, z0) = 0. Remind that here (x1(^),... ,xm(•), z(-)) is the cooperative trajectory in the game rv

(x0 x0 z0)

1m

We study behavior of the set cv (x°,... ,xm, z0) along the cooperative optimal trajectory

(x1 (•), . . . , xm (•), Z (•)) .

With this end in view we enter the notion of a current subgame. At each current state (x1(t),...,xm(t),z(t)) a current subgame rv(x1(t),...,xm(t),z(t)) is defined like the game rv(x°,...,xm,z°) with the only difference: it starts at the current state lying on the cooperative trajectory and has duration (z - t). In the subgame

rv(xi_(t),... ,xm(t), z(t)) we define the characteristic function as it was done for the original game:

v({Pi},xi(t),z(t)) = -7 x i = 1, to,

ai - p

/fn1 , v vv II Z(t) - z(t)\\

v({E}, Xi(t), z(t)) = 7 x -------,

vdP^P^x^x^m) = -27 x

«({P^E^Xj^z^)) = 0, * = 1, m,

z(t) — Zjjtk(t) II

({Pi,Pj,Pk},xi(t),xj(t),xk(t),z(t)) = -37 x

p

II z(t) — r.- (t) II

v({Pi, Pj, E}, Xi(t), Xj(t), z(t)) = -7 x

\\z(t) - Zij(t)\\ — —j * - .

w({Pi, . .. ,Pm},xi(t), . .. ,xm(t),z(t)) = -to x 7 x -Z^ Z<yt^

p

w({Pi,... ,Pm, E},xi(t),.. . ,xm(t), z(t)) = -7 x (t - t),

where

Z(t) = argmax ||z(t) - z(t)||,

z(t)£Ai(t)n...nAm(t)

Zij(t) = argmax \\z(t) - z(t)\\, =

z(t)eAi(t)nAj (t)

Zi,j,k(t) = argmax \\z(t) - z(t)\\, i, j, k = I, m, i ^ j ^ k, zeAi(t)nAj (t)nAk(t)

zi}j{t)= argmin \\z(t) - z(t)\\, zeAi(t)nAj (t)

z = argmin ||z(t) - z(t)||,

z£Ai(t)n...nAm(t)

and Ai(t) are Apollonius circle in the games E (xi(t),z(t)).

Let us consider the functions

, , _ \\Xi(t) - z(t)\\ . _ _ J|z(t) -z(t)\\

9i\t) 7 r, ; * !;•••; mi SV) 7 r, ■

a1 - p p

These functions are continuous monotonically decreasing functions in t on the interval

[0,Z.

Remark 1. From the definition of Apollonius circle it follows that

/ t\ ix - z0\\

9i(t) =7 [1 ~ t) --------------i=l,

tz a1 - p

Proposition 3.1 The function g0(t) is linear function with respect to t, i.e.

0,,n I A llz0 - z\\

9V(t)= 7 1-7-

tz p

The characteristic function of the game rv(x1(t),..., xm(t), z(t)) has the form

v({Pi},Xi(t),z(t)) = ~9i(t), i=l,m,

v({E},Zi(t),z(t)) = g(t),

v({Pi,Pj },xi(t),xj (t),z(t)) = -2Zi,j (t),

v{{Pi, E},Xi(t), z(t)) = 0, i=l,m,

v({Pi,Pj ,Pk},xi(t),xj (t),xk (t),z(t)) = -3Zi,j,k (t),

v({Pi,Pj ,E},xi(t),xj (t),z(t)) = -cn,j (t),

v({P1, .. .,Pm},x1(t), .. .,xm(t),z(t)) = -m x g(t), v({P1,...,Pm,E},x1(t),...,xm(t),z(t)) = -g*(t).

The imputation set in the game rv(a;1 (t),... ,xm(t), z(t)) is of the form Ev{x1{t),...,xm{t),z{t)) = ^£t : £p. > -9i(t), i=l,m,

£E > g(t); ^= -g*(t)j.

i£N

The core of the current game is defined as follows

c {x1(t),...,xm(t),z(t)) = { £t : £t e Ev (^1 (t),...,xm(t),z(t)),

£t satisfies system (10) j.

It was proved by V. Reshetilova in her diploma thesis.

1

£Pi + > -2 x i,j = 1, to, j,

£pk +£e > 0, A; = 1,

£Pi + £Pi +£e> ~gi,j(t), *, j = 1, to, * ^ j,

(10)

£Pi + £P2 +... + £pm >-m x g(t),

£Pi + ^Pj + ••• + £Pk £E ^ h j, ■ ■ ■ T k — lj mj ® ^ i ^

1

Suppose cv(x1(t),.. .,xm(t),z(t)) = 0 along the cooperative trajectory for any t, t G [0,t]. If this condition is not satisfied, then it is impossible for players to adhere to the chosen optimality principle, since at the very first instant t, when G^(x1(t),... ,xm(t), z(t)) = 0, the players have no possibility to follow this principle. And as we assumed above at the initial state (x0,... ,x(m, z0) the players agree upon the imputation £0 e cv(x°,... ,xm, z0) such that the share of the i-th player is equal to £0. Let the payoff of the player i (his share) on the time interval [0, t] be £i(x1(t),... , xm(t), z{t)). Then on the remaining time interval [t, t], according to the imputation £0, he is to receive the gain

= £0 - £i (x1(t),...,xm(t),z (t)) .

For the original agreement (the imputation £0) to remain in force at the instant t it is essential that the vector £t = (£pi ,...,£p ,£tE) belongs to the set cv (x0,..., xm, z0). If this condition is satisfied at each instant t G [0, t], then the imputation £° is realized. This is conceptual meaning of time-consistency. In a view of Petrosjan, the definition of time-consistency of an imputation in the game rv (x°,..., xm, z0) has the form.

Definition 4. The imputation £0 e cv (x\, ..., x^m, zci) is called time-consistent in the

time-optimal game rv (x1^ ..., xm, zt^ if the following conditions are satisfied:

1. G^(x1 (t), .. . ,xm(t),z(t)) = 0 along the cooperative optimal trajectory (x1(t), ..., xm(t), z (t)) at each instant t, 0 < t < t;

2. there exists an integrable function t (t) = (Tpi (t),...,Tpm (t), te (t)) on [0, t]

such that Ti(t) > 0 (i e N) for each t e [0, t] and

£0 - £(x1(t),...,xm (t),z(t)) e cv tx1(t),...,xm (t),z(t)), (11)

where

£(x1(t),...,xm(t),z(t)) = (%Pi (x1(t),...,xm(t),z(t)),

•••7 £Pm (xi(t), • ••, xm (t), z(t)) , & (xi(t), • ••, xm(t), z(t)) ^

t

and £i(xi(t), • • •,xm(t), z(t)) = / Ti(y)dy, i Є W.

Remark 2.The cooperative differential game rv (x0,..., xm, z0) has a time consis-

M x0,

The following theorem holds.

tent solution if each imputation £0 Є Cv(x0,..., xm, z0) is time-consistent.

Theorem 3. In the cooperative differential time-optimal pursuit game rv (xi, • ••, xm, zo) there exists the non-empty core Cv (xl, . .. ,x(m, zo) that is time-consistent.

Proof. Consider the family of the current subgames

{rv(xi(t),...,xm(t),z(t)), 0 < t < t|.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Now our aim is to show that Cv(yxi(t), •••, xm(t), z(t)) = 0 for each t Є [0, t|. Summing inequality (10) and multiplying by ( — 1), we obtain

-(cm + •••+Cm-1) x (£Pi + £P2 + •••+£Pm+£e ) <2 ((?i,2(t)+gi,3(t) + •••)+

'------------V------------'

cm

+ 3 (gi,2,3(t) + <?i,2,4(t) + •••) + (gi,2(t) + gi,3(t) + •••) +

'---------------V------------' '------------V-----------'

c3 c2

(12)

+ 4 (<?i,2,3,4(t) + gi,2,3,5(t) + •••) + 2 (<71,2,3(t) + <?i,2,4(t) + •••) +

'---------------V----------------' '---------------V-------------'

c4 cm

+ . . . + TO X g(t) +(m — 2) X (gi,2,...,m-l(t) + gi,2,...,m-2,m(t) + ...).

cm-1

Note that

9(t) = J,\\z(t) - z(t)\\ = max ||z(t) - z(t)\\ < ... < gitjtk{t) =

p z(t)eAi(t)n...nAm (t)

= ^ л. Jz(t) — z(t)W<9i,i(t) . . max l|z-(t) — z(t)\\

z(t)eAi(t)nAj (t)nAk (t) z(t)eAi (t)nAj(t)

and

9*{t) = ~б\\Ф) ~ z(t)\\ = min 11 z (t) - z(t)\\ < ... < gi,j,k{t) =

p z(t)eAi(t)n...nAm(t)

= ^ Л . mln^ л , Jz(t) — z(t)\\<9i,j(t) ,, min \z(t) — z(t)\

z(t)eAi(t)nAj (t)nAk(t) z(t)eAi(t)nAj (t)

where Ai(t) C Ai is Apollonius circle in the game rPi\E (xi(t), z(t)), i = Obviously, g*(t) < g(t) for all t G [0,t|. Therefore, inequality (12) we can write in the following form

(2m - 2) x g* < (2Cm + 3Cm + Cm + 4C^ + 2Cg +... + m x C% + (m - 2) x Cm-1) x g*.

Then inequality (12) holds for all t G [0,t|. (The proof of this fact is like the proof of Proposition 2.) It is clear that there exists an imputation

£t = (—g(t), —g(t),... j -g*, (m - 1) x g(t)),

which belongs to E^{x1(t),... ,xm(t),z(t)) and satisfies system (10). So, C^{x1(t), ..., xm(t), z(t)) = 0 for all t G [0,t].

Now it remains to check condition 2 in definition 4. According to Remark 2, we must prove that condition (11) holds for all imputations from the core. For convenience we need to get an equivalent game rv/ (x0,..., xg, z0) (see system (5)). The core in this game can have various forms corresponding to initial positions of the players. In order to prove time-consistency of the core it is sufficient to consider a case of the widest core. This core Cv/ (x0,..., xg, z0) represents a convex hull of the imputations

= (g° - go,g°0 - g 0. ..,g<g-1 - g0,gm - g*, (m - 1)g ),

= (g° - g0,g°°- g0,. ..,gL-1 - g*,g(L -g0, (m - 1)g0),

nm-1

nm

n%+1

nm+2

= (g0 - g0, g00 - g* ^.^ gm-1 - g0, gg - g0, (m - 1)g0)

= (g0 - g*,g2i - g0,

g0

gm-1

00

g0,g0m - g0, (m - 1)g0),

(0,g0 - g*,...,gm-1,g'L^

(g0 - g*, 0,..., g0^-l, gm, g0),

^2m-1

n‘2m+1 n2m+2 __

(g1,g2,...,0,g<m -g*,g^-1), (g°i,g2^,...,g(L-1 - g*-, 0,g(L), (0,g0 + g0 - 2g0,.. .,g{L-1 - g0,g(L - g0,mg0 - g*),

= (0 g0 g0 g0 g0 g0 g0 m g g*)

= (0,g2 - g , ... gm-1 - g ,gm - g , mg - g ),

(13)

n

,3m-1

,3m+1

(3m+2

(0, g0 - g0,...,g0 + g0m-1 - 2g0, gm - g0, mg0 - g*), (0, g0 - g0,..., g<m-1 - g0, g0 + g0m - 2g0, mg0 - g* ), (g0 + g0 - 2g0,0,..., g°m-1 - g0, gm-0, mg0 - g*), (g0 - g0, ^..^ gm-1- g0, m- o, mg0 - g*),

nP-1

= (g0 - g 0 g-2 - g0,..., +0, gm-1 + gm - 2g, mg0 - g*), np = (g0 - g0,g0 - g0,...,g(L-1 + g<m - 2g0, 0,mg0 - g*),

where p = 2m + 2 x Cm.

Hence, any imputation n0 G Cv/ (x0,..., x%, z0) can be represented as

PP

V° = = 1, — 0 for 1 < j < 6.

j=1 j=1

1

n

2

n

By substituting (13) into (14) and denoting

2m

«1

m 2m P

xj + xj

j = 1 j=m+2 j=3m+1

^2m+1

A,

m+1

/

«2 =

A2m+1

m+1 3m p

J2Ai + 53 Aj + Aj

j = 1 j=m+3 j=4m+1

A

■m+2

«m+1

1

- Aj - 2A3m+1 - ^2 Aj

j=1

m2

j=3m+2

3m

- Aj - Am - 2A2m+1 - Aj - X/ Aj

j=1 j=2m+1 j=4m

m p

(m -1^53 Aj + m x

j=1 j=2m+1

Am + Am+2 Am -1 + Am+2

j

«m+2

p

Aj

j=2m+1

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

we have n0 = g0«1 + g0«2 +... + g{)m«m + g0«m+1 -g*«m+2. The main idea is to prove that n0 is time-consistent and, namely, to find an integrable vector function t(t) on [0,t] such that Ti(t) — 0, i G N, and condition (11) holds.

Indeed, at the last moment t = t the core Cv/ (x1 (t),.. .xm(t), t(t)) = 0 as a solution of the current game rv/(x(t),... xm(t) with integral payoffs and zero-duration. Thus, from condition (11) it follows that

n(x(t),...xm(t) = / T (y)dy = n0.

0

(15)

0 0 0

On account of (15), we can put r(y) = ^fsi + s2 + ... + ^sm+i — 7«m+2- Finally, according to (15), we have

T(y)dy

rt rg0

0

.X .92 , .9

+ ~ys 2 +... + --5m+i — 754

dy

0

n

0

0

0 0 0 = g1 «1 + g2 «2 + ... + g «m+2 - g «m+2.

Now the aim is to show that the imputation n0 - n(x1 (t),..., xm(t), z(t)) belongs to the core of the current game rv/ (x1 (t),... ,xm (t), z(t)) for any t G [0, t]. Substituting the vector-function t(t) into (11) and taking into account Proposition 3 and Remark 1, we obtain

n0 - n(x1(t),...,xm(t),z(t)) = n0 - [ T(y)dy =

0

= [<7isi + 5i2s2 + • • • + g°sm+i — g*sm+2] (l — |) = rf for all t G [0,t.

It is not hard to prove that rf belongs to the core Cv/ (x1(t),..., xm(t), z(t)) of the game rv/(x1 (t),... ,xm(t), z(t)).

So, condition 2 of definition 4 holds for all t G [0, t] and for all r/° G Cv/ (x01,...,xm,z0). Since rv (x1,...,xm ,z0) and rv/ (x0 ,...,x% , z0) are equivalent we have n0 = £<0 + g0 i = 1,m +1. Therefore, all our conclusions are true for . This completes the prove of the theorem.

Acknowlegments

The author expresses the gratitude to D.Kuzyutin and S.Tarashnina for useful discussions on the subjects.

References

Isaaks R. 1965. Differential Games: a mathematical theory with applications to warfare and pursuit, Control and Optimization. New York: Wiley.

Bondareva O. 1963. Some applications of methods of linear programming to cooperative games theory. Problems of Cybernatics, 10: 119-140 (in Russian).

Chikrii A.A. 1992. Conflictly Controlled Processes. Naukova Dumka, Kiev (in Russian).

Chistyakov S. 1993. Dynamic aspect of solving of classical cooperative games. Report of Russian Academy of Science N. Vol. 330, N 6: 707-709 (in Russian).

Melikjan A.A. 1981. Optimal interaction in two-evaders game. Techn. Cybernetics,, 2 (in Russian).

Petrosjan L. 1993. Differential Games of Pursuit. World Scientific Publishing, Singapore.

Petrosjan L., Zenkevich N. 1996. Game Theory. World Scientific Publishing, Singapore.

Petrosjan L., Kuzyutin D. 2008. Consintent solutions of position games. Saint-Petersburg University Press (in Russian).

Petrosjan L., Tomskii G. 1983. Geometry of Simple Pursui. Nauka, Novosibirsk (in Russian).

Pecherskii S., Yanovskaya E. 2004. Cooperative games: solutions and axioms. European University Press , Saint-Petersburg (in Russian).

Scarf H.E. 1967. The core of an n-person game. Econometrica, 35: 50-69.

Shapley L. 1967. On balanced sets and cores. Naval Research Logistic Quarterly, 14: 453-460.

Tarashnina S. 1998. Nash equilibria in a differential pursuit game with one pursuer and m evaders. Game Theory and Applications. N.Y., Nova Science Publ. Vol. III: 115-123.

Tarashnina S. 2002. Time-Consistent Solution of a Cooperative Group Pursuit Game. International Game Theory Review. Vol. 4, 3: 301-317.

i Надоели баннеры? Вы всегда можете отключить рекламу.