Yaroslavna B. Pankratova1 and Svetlana I. Tarashnina2
1 St.Petersburg State University,
Faculty of Applied Mathematics and Control Processes, Universitetsky pr. 35, St.Petersburg, 198504, Russia International Banking Institute,
Nevsky prospect 60, St.Petersburg, 191023, Russia E-mail: [email protected]
2 St.Petersburg State University,
Faculty of Applied Mathematics and Control Processes, Universitetsky pr. 35, St.Petersburg, 198504, Russia E-mail: [email protected]
Abstract In this paper we study a game of group pursuit in which players move on a plane with bounded velocities. The game is supposed to be a nonzero-sum simple pursuit game between a pursuer and m evaders acting independently of each other. The case of complete information is considered. Here we assume that the evaders are discriminated. Two different approaches to formalize this pursuit problem are considered: noncooperative and cooperative. In a noncooperative case we construct a Nash equilibrium, and in a cooperative case we construct the core. We proved that the core is not empty for any initial positions of the players.
Keywords: group pursuit game, Nash equilibrium, realizability area, TU-game, core.
1. Introduction
The process of pursuit represents a typical conflict situation. When only two players are involved in the process of pursuit we deal with a classical zero-sum differential pursuit game. These games grew out of the military problems and were developed by Isaaks (1965).
When more than two players participate in a game and the players’ objectives are not strictly opposite it is rather reasonable to consider such a game as a nonzero-sum one. This approach for solving a group pursuit problem was introduced in (Petrosjan and Shirjaev, 1981) and further applied in works (Tarashnina, 1998), (Pankratova and Tarashnina, 2004).
It is obvious that players’ goals are not always strictly opposed. We want to illustrate how differential games can be used for solving different kind of problems. In this case under "capture" we can understand just meeting of players and delivering some goods or information. In terms, players are not aimed to destruct each other. Moreover, players in a nonzero-sum game may cooperate with each other to get a maximal profit.
We investigate a nonzero-sum group pursuit game using two different approaches. We construct a game in normal form and its TU-cooperative version and find their solutions.
2. Nonzero-sum group pursuit game
In the work we study a game of pursuit in which n players — the pursuer P and evaders E1,... ,Em — move on a plane with constant velocities with the possibility of changing the direction of their motion at each time instant (simple motion). We consider the case of complete information. This means that each player at each time instant t > 0 knows the moment t and his own as well as all other player’s positions. Additionally, we assume that the pursuer uses strategies with discrimination against the evaders. This means that at each instant t the pursuer P knows the vector-speeds chosen by the evaders at that time moment.
The players start their motion at moment t = 0 at the initial positions
z°p = (x°p, y°p), z° = (x°, y°), i = l,m.
Let a and /3* are velocities of P and Ei (i = 1, to), respectively. Suppose that a > max j3i. Denote by E\ = z1 = (xti,yti) and P1 = zp = (xtp,ytP) the current
i=1,...,m
positions of evader Ei and pursuer P at the moment t > 0, respectively.
The motion of players is described by the following system of differential equations
Zp = Up, Up G Up, ^
Zi = u^, u^ € Ueh * = 1, m,
with initial conditions
zP(0) = z°p, Zi(0) = z°, i = 1, m, (2)
where zP, z1,...,zm e R2. The vectors uP G UP and u^ G UEi are control variables of P and Ei (i = 1, m), respectively. The set of control variables Up, £/g. have the following forms
Up = {up = (up,up) : (Up)2 + (u2p)2 = a2},
UEi = {uEi = {ulEi,u2Ei) : (ulE,)2 + {u2Bi)2 = j32}, i = 1, m.
We need to explain how the players choose their control variables throughout the game according to the incoming information. Define a strategy of the evader as a function of time and current positions of the players. A strategy of player Ei
is a function uEi(t,zip,z1 ,...,ztm) with values in UEi. The evaders use piecewise
open-loop strategies. Denote by UEi the set of admissible strategies of player Ei, i = 1, m.
A strategy of player P is a function of time, players’ positions and velocity-vectors of the evaders, i.e.
uP(t, z p, z^ ,...,z^, uEi ^.^ uEm).
That means, that the class of admissible strategies of the pursuer consists of strategies with discrimination (counterstrategies).
The game is played as follows: at the initial time instant the pursuer dictates to the evaders E1,..., Em a certain behaviour and chooses some pursuit order. In other words, the pursuer fixes some pursuit order and calculates the total pursuit time taking into account that the evaders use the prescribed behaviour. After that, P consequently pursues the evaders according to the chosen order and changes it
as soon as any of the evader chooses a direction of motion different from the one dictated by the pursuer. So, the pursuer punishes deviated evader, changing pursuit order by starting the pursuit of defected evader. If group of evaders is deviated then the pursuer punishes anyone of this group.
Let n be the set of all possible orders. Now we define a notion of a punishment strategy of the pursuer.
Definition 1. We say that the triple unP = (n,uP,p) is a punishment strategy of pursuer P with
— n(z0P, z0,..., zm ,uEl,..., uEm) is a pursuit order chosen by the pursuer at the initial instant t = 0 for some fixed strategy profile of the evaders uEl ,...,uEm;
— uP (t,ziP ,z1 ,...,ztm, uE1,..., uEm), t > 0, is a pursuit strategy of P that consists in consequent pursuit of the evaders according to the chosen order;
— p = p(t, uE1,..., uE ) is an element of punishment that consists in changing the pursuit order at the moment t by starting the pursuit of defected evader in case any of the evaders chooses a direction of motion different from (uE1,..., uE ) dictated by the pursuer.
Denote by UP = {uPn the set of punishment strategies of the pursuer.
Evader Ei is considered caught if the positions of P and Ei coincide at some time instant. We say that the game is over if the pursuer captures all the evaders. Let n = {l,...,i,..., m} be a pursuit order chosen by pursuer P.
Denoting by KP the payoff function of P, and by KEi the payoff function of evader Ei, i = 1, to, we have
KEi (uP ,uEi ,...,uEi ,...,uEm) = '^2 T'п, (3)
k<i, k=l,m
where Tj? is the time spent by the pursuer for capture of the evader Ek (k = 1, m) minus time according to the pursuit order n G n. Here i is a number of the evader Ei in the pursuit order n = {l,...,i,..., m} and k (k < i) is a number of the evader which is pursued before Ei inclusively.
The payoff of P is defined as the negative value of the payoff of evader Ei that is caught last. Thus,
KP (uP ,uEi ,...,uEm ) = —T n, (4)
m
where Tn T£ is the total pursuit time, and n is the chosen pursuit order.
k=1
So, we define the nonzero-sum pursuit game as a normal form game as follows
r (zP, z° ...,z0m ) = N, {Ui}iEN, {Ki}iEN), (5)
where N = {P, E1,..., Em} is the set of players, Ui is the set of admissible strategies of player i and Ki is a payoff function of player i (i G N), defined by (3) and (4). Each game depends on a choice of the initial positions of the players. Let us fix the players’ initial positions and consider the game r (zP, z0,..., z^m).
3. Nash equilibrium in the game r(z0, z0 ..., z^)
In nonzero-sum games there is a number of solution concepts that are based on some additional assumptions for players’ behaviour and structure of the game. One of them is the well-known concept of Nash equilibrium. In considered game there exists a whole family of Nash equilibria that includes some which are extremely adverse to the evaders’ interests, and some which are favorable for them, as well as all intermediate equilibria. Different kind of Nash equilibria in the game r(z0p, z0,..., z(m) were constructed in (Petrosjan and Shirjaev, 1981), (Tarashnina, 1998), (Pankratova and Tarashnina, 2004). In this game we consider the extremely odd Nash equilibrium that is the most disadvantageous for the evaders among all the equilibria.
The strategy set of pursuer P consists of unP corresponding to the pursuit order n G n. The pursuer aims to minimize the total pursuit time and each evader wants to avoid his own capture as long as possible and does not care about the other evaders. Denote by n* the pursuit order which minimizes the total pursuit time and by uP the corresponding strategy of the pursuer.
Let Ej be the evader who is currently pursued, j' g{1,..., m}. Ej is the j-th in the line of pursuit evader among the ones not yet caught, j g{1,..., m}, j > j'.
Now let us describe two types of behaviours of evader Ei (i = 1, to):
- Ej , j g{1,... ,m}, uses behaviour [ujEi ] that prescribes to move along the
straight line connecting his own and the pursuer’s current positions in the direction from P (to the current capture point Nj ).
- Ej, j G {^.. . m}, uses behaviour [v?Ei] that prescribes to move along the
straight line to the capture point of the currently pursued evader Ej , j > j',
j j T n
namely, to the current capture point Nj , where Nj = P jI.
It is obvious that throughout the game at some moment tEi > 0 each evader Ei changes its type from Ej into Ej . So, the strategy uE.(t, •) of evader Ei (i = 2, m) can be describe as
u*Ei (t,
[ujEi], 0 < t <tEi
j], t — tEi .
During the game each evaders, accept E1, consequently applies both types of behaviours. The player E1 uses just type [ujE,], i.e. his strategy is u*E (t, ■) = [ujE ],
t — 0. * 11 Suppose that T0 = 0, N0 = P0 = (0,0).
The following theorem (Tarashnina, 1998) defines the conditions that support the described Nash equilibrium (uP , u*El,..., u*Em) in the game r (z0P, z0,..., z^m).
Theorem 1. In the game r(z0, z1_,..., z<m) in case the conditons
(6)
and
a ~ A
a — (3.
i1
N i-2E?L\2
+
>
N i-2ET*-
i = 2,m.
(7)
hold for all i = 1,m there exists a Nash equilibrium that is constructed as follows:
2
1. Ei (i = 1, to) chooses the strategy u*Ei that dictates to him
- according to the behaviour [ujE ] to move along the straight line connecting
current positions Ej and P at the moment Ti-1 in the direction from P if i = j', where j' G {1, ..., m};
- according to the behaviour [ujE ] to move along the straight line to the capture
point of the currently pursued evader Ej , j' G {1,..., m}, j > j', i.e. to point
Nj , where Nj = PTjI, if i = j, where j > j'.
2. P chooses the strategy unP that minimizes the total pursuit time if each Ej,
(i = 1, to) adheres to the strategy u*E , and P changes the pursuit order as soon as any of the evaders Ej (i > j') that are not yet caught deviates from the strategy u*E. and pursuits the deviated evader the first.
Now we introduce the notion of a realizability area. For this purpose we associate with each evader Ei an area Qi of initial positions of evaders that support the Nash equilibrium and refer to it as the realizability area of the punishment strategy of the pursuer with respect to evader Ei, (i = 1, m).
In words, area Qi is the set of all Ei’s initial positions such that, when there, evader Ei has to adhere the strategy u*Ei dictated to him by the pursuer.
Definition 2. The punishment strategy of pursuer P is called realizable with respect to evader Ei, if the life time of evader Ei, i g{2,..., m}, in case Ei adheres to the strategy u*Ei is larger then if Ei deviates, i. e. inequality (7) holds for fixed i.
Definition 3. The punishment strategy of pursuer P is called realizable in the game r(zp, z®,..z^) if inequality (7) holds for all i = 2, m.
In (Pankratova and Tarashnina, 2004) some illustrative examples for constructing of realizability areas of the punishment strategy are have been presented.
4. Cooperative pursuit game rv (z0, z0,..., z^)
Let us suppose that the players in the game can form a coalitions. Construct a cooperative game between pursuer P and evaders E1, ..., Em in assumption the players use the strategies described in the previous paragraph without the threat of punishment.
Assume that utility of any player is transferable.
Let 2n be the set of all subsets of N. The function v: 2N ^ R1 with the following two properties
1. v(0)=0, where 0 is an empty set,
2. v(s U R) — v(S) + v(R) for all R, S c N with S f| R = 0,
is called the characteristic function of the game . Condition 2 is a superadditivity property.
For any coalition S c N we define the characteristic function as follows v(S) = max mi^ KAuS,uN\S),
US UN\S
iGS
where uS and uN\S are vectors of admissible strategies of the coalitions S and N \ S, respectively. Using this approach, we construct the characteristic function v for the game r(zP,z0,...,zm).
Consider an arbitrary permutation n of the ordered set of indexes M = {1,2,..., m}. With this permutation we associate a substitution kn, i.e. kn : M ^ M. This means that k G M goes to kn G M in permutation n.
The characteristic function of the game has the form
kn <i1
{{Eil,Ei2}]z0P,z°l,... ,z°m) = min < £ Tk + E TL where = 1 ,m,
nEn
i1 = i2.
kn <i1 kn <i2
v({Ei1 ,Ei2,Ei3};zP,z0,...,z°m) = min £ Tn + E Tkn + E Tkn >,where
nEn \kn <i1 n kn <i2 n kn <i3 n
i 1, *2, *3 = 1, rri, *1 ^ *2 7^ *3-
v
i2 irn-1 im
v({E1,...,Em}; zP ,z0,...,zOm) = mm Tn + £ Tnn + ...+ £ Tin + £ Tin
nEn ^ kn =i1 kn =i1 kn =i1
v({P}; zP, z1,..., zm) = max{— Tn }.
nEn
v({P, Eh}-, Zp, z°,..., z^) = max < -Tv + £ T£ >= 0, where *i = 1, to.
nEn
kn <i1
)({P,Ei1 ,Ei2}; zP ,z0,...,zm) = max i —Tn + E Tn + £ Tn J>,wherei1,i2 =
nEn
kn <i1 kn <i2
1, m, i1 = i2.
v({p, E1,...-l Em};z0, 4^.^ z°
ax —Tn + Tl + £ Tk + ... + £ T£ + ]T 2^ =
k=i1 k=i1 k=i1
{i2 i£-1 }
Ti + £ T't + ..+ £ Tn
k=i1 k=i1 J
For simplicity denote by
Th = innU £ Tkn },
nEn
n7T
k
k-K <i1
Tm2 =min £ T£+ £ Tl ^
nEn \^kn <i1 kn <i2 J
Ti1i2i3 = miJ £ Tl+ £ Tl+ £ Tl
nEn \kn <i1 n kn <i2 n k n <i3 n
i2 i£-1 im
Ti1i2...£ = min TI + £ Tl+...+ £ Tl+ £ Tl
nEn I kn =i1 kn =i1 kn =i1
T=mn{—T'}■ (8)
T1 = max < —Tn + £ T£ \ ,
nEn{ kn <i1 j
Ti1i2 =maxj —Tn + £ Tl + £ Tl \,
nEn ^ kn <i1 kn <i2 J
Ti1i2i3 = max I —Tn + £ TL+ £ tk + £ Tk
nEn
k + Tk + Tk
kn <i1 k n <i2 k n <i3
t* = mEaxiTn1 + x. ti + ...+x, Tkn
k
kn =i1 kn =i1
The characteristic function v can be described in the following form
v({P};z°P,zP...,z^n)=f^ ____
v({Eh}', ZP> zi> ■ ■ ■ > zm) =Ti1L h = l,m,_
V({P, Eh}', z°p, zi, ■ ■ ■, Z<L) =Ti^=0, *i = 1 , to,
=Tili2, i1,i2 = l,m, *1^*2,
V ({_P, Ei% , £^2 }jz.P?zl?***? zm) ^1^2 7 H ? ^2 1 ? H 7^ ^2?
^ ■? E%2 i Ei3 J: zp> zh • • • 7 zm) ■? H ? ^2 ? ^3 1 ? H 7^ ^2 7^ ^3 ?
^ ({-P, Ejil, £^2 ? ^3 }jz.P?z1?***? zm) -^i ^2^37 H?^2?^3 1? H 7^ ^2 7^ ^3?
v
({E1,..., Em}; zP, z!L,..., zm.) = T1
.m?
v({P, E1,..., Em } ; zP, z!L,..., zm ) = T *.
Here and then we will use following designation v({Ei1,..., Eik }; z°P, z0,...,zm) = v(Ei1,..., Eik) and v({P, Eh ,...,Eik }; z% ,zo^,...,zt0l) = v(P, Eh,... ,Eik)
Definition 4. The pair (N,v(S; z0 ,z<^,...,z<m ),S C N), where N is the set of players, and v is the characteristic function defined by (8)-(9) is called a cooperative pursuit game in characteristic function form and denoted by rv(zP, z0,..., zm).
Example 1. let us construct the characteristic function for a pursuit game with a pursuer and three evaders according to formulas (8) and (9). Let a = 1 and /?* = -, i = 1, 2, 3.
Fix the initial positions of the players: P0 = (0,0), E0 = (1, 0), Eg = (-2, 4), Eg = (5, 7). Note that for the chosen initial positions of the players the punishment strategy of P is realizable. First of all, we compose a table with the players’ payoffs for all pursuit orders n e {1,..., 6}.
Table 1: The players’ payoffs for different pursuit orders.
Payoff TV! = {1,2,3} TT2 = {1,3,2}
Ke 1 Tp = 2 Tp = 2
Ke2 Tp = 2 + 9,31 = 11,31 Tp2 = 2 + 13, 23 + 11, 34 = 26, 57
Ke3 Ti23 = 2 + 9, 31 + 9, 09 = 20, 4 Tp = 2 + 13, 23 = 15, 23
K p Tp1 = 20,4 Tp = 26, 57
7T3 = {2,3,1} TT4 = {2,3,1}
Ke± Tp = 8,94 + 9,92 = 18,86 Tp1 = 8, 94 + 9, 7 + 9, 73 = 28, 37
Ke2 Tp = 8,94 Tp = 8, 94
Ke3 Tp3 = 8, 94 + 9, 92 + 26, 03 = 44, 89 Tp = 8, 94 + 9, 7 = 18, 64
K p Tp = 44, 89 Tp = 28, 37
7Tb = {3,1,2} 7T6 = {3,2,1}
Ke± Tp = 17, 2 + 16, 08 = 33, 28 Tpx = 17,2 + 14, 04 + 27, 23 = 58,47
Ke2 Tp2 = 17,2 + 16, 08 + 33, 52 = 66, 8 Tp = 17, 2 + 14, 04 = 31, 24
Ke3 Tp = 17, 2 Tp = 17, 2
K p Tp = 66,8 Tp = 58, 47
The characteristic function, according to formulas (8) and (9), has the following form
v(P) = max{-20,4; -26, 57; -44, 89; -28, 37; -66, 8; -58,47} = -20,4,
v(E1) = min{2; 2; 18, 86; 28, 37; 33, 28; 58,47} = 2,
v(E2) = min{l1, 31; 26, 57; 8, 94; 8, 94; 66, 8; 31, 24} = 8, 94,
v(E3) = min{20,4; 15, 23; 44, 89; 18, 64; 17, 2; 17, 2} = 15, 23,
v(p,Ei) = max{-20, 4 + 2; -26, 57 + 2; -44, 89+ 18, 86;
-28, 37 + 28, 37; -66, 8 + 33, 28; -58,47 + 58,47} = 0, v(P,E2) = max{-20, 4+11, 31; -26, 57+ 26, 57; -44, 89 + 8, 94;
-28, 37+ 8, 94; -66, 8 + 66, 8; -58,47+ 31, 24} = 0, v(P,E3) = max{-20, 4 + 20, 4; -26, 57+ 15, 23; -44, 89 + 44, 89;
-28, 37+ 18, 64; -66, 8 + 17, 2; -58,47+ 17, 2} = 0, v(E1 ,E2) = min{2+ 11, 31; 2 + 26, 57; 18, 86 + 8, 94;
28, 37+ 8, 94; 33, 28 + 66, 8; 58,47 + 31, 24} = 13, 31, v(E1, E3) = min{2 + 20,4; 2 + 15, 23; 18, 86 + 44, 89;
28, 37 + 18, 64; 33, 28 + 17, 2; 58,47 + 17, 2} = 17, 23, v(E2,E3)=min{11, 31 + 20, 4; 26, 57+ 15, 23; 8, 94+ 18, 64;
8, 94 + 44, 89; 66, 8 + 17, 2; 31, 24 + 17, 2} = 27, 58, v(P,E1 ,E2) = max{-20, 4+2 + 11, 31; -26, 57+ 2 + 26, 57;
-44, 89 + 18, 86 + 8, 94; -28, 37 + 28, 37 + 8, 94;
-66, 8 + 33, 28 + 66, 8; -58,47 + 58,47 + 31, 24} = 33, 28, v(P,E1 ,E3) = max{-20, 4+2 + 20,4; -26, 57 + 2 + 15, 23;
-44, 89 + 18, 86 + 44, 89; -28, 37 + 28, 37 + 18, 64;
-66, 8 + 33, 28 + 17, 2; -58, 47 + 58,47 + 17, 2} = 18, 86,
v(P, E2,E3) = max{-20, 4 + 11, 31 + 20, 4; -26, 57 + 26, 57 + 15, 23;
-44, 89 + 8, 94 + 44, 89; -28, 37 + 8, 94 + 18, 64;
-66, 8 + 66, 8 + 17, 2; -58,47+ 31, 24 + 17, 2} = 17, 2, v(E1 ,E2,E3) = min{2 + 11, 31 + 20,4; 2 + 26, 57+ 15, 23;
18, 86 + 8, 94 + 44, 89; 28, 37 + 8, 94 + 18, 64;
33, 28 + 66, 8 + 17, 2; 58,47 + 31, 24 + 17, 2} = 33, 71, v(P, E1 ,E2, E3) = max{-20,4 + 2+ 11, 31 + 20,4;
-26, 57 + 2 + 26, 57 + 15, 23; -44, 89 + 18, 86 + 8, 94 + 44, 89;
-28, 37 + 28, 37 + 8, 94 + 18, 64; -66, 8 + 33, 28 + 66, 8 + 17, 2;
-58,47 + 58,47 + 31, 24 + 17, 2} = 50, 48. Finelly, we construct a cooperative pursuit game in the characteristic function form. That is
v(P) = -20,4,
v(e1 ) = 2 v(E2) = 8, 94, v(E3) = 15, 23, v(P,E1) = ° v(P,E2) = ° v(P,E3) = °
v(E1 ,E2) = 13, 31, v(E1 ,E3) = 17, 23, v(E2,E3) = 27, 58, v(P,E1,E2) = 33, 28, v(P, E1, E3) = 18, 86, v(P,E2,E3) = 17, 2, v(E1, E2, E3) = 33, 71, v(P, E1, E2, E3) = 50, 48.
On this example we can see that the characteristic function of the game is superadditive.
Further we proof that it is true for any number of evaders and any initial positions of the players.
Theorem 2. In the game rv (zp,zg,... ,z{°l) the characteristic function v that is constructed by formulas (8) and (9) is superadditive.
Proof. In order to prove the theorem we have to show that inequality
v(S) + v(T) < v(S U T)
holds for all coalitions S, T c N, S n T = 0.
In fact, the following inequalities are fulfilled.
Tii = mn | £ Tln 1 < Thi2 = mini £ Tn + £ TkA < Tii i2 i3 =
= min J £ Tl + £ Tl+ £ tA < ... < T^-i =
^k n <ii kn <i2 kn <i3 J
= mn J £ Ti +... + £ Ti \ <
Ik n <ii k n <im—1 \
< Tii -im =^1^ £ Tl+ ...+ £ Tl
I kn <ii kn <im
It can be easily shown that
-T = max{-T n} < Thi2 = mane J -Tn + £ Tl + £ Tl \ < ... <
neU- neU- I kn <ii kn <i2 )
< Tii...im—i = max I -Tn + £ ti +...+ £ ti \<
nGn
lkn + ...+ ^ Tk n
kn <ii kn <im — i
i2 im—i
< T* = max -Tn + Tl + £ Ik + ... + £ T£ + £ T£ \ .
nen ^ k=ii k=ii k=ii J
So, we have
Tii < Tiii2 < Tiii2i3 < ... < Tii---im — i < Tii...im , (10)
-T < Tiii2 < Tiii2i3 < ... < Tiii2...im—i < T^ . (11)
Let S = {P} and T = {Eii}. Since T = min{Tn} > Tii, we have v(P)+v(Eii) =
nen
-T+ Th < 0 = v(P, Eh), *1 = l,m. ^ ^ ^
For S = {Eh} and T = {Ei2} we have v{Eh) + v(Ei2) = Til + Ti2 < tili2 v(Ei1,Ei2), *i, *2 = 1, m, *i ^ *2- This follows from (10).
For S = {P} and T = {Eii, Ei2} we have v(P) + v(Eii, Ei2) = -T + Tii2 =
max{-T} + min ^ Tl + ^ Tl
k <i k <i2
= -T^' +™S { E Tk. + Z Tl } < -Tn' + J2 Tf„ + £ T!l <
kn <ii kn <i2 J kn * <ii kn * <i2
< max ^ -Tn + £ ti + £ ti I=Tii2=v(P, Eii ,Ei2), k < i k < i2
*i, *2 = 1 ,m, i 1 ^ *2-
Now consider S = {Eii} and T = {P,Ei2}. Then
v(Eii)+v(p, Ei2) = Ti + Ti2 = mm< £ Tkn >+max<-Tn + £ Tkn
kn <ii n nen kn <i2 "
min
kn <ii J kn* <i2 kn* <ii kn* <i2
~ Tre^l I + 53 r^'k* 1 ~ -^1*2) *1; *2 — 1, m, n 7^ *2-
k <i k <i2
For S = {P, Ei i} and T = {Ei2, Ei3} we have
nen 1 ' ^ — - kn
v(P,Eii ) + v(Ei2 ,Ei3 ) = 'Tii + Ti2i3 = ma^ -Tn ^£ kn < t1Tl } +
+ min £ Tl + £ Tn = -T^ ^ Tkn;* +min £ Tl + £ Tn <
nen (^kn- <i2 kn <i3 J kn * <ii nen ^kn <i2 kn <i3 J
< -T-- + ^ ^ ^ T-; + £ Tf* <
k <i1 k <i2 k <i3
<nn\-Tn + 53 Ti + E Ti + E Tn
k < i1 k <i2 k <i3
= Ti1i2i3=v(P,Eil,Ei2,Ei3), ihi2,i3 = l,m,
Now consider two coalitions each of which includes only evaders: E; = {Eii ,...,Eil}
and Es = {Ej1 , . . . , Ejs }, E; p| Es = 0, ik:jq = 1? 'Tft'i ik 7^ jq: H 7^ ••• 7^
ji ^ ^ js, k = 1,1 e q = 1, s. For this coalitions we get
v(Eii , ..., Eil ) + v(Eji ,..., Ejs ) = Tii...il + Tji ...js =
m'S 1 £ Tn. + -+£ Tq+™s 1 £ Ti + -+ £ TU<
k <i1 k <il k <j1 k <js
<men1 E tl + ... + E Tkn + E Tkn +... + 53 Tk n
k <i1 k <il k <j1 k <js
= Tii...il ji...js = v(Eii ,..., Eil , Eji ,..., Ejs ).
It remains to consider the coalitions S = {P, Eit,..., Eit} and T = {E^,..., Ejs}
jq = 1? 7^ jqi H 7^ * * * 7^ ^: ^'l 7^ * * * 7^ js: k = 1,1 and q = 1, S.
Then
v (P, Eii ,..., Eil ) + v(Eji ,... , Ejs ) = Tii...il + Tji ...js =
max 1-T + £ Tn +... + £ Tn f+nms I £ Tn +... + £ Tn
k < i 1 k < i l k < j1 k < js
-T’■ + £ TC +^ + £ TC + mg 1 £ V. +... + £ I
< i 1 < i l < j 1 < js
__* X T______* % _* % ____* % _*
<-Tn + E Tl* + ...+ E Tl*+ E Tn* +...+ E Tn* <
k <i1 k <il k <j1 k <js
<sen? 1-Tn+ 53 Tn + -+ E Tn + E Ti + -+ E Ti
k < i 1 k < i l k < j1 k < js
= Tii...ilji...js = v(P! Eii ,..., Eil , Eji ,..., Ejs ).
Finally, we consider S = {P} and T = {Eii, Ei2,..., Eim }. Hence, v(P) + v(Eii ,Ei2 ■j... Eim ) = -T + Tiii2...im =
< max{-I”} + mm T* + £ I? + ... + £ TT + £ TI
k=ii k=ii k=ii
im- i
\ k=ii k=ii k=ii )
i2 im — 1 im
<-Tn + Tn* + ^Tn* +...+ £ Tn* + ]TTn * < k=ii k=ii k=ii
i2 im—i
< max<j -Tn + T™ + £ Tn + ... + £ Tkn + ^ Tnk
k=ii k=ii k=ii )
= T * = v(P,Ei ,...,Em).
This completes the proof.
It follows from the superadditivity of v that it is profitable for the players to form the maximal coalition N and obtain the maximal total payoff that is possible in the game.
There exist various methods for distribution of the total payoff between the players in a cooperative TU-game. In our paper we consider the core as a solution concept of the game.
5. The core in the game rv(z0, z0, ..., z0^)
Let us describe the imputation set in the game rv (zp, z0,..., zm). Denote by £ = (£P,£Ei,...,£Em) an imputation in the game. The imputation set is defined as follows
Ev(z°p, zl..., z°m) = (e : £Ei > Ti, t e > -f- £ 6 = T*\. (12)
I i e N J
From (Bondareva, 1963) and (Shapley, 1967) follows the result. For an imputation £ to belong to the core of the game rv (z0,z0,... ,zl0l) it is necessary and sufficient that the following system of inequalities holds
> T, ___
Ze^ > Tn , h = 1 ,m,
> Tii =0, ii = 1, m,
Ze^ + ^Ei2 > Tili2, h,i2 = 1 ,m, h / i2,
£,P + C-Eij + £,Ei2 > Til i2 > *1, *2 = 1, m, i\ / *2 ,
+ ^E^ 2 + ^Ei3 — Tiii2i3 , *1, *2 , *3 — 1, VTl, i 2 *3 ,
ip + S.E^ + £,Ei2 + £,Ei3 > Tili2i3 , il, %2, h = 1, m, il / *2 / *3,
(13)
& + ^Eii + ... + ^Eim—i > Tii...im—i > *1 , • • • , im — 1 — 1, il -f- ... 7“ ~ 1)
, ^Eii + ... + ^Eim > Tii...im .
Denote by Cv (z0, z0,..., zm) the core of the game rv (z0, z0,..., z^m).
The following theorem holds.
Theorem 3. In the cooperative pursuit game rv (z0,z{^,...,z{0l) there exists the non-empty core for any initial positions of the players.
tm-i
Proof. First of all, we show that any imputation from the core satisfies the system (13). Suppose that imputation £ = (£P,£Ei,...,£Em) belongs to the core Cv(z0, z0,..., z^m). We have to show that system (13) is combined.
Summing the inequalities of system (13), we obtain
m
(i+cm+cm+...+cm 1) * (£p+£ei +...+£Em) >T + J2 Tk+
k=i
m
+ £ Tk + (T12 + T13 + ... + Tm-1,m) + (T12 + T13 + ... + Tm-1,m) +
fc=l '--------------v-----------' '------------v-------------' (14)
cm cm
+ (T123 + T124 +...) + (T123 + T134 +...) +
'-------V--------' '--------V------'
cm cm
+ (T1234 + T1235 + ..•) + (T1234 + T1234 + ..•) +
C4 C4
+ ... + (T12...m-1 + T12...m-2m + - -•) + (T12...m-1 + T12...m-2m + . .•) +
'--------------V------------------' '------------V---------------'
cz—i cz—i
+ ^12...to •
cmm
Let us consider of the left part of inequality (14). Taking into account v(N) = T* and 1 + Cm + ... + Cm-1 = 2m - 1, we have
mm
(2m - i) . T* > £ Tk + £ Tk
k=1 k = 1
+ (T12 + T13 + ... + Tm-1,m) + (T12 + T13 + ... + Tm-1,m)
(15)
c^ c2
+ (T123 + T12 4 + ...) + (T123 + T134 + ...) +
'-----------V-------------' '------------V------------'
c'L c3m
+ (T1234 + T1235 + ..•) + (T1 234 + T1234 + ..•) + c 4 c 4
+ ... + (T12...m-1 + T12...m-2m + •••) + (T12...m-1 + T12...m-2m + •••) —
cm—i cz—i
+T + T1...m.
Let us consider the following of values
Tn and T3i...ji—i3i+i...3m (m = Cm pairs);
Tj<. and Tji...ji—iji+i...jm (m = Cm-1 pairs);
Tjijk and Tji...ji—iji+i...jk — ijk+i ...jm (Cm pairs); Tjijk and Tji...ji—iji+i...jk — ijk+i ...jm (Cm pairs);
— T and rT1...m (1 pair).
By superadditivity of the game, we get
Tji + Tjl---ji—lji+l---jm — T *
*
T ji 1 T jl---ji—lji+l---jm —
Tji + Tjl---ji—lji+l---jm — T *
jijk jl---ji—lji+l---jk—l jk + l---jm —
T- - + T- ■ < T*
jijk ~+ jl---ji—l ji+l---jk—l jk + l---jm —
T + T-t-m — T *.
The right side of (15) can be estimated as follows:
((T1 + T23 - - - m) + ... + (Tm + T12 - - - m-l)) +
'----------------V---------------'
cm
+ ((T12 + T34 - - - m ) + ... + (Tm—1,m + T12 - - - m-2)) +
'------------------V------------------'
CL
+ ((T^123 + T4 - - - m ) + ... + (Tm-1m-2m + T1 - - - m-3)) + ... +
'------------------V--------------------'
C 3
+ ((T123 + T4 - - - m) + . . . + (Tm-2m-1m + T1 - - - m-3)) +
'-------------------V-------------------'
cm—3
+ ((T12 + T34 - - - m) + ... + (Tm-1,m + T12 - - - m-2)) +
'------------------V------------------'
cm—2
+ (T1 + 2^23 - - - m + . . . Tm + T12 - - -m-1) +
'-------------V-------------'
cm—l
+ (T + T1,2,- - -,m) —
— (cm+cm+cm+...+cm-1+cm) ■ t *.
It remans to show that the following inequality holds
(2m _ i) ■ t* > (cm+cm+cm+...+cm-1+cm) ■ t*.
(16)
If the last inequality is true then (16) is fulfilled. Taking into account that cm +
~im— 1 I c m ym + cm
cm+cm+...+cm 1+cm —2m _ l, we get
(2m - 1) ■ T* > (2m - 1) ■ T*.
It is obvious that the last inequality holds for any m. So, inequality (15) is satisfied. This means that inequality (14) is also satisfied for any initial positions of the players. Hence, system (13) is combined.
It remains to show that any vector satisfying the system (13) is an imputation of the game rv(z0,z0,...,zm). Indeed, it can be easily checked that the vector
m
rp — (T* — Ti,T1,T2,..., Tm-1, Tm) satisfies system (13) and is an imputation
i=1
of this game. This completes the proof.
6. Conclusion
The considered cooperative and noncooperative approaches to investigation of group pursuit games with one pursuer and m-evaders give us various interesting solutions and allow to look at the same problem from different points of view. This paper extends an application area of group pursuit games.
References
Bondareva, O. (1963). Some applications of methods of linear programming to coorerative games theory. Problems of Cybernatics, 10, 119-140 (in Russian).
Isaaks, R. (1965). Differential Games: a mathematical theory with applications to warfare and pursuit, Control and Optimization. New York: Wiley.
Pankratova, Y. (2007). Some cases of cooperation in differential pursuit games. Contributions to Game Theory and Management. Collected papers presented on the International Conference Game Theory and Management / Editors L.A. Petrosjan, N. A. Zenkevich. St.Petersburg. Graduated School of Management, SPbGU, pp. 361-380.
Pankratova, Ya. B. (2010). A Solution of a cooperative differential group pursuit game. Diskretnyi Analiz i Issledovanie Operatsii. Vol. 17, N 2, pp. 57-78 (in Russian).
Pankratova, Ya. and Tarashnina, S. (2004). How many people can be controlled in a group pursuit game. Theory and Decision. Kluwer Academic Publishers. 56, pp. 165-181.
Petrosjan, L.A. and Shirjaev, V. D. (1981). Simultaneous pursuit of several evaders by one pursuer. Vestnik Leningrad Univ. Math. Vol 13.
Petrosjan, L. and Tomskii, G. (1983). Geometry of Simple Pursuit. Nauka, Novosibirsk (in Russian).
Scarf, H. E. (1967). The core of an n-person game. Econometrica, 35, 50-69.
Shapley, L. (1967). On balanced sets and cores. Naval Research Logistic Quarterly 14, 453-460.
Tarashnina, S. (1998). Nash equilibria in a differential pursuit game with one pursuer and m evaders. Game Theory and Applications. N.Y. Nova Science Publ. Vol. III, pp. 115-123.
Tarashnina, S. (2002). Time-consistent solution of a cooperative group pursuit game. International Game Theory Review. Vol. 4, pp. 301-317.