UDC 519.83 Вестник СПбГУ. Прикладная математика. Информатика... 2024. Т. 20. Вып. 1
MSC 91A25
A note on cooperative differential games with pairwise interactions*
Y. He, L. A. Petrosyan
St. Petersburg State University, 7—9, Universitetskaya nab., St. Petersburg, 199034, Russian Federation
For citation: He Y., Petrosyan L. A. A note on cooperative differential games with pairwise interactions. Vestnik of Saint Petersburg University. Applied Mathematics. Computer Science. Control Processes, 2024, vol. 20, iss. 1, pp. 91-108. https://doi.org/10.21638/11701/spbu10.2024.108
In this paper, a differential game with pairwise interaction in a network is proposed. For explicitly, the vertices are players, and the edges are connections between them. Meanwhile, we consider the cooperative case. One special characteristic function is introduced and its convexity is proved. The core is used as a cooperative optimality principle. The characteristic function allows the construction of a time-consistent (dynamically stable) solutions, such as the Shapley value and the core. Finally, the results are illustrated by an example. Keywords: cooperative games, differential network games, pairwise interactions, characteristic function, the Shapley value, time-consistency.
1. Introduction. The pairwise interaction games are new and important part of modern game theory. The first study on pairwise interaction in non-cooperative cases in network structure is done in [1]. In other words, the pairwise interaction games are proper subclass of the usual graphical or network game [2]. In the case of two strategies, pairwise interaction games on the complete graph can be modeled as congestion games [3].
To the best of the author's knowledge, the research related to cooperative games with pairwise interaction is done in [4-8]. For the first time, in a cooperative form, a multistage network game with pairwise interaction is considered, when players play bimatrix games with their neighbors by network structure, and sufficient conditions for strong time consistency of the core are formulated [4]. Also, in [5], for a particular class of symmetric networks (star-network), a simplified formula for calculating the components of the Shap-ley value is obtained, and conditions for strong time consistency of the core are derived. In [6], considering multistage cooperative games with pairwise interaction, an analogue of the core is constructed and its strong time consistency proved. Furthermore, alternative approaches to constructing the characteristic function for games with pairwise interaction are considered [7]. And a new characteristic function is constructed, which has a lower computational complexity than the classical one, the IDP-core is proposed and its strong time consistency proved [8].
Since that most real-life game situations are dynamic rather than static, network differential games have become a field that attracts theoretical and technical developments [9-11]. We developed differential game models for studying congested traffic networks
[12]. Cooperative differential games on networks are developed in [13]. A time-consistent Shapley value and т value solution in a class of differential network games was proposed in
[13]. Later, Tur and Petrosyan give a simplified formula for the calculation of the Shapley value [14]. Additionally, they also proved that the core is strongly time-consistent [15]. In
* This research was supported by the Russian Science Foundation (project N 22-11-00051, https://rscf.ru/project/22-11-00051/).
© St. Petersburg State University, 2024
cooperative differential game theory, it is important for the solution to be dynamically stable (time-consistent), meaning that players have no intention to break the rules. The notion of time consistency of differential game solutions was first introduced in [16,17]. The cooperative game model is based on the characteristic function, and in [18], a new the characteristic function was introduced for cooperative differential games on networks.
In the paper, we consider a new class of games which is a subclass of differential network games namely differential games with pairwise interactions. For this class of games, we prove the convexity of the introduced characteristic function which is not true in general and guarantees that the Shapley value belongs to the core. In addition, the Shapley value is time-consistent which is an exception in differential and dynamic cooperative game theory. The theory is illustrated on nontrivial three person differential pollution control game with pairwise interactions. The results of numerical simulation are also presented in this case.
The rest of the paper is organized as follows. In Section 2 a class of differential network games with pairwise interaction is described. In Section 3 a special characteristic functions is introduced. In Section 4 the core and the Shapley value for the considered class of games are introduced. In Section 5 one illustrative example, with different differential games on the various links of the network is considered. In Section 6 the conclusions are drawn.
2. A class of differential network games with pairwise interaction. Consider a class of n-person differential network games with pairwise interaction over the time horizon [t0,T]. The players are connected to a network system. Let N = {1,2,...,n} denote the set of players in the network. The nodes of the network are used to represent the players in the network.
A pair (N, L) is called a network, where N is a set of nodes, and L c N x N is a given set of arcs. Note that the pair arc(i, i) € L. If pair arc(i, j) € L, denote link as i ■ j connects players i and j, j € K(i). It is supposed that all connections are undirected. We also denote the set of players connected to player i as K(i) = [j : arc(i,j) € L], for i € N, i = j, K(i) = K(i) U i.
The state dynamics of the game are given by
iij(T) = fij(xij(t),uij(t),uji(r)), xij(to) = x0j (1)
for t € [to; T] and i € N, j € K(i).
Here xij (t) € Rm is the state variable of player i interacting with player j € K(i) at time t, and uij (t) € Uij, Uij c CompR1, the control variable of player i interacting with player j. Every player i plays a differential game with player j according to the network structure. The function fij(xij(t),uij(t),uji(T)) is continuously differentiable in xij(t), uij (t) and uji (t).
Define the payoff of each player i at each link or arci ■ j by
T
Kij (x0j, uij ,uji, T - to) = j hj (xij (t ), uij (t ))dT.
to
Because player i plays multiple different differential games, the dynamic equation contains the player i's control and the control of his neighbor who plays the differential game with him. The payoff function of player i is not only dependent upon his control variable, which is from the control set ui(t) = (uij(t),j € K(i)), and trajectories xi(t) = (xij(t),j € K(i)) but also depend on the control variables of his neighbor, which is from the control set
uj(t) = (uji(t),i G K(j)). Denote by u(t) = (u1(t), ...,ui(t),.. ,un(t)), where ui(t) = (uij (t),j G K(i)) is the control variable of player i in the network structure. We use x0 = (xg, ...,xi0, ...,Xq) to denote the vector of initial conditions, here xl0 = (xij(t0),j G K(i)) is the set of initial conditions of player i.
The payoff function of player i is given by
Hi(x0,ui,uj,T—10)= Y. Kij(xo ,uij,uji,T-to)= E / hjX(T),uij(T))dr- (2)
je k (i) je k (i) t0
Here, the term hj(xij(t),uij(t)) is the instantaneous gain that player i can obtain through network links with player j. We also suppose that the term hj(xij(t),uij(t)) is nonnegative.
3. The characteristic function. The game r(x0,T —10) is defined on the network (N,L), the system dynamics (1) and players' payoffs are determined by (2). Player i (i £ N), choosing a control variable uij from his set of feasible controls, seeks to maximize his objective functional (2).
Suppose that players can cooperate to achieve the maximum total payoff:
X max EE \ hj (xij (t ),uij (t ))dT
ieN jek(i) Vo
subject to dynamics (1) and the corresponding to the optimal cooperative strategies of players u(t) = (ug(t), ...,ui(t), ...,ul(t)), where ui(t) = (uij, j G K(i)) exist. Denote the corresponding cooperative trajectory of player i by Xij(t), i G N, j G K(i). The trajectory x(t) = (xg(t), ...,Xi(t), ...,Xn(t)) is called optimal cooperative trajectory, where Xi(t) = (Xij(t), j G K(i)).
Then the maximal joint payoff can be expressed as
T
E E (/hj (Xij (t ),uij (T ))dT I . ieN jek(i) Vo In [18] introduce a new characteristic function
y (S ; xo,T - to) = Y E (/ hj (Xij (t ),Xji(T ))dT | + ieS jek(i)ns Vo
+ a(S) E E (] hj (Xij (t),Xji(t))dT ieS jek(i)nN\s Vo
for S c N. The values of characteristic function for each coalition are calculated as joint payoff of players from this coalition plus payoffs (multiplied on discount factor depending from S) of players which do not belong to the coalition S but have connections with players from S. We consider a special case of this characteristic function, when a(S) = a and do not depend on coalition S, a G [0,1).
Defintion 1. The characteristic function V(S; x0,T —t0) is defined as
T
V(S; x0 ,T - to) = E (/hj (t ),uij (r))dr ) +
ies jea(i)ns \to J
+ a E E (J hj (xij (t ),Uj (t ))dT j (3)
ieS jek(i)nN\s \to /
for S c N. Here a G [0,1), note that every player i from the coalition S plays the independent pairwise differential games with players j entering in K(i) n S and also with players outside coalition S (j belongs to the K(i) n (N \ S)). From (3), for coalitions {i} , {0} , {N}, we get equations
V({i} ,x0,T — to) = a Y, I J hj(xij(t),uij(T))dT ) , (4)
je к (i),j=i \t0 V({0} ,xo,T — to)=0,
V({N} ; xo,T — to)= max У У I/ hj(xij(r),uij(r))dT ) =
ieNjeк(i) \to )
= EE (J hj (xij (t ),uij (t ))dTj .
ieN je к (i) Vo /
Defintion 2. The characteristic function V(S; x0, T —10) is called convex (or super-modular) if for any coalitions Si, S2 Ç N the following condition holds:
V(Si U S2; xo, T — to) > V(Si; xo, T — t0) + V(S2; x0, T — t0) — V(Si n S2; x0,T — t0).
A game is called convex if its characteristic function is convex.
Proposition 1. The characteristic function V(S; x0,T — t0) defined by formula (3), S с N is convex.
Proof. See Appendix.
Proposition 2. The characteristic function V(S; x0,T — t0) defined by formula (3), S с N is time consistent. From (4), we obtain
V (S; x0 ,T — t0) = E E [J hj (xij (t ), uij (t))c1t\ + ieS jeк(i)ns Vo J
+ a E E (/hj (xij (t ), uij (T))dT ) + V(S; x(t),T — t), (5)
ieS jeк(i)nN\s Vo J
here a G [0,1), and Equation on (5) show the time consistency property of the cooperative-trajectory characteristic function V(S; x0,T — t0).
4. Cooperative game, the Shapley value. Next, we need to determine the rule for allocating the maximum total payoff between the players. This paper considers the core and the Shapley value as the optimality principles. We denote the set of all imputations
as L(xo, T — to)
L(x0,T — to) = [£(xo,T — to) = (£i(xo, T — to),..,£n(xo,T — to)) : V (N; xo,T — to) =
= J2 ii(xo, T — to),ii(xo, T — to) > V ({i} ; xo,T — to)} ieN
for i £ N.
4-1- The core.
Defintion 3. The core C(xo, T — to) is the subset of imputations L(xo, T — to), and is defined as
C(xo,T — to) = {^(xo,T — to) £ L(xo,T — to) &(xo,T — to) > V (S; xo,T — to)}
ies
for S c N.
4-2. The Shapley value. Using the newly defined characteristic function, we introduce the Shapley value imputation in this subsection:
Sh^.T - to)= £ (1*1- ^ -'S"! x (6)
n!
ScN S3i
x [V (S; xo, T — to) — V (S \ {i} ; xo,T — to)] for i £ N. Form (6), we get
Sk,<x«,T - „) = £ < I S I- "'С -| S 1 ''
SCN S3i
£ £ (/hj X <т <т ))dT}) +
leS jeк(i)ns Vo J
T
+ a£ £ ( / hi (xj (t ),ulj (t ))dT | — (7)
les jek(i)n(N\s) \to
— £ £ (/ hj x (t ),ulj (t ))dT | —
ies\{i} jeK(i)ns\{i} \/0
— « £ £ (j hj (xlj (t ),ulj (t ))dT
les\{i} jek(i)nN\(s\{i}) \to
Theorem 1. The Shapley value imputation in (7) satisfies the time consistency property.
x
X
Proof. By direct computation we get formula
Shi(xo,T - to) =
(|S I- 1)!(n -S |)!
n!
Z Z (J
leS je к (i)ns Vo
hj(xlj(r),ulj(r))dr I +
+ a£ E [J ht (Xlj (t ),Ulj (t ))dT I -
leS je к (l)r\N/s Vo
- E E (/ hj X (T )ulj (t ))dT| -
leS\{i} je к (l)ns\{i} Vo
— a
E E
hj (xlj (t )ulj (t ))dT
+ Shi(x(t),T - t)
ies\{i} jeK(i)nN\(s\{i}) Vc )
for i E N, which exhibits the time consistency property of the Shapley value imputation, Shi(x(t),T - t) for t E [t0,T].
In most cases, the Shapley value usually does not satisfy this condition [19-21]. 5. Example. Consider following game-theoretic model of differential network games. The network structure is shown in Figure 1. There are three players to present the three national or regional factories that participate in the game with the network structure, N = {1, 2, 3}.
Figure 1. Network structure
t
x
t
As for arc(1, 2) (similar game is considered in [22]). Regions 1 and 2 play the pollution game. Each region has an industrial production site. The production is assumed to be proportional to the pollution u12 and u21. Thus the strategy of each player is to choose the amount of pollutants emitted to the atmosphere, u12 E [0, b12], b12 > 0, u21 E [0, b21 ], b21 > 0, A12 is the amount that the government subsidizes to the factory 1 at each moment, d12 x12 (t) is the environment department that penalizes factory 1 at each moment. The dynamics of each players 1 and 2 on arc (1, 2) is described by
X12(t)= u12(t)+ u21 (t), x12(to) = x02, t E [to,T], (8)
x21(t) = u21(t) + u12(t), x21(t0) = x20\ t e [t0,T]. (9)
The payoff of each player in the pairwise interactions game on the arc(1,2) is defined
as
K^2(x12,u12(t),u21(t),T - t0) = J
to
bi2 - 2u (t)) u (t) - di2x (t) + A12
dt,
i
K21(x21,u12(t),u21(t),T - t2) = J
b2i - 2 u21(t)) u21(t) - d,21X21(t) + A21
dt.
to
As for pair arc(1, 3) (similar game is considered in [23]), we examiner another pollution game. The release pollution of each player 1 or 3 are denoted by u13 and u31, where
,13
e [0, b13], b13 > 0, u31 e [0, b31], b31 > 0. Let x13(t) and x31 (t) denote the stock of accumulated pollution by time t. The dynamics of each player 1 and 3 at pair arc(1,3) is described by
X 13(t) = u13(t)+ u31(t) - Sx13, x13(t2) = x03, t € [t2,T].
(10)
i31(t) = u13(t)+ u31(t) - Sx31, x31(t0) = x30\ t G [t0,T]. (11)
Where S is the absorption coefficient corresponding to the natural purification of the atmosphere, we assume that S > 0. Here we don't consider the additional cost. The payoff of each player in the pairwise interactions game on arc(1,3) is defined as
T
Ki3(x03,u13(t),u31(t),T - to) = j ( (b13 - 2u13(t)^j u13(t) - d13x13(t) + dt,
K31(x31,u13(t),u31(t),T - to) = J ( (b31 - 2u31(t^ u31(t) - d31x31(t) + A3^ dt.
to
As for arc(2, 3) (similar game is considered in [24]), we examiner another pollution game. The dynamics of the stock of pollution for each player at arc(2,3) is described by
x23(t)= n(u23(t) + u (t)) - ex23(t), x (t2) = x203, t € [t2,T].
(12)
Here n> 0 is the marginal influence on pollution accumulation x23 issued by the players' emissions, and e > 0, e = S is the rate of natural absorption:
x32(t)= n(u32(t) + u (t)) - ex32(t), x (t0) = x32, t G [t0,T]. The payoff of each player at arc(2, 3) is defined as
(13)
K23(x23,u23(t),u32(t),T - ti) = j
b23 - 2u23(t)) u23(t) - d23x23(t) + A23
dt,
K32(x3o2,u23(t),u32(t),T - to) = J (b32 - 1 u32(t^ u32(t) - d32X32(t)+ A32
tc
In the network game, as for multiple links, the payoff of each player is defined as
T
H1(x10,u12(t),u21(t),u13(t),u31 (t),T - to) = J (bu - 1 u12(t)^j u12(t) -
dt.
- di2X12(t) + A12
dt +
bi3 - ^u it)) u13(t) - di^x (t) + An
dt,
T
H2(x2,ui2(t),u2i(t),u23(t),us2it),T - to) = J (b2i - 1u21{t)^ u21(t) -
to
- d2iX2i(t) + A21
dt +
b2s - ^ u it)) u2s(t) - d2sx2s(t) + A2S
dt,
Hs(x30,uis(t),usi(t),u2s(t),us2it), T - t0) = J (bsi - ^usiit)^j usi(t) -
to
- doiX0iit) + Asi
dt +
bs2 - ^11 it)) u32(t) - do2X32(t) + A32
dt.
Subject to dynamics (8)-(13).
Under the cooperation, players maximize the total payoff
V({N} ; xo,T - to) = t _
E E I
max
u12 ,u21 ,u13 ,u31 ,u32 ,u2'
ieN je к (i) to
bij - 2 uij ) uV - dij Xv + Ai
dt.
Using Pontryagin Maximum Principle (PMP) to solve the optimization problem, firstly, write down the Hamiltonian function:
1
H(x0,T - t0 ,uit),V>)= ^ E [ibij - 2 uij )uij - dijxij + Aij]+ iu12 + u2i) +
ie1,2,3 je к (i)
+ ^21 (u + u12) + ^13(u13 + u31 - Sx13) + ^(u31 + u13 - Sx31) + + P23(v(u23 + u32) - ex23) + P32(v(u32 + u23) - ex32). Here we have the following boundary conditions on adjoint variable (t):
<Pij (T )=0.
Taking the first derivative with respect to u12, we get the expressions for the optimal controls:
u12(t) = b12 + (p12 + P21). The canonical system is written as
x12 = u12 + u21 = b + 2(pu + p21), (14)
p 12 = d12, p 21 = d21,
where b = b12 + b21.
Recall that the initial condition is x12(t0) = x02, also using another boundary condition, which is obtained from (8)-(13), then we get the
P12(t) = -d12(T - t),
P21(t) = -d21(T - t).
Substitute this solution to the differential equation (14) to obtain the expression for x12(t):
x12(t) = dt2 - dt20 + (b - 2dT)t + (-b + 2Td)t0 + xl2, here d = d12 + d21 . The optimal control is
u12(t) = b12 - d(T - t). Similarly, we get the optimal trajectories:
x21(t) = dt2 - dt20 + (b - 2dT)t + (-b + 2Td)t0 + xl1,
b e-S(T-t)d
x
\t) = C13e-dt + - -
S S2
here C13 = est°(x03 - f + b = b13 + bu, d= d13 + d31;
b e-S(T-t)d
x31(t) = C31e-dt + s--sp-
here C31 = eSt° (x31 - b + e-S(T-t°)d);
x23(t) = C23e-tt + ^ +
e e e
here C23 = eet°(x23 - f + e-'(T-")â+ ^), d = d23 + d32, b = b23 + b32;
w, . et ub e-e(T-t)du2 2du2
x32(t) = C32e-tt + — +-y
-23{ ^b e-t(T-t)d/2 2d/2
e
here C32 = eet0(x32 - f + ^+ ^), â = d23 + d32, b = b23 + l^. The corresponding optimal controls are
21
u21(t) = b21 - d(T - t),
и13(г) = ь13 -
й31(г) = ь31 -
и23(г) = ь^з -
в-6(т 5 '
5 ' • л—1
-32
(г) = Ьз2 -
у({1} ,хо,т - го) = а П (ьи - 2и12(г)^ и12(г) - Лих12(г) +
+ А
12
лг +
Мо
Ь13 - 1 и13(г)) и13(г) - ¿1зХГЗ(г) + А13
лг
у({2} ,Х0,т - го) = а П (ь21 - 2и21 (г^ и21(г) - л21Хх21(г) +
+ А:
21
лг +
Чо 1
Ь2з - ^п23(г)) и23(г) - Л2зх23(т) + А23
лг
у({3} ,х0,т - го) = а П (ь31 - 2и31 (г^ и31(г) - л,31х31(г) +
Уо
лг + ! Ць32 - и32(г)) и32(г) - л32х32(г) + А32] лг I ,
+ А
1
у({1,2} ,хо,т - го) = У
ь12 - 1 и12(г)) и12 (г) - л12 х12 + А12
лг +
+
ь21 - 2 и (г)) и21 (г) - л21х21(г)+ А21
лг + а
ь13 - 2 и (г)) и13(г) -
- лЛ3х13(г) + А13
лг\ + а
(ь23 - 2и )и23(г) - л,23х23(г) + А23
лг
е
е
V({1,3} ,x0,T - to) = J
bis - 2u13(t)j u1'6(t) - d13xr6(t) + A13
dt +
+
to
bsi - 2 u (t)) u (t) - dsi xsi(t)+ Asi
dt + a
bi2 - 2*i2(t)) ui2(t) -
- di2Xi2(t) + Ai2
dt\ + a
bs2 - 2u ) u - ds2XXS2(t) + As2
dt
V({2, 3} ,x0,T - to) = j[(b2s - u2s(t))u2s(t) - d2sX2S(t) + A2s]dt +
to
i / i + J [{bs2 - us2(t)) us2(t) - d32xs2(t)+ A32] dt + alj (b2i - 2u2i(t)^u2i(t) -
to to
- d2ix (t) + A2i
dt\ + a
bsi - 2uSi(t)) uSi(t) - dsixsi(t)+ Asi
dt
Remark. The instantaneous payoff in the game is (bij- 1 uij(t))uij(t)- dijxij(t)+ Aij, since (bij- 1 uij(t))uij(t) > 0, uij E [0,bij], if Aij > maxxn(t)(dijxij(t)), t E [to,T], then all instantaneous payoff for each player at any time t are non-negative. Because of uij(t) E [0,bij], uji E [0,bji], if we treat it as a constant and integrate the differential equations (8)-(13), we obtain:
x12(t) = (u12 + u21 )(t - to) + xo2, x21(t) = (u12 + u21 )(t - to) + xo1,
xi3(t)
(u13+u31),
(1 - e-S(t-to)) + e-S(t-to),
3i
(t) = (U3 +/1 ) (1 - e-S(t-to)) + e-S(t-to)
i3
to) 3i
23
(t) = e-e(t-to) xf +
xs2 (t)
»(u23+u32)
(1 - e-<t-to)),
-e(t-to) xs2 + (u (1 - e-e(t-to )).
Additional conditions
Ai2 > max(di2xi2(t)) = di2 [b(T - to) + x^2],
x12
A2i > max(d2ix2i(t)) = di2 [b(T - to) + xf],
Ais > max(disxis(t)) = di3[x0se-S(T-to) + -(1 - e-S(T-to))],
o
o
x
x
i -niqg -gg "X 'fZOZ '"таихтз^офни -тзмихютэхтзм кт2н№2тшс1ц "ЛЛ91Ю яинхээд çqj
A3i > max(do1x01(t)) = d31[x301e-S(T^ + b-(1 - e-(T^
x31 0
A2O > m&x(d2ox20(t)) = d2o[xl0e-<T^ + ^(1 - e-e(T^
X23 6
A02 > max(do2X02(t)) = do2[x302e-<T^ + ^(1 - e-e(T
x32 6
Try to compute the core and the Shapley value. Assume the following values of the parameters: b12 = 200, b21 = 250, b10 = 300, b01 = 400, b20 = 350, b02 = 500, d12 = 1,d21 = 1.5, d10 = 2.5, d01 = 3,d20 = 2.5, d02 = 3.5,0 = 0.3, p = 0.8,6 = 0.08, a = 0.1, t0 = 0,t = 2.5, T = 5, X02 = 10, xl1 = 20, xl = 25, xf = 30, xf = 50, xf = 40, A12 = 2260, A21 = 3405, A10 = 4545.69, A01 = 5458.17, A20 = 7089.5, A02 = 9901.535, V(N,x0,T - t0) = 1.975334 ■ 106. Then we calculate
V({1} ,x0,T - t0) = 3.383075 ■ 104,
V({2} ,x0,T - t0) = 4.807020 ■ 104,
V({3} ,x0,T - t0) = 1.156325 ■ 105, V({1, 2} ,x0,T - t0) = 3.254166 ■ 105, V({1, 3} ,x0,T - t0) = 8.226729 ■ 105, V({2, 3} ,x0,T - t0) = 1.024778 ■ 106,
Sh(x0; T - t0) = (4.921934 ■ 105, 6.003659 ■ 105, 8.827752 ■ 105).
The numerical results are displayed in Figures 2-5. In Figure 2, we plotted the optimal policy uij, and optimal trajectories xij, i G 1, 2,3. Given three graphic interpretations of the obtained results. Figures 3-5 show the domains corresponding to the feasible imputation set L(x0,T -10), and the core C(x0,T -10) constructed using V(S,x0,T -10). Color with shadows represents the core, and the blank star represents the Shapley value imputation. The Figure 3 represents the game on the time interval form t0 to T. On Figure 4 is the subgame that happened on the time interval [t0, t], and on Figure 5 is the subgame that happened on the time interval [t, T]. In our case, we use t0, t, T as above value. The resulting Shapley value imputation belongs to the core of the initial game.
Тable. Values of characteristic function
S У (S,xo,t - to) У(S,X(t),T - t) У(S,xo,T - to)
N 1.003462 • 106 9.941128 • 105 1.975334 • 106
{1} 1.733632 • 104 1.787801 • 104 1.649443 • 104
{2} 2.448433 • 104 2.358587 • 104 4.807020 • 104
{3} 5.852549 • 104 5.710699 • 104 1.156325 • 105
{1, 2} 1.666546 • 105 1.587619 • 105 3.254166 • 105
{1, 3} 4.180839 • 105 4.045889 • 105 8.226729 • 105
{2, 3} 5.190691 • 105 5.057093 • 105 1.024778 • 106
To illustrate the time consistency, we choose the Shapley value as the cooperative solution. Then the payoffs of players at time period [0,t] are (2.508643 ■ 105, 3.049309 x 105,4.476661A05), and at the time period [t, T] are (2.584275105, 2.954349105, 4.351089x
105). Furthermore, from the Table, the values of the characteristic function is also time consistent.
(920452.18,24484.33,58525.49) (17336.32,927600.19,58525.49)
Figure 3. Cooperative game at [i0, T] (x is Shapley value also for Figures 4, 5)
(913419.94,23585.87,57106.99) (17878.01,919127.8,57106.99)
Figure 4- Cooperative game at [io, t]
(861075.3,48070.2,115632.5) (16494.43,892651.07,115632.5)
Figure 5- Cooperative game at [t, T]
6. Conclusion. In this paper, we studied the differential game with pairwise interaction, a new type of game in game theory. This give us the possibility to getting the new characteristic function in the game. The convexity of characteristic function is proved. By cooperation, we considered the Shapley value and the core as solutions. The key of this research is the differential game with pairwise interactions, where each player can play multiple different differential games. Finally, the results are illustrated by an example. Appendix. P r o o f of proposition. We also introduce the additional notation
T
wij = j hjixij(t),uijiт))dr.
to
Using (4), (5), we can rewrite:
v(si U S2; xo,T - to) = У wij + a ^ wij =
jes 1US2 _ ieSiUS2
je к (i)n(SiUS2) je к (i)n(N\S1US2))
= E wij + E wij + E wij + E wij +
jeSi\S2 jes i\S2 jeSi\S2 jeS2\Si
je K(i)n(Si\S2) jeK(i)n(S2\Si) je K(i)n(SinS2) je K(i)n(S2nSi)
+ E wij + E wij + E wij + E wij +
jeS2\Si jeS2\Si jeSinS2 jeSinS2
je K (i)n(S2\Si) je K(i)n(Si\S2) jeii(i)n(Si\S2) je K(i)n(SinS2)
+ E wij +a E wij + a ^ wij +
jeSinS2 _ ieSinS2 _ ieSi\S2
je K(i)n(S2\Si) je к (i)n(N\(SiUS2)) je K(i)n(N\(SiUS2))
+ a Y^ wij, (15)
ieS2\Si je к (i)n(N\(SiUS2))
v(si; xo, T - to) = y wij + a ^ wij = ^ wij +
ieSi _ ieSi ieSi\S2
je к (i)nSi je к (i)n(N\Si) je K (i)n(SinS2)
+ E wij + E wij + E wij +a E wij +
ieSi\S2 jeSinS2 jeSinS2 ieSi\S2
je K(i)n(Si\S2) jeK (i)n(SinS2) jeK(i)n(Si\S2) jeK(i)n(N\(SiUS2))
+ a wij + a wij + a ^ wij, (16)
ieSi\S2 _ ieSinS2 jeSinS2
je K(i)n(S2\Si) jeK(i)n(N\(SiUS2)) je K(i)nS\Si)
v(s2; xo, T - to) = y wij + a y wij = y wij +
ieS2 _ ieS2 ieS2\Si
je K(i)nS2 je K(i)n(N\S2) je K(i)n(S2\Si)
+ E wij + E wij + E wij +a E wij +
ieS2\Si jeS2nSi jeS2nSi ieS2\Si
je K(i)n(S2nSi) je K(i)n(S2nSi) je K(i)n(S2\Si) je K(i)n(N\(S2USi))
+ а wj + а У ^
jeS2\Si je к (i)n(Si \S2)
Wij + а wij, (17)
ieS2nSi jeS2nSi
je к (i)n(N \XS2US1)) je к (i)n(Si\S2)
v(St П S2; xo,T - to)
wij +а
Uij I а wij
jeSinS2 _ ieSinS2
je к (i)n(SinS2) je к (i)n(N\(SinS2))
= £ wij +a £ wij +a £ wij + j,eS1nS2 _ ieS1nS2 jeS1nS2
jeK(i)n(s1 nS2) je K(i)n(N\(SiUS2)) jeK(i)n(Si\S2)
+ a ^ Wij.
ieSinS2 je k (i)n(S2\Si)
Subtracting the expressions (16), (17) from (15) and adding (18), we obtain formula
(18)
(
(1 - а)
\
£ wij + £ wi ieSi\S2 ieS2\Si
\je к (i)n(S2\Si) je к (i)n(Si\S2) /
> 0.
The inequality follows from the non-negativity of payoffs. The statement of the proposition is proved.
References
1. Dyer M., Mohanaraj V. Pairwise-interaction games. International Colloquium on Automata, Languages, and Programming. Berlin, Heidelberg, Springer, 2011, pp. 159—170.
2. Cheng S. F., Reeves D. M., Vorobeychik Y., Wellman M. P. Notes on equilibria in symmetric games. Proceedings of the 6th International Workshop on game theoretic and decision theoretic agents, GTDT, Research Collection School of Computing and Information Systems, 2004, pp. 71—78.
3. Bulgakova M. A. Reshenija setevyh igr s poparnym vzaimodejstviem [Solutions of network games with pairwise interactions]. Vestnik of Saint Petersburg University. Applied Mathematics. Computer Science. Control Processes, 2019, vol. 15, iss. 1, pp. 147—156. https://doi.org/10.21638/11702/spbu10.2019.112 (In Russian)
4. Bulgakova M. A., Petrosyan L. A. Kooperativnye setevye igry s poparnym vzaimodejstviem [Cooperative network games with pairwise interactions]. Mathematical Game Theory and its Applications, 2015, vol. 7, iss. 4, pp. 7-18. (In Russian)
5. Petrosyan L. A., Bulgakova M. A., Sedakov A. A. Time-consistent solutions for two-stage network games with pairwise interactions. Mobile Networks and Applications, 2021, vol. 26, iss. 2, pp. 491-500. https://doi.org/10/1007/s1136-018-1127-7
6. Bulgakova M. A., Petrosyan L. A. Ob odnoj mnogoshagovoj neantagonisticheskoj igre na seti [About one multi-stage non-cooperative game on the network]. Vestnik of Saint Petersburg University. Applied Mathematics. Computer Science. Control Processes, 2019, vol. 15, iss. 4, pp. 603-615. https://doi.org/10.21638/11701/spbu10.2019.415 (In Russian)
7. Bulgakova M. A., Petrosyan L. A. About strongly time-consistency of core in the network game with pairwise interactions. International Conference Stability and Oscillations of Nonlinear Control Systems (Pyatnitskiy's Conference), IEEE, 2016, pp. 1-4.
8. Bulgakova M. A., Petrosyan L. A. Multistage games with pairwise interactions on complete graph. Automation and Remote Control, 2020, vol. 81, iss. 8, pp. 1539-1550.
9. Petrosyan L. A., Sedakov A. A. One-way flow two-stage network games. Vestnik of Saint Petersburg University. Applied Mathematics. Computer Science. Control Process, 2014, vol. 10, iss. 4, pp. 72-81.
10. Mazalov V., Chirkova J. V. Networking games: network forming games and games on networks. London, Academic Press, 2019, 322 p.
11. Sun P., Parilina E. M. Two stage network games modeling the Belt and Road Initiative. Vestnik of Saint Petersburg University. Applied Mathematics. Computer Science. Control Processes, 2022, vol. 18, iss. 1, pp. 87-98. https://doi.org/10.21638/11701/spbu10.2022.107
12. Wie B. W. A differential game model of Nash equilibrium on a congested traffic network. Networks, 1993, vol. 23, iss. 6, pp. 557-565.
13. Petrosyan L. A., Yeung D. W. K., Pankratova Y. B. Dynamic cooperative games on networks. International Conference on Mathematical Optimization Theory and Operations Research. Cham, Springer Publ., 2021, pp. 403-416. https://doi.org/10.1007-3-030-86433-0-28
14. Tur A. V., Petrosyan L. A. Cooperative optimality principles in differential games on networks. Autom. Remote Control, 2021, vol. 82, pp. 1095-1106. https://doi.org/10.1134/S0005117921060096
15. Tur A. V., Petrosyan L. A. The core of cooperative differential games on networks. International Conference on Mathematical Optimization Theory and Operations Research. Cham, Springer, 2022, pp. 295-314. https://doi.org/10.1007/978-3-031-09607-5-21
16. Petrosyan L. A. Ustojchivost' reshenij v differencial'nyh igrah so mnogimi uchastnikami [Stability of solutions of differential games with participants]. Vestnik of Leningrad State University, 1977, vol. 19, pp. 46-52. (In Russian)
17. Petrosyan L. A., Danilov N. A. Ustojchivye reshenija v neantagonisticheskih differencial'nyh igrah s transferabel'nymi vyishryshami [Time consistent solutions of zero-sum differential games with transferable payoffs]. Vestnik of Leningrad State University, 1979, vol. 1, pp. 46-52. (In Russian)
18. Petrosyan L. A., Yeung D. W. K., Pankratova Y. B. Characteristic functions in cooperative differential games on network. Journal of Dynamics and Games, 2024, vol. 11, iss. 2, pp. 115-130. https://doi.org/10.3934/jdg.2023017
19. Gromova E. V. The Shapley value as a sustainable cooperative solution in differential games of three players. Recent Advances in Game Theory and Applications. Cham, Birkhauser, 2016, pp. 67-89. https://doi.org/10.1007/978-3-319-43838-27
20. Petrosyan L. A., Zaccour G. Time-consistent Shapley value allocation of pollution cost reduction. ■Journal of Economic Dynamics and Control, 2003, vol. 27, iss. 3, pp. 381-398. https://doi.org/10.1016/S0165-1889(01)00053-7
21. Petrosyan L. A., Yeung D. W. K. The Shapley value for differential network games: Theory and application. Journal of Dynamics and Games, 2020, vol. 8, iss. 2, pp. 151-166. https://doi.org/10.3934/jdg.2020021
22. Breton M., Zaccour G., Zahaf M. A differential game of joint implementation of environmental projects. Automatica, 2005, vol. 41, iss. 10, pp. 1737-1749.
23. Gromova E., Tur A., Barsuk P. A pollution control problem for the aluminum production in eastern Siberia: Differential game approach. Stability and Control Processes, SCP 2020. Lecture Notes in Control and Information Sciences — Proceedings. Eds N. Smirnov, A. Golovkina. Cham, Springer, 2020, pp. 399-407. https://doi.org/10.1007/978-3-030-87966-2-44
24. Su S., Parilina E. M. Can partial cooperation between developed and developing countries be stable? Operations Research Letters, 2023, vol. 51, iss. 3, pp. 370-377. https://doi.org/10.1016/j.orl.2023.05.003
Received: November 20, 2023. Accepted: December 26, 2023.
A u t h o r s' i n fo r m a t i o n:
Yang He — Postgraduate Student; [email protected]
Leon A. Petrosyan — Dr. Sci. in Physics and Mathematics, Professor; [email protected]
Заметка о кооперативных дифференциальных играх с парными взаимодействиями*
Я. Хе, Л. А. Петросян
Санкт-Петербургский государственный университет,
Российская Федерация, 199034, Санкт-Петербург, Университетская наб., 7-9
* Исследование выполнено за счет гранта Российского научного фонда № 22-11-00051, https://rscf.ru/project/22-11-00051/
Для цитирования: He YPetrosyan L. A. A note on cooperative differential games with pairwise interactions // Вестник Санкт-Петербургского университета. Прикладная математика. Информатика. Процессы управления. 2024. Т. 20. Вып. 1. С. 91-108. https://doi.org/10.21638/11701/spbu10.2024.108
Предлагается дифференциальная игра с парным взаимодействием. Вершины в сети — это игроки, а ребра — связи между ними. При этом рассматривается кооперативный случай. Вводится новая характеристическая функция и доказывается ее выпуклость. Ядро используется в качестве кооперативного принципа оптимальности. Характеристическая функция позволяет построить устойчивое во времени (динамически устойчивое) решение, такое как вектор Шепли и ядро.
Ключевые слова: кооперативные игры, дифференциальные сетевые игры, парное взаимодействие, характеристическая функция, вектор Шепли, состоятельность по времени.
Контактная информация:
Хе Ян — аспирант; [email protected]
Петросян Леон Аганесович — д-р физ.-мат. наук, проф.; [email protected]