Научная статья на тему 'THE τ-VALUE IN MULTISTAGE GAMES WITH PAIRWISE INTERACTIONS'

THE τ-VALUE IN MULTISTAGE GAMES WITH PAIRWISE INTERACTIONS Текст научной статьи по специальности «Математика»

CC BY
9
4
i Надоели баннеры? Вы всегда можете отключить рекламу.
Область наук
Ключевые слова
COOPERATIVE GAMES / NETWORK GAMES / DYNAMIC GAMES / τ-VALUE / PAIRWISE INTERACTION / TIME-CONSISTENCY

Аннотация научной статьи по математике, автор научной работы — Bulgakova Mariia A.

We consider multistage bimatrix games with pairwise interactions. On the first stage players chose their neighbours and formed a network. On the later stages bimatrix games between neighbours by network take places. As a solution consider the τ-value (Tijs, 1987). Earlier we calculated coefficient λ of τ-value in case of two-stage game. Now we consider a general case of one-stage game with any players and any number of links. We assumed followings: N is set of players, N ≥ 2 and any type of network g. It is also assumed, that there are not necessarily paths between every pair of vertices. We will consider conditions for time-consistency of τ-value in two-stage game.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «THE τ-VALUE IN MULTISTAGE GAMES WITH PAIRWISE INTERACTIONS»

Contributions to Game Theory and Management, XV, 32—40

The t-value in Multistage Games with Pairwise Interactions*

Mariia A. Bulgakova

St. Petersburg State University, Faculty of Applied Mathematics and Control Processes, 7/9, Universitetskaya nab., St. Petersburg 199034, Russia E-mail: m.bulgakova@spbu.ru

Abstract We consider multistage bimatrix games with pairwise interactions. On the first stage players chose their neighbours and formed a network. On the later stages bimatrix games between neighbours by network take places. As a solution consider the t-value (Tijs, 1987). Earlier we calculated coefficient a of t-value in case of two-stage game. Now we consider a general case of one-stage game with any players and any number of links. We assumed followings: N is set of players, | n| ^ 2, and any type of network g. It is also assumed, that there are not necessarily paths between every pair of vertices. We will consider conditions for time-consistency of t-value in two-stage game.

Keywords: cooperative games, network games, dynamic games, t-value, pairwise interaction, time-consistency.

1. Introduction

The theory of dynamic cooperative network games is an important part of modern game theory that will be used to construct solutions in conflict-controlled processes on networks. This theory includes finding of the cooperative trajectory, the strategies that generate it, the payoff along the cooperative trajectory, as well as the distribution of payoff between the players and the analysis of the dynamic stability of solutions (Petrosyan and Danilov, 1979).

The network is used for illustration of the connections between players and the possibility of cooperation. The main attention is given to cooperative behavior of player that is behavior in which the total payoff of players will be maximal.

The principle of pairwise interaction in cooperative games was considered in (Bulgakova, 2019). This principle implies splitting the game into a family of simultaneous games between pairs of players — the vertices of the same edge in the network. In (Bulgakova, 2021) the Shapley value for two-stage game with pairwise interactions was found. Two-stage network games and the mechanism of network formation also was considered in (Petrosyan et al., 2013). The paper (Bulgakova et al., 2018) is devoted to finding the basic solutions of two-stage cooperative games with pairwise interaction and also to the investigation of the supermodularity of the characteristic function is the special class of network, namely, the star-network. This property significantly increases the value of such a solution as the Shapley value, since in the case of a convex game it always belongs to the core. In this regard, in games based on pairwise interaction, this solution is of particular interest, also because of possibility to simplify the corresponding calculations. Considering the construction of the characteristic function, another solution proposed in (Tijs,

*This research was supported by the Russian Science Foundation grant No. 22-1100051, https://rscf.ru/en/project/22-11-00051/

https://doi.org/10.21638/11701/spbu31.2022.03

1987), namely, T-value. The Shapley value in dynamic network games with shock, when after the first network formation stage, a particular player with a given probability may stop influencing other players by removing all her links and receiving zero payoffs, is considered in (Petrosyan and Sedakov, 2016). The differential games on networks with partner sets are considered in (Petrosyan et al., 2021). Also cooperation in dynamic network games is investigated in (Gao and Pankratova, 2017).

In this paper, a special type of multi-stage cooperative games on the network is considered, which is distinguished by the way of constructing the characteristic function. The cooperative trajectory is found, and then the characteristic function is calculated taking into account the cooperative strategies of players. As a solution, T-value is considered, the coefficient A is calculated for the subgame, starting from the second stage. This coefficient is the same for any number of players and for any network design in a given game, which is greatly simplifies the calculations of this solution. Also a question about time-consistency of the T-value is considered, and conditions for time-consistency was found.

2. The Model

Let N be a finite set of players, which make decisions in two stages, |N| = n > 2. Denote state of game by z . The game starts in state zi, where every player i G N choose his behavior bj = (6i1,...,6j„) — n-dimensional vector of communication offers to other players (Bulgakova, 2014; Petrosyan et al., 2013).

The following notations will be used: Mj Ç N\{i} — those players, whom player i G N can offer a link, while the value aj G {0,..., n — 1} is maximal number of connections, which he can support at the same time. If Mj = N\{i}, then it means, that player i can offer links to all players. In the case, when aj = n — 1, player i can support any number of connections.

For every behavior bj there exist such subset of realized link offers Qj C Mj, which satisfy following conditions

b.. i1, if j G Qi, (1) jj | 0, in other case,

under this additional restriction

^ bjj < aj. (2)

jeN

Condition (2) means, that the number of possible links is restricted for every player. Also, obviously, that | Qi| < ai.

Link ij will be realized if and only if when bj = bjj = 1. Formed links ij will create edges of network g, which has players as vertices, i.e., if bj = bjj = 1, network g will has an edge with vertices i and j.

Denote as Nj(g) neighbours of player i in network g, i.e. Nj(g) = {j G N \ {i} : ij G g}. Further, for brevity, sometimes instead of Nj(g) will be wrote Nj . The result of the choice of players in the first state is the network g(b1,..., bn). After its formation, the players go to the state z2(g), which is determined by the network (the set of neighbors depends on the network Ni and hence the rule of interaction between players). In second state z2(g), neighbors on the network play in pairs in simultaneous bimatrix games, after which the players receive payoffs and the game

ends. In other words, a two-stage game (g) takes place, which is a special case of multi-stage non-zero sum games. Adapting the definition of strategies to this case, it is assumed that in the case under consideration, the strategy is a rule that for each player determines the set of his "desired" neighbors in the first state, namely, the vector b1, and behavior in each bimatrix game in the second state in accordance with the network that is formed in the first state, — b2. Denote as ui = (61,62), i G N, strategy of player i in two-stage game (g). Let's calculate payoff of player i as hi(z2), where (zi, z2) — trajectory realized be strategy profile u = (wi(-),..., «„(■)) in game rS(g). Since in the first state the players do not receive winnings, the payoff function in the game rS(g) with starting position zi is defined as follows:

Ki(zi; u) = Ki(zi; ui(-),... ,u„(-)) = hi(z2).

In the second state, the game is a family of pairwise simultaneous bimatrix games {Yj} between neighbours by network. Namely, let i G N, j G Ni. Then player i play with j in bimatrix game Yj with payoff matrices Aj and Cj of players i and j correspondingly.

i aj aij

Aij =

ii ij a2i

\a,

ij j li am2

Cij =

"ii i2 ij ij c2i c22

\c;

ij mi

nfc ij J2fc

ij amk /

(3)

■"ifc c2fc

(4)

api > 0, cpi > 0, p = 1, .. ..

l = 1,

Constants m and k are similar for all i and j. When the game Yji takes place, i.e. player i is the second player, then he plays with the matrix Cji which is equal to Aj, and player j, who is now the first player plays with the payoff matrix Aji, or, equivalently, Cj. Denote as (g) subgame of game r, which happens in the state z2. Consider such game in a cooperative form. Let us find the characteristic function for each subset (coalition) S C N as the lower (maximin) value of the antagonistic of two persons game in the coalition S and additional coalition N \ S, game-based r.^ (g), in this case, the gain of the coalition S is considered as the sum of the payoffs of the players included in S. The superadditivity of the characteristic function follows from its definition. Following notations are assumed

ij

ij

ij

ij

ij

c

2

m

wj = maxmin alj, p = 1,..., m; I = 1,..., k, (5)

ij p I pl W

wj = maxmin c'l, p = 1,..., m; I = 1,..., k, (6)

j i p pl

and v(z2; S), S C N, — lower value of zero-sum game rS(g). Function v(z2; S) calculated by following formulas:

v(z2; |0}) = O, (7)

v(z2; {i})= E wij, (8)

jeNi

v(z2; S) = 1 E E nia/(apji + cpj1) + E E wik,S C N, (9)

¿eSjeNiHS ies keNi\s

v(z2; N ) = i EE ma/Kj + j (10)

ieN jeNi

Consider the cooperative form of two-stage game rS (#)• Let it be supposed, that players choose strategies ui, i G N, which maximize their total payoff in game rS (g),i. e.

EKi(zi; Ui, ...,%) = max Ki(zi; ui,..., w„).

u z—'

ieN ieN

We will call strategy profile U = (u1,..., Un) cooperative behavior, and corresponding trajectory (z1,z2) — cooperative trajectory, which content two states in this case.

As previous, for coalition S C N define characteristic function v(z1; S) as a lower value in two player zero-sum game between coalition S, which plays as first (maximizing) player and additional coalition N \ S, which plays as (minimizing) player. In this case, the best behavior on the first stage for minimizing player, is not to create connections with players from coalition S. This will reduce the payoff of coalition S on a value ^ wlk - Let's keep in mind that here the coalition

ies keNi\s

S payoff also is equal to summarized payoff of its members and the strategy of S — element of the Cartesian product of the sets of strategies of the players included in S.

Denote as v(z1; S), S C N, lower value of zero-sum game rS (g).

Function v(z1; S) defines by following expressions:

v(z1; {i}) = 0, v(z1; 0) = 0, (11)

v(z1; S) = max | 1 E E max(aj + j | , S C N, (12)

ieSjeNi(s)ns

= v(z2; N) = mgix ( 1 E E nl£lx(aPl + CpD ) . (13)

" ieN jeNi 1

3. The t-value

Earlier we showed that the game (g) is convex game.

The formula for calculating of components of t value for convex game:

Ti(N, v) = A(v(N) - v(N \ {i})) + (1 - A)v({i}), (14)

when coefficient A is determined from the equation

E(A((v(N) - v(N \ {j})) + (1 - A)v({j})) = v(N). (15)

jeN

In (Bulgakova, 2021) we get, that the coefficient A for T-value in two-stage game with pairwise interactions (g) is equal to - and the formula for T-value has form

Ti(N,v) = 2 E ma/K;l + jl) (16)

jeN p'

Now consider T-value as a solution of subgame rS (g).

To calculate coefficient A (15), we need to know the difference (v(N)— v(N\{j}). Substituting formulas for characteristic function of rz2 (g) (10) in the difference we get

(v(N) - v(N \{j}) =

2 EE maix(ap1 + cki) - (1 EE niax(ap1 + c£) + £ wj). (17)

ieN keNi p' ieN,i=j keNi p' teNj

The first term in right part of equation contents the second one.

1 EE max(apk + cp|) = 1 £ £ max(apk + c$) + £ max(apj + cpl).

2 P'i 2 P'i P'i

ieN keNi ieN'i=j kern teNj

After opening brackets in (17) we will get

"'"pi + cpi) - wtj

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

teNj ^

Sum this expression over j

< V^ + cpi) __ ^ wtj

(v(N) - v(N \ {j}) = E max^j + cpl) - £ wj.

PA

teNj teNj

$>(N) — v(N \ {j}) = E( E max(a j + c£) — E wj) = 2v(N) — v({j}).

jeN jeN teNj p' teNj

Substituting calculated difference in (15) we get

A • (2v(N) — E v({j})) + (1 — A) E v({j}) = v(N).

jeN jeN

Group the terms

2Av(N) - 2A E v({j})) + E v({j}) = v(N),

jeN jeN

2A(v(N) - E v({j})) = v(N) - E v({j}),

jeN jeN

a = 2.

Substitute calculated coefficient A in (14):

Ti(N, v) = 2(v(N) - v(N \ {i})) + 2v({i}),

And also substitute the value of difference v(N) — v(N \ {i})

Ti(N, v) = 1( £ max^ + - £ wiii) + 1 v({i}).

teN p' teNi

So we proved following result:

Theorem 1. In subgame rj^ (g) the coefficient for calculating t-value A = ^, and formula for component of t-value for arbitrary player i:

Ti(N,v) = 1( £ max(a« + c£) - £ WJ + 2v({i}). (18)

teNi teNi

4. Time-consistency

Denote the set of all imputations in game rS(g) as

n

M[v(zt)] = {x = (xi,... ,Xn) : Xi = v(zt; N),Xi > v(zit; {i}),i € N}.

i=i

As a solution, we will consider the T-value t[v(zt)] = (Ti[v(zt)],..., Tn[v(zt)]), t = 1, 2.

Following cooperative game theory, the maximum total payoff of all players v(zi; N) in case of cooperation, it must be divided among all players after the end of the game. To do this, the characteristic function v(zi; S) is used, according to which the imputation is determined as a vector ^[v(zi)] = (^i[v(zi)],...,^n[v(zi)]) which, firstly, it satisfies the efficiency condition, i.e., ^i£N ^i[v(zi)] = v(zi; N) and, secondly, the condition of individual rationality, i.e., ^i[v(zi)] > v(zi; {i}) for every i € N. Let us denote the set of all imputations in the game rS(g) as M[v(zi)]. Cooperative solution I[v(zi)], is a rule, which maps the set M[v(zi)] cooperative game rS(g).

Before starting the game rS(g), players enter into a cooperative trajectory agreement (zi,z2), i.e., such trajectory, which leads to the maximum total payoff v(zi; N), and it is assumed that the players share this payoff in accordance with the chosen imputation ^[v(zi)] from the adopted cooperative solution I[v(zi)]. It means that in game rS (g) every player i € N expects his payoff will be equal to ^i[v(zi)]. If the players recalculate the solution after network formation stage (at the second stage), this will lead to the fact that the recalculated set I[v(z2)] will differ from the previous I[v(zi)]. It happens because the characteristic function in the subgame rS (g) is different. Thus, this change may lead to the fact that some of the players will leave the cooperative agreement and deviate from cooperative strategies. Next, a mechanism will be used that provides consistency against deviation from cooperative solution I[v(zi)]. Cooperative solution I[v(zi)] in two-stage game is time consistent [9], if for any imputation ^[v(zi)] € M[v(zi)] there exists imputation £[v(z2)] € I[v(z2)] such, that

e[v(zi)] = e[v(^2)], (19)

since the players do not receive their payoffs at the network formation stage. Otherwise, the cooperative solution is time inconsistent.

Based on the above definitions, we can conclude that the t-value t[v(zi)j is time consistent cooperative solution, if

T [v(zi)]= T [v(z2)l, (20)

Substituting calculated formulas for T-value for game I, (g) and subgame, (g) we get

After reduction of similar terms we will get following conditions for time-consistency of r-value in two-stage network game with pairwise interactions rS (g)

E wti = v(z2; {i}), Vi G N. (21)

teNi

In other words, this condition is as follows: the payoff of an arbitrary player i, which he can get individually, without cooperation with other players, must be equal to the total payoffs of his neighbors along the edges of the network, which they get by playing against this player i . One can also impose a constraint that the individual payoff of players i and j playing a bimatrix game along a common edge must be the same. In this case, the above condition will be satisfied, which guarantees the time consistency of the T-value, but such a restriction is only a particular, more natural case of the obtained condition.

5. Example

To illustrate our results consider four-person two stage game. N = {1, 2,3,4}, sets of players whom every player can offer a connection are given: M1 = M4 = N, M2 = M3 = {1,4}, also given restrictions for total number of connections for every player ai = a4 = 3, = 2, a2 = 1.

Payoff matrices for all possible pairs of players are given

A12C21 A14C41

(3; 1) (1;4) (5; 4) (2; 2)

(2; 2) (3; 0) (1; 3) (6; 1)

A34C43 =

A13C31

A24C42 =

(5; 5) (4; 7) (4; 4) (5; 0)

(2; 5) (4; 1) (3; 2) (1; 1)

(1; 0) (0; 1) (1; 1) (0; 0)

To maximize total payoff on the second stage of game players should choose following vector of communication offers:

b1 = (0,1,1,1), b2 = (1,0,0,0), b3 = (1,0,0,1), b4 = (1,0,1,0).

After choosing these behaviors the following network will be formed (see Figure 4):

Fig. 1. Network on the second stage of game.

Now calculate values wj to construct characteristic function v(z2; S). These

t (

values will be equal to zero in two-stage game rS (g).

w12 = 2, w13 = 2, w14 = 2, w21 = 2, wf1 = 2, w4 1 = 2, wf4 = 4, w43 = 4.

Also calculate maximal total payoff for every pair of players on the formed network mj = maxPjl(alj, + cjl). These values will be equal to each other in both games, rS(g) and subgame rS (g)

m12 = 9, m-13 = 7, m.14 = 7, m24 = 2, m.34 = 11.

And the values of characteristic function for maximal coalition N (equal for both games).

v(z1 ; N) = v(z2; N) = m12 + m 13 + m14 + m34 = 34.

Condition (21) for time-consistency of T-value holds

For player 1:

v(z2, {1}) = w^ + w31 + w^ = 6.

For player 2: For player 3: For player 4:

v(z2, {2}) = w}2 =2.

v(z2, {3}) = w13 + w43 = 6.

v(z2, {4}) = w34 + w i 4 = 6. Calculate the T-value for games rS(g) and (g), using formulas (16), (18).

T(v(zi)) = (11, 5; 4, 5; 9; 9),

t(v(z2)) = (11, 5; 4, 5; 9; 9).

And in respect to (20) we get that T-value in two-stage network game with pairwise interactions is time-consistent.

6. Conclusion

We proved very important property — the time-consistency of cooperative solution — for solution of two-stage games with pairwise interactions. It was possible because of specified form of characteristic function, where the game was considering as a family of small bimatrix games between neighbours by network. This result can be generalized on multistage games with using IDP — imputation distribution procedure (Petrosyan and Danilov, 1979).

References

Bulgakova, M. A. (2019). Solutions of network games with pairwise interactions. Vestnik of Saint Petersburg University. Applied Mathematics. Computer Science. Control Processes, 15(1), 147-156. Bulgakova, M. A. (2021). Non-zero sum network games with pairwise interactions. Contributions to Game Theory and Management, 14, 38-48. Gao, H. and Pankratova, Y. (2017). Cooperation in dynamic network games. Contributions

to Game Theory and Management, 10, 42-67. Petrosyan, L. A. (1977). Stability of solutions in n-person differential games. Vestnik of

Leningrad University. Series 1. Mathematics. Mechanicks. Astronomy, 19, 46-52. Petrosyan, L. A., Bulgakova, M. A. and Sedakov, A. A. (2018). Time-Consistent Solutions for Two-Stage Network Games with Pairwise Interactions. Mobile Networks and Applications, 26, 491-500. Petrosyan, L. A. and Danilov, N.N. (1979). Stability of solutions of non-zero-sum game with transferable payoffs. Vestnik of Leningrad University. Series 1. Mathematics. Mechanicks. Astronomy, 1, 52-59. Petrosyan, L. A., Sedakov, A. A. and Bochkarev, A. A. (2013). Two-stage network games.

Mathematical game theory and applications, 5(4), 84-104. Petrosyan, L. A. and Sedakov, A. A. (2016). The Subgame-Consistent Shapley Value for Dynamic Network Games with Shock. Dynamic Games and Applications, 6(4), 520537.

Petrosyan, L. A., Yeung, D. and Pankratova, Y. B. (2021). Cooperative Differential Games with Partner Sets on Networks. Trudy Inst. Mat. i Mekh. UrO RAN, 27(3), 286-295. Tijs, S. H. (1987). An axiomatization of the t-value. Mathematical Social Sciences, 13, 177-181.

i Надоели баннеры? Вы всегда можете отключить рекламу.