Научная статья на тему 'COOPERATION IN DYNAMIC NETWORK GAMES'

COOPERATION IN DYNAMIC NETWORK GAMES Текст научной статьи по специальности «Математика»

CC BY
40
8
i Надоели баннеры? Вы всегда можете отключить рекламу.
Область наук
Ключевые слова
DYNAMIC GAMES / COOPERATION / NETWORK / PAIRWISE INTERACTIONS / TIME-CONSISTENCY

Аннотация научной статьи по математике, автор научной работы — Gao Hongwei, Pankratova Yaroslavna

This paper reviews research on dynamic network games that has been carrying out in Saint Petersburg State University since 2009. We focus on the problem of cooperation in dynamic network models noting time and subgame inconsistency of cooperative solutions. The problem of stable cooperation is also covered.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «COOPERATION IN DYNAMIC NETWORK GAMES»

Contributions to Game Theory and Management, X, 42—67

Cooperation in Dynamic Network Games*

Hongwei Gao1 and Yaroslavna Pankratova2

1 College of Mathematics and Statistics, Qingdao University Qingdao 266071, China E-mail: gaohongwei@qdu.edu.cn 2 Saint Petersburg State University 7/9 Universitetskaya nab., Saint Petersburg 199034, Russia E-mail: y.pankratova@spbu.ru

Abstract This paper reviews research on dynamic network games that has been carrying out in Saint Petersburg State University since 2009. We focus on the problem of cooperation in dynamic network models noting time and subgame inconsistency of cooperative solutions. The problem of stable cooperation is also covered.

Keywords: dynamic games, cooperation, network, pairwise interactions, time-consistency.

1. Two-stage games

In this section we introduce basic definitions and analyze how mutual links connecting players can influence players' behavior. Such links define a network. A two-stage game is considered as a basic model in which players form a network at the first stage, and at the second stage players choose their controls. The network game is given in a strategic setting, and following (Kuhn, 1953) a strategy of a player is a rule that uniquely defines his behavior at both stages of the game (player's behavior at the second stage depends on the network formed at the first stage).

It is supposed that payoff to each player depends on his behavior at the second stage and behavior of his "neighbors" in a network formed at the first stage of the game. Similar setting modeled with a two-stage network game, is considered in (Goyal and Vega-Redondo, 2005; Jackson and Watts, 2002). In these papers, the authors consider a model in which players form a network at the first stage, and at the second stage, players are involved in a 2 x 2 coordination game which is the same for all players. Another two-stage model on a network as a location-price game is considered in (Lu et al., 2010).

This model is based on papers studying processes of network formation, network evolution during the game as well as research connected with allocation rules and its properties for a fixed network (Bala and Goyal, 2000; Dutta et al., 1998; Goyal and Vega-Redondo, 2005; Feng et al., 2014; Igarashi and Yamamoto, 2013). In (Bala and Goyal, 2000) a Nash network is considered as a solution for the strategic setting, and the network evolution is modeled as a convergent stochastic process. In (Petrosyan and Sedakov, 2009) the network evolution is constructed as the result of players' actions, and the solution is considered in the sense of subgame per-fectness. Other solution concepts for games on networks are studied regardless of network formation mechanisms (Dutta et al., 1998; Jackson and Wolinsky, 1996).

* This research was supported by the Russian Foundation for Basic Research (grant No 1751-53030).

In the papers mentioned above, the problem of time consistency is not studied. This problem was initiated by Petrosyan (Petrosyan, 1977) for cooperative differential games, and later a special mechanism of stage payments—an imputation distribution procedure—was designed to overcome time inconsistency of cooperative solution concepts (Petrosyan and Danilov, 1979). Time-consistent solutions for differential games under deterministic and stochastic dynamics can be found in (Yeung and Petrosyan, 2006; Petrosjan, 2006; Yeung and Petrosyan, 2012). It has been shown that the time consistency problem arises not only in cooperative differential games but also in other classes of cooperative dynamic games. In it is shown that the time inconsistency problem also arises in cooperative two-stage network games (Petrosyan, Sedakov and Bochkarev, 2013), where time inconsistency of the Shapley value is proved. A more strict property of cooperative solution concepts— the property of strong time consistency (Petrosyan, 2005)—is also studied.

1.1. The model

The following model was proposed in (Petrosyan, Sedakov and Bochkarev, 2013). Consider the model in detail. Let N = {1,..., n} be a finite set of players who can interact with each other. The interaction between two players means the existence of a link connecting them and, therefore, communication between them. On the contrary, the absence of the link connecting players means the absence of any communication between the players. Under these assumptions cooperation of players is said to be restricted by a communication structure (or a network). A pair (N, g) is called a network where N is a set of nodes (and it coincides with the set of players), and g G N x N is a set of links. If pair (i, j) G g, there is a link connecting players i and j, and, therefore, generating communication of the players in the network. Below to simplify notations, the network will be identified with a set of its links and denoted by g, and a link (i, j) in the network will be denoted by ij. It is supposed that all links are undirected, so ij = ji.

Consider a two-stage problem. At the first stage each player chooses his partners— the players with whom he wants to form links. Choosing partners and establishing links, players, thereby, form a network. Having formed the network, each player chooses a control influencing his payoff at the second stage. Consider the problem in detail.

First stage: network formation Having the player set N given, define the link formation rule in a standard way: links, and, therefore, a network, are formed as a result of players' simultaneous choices. Let Mj C N \ {i} be the set of players whom player i G N can offer a mutual link, and aj G {0,..., n — 1} be the maximal number of links which player i can maintain (and, therefore, can offer). Behavior of player i G N at the first stage is an n-dimensional profile gj = (g^,... , gin) whose entries are defined as:

_ J 1, if player i offers a link to j G Mj, , .

gij = \ 0, otherwise, ( )

subject to the constraint:

The condition gii =0, i G N excludes loops from the network, whereas (2) shows that the number of possible links is limited. If Mi = N \ {i}, player i can offer a link to any player, whereas if ai = n — 1, he can maintain any number of links.

(2)

jeN

A set of all possible behaviors of player i G N at the first stage satisfying (1)-(2) is denoted by Gi. The Cartesian product ieN Gi is the set of behavior profiles at the first stage. It is supposed that players choose their behaviors at the first stage simultaneously and independently of each other. In particular, player i G N chooses gi G Gi, and as a result the behavior profile (gi,..., gn) is formed. Under the assumptions, an undirected link ij = ji is established in network g if and only if gij = gji = 1, i.e., g consists of mutual links which were offered only by both players.

Second stage: choosing control Having formed the network, players choose their behaviors at the second stage. Define neighbors of player i in network g as elements of the set Ni(g) = {j G N \ {i} : ij G g}. Players are allowed to reconsider their decisions made at the first stage by giving them the opportunity to break the previously selected links.

Define components of an n-dimensional profile di (g) as follows:

Elements dj (g) satisfying (23) are denoted by D¿(g), i € N .It is obvious that profile (di(g),..., dn(g)) affects network g formed at the first stage by removing some links: profile (d1 (g),..., dn(g)) applied to network g changes its structure and forms a new network, denoted by gd. Network gd is obtained from g by removing links ij such that either dj (g) = 0 or dj (g) = 0.

Moreover, at the second stage player i € N chooses control u from a finite set Uj. For example, in (Goyal and Vega-Redondo, 2005; Jackson and Watts, 2002) U is a set of strategies of player i in a 2 x 2 symmetric coordination game in which i € N plays with neighbors; in (Corbae and Duffy, 2008) Uj is a set of strategies of player i in a 2 x 2 stag-hunt game; in (Xie et al., 2013) Uj is a set of strategies in a prisoner's dilemma game ("cooperate" and "defect") in which i also plays with neighbors in the network. The sets U1,..., Un are not specified, thus they could be of a general structure. We only claim the sets are finite.

Then, behavior of player i € N at the second stage is a pair (dj(g), uj): it defines, on the one hand, links to be removed dj(g), and, on the other hand, control uj.

A payoff function Kj of player i € N depends on both new network gd and controls uj, i € N. Specifically, it depends on player i's behavior at the second stage as well as behavior of his neighbors in network gd, i.e., Kj(uj,uN.^gd)) is a nonnegative real-valued function defined on Uj x n¿eN-(gd) Uj. Here uNi(gd) denotes a profile of controls u¿ chosen by all neighbors j € Nj(gd) of player i in network gd. Assume that functions Kj, i € N, satisfy the following property:

(P): For any two networks g and g' s.t. g' C g, controls (uj,uNi(g)) € Uj x njeNi(g) Uj, and player i, the inequality Kj(uj,uN¿(g)) > Kj(uj, u^g')) holds.

1.2. Cooperation in two-stage network games

Now we describe the cooperation in two-stage network game. We will answer three main questions: What is a cooperative solution in the game? Can it be realized? Is it strong time consistent? To answer all these questions, first we start analyzing an additional case which results will be used below.

1 , if player i does not break the link formed at the first stage

with player j G Ni(g) in network g, 0, otherwise.

(3)

Two-stage Network Game: Cooperation at the Second Stage In this section it is supposed that players' behavior profile (g1,..., gn), gi G Gi, i G N, which is chosen at the first stage, is fixed, and it forms network g. At the second stage players jointly choose n pairs (d*(g), w*) G Di(g) x Ui, i G N maximizing the sum of players' payoffs.

Proposition 1 (Petrosyan, Sedakov and Bochkarev, 2013). The maximal sum of players' payoffs can be calculated by the formula:

XlKi(w*,WNVi(S)) = Ki(wi,wwi(g}). (4)

iGN iGN

Next problem is to allocate the maximal sum of players' payoffs among the players. After the allocating procedure, the game ends. To allocate the maximal sum of players' payoffs, a cooperative TU-game (N, v(g)) is constructed. The characteristic function v(g) in this game is defined for any subset S C N—a coalition—as follows:

v(g,N) = £ Ki(w*,uNi(fl)),

iG N

v(g, S) = u Ki(«i, «N¿(3)1-5),

v(g, 0) = 0,

«¿eU.ies '

¿es

subject to network g is fixed.

In (Petrosyan, Sedakov and Bochkarev, 2013; Gao et al., 2017) the characteristic function was defined as in (Von Neumann and Morgenstern, 1944). Following this idea, the value v(g, S) is the maximal payoff that coalition S can guarantee for itself (the maxmin value) in a zero-sum game between two players: coalition S maximizing its payoff, and its complement N \ S minimizing the payoff to S, provided that network g is fixed. However in (Petrosyan, Sedakov and Bochkarev, 2013) a simplified form of such characteristic function was proposed for the first time.

Proposition 2. If payoff functions Ki, i G N, are nonnegative and satisfy property (P), the maximal payoff that coalition S can guarantee for itself is calculated by formula:

v(g,S )= max „£ Ki(wi, uNi(9)ns). (5)

Ui GUi ,iG S

iGS

Note that under the assumptions the value v(g, S), S C N can be calculated as a solution of the maximization problem (5). To find this solution is simpler than to solve the maxmin problem in general case.

For a singleton {i}, its value is defined in the following way:

v(g, {i}) = max Ki(wi), (6)

Ui GUi

and it does not depend on the network.

An imputation is an n-dimensional profile £(g) = (£i(g),..., £n(g)), satisfying both the efficiency condition and the individual rationality condition:

£ 6(g) = v(g,N),

iGN

&(g) > v(g, {i}), i G N.

Let the set of imputations in game (N, v(g)) be denoted by I(v(g)).

A cooperative solution concept in TU-game (N, v(g)) with fixed network g is a rule that uniquely assigns a subset CSC(v(g)) C I(v(g)) to game (N, v(g)). For example, if the cooperative solution concept is the core C(v(g)), then

CSC(v(g)) = C(v(g)) = j &(g) e I(v(g)) : £ &i(g) > v(g, S), S C N 1 .

I ¿es J

Two-stage Network Game: Cooperation at Both Stages Suppose now that players jointly choose their behaviors at both stages of the game. Acting as one player and choosing gi e Gi, e i e N, the grand coalition N maximizes the value:

(7)

¿eN

Let the maximum be attained when players' behavior profiles g*, w*, i e N are chosen where profile (g*,..., g**) forms network g*. Here as well as in (4) to maximize the sum of players' payoffs to N, players should not remove links from the network. Therefore, any profile ¿¿(g) coincides with gi for any player i e N and any network g. Let

E^K^W )) = SiemaxeN„iem-a,xeN£ KiK,wNi(S)).

Again to allocate the maximal sum of players' payoffs according to some imputation, a cooperative TU-game (N, V) is constructed. The characteristic function V is defined similarly to function v(g) considered in Subsection 1.2.

Proposition 3. In the cooperative two-stage network game the superadditive characteristic function V(•) in the sense of von Neumann and Morgenstern is defined as:

V(N) = £ KiK*,«Ni(g*)),

ieN

V (S) = giem-axes^maxes^ ^K^ns),

¿es

V (0) = 0.

For a singleton {i}, its value is defined in the following way:

V({i}) = max Ki(wi). (8)

uieUi

An imputation in the cooperative two-stage network game is an n-dimensional profile e = (&i, ..., &n), satisfying £¿e„ & = V(N) and > V({i}) for all i e N. Let the set of imputations in game (N, V) be denoted by I(V).

A cooperative solution concept in cooperative TU-game (N, V) is a rule that uniquely assigns a subset CSC(V) C I(V) to game (N, V). For example, if the cooperative solution concept is the core C(V), then

csc(v) = c(V) = \ & e I(V) : ^> v(S), s c n i.

¿es

1.3. Time-consistent and strong time-consistent cooperative solutions

The problem of time consistenty and strong time consistency was systematically developed in (Petrosyan and Sedakov, 2014; Gao et al., 2017). Suppose that at the beginning of the game players jointly decide to choose behavior profiles g*, u*, i G N to maximize the sum in (7), and then allocate it according to a specified cooperative solution concept CSC(V) which realizes an imputation £ = (£]_,..., £n). It means that in the cooperative two-stage network game player i G N should receive the amount of £i as his payoff. What will happen if after the first stage (after choosing the profiles g*,..., g^) player i G N recalculates the imputation according to the same cooperative solution concept? The behavior profile g*,..., g^ at the first stage forms network g* , therefore, after recalculation of the imputation (according to the same cooperative solution concept as £), players i's payoff will be £i(g*) based on the values of characteristic function v(g*, S) for all S C N.

The definition of time-consistent imputation was adopted for two-stage network games in (Petrosyan, Sedakov and Bochkarev, 2013). In the aforementioned paper it was shown for the first time that the Shapley value, the T-value, and the core are inconsistent cooperative solutions in this class of games.

Definition 1. An imputation £ G CSC(V) putation £(#*) G CSC(v(g*)) such that the

= £i(g*),

A cooperative solution concept CSC(V) is CSC(V) is time consistent.

is time consistent if there exists an im-following equality holds for all players:

i G N. (9)

time consistent if any imputation £ G

Equality (9) means that if we choose a cooperative solution concept CSC(V) at the first stage and according to it calculate the imputation £, defining players' payoffs, and then at the second stage recalculate players' payoffs according to the same cooperative solution concept CSC(v(g*)), i.e., calculate a new imputation £(#*), subject to formed network g*, players' payoffs will not change.

Proposition 4. Any cooperative solution concept based only on values V(N), V({i}) i G N is time consistent.

Remark 1. Using the previous proposition, note that the CIS-value (CIS1,..., CISn) (Driessen and Funaki, 1991) calculated by the formula

is the time-consistent cooperative solution concept.

Since in most games condition (9) is not satisfied, the time consistency problem arises: player i G N, who initially expected his payoff to be equal to £i, can receive different payoff £i(g*). To avoid such situation in the game, a stage payments mechanism—an imputation distribution procedure (Petrosyan and Danilov, 1979) for £ is proposed. The definition of the imputation distribution procedure was also adopted for two-stage network games.

Definition 2. An imputation distribution procedure for £ in the cooperative two-stage network game is a matrix

/£11 £12'

\£n1 £„2 ,

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

where

£i = £¿1 + £¿2, i g N.

The value £ik is a payment to player i at stage k = 1, 2. Therefore, the following payment scheme is applied: player i G N at the first stage of the game receives payment £i1, at the second stage of the game he receives payment £i2 in order to his total payment received at both stages £i1 + £i2 would be equal to the component of allocation £j, which he initially wanted to get in the game as his payoff.

Definition 3. Imputation distribution procedure £ for £ is time consistent if

£i - £i1 = £i(g*), for all i G N.

It is obvious that the time-consistent imputation distribution procedure for £ = (£1,..., £„) in the cooperative two-stage network game can be defined as follows:

£i1 = £i - £i(g*), (10)

£¿2 = £i(g*), i g N.

If the cooperative solution concept CSC(V) assigns multiple allocations (for example, the core), a more strict property can be used—strong time consistency. In (Gao et al., 2017) the definition of strongly time-consistent solution (the core) was adopted, and the corresponding definitions and propositions were presented.

Definition 4. An imputation £ G CSC(V) is strong time consistent if the following inclusion is satisfied:

CSC(v(g*)) Ç CSC(V). (11)

A cooperative solution concept CSC(V) is strong time consistent if any imputation £ G CSC(V) is strong time consistent.

Therefore, the core C(V) is strong time consistent if C(v(g* )) Ç C(V). The next result directly follows from Proposition 4.

Proposition 5. Any cooperative solution concept based only on values V(N), V({i}), i G N is strong time consistent.

The proof of the statement is very similar to the proof of Proposition 4 replacing equality (9) with inclusion (11).

For cooperative solution concepts which are not strong time consistent, one can also introduce an imputation distribution procedure.

Definition 5. Imputation distribution procedure £ for £ is strong time consistent if

(£11,..., £„1) © CSC(v(g*)) Ç CSC(V), (12)

where a © A = {a + a' : a' G A}, a G Rn, A C Rn.

Unfortunately, for strong time-consistent imputation distribution procedures it is impossible even to derive formulas similar to (10) in general. However, for the core, one can provide conditions for the existence of strong time-consistent imputation distribution procedures. Note, that inclusion (12) for the core can be rewritten as

08ii,...,£ni) © C(v(g*)) C C (V). (13)

Proposition 6 (Gao et al., 2017). Let a set C(W) be an analog of the core in the game with characteristic function W(S) = V(S) — v(g*, S), S C N, i.e.,

C(W) = {(a ...,&„): £ & > W(S), S C N; £ & = W(N) = 0},

¿es ¿en

and let this set be non-empty. Then imputation distribution procedure ft for an imputation from the core C(V) satisfying the conditions

(ftii,...,ftni) e C(W), (14)

(fti2,...,ftn2) e C(v(g*)),

is strong time consistent.

1.4. Two-stage games on undirected networks

Now we consider a case of directed networks (Petrosyan and Sedakov, 2014). Since the network can be undirected, the characteristic function has to be redefined. Let the resulting network g consists of directed links (i, j) s.t. gj = 1. Define the closure of network g as an undirected network g where gj = max{gij, gji}. Similarly to the previous case, payoff function of player i depends on network gd, his control and controls «j, j e Ni(gd) of his neighbors in the closure gd:

Ki(ui, UNiCgd)) : Ui x JJ Uj ^ R, i e N,

jeN^)

When players act cooperatively, they should choose gi e Gi and (di(gd),ui) e Dj(g) x Ui, i e N to maximize the joint payoff:

£Ki(ui,UNi(gd)). (15)

¿eN

Again, to allocate the maximal sum of players' payoffs according to some solution concept, one needs to construct a cooperative TU-game (N, V). Note that V(N) =

£ Ki(u*, uN,(g*)). ¿eN '

Consider a non-empty coalition S C N. Denote a network, formed by profiles gi, i e N, s.t. gj = (0,..., 0) for all j e N \ S, by gs. Let gs be the closure of gs. For any controls «j, i e S let controls Uj(us), j e N \ S, where us = {«j}, i e S, solve the following optimization problem

uNi(ss)nS; u(N\S)nNi(-s)(uS))

¿es

uJe(N\S)nNi(-S)

¿es

Here MNi(gS)ns is the profile of controls chosen by all neighbors of player i from coalition S in the network , and U(N\s)nNi(gS)(«S) is a profile of controls chosen by all players from coalition N \ S who are neighbors of player i in the network . The next proposition is the analog of Proposition 3.

Proposition 7 (Petrosyan and Sedakov, 2014). Suppose that functions Ki; i G N, are non-negative and satisfy the property (P). Then for all S C N we have

V(S)= . ImaX n £ Ki (ui,u«,(js)nS, u(N\S)nWi(ss)(usd •

iES ieS

In a similar way one can determine the characteristic function v(g*, S) for S Ç N. Note that

, N ) = £ Ki«,«^)) = V (N ),

¿ew

The following result becomes the analog of Proposition 2 for the case of directed networks.

Proposition 8. If functions Ki, i G N, are non-negative and satisfy the property (P), the value v(g*,S) can be calculated by formula,

v(g*,S) = max £ ^¿(«¿,«N¿(3* )ns, u(N\s)nwi(3S)(«s)),

'¿es^ ieS

where «j («s), j G N \ S, solve the following optimization problem:

)nS,u(N\S)nNi(gS)(«s^ =

¿es

= , m^P , v/ (Mi,MNi(3S )ns, «(N \S)nN,(ji))

Uj ,je(N\s)nNi(gj) ¿eS V ) ( ) '

and gS is the closure of network gS, formed by profiles g*, i G N, s.t. g* = (0, .. ., 0) for all j G N \ S.

1.5. Two-stage games with pairwise interactions

In (Bulgakova and Petrosyan, 2015; Bulgakova and Petrosyan, 2016) two-stage cooperative network games with pairwise interactions were proposed. The first stage is a network formation stage. On the second stage players play bimatrix games between partners according to the network realized on the first stage.

Description of the model The model under consideration was introduced in (Bulgakova and Petrosyan, 2015). Let N be a finite set of players, |N | = n > 2. On first stage zi each player i G N chooses his behavior b1, an n-dimensional vector of offers to connect with other players. The result of first stage is a network g(&1,..., &n). On the second stage z2(g) which depends upon the network chosen on the first stage, neighbors in the network play pairwise simultaneous bimatrix games and after that get their payoffs then the game ends.

Consider first stage of game. As it was mentioned, players on the first stage choose behaviors b1 = (bi11,..., b1n), with components:

bi = j1, if j G M^ (16)

1j 0, otherwise v '

Connection ij takes place if and only if bj = j = 1. Briefly denote Nj(g) as Nj. After formation of network players pass to the second stage z2(g). On the second stage n-person game between all participants of network takes place. This game is a family of simultaneous pairwise bimatrix games Yj between neighbors. Namely, let i G N, j € N, i = j, j € Nj, then i plays with j a bimatrix game Yj with matrix A j and Bj for players i and j respectively,

■ j j

a11 a12

ij ij a21 a22

aiJ \ a1k

2 k

ij ij ij \ami am2 • • • amfc/

ap > 0 , bp > 0, p — 1,

/j b1j2 ••• j

Bij —

b21 b22

\bm1 bm2

m; l — 1, • •

•• biJ b2k

b?mfc/

k

Characteristic function in two-stage game The characteristic function in the two-stage game with pairwise interactions under the consideration is defined using the ideas from (Petrosyan, Sedakov and Bochkarev, 2013). Moreover, due to the structure of interactions, the expression of the characteristic function in found in a closed form.

Game rz2 can be considered in cooperative form. The characteristic function is defined in trivial way. Denote maximal guaranteed gain (maxmin) of player i(j) with neighbor j(i) as:

Wjo = maxmin aj, = maxmin 6j, p =1,...,m, l = 1,...,k. (17)

p i pl i p pl

The values of the characteristic function v(z2; S) are equal to:

{ij}) = max(apj + j + £ wr + £ j, j G Nj P' rew,\{jj qeNj\{i}

v(z2; {ij}) — v(z2; {ji}) — 0, j G N \ Ni,

v(z2;{i}) — £ wij,

je Ni

v(z2;S) = ^ £ £ wifc, S c N,

i,jes,jeNi ies,fce(Ni\s)

1 n .... AT) =2 £ max(Opi + hpi)-

i=1,i=j,jeNi p'

In this case we define the value of the characteristic function for coalition S C N as lower value of zero-sum game between S and N \ S in game rz2, and the superadditivity follows from this.

As before for S C N define the characteristic function v(zi; S) as lower value of zero-sum game between coalition S, acting as player I (maximizing) and coalition N\S, acting like player II (minimizing), where payoff of player S is sum of payoffs of players in S, and strategy of player S — element of Cartesian product of sets of players' strategies from S. For minimizing player the best way of behavior is to eliminate all the connections with maximizing player (because of positive payoffs

for each connection) Hence we get:

v(zi; {i}) = 0, v(zi; 0) = 0,

( max(opi + ),j G Ni, v(zi; {ij}) =< p,1 ^^ 10, j G N \ Ni,

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

v(zi;S) = £

¿es.jewics

v(zi; N) = N)•

v(zi;{ij}))

1.6. The core in two-stage three-person game

With the use of the characteristic function considered in (Bulgakova and Petrosyan, 2015), we examine the core as solution in three-person game. Define also the core C(z) C I(v) in game r and suppose, that for every z1, z2, C(z) = 0. For the second stage z2 we have following values of the characteristic function:

v(z2; 0) = 0, v(z2; {1}) = wi3 + wi2, v(z2; {2}) = W2i + W23, v(z2; {3}) = W3i + W32, v(z2; {12}) = max(ai2 + b2J) + wi3 + W23,

v(z2; {13}) = max(aJ3 + b?*) + wi2 + W32,

v(z2; {23}) = max(a?3 + b32) + W2i + W3i,

(Z2; N) = max(aJ2 + bpi) + max(

i3

+ bp?) + max(ap3 + bp2).

Introduce notations: Ai2 = max(ap2 + bpi), Di = w23 + wi3, Ai3 = max(ap3 +

pl pl pl pl pl

bp/), D2 = wi2 + W32, A23 = max(ap3 + bp2), D3 = W2i + W3i. Imputation

x = (xi,x2,x3) belongs to core C(z2), when following conditions are satisfied. This system, which defines structure of the core C(z2) can be rewritten in the form:

xi + x2 > v(z2; {12}) xi + X3 > v(z2; {13}) X2 + x3 > v(z2; {23}) xi > v(z2; {1}) x2 > v(z2; {2}) x3 > v(z2; {3}) ii + x2 + x3 = v(z2; N)

xi + x2 > Ai2 + Di

xi + x3 > Ai3 + D2 x2 + x3 > A23 + D3 ^xi + x2 + x3 = v(z2; N)

Consider the core C(zi) of two-stage game r and rewrite it in accordance to new notations:

2

v

xi + > v(zi; {12}) xi + x3 > v(zi; {l3}) x2 + x3 > v(zi; {23}) xi + x2 + x3 = v(zi; N)

xi + x2 > Ai2

xi + x3 > Ai3 x2 + x3 > A23 xi + x2 + x3 = v(zi; N)

Strongly time-consistency of the core Using an IDP ft, we get:

fti + fti + ft2 + ft2 > Ai2

fti + fti + ft3 + ft3 > Ai3 (18)

ft2 + ft2 + ft3 + ft3 > A23

For strongly time-consistency these inequalities must satisfy under following additional conditions:

fti + ft2 > Ai2 + Di

fti + ft3 > Ai3 + D2 (19)

ft2 + ftf > A23 + D3

Fix fti, then for strongly time-consistency we must have (19) for all ft2. ft2 must satisfy conditions (18). Also from that v(z2; N) = v(zi; N), we get ftj+ft^ft3 = 0. If (18) satisfies under minimal values of ft2, ft2, ft3 from condition (19), then it satisfies for other values as well. We get:

—ft3 + Ai2 + Di > Ai2

-ft2 + Ai3 + D2 > Ai3 (20)

—fti + A23 + D3 > A23

Hence we get conditions for strongly time-consistency of the core C(z2) in game r.

Proposition 9. Suppose that the following conditions are satisfied

ft3 < Di

ft2 < D2 (21)

ft11 < D3

(there exists fti which satisfy (21)), then the core C(zi) is strongly time-consistent.

2. Dynamic games with shock

The following papers (Gao et al., 2017; Petrosyan and Sedakov, 2016) are devoted to repeated games with finite number of rounds. In the framework, the first round is the network formation stage where players form a network choosing their neighbors. All the subsequent rounds have similar structure: observing the network, each player may reconsider his set of neighbors (he can only make the set smaller) and after that the player selects an admissible control. Players' decisions made in the current round do not influence the structure of the game in any of subsequent rounds. What does influence it is a so-called "shock", an external factor with a stochastic nature. There are different types of shocks. For instance, in (Corbae and Duffy, 2008), the

shock changes sets of players' actions. In the setting, it is supposed that the shock makes a particular player inactive in the game. Moreover, it is assumed that the shock may appear in each round after the network formation stage, but once the shock appears, it will never appear in subsequent rounds of the game.

One-shot network games are studied in (Bala and Goyal, 2000; Galeotti et al., 2006; Haller, 2012) where Nash equilibrium is considered as a solution. The model is based on a cooperative two-stage network formation game (Petrosyan, Sedakov and Bochkarev, 2013). It is worth noting that for static games (or two-stage games), similar settings involving network formation as well as a strategic component are well-studied for coordination games and prisoner's dilemma games (Goyal and Vega-Redondo, 2005; Jackson and Wolinsky, 1996; Xie et al., 2013). Dynamic aspects of network formation including stochastic elements or cooperative behavior, are considered, for instance, in (Feri, 2007; Fosco and Mengel, 2011; Feri and Melendez-Jimenez, 2013; Jackson and Watts, 2002).

The papers (Gao et al., 2017; Petrosyan and Sedakov, 2016) also cover the problem of subgame consistency of a cooperative solution in repeated network games, namely, subgame consistency of the dynamic Shapley value (Shapley, 1953). It is known that the Shapley value is an efficient cooperative solution. The Shapley value is subgame consistent if for any player his entry of the Shapley value equals to the sum of cumulative individual stage payoffs up to an arbitrary round and his entry of the Shapley value in the subgame starting from this round, provided that all the players follow the cooperative agreement. The notion of subgame consistency was introduced in (Petrosjan, 2006) for a cooperative stochastic game. Inconsistency of the cooperative solution may break the cooperative agreement, but by means of specially designed imputation distribution procedure (Petrosyan and Danilov, 1979), the cooperative agreement can be kept throughout the game.

As an application of the proposed theory, one can imagine a wireless network in which a pair of wireless agents (players) can transmit data to each other. Data transmission is successful if transmit power of the players is greater than a threshold, i.e. if they are "connected". Thus the first stage can be interpreted as a stage at which players choose their transmit power. Then observing the network, agents can reduce their transmit power (if it makes sense) and select the transmission capacity according to a demand, while the shock can make the particular agent inactive in the network. In a cooperative scenario one may focus on finding a policy that maximizes the expected total profit of the network according to its topology and transmission capacities chosen by the agents.

2.1. The model

Due to the importance of (Gao et al., 2017) from our perspective, we describe the model in detail. We consider a dynamic game with more than two stages. Let I + 1 be a length of the game. The game consists of one network formation stage and I rounds. When the game is deterministic, one can easily extend the theory of two-stage games to the (I + 1)-stage game. For this reason a stochastic element called "shock" influencing the network structure is introduced. The shock, which may appear between rounds with a given probability p, is characterized by a discrete random variable w that takes only I +1 values. If w = 0, the shock does not appear in the game, whereas w = t specifies the game round before which the shock appears. It is supposed that the probability of the shock in round t equals

Pr(w = t) = (1 — p)4 1p for t G {1,..., I}, and the probability of not appearing the shock in the game is Pr(w = 0) = (1 — p)1.

Consider a tuple (t1, ..., ti) where Tt G {0,1,..., t} for all t G {1,..., I}. We connect realization of w = t with a unique tuple (t1(w), ..., Ti(w)) such that T1(t) = ... = Tt_i(t) = 0 and Tt(t) = ... = Ti(t) = t. Below, to simplify notations, the dependence on w is left out.

Game stages Again we distinguish the network formation stage and other subsequent stages as in (Petrosyan, Sedakov and Bochkarev, 2013).

Network formation stage. This stage dos not differ from that of the two-stage model. Let network g be formed at this stage. Denote the player who has more neighbors in g than any other player by m G N, i.e.

m = argmax |Nj(g)|, (22)

ieN

If more than one player satisfies (22), we select one of them for further consideration.

Round 1. At the beginning of the round the shock appears with probability p. It means that player m with this probability becomes inactive in the network. In other words, all links involving player m are eliminated from network g, yet this player still belongs to set N and receives zero payoffs. Note that in round 1, t1 can take only two values: 0 and 1.

Let g-m denote a network in which all the link with player m G N are deleted, i.e. g-m = g \ {(j, m) G g : j G Nm(g)}. Thus we have a network

g!'Tl = ( g, T1 =0 L g-m, T1 = 1.

After observing the network g1'11, players are allowed to reconsider its structure: in particular, players can only delete some "ineffective" links. For this purpose, we introduce n-dimensional vectors dj(g1,Tl) = (dj1(g1'Tl),..., din(g1,Tl)), i G N which show an updated network:

, ( 1,Tl) i 1, if i keeps the link with player j G N^g1'11) in g1'11, (23) j(g \0, otherwise. ( )

Let Di(g1'Tl) = {di(g1'Tl) : dj(g1'Tl) satisfies (23)}, i G N. The profile d(g1'Tl) = (d1(g1'Tl),..., dn(g1,Tl)) updates network g1'11, thus a new network, denoted by gd,1,Tl, consists of links (i, j) such that dj(g1'11) = dji(g1'Tl) = 1.

At the same time, each player chooses a control from a given set. In particular, player i G N chooses w1'1"1 G Uj. Then behavior of player i in round 1 is a pair (dj(g1,Tl),u1'Tl). A payoff to player i G N is defined according to a real-valued payoff function Kj which depends on the updated network gd,1,Tl, control m^11 of player i, and controls of his neighbors wj'Tl, j G Nj(gd'1'Tl), i.e. Kj(u1'Tl, wNTi1gdl1,T1)).

Having received the payoffs in this round, players proceed to the next round with the similar structure. Consider an intermediate round t G {2,... ,1} and suppose that we have an updated network gd'i-1,Tt-1 after round t — 1.

Round t, t G {2,.. .,1}. At the beginning of this round the shock appears with probability p (if it did not appear before). In case of the shock, player m becomes inactive with this probability. If shock appeared in previous rounds, nothing happens. Thus we have a network just prior the round t:

t Tt_ / 'Tt-1 ,Tt G{0,1,...,t - 1},

g ' = 1 d, t-1 ,Tt_i ,

I g-m , Tt = t.

After observing the network gt 'Tt, players are allowed to reconsider its link structure. For this purpose, we introduce n-dimensional vectors dj (gt,Tt) = (di1 (gt,Tt),..., djn(gt,Tt)), i G N which show an updated network:

d ( t,Tt) = i 1, if i keeps the link with player j G Nj(gt,Tt) in gt,Tt, (24) (g ' ) [0, otherwise. ( )

Let Di(gt'Tt) = {di(gt'Tt) : di(gt'Tt) satisfies (24)}, i G N. The profile d(gt'Tt) = (d1(gt'Tt),..., dn(gt,Tt)) updates network gt,Tt, thus a new network, denoted by gd,t,Tt, consists of links (i, j) such that dj (gt,Tt) = djj(gt,Tt) = 1.

At the same time, each player chooses control from a given set. In particular, player i G N chooses wt,Tt G Uj. Then behavior of player i in round t is a pair

(di(gt'Tt ),ufTt).

In round t, a payoff to player i G N is defined according to the same real-valued payoff function Kj(ut,Tt, «NTo^,^))

Having received the payoffs in this round, players proceed to the round t + 1 unless t = I. In this case the game ends.

Strategies To formalize the game, define strategies of players.

Definition 6. A strategy xj = {x"} of player i G N is a rule that assigns a profile:

< = (gj, (dj(g1'T1 ),u!'T1),..., (dj(gl'Tl ),ufTl)) to each value w G {0,1,..., I}.

Recall that w defines profile (t1, ..., ti) in a unique way, therefore, for any w = t, t G {0,1,..., 1} and player i G N, we get

(gi, (dj(g1,°),u1'0),..., _t = j (dj(gt-1'0),ut-1'0), (dj(gt't),ut't),..., (dj(gl't),ul't)), i = m, (gm, (dm(g1'0),umi0),...,

(dm(gt-1'0), um-1'0), (0, «mi),..., (0, umt)), i = m.

Let Xj denote a set of strategies of player i G N. Given a value w = t, consider a profile xt = (x1,...,x^). A payoff to player i G N for w = t equals:

t-1 • 0 •0 1 • t •t

E Ki(uj ' ,MN-(gdj,0)) + E Ki(uj ' t,UN-{gd,,,t)),i = m,

Kj(xt)H j=1 . q . o j=t

E Km(um , UWm(gd.i.0)), i = m.

■ j=1

Then a payoff Ej to player i G N in the whole game is defined as his expected payoff, provided that the strategy profile x = (x1,...,xn) G X1 x ... x Xn is chosen:

1

Ej(x) = £Pr(w = t)Kj(xt).

t=0

2.2. Cooperation in the dynamic game with shock

In the previous section we specified rules of the repeated game and formalized it. Now the repeated game is considered from the perspective of classical cooperative theory.

Characteristic function Under the cooperative agreement, the value V(N) can be easily determined using the ideas of (Petrosyan, Sedakov and Bochkarev, 2013). Since all the players aim at maximizing their total expected payoff, let

V(N)= max V ¿¿(x) = V ¿¿(x*). (25)

¿eN ¿ew

A strategy profile x* = (x*,..., x^) which entries x*, i G N are from (25) is called the cooperative strategy profile.

Proposition 10. Let functions Ki, i G N satisfy property (P). Then cooperative strategy x* = jx*w} of player i G N for a given w = t, t G {0, 1,... ,1} is of the form: xf = (g*, (g*, u**0), .. ., (g*, u**0), (g*, uf), . .., (g*, uf)) for all i G N \ {m}

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

and xmt = (gm, (gm, <?),..., (gm, <?), (0, <a..., (0,<j)).

From the previous statement, we conclude that only two networks are possible in the cooperative framework: network g* which is not changed until the shock appears, and network g-m which is not changed after the shock has appeared.

The next result is an extension of the result introduced in (Gao et al., 2017), provided that functions Ki, i G N satisfy property (P). The statement connects the value V(S) for a given coalition S C N with values of "local" (or stage) characteristic functions in games played in each round. More specifically, given a round t and a number Tt G {0,1,...,t}, we define stage characteristic functions as in (Gao et al., 2017) v(S) before the shock and v(S) after the shock:

v(S) = max y^Ki(ui,uNi(s)ns), (26)

ies ¿eS

®(S) = max.xU. Ki(ui,uw.(9)ns). (27)

%eS\{m> " ¿eS\{m}

Proposition 11. The value V(S) can be found from the recurrence equation

V(S) = plv(S) + (1 - p)Vi(S) where

Vt(S) = v(S) + p(l - t)v(S) + (1 - p)Vt+i(S)

for t = 1,...,l - 1 with boundary condition Vl(S) = v(S).

2.3. Cooperative solution

Having determined values V(S) for all S C N, one can define an imputation which is a n-dimensional vector showing how the maximal total expected payoff V(N) is allocated among players. Let the Shapley value ^ = (^1,... be taken as the solution. Specifying the Shapley value as the solution, for all i G N, its entries can be determined by the formula:

^ = £ as [V(S) - V(S \{i})],

scw,ies

where as = (|N| - |S|)!(|S| - 1)!/|N|!.

Let < = (<,..., <n) and < = (V,..., <n) be the "local" Shapley values calculated for characteristic functions v and v respectively:

< = £ as [v(S) - v(S \{i})],

SCN,i£S

< = £ as [v(S) - v(S \{i})].

SCN,i£S

Note, that <m = 0.

Proposition 12 (Gao et al., 2015). The Shapley value & in (28) can be found in an explicit form by means of the Shapley values < and < in stage games:

{l-p)[l-{l-pf]± | l£_ (l-p)[l-(l-pf]

<, i = m, (29)

= ---—--—4>i +

p

{i-v)[i-{i-pf]

<Pm = -<pm. (30)

p

2.4. Subgame-consistency problem

The problem of subgame consistency of cooperative solutions was considered in (Yeung and Petrosyan, 2006; Yeung and Petrosyan, 2012) for cooperative differential games. In cooperative dynamic network games this problem was examined in (Gao et al., 2017; Petrosyan and Sedakov, 2016). Before the game rC starts, players agree on choosing cooperative strategies x|,..., x^ from (25), i.e. the strategies that maximize the total expected payoff, and allocating the value V(N) according to the Shapley value This means that in rC each player i € N expects his payoff to be equal to . If players recalculate the Shapley value after the network formation stage (after choosing gf,... ,g^), unfortunately, it turns out that the recalculated Shapley value differs from the "original" This fact leads to breaking the cooperative agreement since some players may refuse using their cooperative strategies. We study the problem in detail.

Characteristic function and the Shapley value in a subgame Similarly to (26) and (27), we define characteristic functions: before the shock v(g, S) and after the shock v(g, S) for any S C N, provided that network g has formed (which is the case after the network formation stage):

v(g,S) = max y] Kj(Mj,MN(s)ns), (31)

ies ieS

v(g,S )= max V Ki(uj ,uNj (g)ns). (32)

Uj EUj, f * w'

i€S\{m> ies\{m>

Consider a game round t € {1,.. .,1} and t € {0,1,...,t}. Let r£T = (N, V i'T) denote a subgame of the game rC. The characteristic function V4'T for any S C N is defined similarly to V(S) as:

'- t +1)V(g-m, S) for T €{1,...,t},

g—m, S) + (1 - p)V1 with boundary condition Vl+ 1,0(S) = v(g—m, S).

Vi'T(S) = I v(g, S) + p(l - t)v(g—m, S) + (1 - p)V1,0(S) for T = 0

The entries of the Shapley value = ,..., ) in subgame r<i;T can be determined by the formula:

= E «« [V^(S) - V^(S \{i})] .

scw.ies

(33)

Note that ^ = 0 for all t G {1,..., t}.

Let <(g) = (<i(g),..., <„(g)) and <(g) = (<^i(g),..., <n(g)) denote the "local" Shapley values calculated for characteristic functions v(g, S) and w(g, S) respectively:

= E as [v(g,S) - v(g,S \{i})] ,

scw,ies

= E as [%,S) - %,S \{i})] •

scw,ies

Note, that <m(g) = 0. Then we get a result similar to Proposition 12.

Proposition 13 (Gao et al., 2015). The Shapley value in (33), t G {1,..., i},

can be found in an explicit form by means of the Shapley values <(g*) and <(g-m) in stage games

S>?-° =

1 +

(1 - p)[1 - (1 - p)l-t]

+

i -1 -

(1 - p)[1 - (1 - p)l-t]

m ), « = m

'- t +1)<i(g* m), T G{1,...,t}, i = m,

=

m

1+

(1 - p)[1 - (1 - p)l-t]

^mT = 0, T g {1,...,t}.

(34)

(35)

(36)

(37)

The entries of the expected Shapley value ^ = (^1,... , ^J in the remaining rounds of the game starting from round t G {1,..., I}, provided that the shock has not appeared yet, have the following form (Gao et al., 2015):

= (1 - p)^'° + pif (1 - p)[1 - (1 - p)l-t+1]

<i(g*)

+

i - t + 1 -

(1 - p)[1 - (1 - p)™]

m^ « = m,

< = (1 - p)#

,t,0 m

(1 - p)[1 - (1 - p)l-t+!]

p

<m (g*).

Thus, we come to the following observation. In game rC players agree on choosing cooperative strategies x*, i G N and allocating value V(N) according to the Shapley value ^ determined by formulas (29) and (30). After forming network g* prescribed by the cooperative strategies, players may recalculate the Shapley value

p

p

p

p

p

which becomes &. Therefore, there may exist a player i G N such that & = This fact means "inconsistency" of the Shapley value. The Shapley value would be subgame consistent if for any player the statement was true: the entry & equals the sum of cumulative individual stage payoffs to player i up to round t and the entry in the subgame starting from this round, provided that all the players follow their cooperative strategies x*, i G N.

Mechanism of stage payments Since the Shapley value is subgame inconsistent (Gao et al., 2015; Petrosyan and Sedakov, 2016), we reallocate players' stage payoffs with new payments specified below. Denote payments to player i G N in all rounds in rC by &j = {£?,£*'T}, t G {1,..., l}, t G {0,1,...,t}. Here £? is a payment to player i at the network formation stage, Ai'T is a payment to player i in round t, provided that the shock appears in round t for t > 0 (or the shock does not appear before round t for t = 0).

Definition 7. Imputation distribution procedure (IDP) of the Shapley value & = (&i,..., ) is a profile A = (Ai,..., such that

= + £ Pr(w = t) £ , i G N.

(38)

t=i

Definition 8. IDP A of the Shapley value & is subgame consistent if for all i G N and t G {1,..., 1} we have:

£

= £#T, t G {1, .. ., t},

(39)

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

j=t f,i,0 Ot,o

£ £ £ = + £ Pr(^ = t|t) £ + Pr(^ = 0|t) £

T=t+i j=t+i j=t+i

i.e. if a stage payment to player i and his expected component of the Shapley value in the game starting from this stage till the end equals &t,T for all t.

Proposition 14. Subgame consistent IDP A for the Shapley value & is of the form:

t,o

At & 0

rm

A4,0 A

m

t,T

m

-

j i _ ,

mm

-I. -I.

¿j(g*), = ¿j (gl

i = m, t G {1,.. J,t G {1, ..

¿m (g*), t G {1, . . 0, t G {1, . .

, l}, i = m, , l}, t = 0,i = m,

,1},t = 0.

The designed subgame consistent IDP prescribes players the following mechanism of stage payments: in each round players are paid according to their local Shapley values, whereas at the network formation stage players are paid the difference _ i G N which does not always equal zero.

In (Petrosyan and Sedakov, 2016), a more general model is considered when there is the second network formation stage where all players without the player affected by the shock can revise the network again. Alternative studies, for example, (Butenko and Petrosyan, 2014; Butenko and Petrosyan, 2015) differ from the

T

model under consideration in that that the shock influences not a particular player, but particular links, or players' action space. These model are applied to a three-stage problem, however they can also be extended to a game with an arbitrary number of stages.

3. Strategic support of cooperation

This probled was studied in (Petrosyan and Sedakov, 2015). For players, cooperation is more preferable than non-cooperative behavior as cooperative behavior can be more beneficial for them. However, in dynamic games, a cooperative agreement creates two major problems. The first one is time inconsistency of any dynamic cooperative solution in general: Even if all players agree on the solution at the beginning of the game, a player/group of players, focusing on her/its cooperative payoffs, might want to revise the solution after some stages (time). To make players indifferent to the revision of the cooperative solution, an imputation distribution procedure (Petrosyan and Danilov, 1979) reallocating players' stage payoffs over time under the cooperative agreement is introduced. The second problem is that the players' cooperative strategies which result in the cooperative payoffs are not a Nash equilibrium in general. This means that there exists a player who will benefit if she stops following her cooperative strategy prescribed by the agreement. Is is shown show that implementing the time-consistent imputation distribution procedure, it becomes possible to find a Nash equilibrium guaranteeing the cooperative payoffs in some class of strategies. However, we can only do it under a specific condition on parameters of the dynamic game. When we are able to have the cooperative payoffs as a result of both a time-consistent imputation distribution procedure and a Nash equilibrium, we can say that the cooperative agreement and, therefore, the cooperation of players is strategically supported (Parilina, 2014; Petrosyan, 2008; Petrosyan and Zenkevich, 2009; Yeung and Petrosyan, 2012).

Here the theory of strategically supported cooperation is developed for dynamic games on networks in which a network structure is a central element (see, for example, (Petrosyan and Sedakov, 2015)). Again during the game, players form a network and choose their control variables, but they can benefit only from their neighbors in the network as in (Petrosyan, Sedakov and Bochkarev, 2013). The theory will also be applied to repeated games (Abreu et al., 1994; Aumann and Shapley, 1994; Myerson, 1997) which is a special class of dynamic games.

4. The model

The game considered in (Petrosyan and Sedakov, 2015) consists of a network formation stage which is the same as in previous models and subsequent stages of a similar structure where at stage t each player chooses (d(gi),Mi(gi)) and the is rewarded according to his payoff function ¿iKi(Mi(gi), MNj(d(gt))(gi)), where S € (0,1) is a common discount factor. Here we note that for any player i € N, his control «¿(g4) € Uj(g4) depends on a network. After players receive their stage payoffs, we proceed to next stage and the corresponding stage game on the network gi+1 given by a single-valued rule T: gi+1 = T(g4, u(g4)) where «(g4) = (u1(gt),..., un(gi)).

Definition 9. A strategy n of player i € N is a rule that uniquely prescribes behavior gj of this player at the network formation stage and behavior (¿¿(g4), «¿(g4)) at game stage t > 1 on network g4 € G(N).

Given a strategy profile n = (ni, • • •, nn), one can define the payoff to player i G N in the game as a function of the strategy profile n as

w

" at

Ki(n) = £ ¿1Ki(«i(gi),UN,(d(9«))(g!))I

t=i

provided that the discounted sum exists.

Suppose now that players jointly choose strategies ni, • • •, nn to maximize the sum of their payoffs in the game. The profile n = (ni, • • •, n?n) solving the following maximization problem

£ Ki (n) = ma^^ Ki (n),

¿eN n iew

(if the maximum exists), we call the cooperative strategy profile, and an element of the profile is a cooperative strategy. Since the game is considered in the cooperative setting, the payoff to a player is prescribed by a cooperative solution. As the solution, we take the Shapley value ^ = (¿1, • • •, which is calculated for the characteristic function

Ei£N Ki(n), S = N,

V(S) = <! mEWmiBv^iesKi(n), S C N,S = 0

0, S = 0,

where approach from (Von Neumann and Morgenstern, 1944) is used. Thus, in the cooperative setting each player should follow her cooperative strategy, and the payoff to each player i G N in the games equals . However, due to time inconsistency of the Shapley value, the player may not get the value as her payoff. It means that the equality

T-1

= £ ^¿(«¿(^«„¿(¿(g t))(gi)) + ¿T¿¿(if), (40)

t=1

does not hold at least for one t > 1 and player i G N. Here gT is the network to which the process comes at stage t under cooperation, i.e., gT = T(gT-1,«(gT-1)), and (gT) is the Shapley value in the infinite-horizon game starting from network gT and calculated for the characteristic function V(gT,S), given by

E ^t-TKi(Ui(s1)1UN,(d(st))(gi)), S = N,

V (gT ,S) =

W

-TKi(ui(gt),uNi(d(3 t) ^

t=T

w

(d ( m&x t„ (d ( min( t„ ^ ^ ^t-TK¿(u¿(gí),uN¿(d(St))(gí)),

(di( st),"i( s t)), (dj (st),"j (gt)), ¿ESt=T ies,t>T jEN\s,t>T

s.t. gt+1 = T(gWg4)), t > T,

TT

gT = gT, S C N,S = 0, 10, S = 0^

Here the value V(gT, S) is the maximal value which coalition S guarantees for itself if its complement N \ S acts against it in a zero-sum game, provided that network gT is given.

To fulfill condition (40) for all stages, we replace players' stage payoffs with payments (and we also add payments at the network formation stage) according to an imputation distribution procedure (IDP) which reallocates the Shapley value over time and players' stage payoffs at each game stage. In this model the IDP £ =

{Aji}jeN,i>° of the Shapley value & is determined in the way that £¿^t = The IDP A is time consistent if for all t > 1 and players, we have:

T-i

&i = £ 5% + 5T&j(gT). (41)

t=°

Dynamic games on networks of general structure In this section, we formulate the results about strategic support of cooperation in case of a dynamic game of a general form.

Proposition 15. The time-consistent IDP A of the Shapley value & for each i G N is given by the following expressions:

Ai° = &i _ ¿&j(gi),

Ait = &j(gt) _ ¿&i(gi+i), t > 1,

'where gi+i = T(g4, «(if)).

In general, cooperative strategy profile n is not a Nash equilibrium, therefore, even implementing a time-consistent IDP of the Shapley value, a player may break the cooperative agreement and switch from her cooperative strategy to some other trajectory. Below a condition when cooperative strategy profile is a Nash equilibrium is proposed. This result is obtained in a class of punishment strategies which is a sub-class of strategies in the sense of Definition 1. A punishment strategy C of player i G N is determined in such way that if no one deviates from her cooperative strategy, all players continue to follow these strategies, but if one player i G N deviates from her cooperative strategy ^ at some game stage, the remaining players from N \ {i} start punishing her immediately from the next game stage onwards and never switch their strategies back to cooperative (in other words, starting from the next game stage players i and N \ {i} are involved in a zero-sum game in which i tries to maximize her future payoff, whereas the coalition N \ {i} acting as a single player minimizes it).

Consider the following system of implicit inequalities with respect to 5:

&i > ¿V(g, {i}), &j(g) > Kj(g) + ¿V(T(g,«(g)), {i}), for all i G N, g G L,

where Kj(g) is the stage payoff to player i if deviating from her cooperative strategy, she plays best response to opponents' cooperative strategies in network g, and L C G(N) is a set of networks generated by cooperative strategy profile n. Here 5 implicitly appears in &j, &j(g), V(g, {i}), and V(T(g, «(g)), {i}). The system above is reduced to the following:

o < mmmm < ——■——; ——-, NN r , N > . 42

— ¿eiv gee \ V(g, {«}) V(T{g, u(g)), {«}) J V ;

Let there exist 5 such that the minimum in the right-hand side in (42) exceeds it.

Proposition 16 (Petrosyan and Sedakov, 2015). For any 5 that solves (42), strategy profile (Ci, .. ., Cn) with players' payoffs as &i,.. ., &n guaranteeing by time-consistent IDP A is a Nash equilibrium.

It is worth noting, that the system (42) may not have a solution from (0,1). If it is the case, we cannot obtain the cooperative outcome ^,..., as a result of a Nash equilibrium.

Repeated Network Games Repeated games is a class of dynamic games in which a given normal-form game appears either a finite or an infinite number of periods. In this part of the review, we suppose that we have one stage at which players create a network structure, and after this stage we have a normal-form game on a network which is repeated an infinite number of periods. In other words, Uj(g) = Uj(g') = U for all i G N and g = g', and for any g G G(N), we have T(g, u(g)) = g.

In (Petrosyan, Sedakov and Bochkarev, 2013; Petrosyan and Sedakov, 2015) the structure of players' cooperative strategies was proposed for two-stage games (one network formation stage and one game stage). Since the considered game is repeated, a structure of players' cooperative strategies in this game will be the same. Specifically, at the network formation stage players should choose gj, i G N and form network g, and from this stage players do not change the network choosing controls Uj, i.e., ¿¿(g) = gj.

Assuming that players behave cooperatively, again, the Shapley value is taken as a solution of the game. The entries of the Shapley value ^ are: <Pi = j^g<f>i, i G N, where ^ is the entry of the Shapley value in any of stage games determined by the characteristic function

^«¿^¿(g)^ S = N

v(S) = i max max Kj(uj, uN.(„)nS), S C N, S = 0, 0, S = 0.

Therefore, V(S) = j^v(S), S C N. The Shapley value <&(g) in the infinite-horizon game starting from network g has a similar form: <&i(g) = j^r$4>i{g), i G N, where ^¿(g) is the entry of the Shapley value in any of stage games determined by the characteristic function

^«¿^¿(g)^ S = N

v(g, S) = { max £ Kj(uj, MWi(g)ns), S C N, S = 0, 0, S = 0,

provided that the network g is given. Therefore, V(g, S) = -^v{g, S), S C N.

Due to time inconsistency of the solution, the allocation is realized with the use of an imputation distribution procedure. In case of repeated games, the time-consistent IDP A for the Shapley value ^ is of the form:

Ao = ~ Ms)),

Ait = &(g), ign, t = 1,2,....

Under the cooperative agreement, players create network g by profile (gi,..., gn) at the network formation stage, and do not change it choosing (Ui,..., Un) at each subsequent stage. So, if player i G N deviates from cooperative behavior at the network formation stage, she gets the value «({«}) as her payoff in the game.

However, if she deviates at a game stage, she will play her best response to those of other players and get the value Ki = max„iEUi Ki(Mi,UNi(g)) at this stage, and after deviation, her future payoff will be {«}). Therefore, player i will never

switch from her cooperative strategy fji to any other strategy if <Pi > and

<&i{g) > Hi + {«})• These two inequalities can be simplified to: <f>i > «({«})

which always holds, and S > 1 — if v(g, {«})• If Ki = v(9> {*})> we

-■"(s-i»})

have <P(g) > v(g, {«}) + -¿sv(g, {«}) = rb^fe {«}) = V(S, {«}) which always holds.

Then we have:

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Proposition 17 (Petrosyan and Sedakov, 2015). For any 5 > 5*, where

X* f-1 MSh-viSA^, iAOA

0 = max 1--, , ..— , 43

iEN: V Ki - V((J, {i}) J

strategy profile (Z1, • • •, Zn) with players' payoffs as ¿1, • • •, ¿n guaranteeing by time-

consistent IDP £ is a Nash equilibrium.

References

Abreu, D., Dutta, P. and Smith, L. (1994). The Folk theorem for repeated games: a NEU condition. Econometrica, 62, 939-948.

Aumann, R. and Shapley, L. (1994). Long-Term Competition—A Game-Theoretic Analysis. In: Megiddo N. (ed) Essays in Game Theory. In Honor of Michael Maschler, Springer-Verlag, pp. 1-15.

Bala, V. and Goyal, S. (2000). A non-cooperative model of network formation. Econometrica, 68(5), 1181-1231.

Bramoulle, Y. and Kranton, R. (2007). Public goods in networks. Journal of Economic Theory, 135(1), 478-494.

Bulgakova, M. and Petrosyan, L. (2015). The Shapley value for the network game with pairwise interactions. International Conference on "Stability and Control Processes" in Memory of V.I. Zubov, SCP, pp. 229-232.

Bulgakova, M. and Petrosyan, L. (2016). About strongly time-consistency of core in the network game with pairwise interactions. Proceedings of 2016 International Conference "Stability and Oscillations of Nonlinear Control Systems" (Pyatnitskiy's Conference), STAB 2016, pp. 229-232.

Butenko, M. and Petrosyan, L. (2014). A combined solution concept in a multistage network game. Proceedings of the XLV International Conference on Control Processes and Stability (CPS14), pp. 452-457.

Butenko, M. and Petrosyan, L. (2015). A two-step solution concept in a network game with shock of a special type. Proceedings of the XLVI International Conference on Control Processes and Stability (CPS15), pp. 573-578.

Corbae, D. and Duffy, J. (2008). Experiments with network formation. Games and Economic Behavior, 64, 81-120.

Driessen, T. S.H. and Funaki, Y. (1991). Coincidence of and collinearity between game theoretic solutions. OR Spektrum, 13(1), 15-30.

Dutta, B., Van den Nouweland, A. and Tijs, S. (1998). Link formation in cooperative situations. International Journal of Game Theory, 27, 245-256.

Feng, X., Zhang, W., Zhang, Y. and Xiong, X. (2014). Information identification in different networks with heterogeneous information sources. Journal of Systems Science and Complexity, 27(1), 92-116.

Feri, F. (2007). Stochastic stability in networks with decay. Journal of Economic Theory, 135, 442-457.

Feri, F. and Melendez-Jimenez, M. (2013). Coordination in evolving networks with endogenous decay. Journal of Evolutionary Economics, 23, 955-1000.

Fosco, C. and Mengel, F. (2011). Cooperation through imitation and exclusion in networks. Journal of Economic Dynamics & Control, 35, 641-658.

Galeotti, A. and Goyal, S. (2010). The Law of the Few. American Economic Review, 100(4), 1468-1492.

Galeotti, A., Goyal, S. and Kamphorst, J. (2006). Network formation with heterogeneous players. Games and Economic Behavior, 54, 353-372.

Gao, H., Dai, Y., Li, W., Song, L. and Lv, T. (2010). One-Way Flow Dynamic Network Formation Games with Coalition-Homogeneous Costs. Contributions to Game Theory and Management, 3, 104-117.

Gao H., Liu Z. and Dai Y. (2011). The Dynamic Procedure of Information Flow Network. Contributions to Game Theory and Management, 4, 172-187.

Gao, H., Petrosyan, L., Qiao, H. and Sedakov, A. (2017). Cooperation in two-stage games on undirected networks. Journal of Systems Science and Complexity, 30(3), 680-693.

Gao, H., Petrosyan, L. and Sedakov, A. (2015). Dynamic Shapley value for repeated network games with shock. Control and Decision Conference (CCDC), 2015 27th Chinese, pp. 6449-6455.

Goyal, S. and Vega-Redondo, F. (2005). Network formation and social coordination. Games and Economic Behavior, 50, 178-207.

Haller, H. (2012). Network extension. Mathematical Social Sciences, 64, 166-172.

Igarashi, A. and Yamamoto, Y. (2013). Computational Complexity of a Solution for Directed Graph Cooperative Games. Journal of the Operations Research Society of China, 1(3), 405-413.

Jackson, M. (2008). Social and economic networks. Princeton: Princeton University Press.

Jackson, M. and Watts, A. (2002). On the formation of interaction networks in social coordination games. Games and Economic Behavior, 41(2), 265-291.

Jackson, M. and Wolinsky, A. (1996). A strategic model of social and economic networks. Journal of Economic Theory, 71, 44-74.

Kuhn, H.W. (1953). Extensive games and the problem of information. Contributions to the Theory of Games II (ed. by Kuhn H.W. and Tucker A.W.), Princeton, 193-216.

Lu, X., Li, J. and Yang, F. (2010). Analyses of location-price game on networks with stochastic customer behavior and its heuristic algorithm. Journal of Systems Science and Complexity, 23(4), 701-714.

Myerson, R. (1997). Game Theory: Analysis of conflict. Harvard University Press.

Parilina, E. (2014). Strategic stability of one-point optimality principles in cooperative stochastic games. Matematicheskaya Teoriya Igr i Ee Prilozheniya, 6(1), 56-72.

Petrosjan, L. A. (2006). Cooperative stochastic games. In: Haurie A., Muto S., Petrosjan L. A., Raghavan T. E. S. (eds) Advances in Dynamic Games Applications to Economics, Management Science, Engineering, and Environmental Management Series: Annals of the International Society of Dynamic Games, Basel: Birkhauser, 52-59.

Petrosyan, L. A. (1977). Stability of solutions in differential games with many participants. Vestnik Leningradskogo Universiteta. Ser 1. Matematika Mekhanika Astronomiya, 19, 46-52.

Petrosyan, L. A. (2005). Cooperative differential games. Annals of the International Society of Dynamic Games. Applications to Economics, Finance, Optimization, and Stochastic Control, (ed. by Nowak A.S. and Szajowski K.), Basel, 183-200.

Petrosyan, L. (2008). Strategically supported, cooperation. International Game Theory Review, 10(4), 471-480.

Petrosyan, L. A. and Danilov, N.N. (1979). Stability of solutions in non-zero sum differential games with transferable payoffs. Vestnik Leningradskogo Universiteta. Ser 1. Matematika Mekhanika Astronomiya, 1, 52-59.

Petrosyan, L. A. and Sedakov, A.A. (2009). Multistage network games with perfect information. Matematicheskaya teoriya igr i ee prilozheniya, 1(2), 66-81.

Petrosyan, L., Sedakov, A. (2014). One-way flow two-stage network games. Vestnik of Saint Petersburg State University. Ser 10: Applied Mathematics, Informatics, Control Processes, 4, 72-81.

Petrosyan, L., Sedakov, A. (2015). Strategic support of cooperation in dynamic games on networks. Proceedings of the International Conference on "Stability and Control Processes" in Memory of V.I. Zubov, SCP, pp. 256-260. Petrosyan, L., Sedakov, A. (2016). The Subgame-Consistent Shapley Value for Dynamic

Network Games with Shock. Dynamic Games and Applications, 6(4), 520-537. Petrosyan, L. A., Sedakov, A.A. and Bochkarev, A. O. (2013). Two-stage network games.

Matematicheskaya teoriya igr i ee prilozheniya, 5(4), 84-104. Petrosyan, L., Zenkevich, N. (2009). Principles of dynamic stability. Matematicheskaya

Teoriya Igr i Ee Prilozheniya, 1(1), 106-123. Shapley, L. S., (1953). A value for N -person games. Contributions to the Theory of Games

II (ed. by Kuhn H.W. and Tucker A.W.). Princeton, 307-317. Von Neumann, J. and Morgenstern, O. (1944). Theory of Games and Economic Behavior.

Princeton: Princeton University Press. Xie, F., Cui, W. and Lin, J. (2013). Prisoners dilemma game on adaptive networks under

limited foresight. Complexity, 18, 38-47. Watts, A. (2001). A Dynamic Model of Network Formation. Games and Economic Behavior, 34, 331-341.

Yeung, D.W.K. and Petrosyan, L. A. (2006). Cooperative Stochastic Differential Games.

Springer-Verlag, New York. Yeung, D. W. K. and Petrosyan, L. A. (2012). Subgame Consistent Economic Optimization. Birkhäuser.

i Надоели баннеры? Вы всегда можете отключить рекламу.