Leon Petrosyan1 and Artem Sedakov2
St. Petersburg University,
Faculty of Applied Mathematics and Control Processes,
35, Universitetskiy pr., Petrodvorets, St. Petersburg, Russia, 198504 E-mail: spbuoasis7@peterlink.ru, a.sedakov@yahoo.com
Abstract. In the paper we consider a multistage network game with perfect information. In each stage of the game a network connecting players is given.
In our setting we suppose that each network edge connecting two players has utility (utility of the first player from the connection with the second player), and players have the right to change the network structure in each stage.
We propose a way of finding an optimal players behavior in this type of multistage game.
Keywords: network, network games, characteristic function, Shapley value,
Nash equilibrium.
1. Introduction
Originally, classical theory of deterministic multistage games was a theory of noncooperative games in which each participant involved in it was aimed to maximize his own payoff. As players optimal behavior was proposed a subgame perfect equilibrium. Then, a cooperative way of participants behavior was considered. In that setting it was supposed that all participants jointly choose an n-tuple of strategies which maximizes total payoff of all players. In that cooperative setting the main problem was the allocation of obtained total payoff among the players. The notion of ’’imputation” was introduced. Among others, core and Shapley value were considered as optimality principles. A few years ago in (Petrosjan and Mamkina, 2005) and later in (Petrosjan et al., 2006) was proposed a way which used the methodology of two previous settings. It was supposed that each participant can choose a coalition in which he wants to belong, and then acts in the interest of the chosen coalition.
In this paper we consider another way of players behavior by adding to the existing theory of multistage games a network component. It is reasonable to add it to the model because in a system of interactions players are connected with each other. We suppose that during the game process players can reconsider current system of interactions by adding or breaking new connections, since as it is supposed each connection contains some utility for the player (positive or negative). In the paper we propose a way of finding players optimal behavior.
Give a short paper overview. In Sec. 2. we define the multistage network game. Particulary, construction of the graph tree of the multistage game and definition of network structure in each vertex of it are given in Sec. 2.1.. In Sec. 2.2. we define stage payments to each participant in each vertex of the graph tree. The definition of the multistage network game with perfect information and other necessary definitions are stated in Sec. 2.3.. Then, in Sec. 3. we propose the algorithm of constructing players optimal behavior in the multistage network game. And, finally, a numerical example illustrating proposed algorithm is considered in Sec. 4..
2. The construction of a multistage network game with perfect information
Let N = {1,... ,n} be a set of players. Construct a game tree (a finite graph tree (Kuhn, 1953)) K = (X,F) with an initial vertex x0. The set X is a set of vertices of the graph K, and F : X ^ X is a point-to-set mapping which to each vertex x € X corresponds a set Fx of vertices directly followed by the vertex x. A vertex x of the graph tree K for which Fx = 0 is called the terminal vertex.
The set X of vertices of the graph tree K we represent in a common way as a conjunction of n +1 disjunctive sets: X = P1 U ... U Pn U Pn+1, where Pi is a set of personal moves of a player i, i € N, and Pn+1 is a set of terminal vertices of the graph tree K.
Hereinafter, by i(x) we denote a player who makes a decision in a vertex x of the graph tree K.
Describe the stepwise evolution of the game.
2.1. The construction of a graph tree of the network game
Initial step. In the initial vertex x0 of the graph tree K a network Gx0 = (N, 9(x0)) is defined. Define by gx0 a set of its edges. Let N be the set of network nodes which coincides with the set of players (node is identified as a player), and Q(x0) : gx0 ^ R is a real-valued function which can be interpreted as a utility function.
Step 1. Player i(xo) has exactly n alternatives in the vertex x0:
— not to take any action, and the game process evolves to a vertex y11 € Fx0;
— break an edge with a player j € N, j = i(x0), if the edge (i(x0),j) € gx0; and the game process evolves to a vertex y1j € Fx0;
— propose to the player k, k = i(x0) a new edge (i(x0), k), if such edge (i(x0), k) € gx0; and the game process evolves to a vertex y1k € Fx0.
Each of n vertices y11, {y1j}j, {y1k}k belongs to Fx0. Subject to player i(x0) choice, the initial network is changed in vertices of the set Fx0. Thus a set of edges of the new network has the following form:
gy11 = gx0, if the player i(x0)
any actions; gyij = gxo \ (i(x0),j), if the player i(x0) with a player j ; gyik = gx0 U (i(x0), k), if the player i(x0) to a player k.
Then for a vertex x1 € Fx0 = {y11, {y1j}j, {y1k}k} a set of edges gxi is uniquely defined. If x1 € Pn+1, we consider the second step for each vertex x1 € Fx0. This step is fully similar to the step 1, so skipping the description of the second step we consider the step t.
Step t (1 < t < l). Suppose we have constructed the graph tree consisting of vertices, which can be reached from the initial vertex x0 no more than in t — 1 stages. Let {x0,x1,... ,xt-1} be a path in the constructed graph tree starting from the vertex x0 and leading to the vertex xt-1 in t — 1 stages. In all vertices x0, x-\_,..., xt-1 corresponding sets of edges gx0, gxi,..., gxt-1 are uniquely defined. Define the set
gxt.
A player i(xt-1) has exactly n alternatives in the vertex xt-1:
does not take breaks a connection propose a new connection
— not to take any action, and the game process evolves to a vertex yt1 € Fxt-1;
— break an edge with a player j € N, j = i(xt-1), if the edge (i(xt-1),j) € gxt-1; and the game process evolves to a vertex ytj € Fxt-1;
— propose to the player k, k = i(xt-1) a new edge (i(xt-1),k), if such edge (i(xt-1),k) € gxt-1; and the game process evolves to a vertex ytk € Fxt-1.
Each of n vertices yt1, {ytj}j, {ytk}k belongs to Fxt-1. Subject to player i(xt-1) choices, the current network is changed in vertices of the set Fxt-1. Thus a set of edges of the new network has the following form:
gyt1 = gxt-1, if the player i(xt-1) does not take
any actions;
gytj = gxt-1 \ (i(xt-1),j), if the player i(xt-1) breaks a connection
with a player j ;
gytk = gxt-1 U (i(xt-1), k), if the player i(xt-1) propose a new connection
to a player k.
For a vertex xt € Fxt-1 = {yt1, {ytj}j, {ytk}k} a set of edges gxt is uniquely defined. If xt € Pn+1, we consider the next step for each vertex xt € Fxt-1, and the
construction of the graph tree is fully similar to the previous stages. When t = l
the graph tree K is constructed.
2.2. The definition of an individual payment to the player Definition 1. Let S C N .A real-valued function v : X x 2N ^ R, defined on a cartesian product of the set X and the set of all subsets of the set N, and specified as
v(y,S)= 53 °ij (y^ (1)
(i,j)€gy: i,j£S
where y € X, is called the characteristic function. Here Qij (y) is the value of the utility function 6(y) in a network game Gy = (N,6(y)).
Having the set of players N and the function v(y, •), defined by (1), one can construct a game in characteristic function form. In this game a player has only ’worths” from the connection with other players. Define a payment to a player in the network. For this purpose we select an optimality principle from the cooperative game theory (in our case we select a Shapley value (Shapley, 1953) because of its uniqueness), and calculate an imputation 7(y) = (71 (y),... ,Yn(y)) based on this principle. The components of the imputation are calculated as follows:
7M= £ (n~*)!(,S~1)>fo.S)-ote.S\*)]. (2)
n!
{S: SCN, keS}
Here s is the cardinal number of a set S, and v(y, S) is the characteristic function defined by (1).
Convert the expression in a square brackets in the right side of the equality (2). Using (1) for each y € X and k € N, we get:
v(y,S) — v(y,S \ k)= ^ Qij(y) — ^ ®ij(y) =
(i,j)egy : i,jeS (i,j)egy : i,jeS\k
= E Qik (y)+ E Qkj (y). (3)
(i,k)egy : ieS\k (k,j)egy : jeS\k
Subject to (3) the components of the Shapley value have the form:
Yk (y)
E
{S: SCN, keS}
(n — s)!(s — 1)!
(i,k)egy: ieS\k
Qik (y) +
+ E Qkj(y)
(k,j)egy: jeS\k
(4)
n!
where y X, k N.
The value (i,k)egy: ieS\k Qik (y) + (k,j)egy : jeS\k Q
kj (y) is a contribution of
player k, if he joints to a coalition S \ k and forms a coalition S. Here the first summand ^(i k)egy: ieS\k Qik (y) is an additional utility of the coalition S\ k, which the player k brings in. The second summan^(k j)egy: jeS\k Qkj(y) is an additional utility of the player k, which he obtains after joining the coalition S \ k.
Suppose in the game a path {x0,. ..,xi} is realized. The total payoff of a player i N along this path is defined in the follow way:
E Yi(x), i € N,
xe{xo,...,xi}
where Yi (x) is computed by formula (4) in the network game Gx = (N, Q(x)).
2.3. The definition of a multistage network game with perfect information
Definition 2. The n person multistage network game with perfect information is a graph tree K with the following properties:
— the set of vertices X is divided into n +1 disjunctive sets P1, P2,..., Pn, Pn+i,. Here Pi, i € N is a set of personal moves of a player i, and Pn+1 = {x : Fx = $} is a set of terminal vertices;
— in each vertex x € X a network Gx = (N, Q(x)) is uniquely defined, where N is
a set of nodes (a set of players), and Q : gx ^ R is a utility function.
Definition 3. A strategy ui( ) of a player i € N is a mapping which to each vertex x € Pi corresponds a vertex y € Fx.
For each n-tuple of strategies (strategy profile) u(-) = (u1(^),..., un( )) in the game on the graph tree K define a payoff function in the following way. Let a strategy profile u(-) = (u1(^),...,un(^)) generate a path {x0,x1,...,xl} from the initial vertex x0 to a terminal one xl. Then the payoff function of a player i has the form:
Hi(u(^)) = E Yi(x), i € N.
xe{xo,...,xi}
Here Yi(x) is a payment to the player i. The payment corresponds to i-th component of the Shapley value, which is computed for v(x, •) defined for the network game
Gx = (N, Q(x)) in the vertex x (see Sec. 2.2.).
Definition 4. A strategy profile u*(-) = (u\(^),...,u*(^),...,u*n()) is called a Nash equilibrium in a multistage network game on a graph tree K with the initial vertex x0 if
Hi(u*(-)||ui(-)) < Hi(u*()
for each i € N and each admissible ui.
3. The construction of Nash equilibrium in the multistage network game
Suppose that the game length is equal to l + 1. To define an optimal behavior of each player we use the concept of Nash equilibrium in a finite multistage game with perfect information.
Introduce the Bellman’s function pt as a payoff of the player i in a Nash equilibrium (Nash, 1951) in the l — t stage game (suppose pi+1 = 0). Values of the Bellman’s function p are defined in a common way using a backward induction (solving the Bellman’s equation with boundary conditions at the terminal vertices) in each vertices of the graph tree K.
In this case the boundary condition has the form:
Pi(xi)= Yi(xi), i € N
for each terminal vertex xl € Pn+1
In an intermediate vertex xt of the graph tree K Bellman’s function satisfies the following functional equation:
rf(xt)(xt) = max (Yi(xt)(xt) + p\+Xt)(y^ =
= Yi(xt)(xt) + max ^l+x])(y^ =
= Yi(xt )(xt)+ (y). (5)
For a player j = i(xt) the values of Bellman’s function are obtained from the condition:
pj (xt) = Yj (xt) + ¥tj+l(y). (6)
Solving Bellman’s equation we obtain values pt, t = 0,...,l, i € N. At t = 0 the equation has completely solved. An n-dimensional profile (p°1(x0),..., p{n(x0)) we call the value of the multistage network game.
Together with the value of the multistage network game we obtain the optimal players strategies, which constitute the subgame perfect equilibrium: in each vertex x € X of the graph tree K a player i(x) chooses a vertex y € Fx in accordance with the rule (4). In the equilibrium the path from the initial vertex to a terminal one is realized. Such path we call the optimal path in the multistage network game.
Theorem 1. An n-tuple of strategies u* (•) = (u| (-),...,u|(-)), where for each vertex x € Pn+1 a strategy u*(•), i € N is defined as u*(^) = y, and y can be found using (4), constitutes a subgame perfect equilibrium.
Remark Let a vertex y € Fxt maximize the function pt+^y in (4). Suppose that a vertex y € Fxt (y = y) is also a maximum point of this function. Obviously, the following equality holds:
which lead us to the same value p\(<Xt) (xt). Therefore, a player who makes a decision
in the vertex xt (a player i(xt)) may choose any vertex y € Fxt which maximize the function p^^y) in (4).
But, generally, in vertices y and y for each player j € N, j = i(xt) the following inequality is true:
influences the decision of the following player (because of the difference between
I(xt)). This implies that in general in the multistage network game the optimal path is not unique.
Nonuniqueness of the optimal trajectory can be avoided by introducing the notion of indifferent Nash equilibrium in the multistage game with perfect information (Petrosjan and Mamkina, 2005).
Since in generally II(xt)| > 1, the player i(xt) is supposed to choose each vertex from the set I(xt) with equal probabilities, i. e. pxt (y) = 1/|I(xt)|, for each y € I(xt). Then for an intermediate vertex xt of the graph tree K, pit satisfies the following equation (similar to (4)):
Solving Bellman’s equation we obtain values pt, t = 0,...,l, i € N. At t = 0 the equation has completely solved. An n-dimensional profile (p°1(x0),..., p{n(x0)) we also call the value of the multistage network game.
Similar to the Theorem 1, the following theorem is true.
Theorem 2. An n-tuple of strategies uIE() = (uIE (•) ,...,uInE (•)), where for each vertex x € Pn+1 a strategy
constitutes a subgame perfect equilibrium. Here y can be found using (7)-(8).
4. Numerical example
To illustrate the Nash equilibrium constraction algorithm in the network game we give a numerical example.
Consider a 3-stage network game. Let N = {1, 2, 3} be the set of players. Construct a graph tree K with the initial vertex x0.
pj+1 (y) = pj+1(y).
This means that the player i(xt) choosing a point from the set
I(xt) = arg Jma;x p^) (y)
(7)
values of the Bellman’s functions of a players followed by i(xt) in points of the set
yei(xt)
For the player j = i(xt), a value of p can be calculated as follows:
(8)
(9)
uiE(x) = {px(y)},ye I(x),px(y) = —^—7, i £ N,
U (x)|
Fig. 1. Network Gx0
Let in x0 the network shown in the Fig. 1 is given.
The set of edges is gx0 = {(1, 2), (2, 3), (3, 2)}. The utility matrix 0(xo) has the form:
/0 20\
O(xo) = I 00 1 I .
V010/
Suppose that in the initial vertex the Pl. 1 makes a move. He has 3 alternatives: (1) not to take any actions (the game process moves to the vertex x1); (2) break the connection with Pl. 2 (the game process moves to the vertex x2); (3) propose a connection to Pl. 3 (the game process moves to the vertex x3). We get:
gxi = gx0, if Pl. 1 chooses the first alternative in x0;
gx2 = gx° \ (1, 2), if Pl. 1 chooses the second alternative in x0;
gx3 = gx° u (1, 3), if pi. 1 chooses the third alternative in x0.
Suppose that in x1, x2, x3 utility matrices are:
/0 —3 —A / 0 3 —2\
0(x1 )= I 2 0 2 I ,0(x2) = G(xs)= I —10 1 I .
y 5 1 0 J \ 31 0/
Let x1 and x3 be terminal vertices, and x2 is a personal position of Pl. 2.
In x2 Pl. 2 has 3 alternatives: (1) not to take any actions (the game process moves to the vertex x4); (2) propose a connection to Pl. 1 (the game process moves to the vertex x5); (3) break the connection with Pl. 3 (the game process moves to the vertex x6). We get:
gx4 = gx2, if Pl. 2 chooses the first alternative in x2;
gx5 = gx2 u (2,1), if Pl. 2 chooses the second alternative in x2;
gxe = gx2 \ (2, 3), if pl. 2 chooses the third alternative in x2.
Suppose that in x4, x5, x6 utility matrices are:
/0 —3 —A / 0 —1 —A
<9(x4)= I 2 0 2 I ,<9(xs) = <9(x6)= I —1 0 2 I .
\5 10/ \240/
Let x4 and x6 be terminal vertices, and x5 is a personal position of Pl. 3.
In x5 Pl. 3 has 3 alternatives: (1) not to take any actions (the game process moves to the vertex x7); (2) propose a connection to Pl. 1 (the game process moves
to the vertex x8); (3) break the connection with Pl. 2 (the game process moves to the vertex xg). We get:
gx7 = gx5, if Pl. 3 chooses the first alternative in x5;
gxs = gx5 u (3,1), if Pl. 3 chooses the second alternative in x5; gxs = gx5 \ (3, 2), if pl. 3 chooses the third alternative in x5.
Suppose that in x7, x8, xg utility matrices are: O(xr) = &(x8) = 0(xg) =
0 —3 —1 2 0 2 5 1 0
Let x7, x8, xg be terminal vertices.
Then sets of personal positions P1, P2, P3 and the set of terminal ones P4 have the form:
P1 = {x0},
P2 = {x2},
P3 = {x5},
P4 {x 1, x3, x4, x6, x7, x8,x9},
and the graph tree K is shown in Fig. 2.
Fig. 2. Graph tree K
First, compute individual payments to each player in each vertex of the graph tree K.
Consider x0. Construct a characteristic function by the rule (1):
v(xo, {1, 2, 3}) =4, v(x0, {1, 2}) = 2, v(x0, {1, 3}) = 0, v(x0, {2, 3}) = 2,
v(x0, {1}) = v(x0, {2}) = v(x0, {3}) = 0.
Individual payments to players in x0 are computed in accordance with the Shapley value (4). We get:
7(x0) = (1, 2,1).
Individual payments to players in others vertices of the graph tree K are computed similarly. We give the final values:
Y(x1) = ( — 1.5, 0,1.5), y(x6) = (0, 2, 2),
Y(x2) = (0,1,1), Y(x7) = (1, 2.5,1.5),
Y(x3) = (0.5, 2.5, 0), y(x8) = (3.5, 2.5, 4),
Y(x4) = (0,1.5,1.5), y(xg) = (1, 2,1).
Y(x5) = (—0.5, 2.5, 3),
After computing payments to players in each vertex of the graph tree K, the computation of the equilibrium in the multistage network game does not present any difficulties. This procedure is fully similar to the construction of the Nash equilibrium in a multistage game with perfect information with the difference that in the classical setting players payoffs are defined in terminal vertices, and in intermediate they are equal to zero. Desired equilibrium in the multistage network game is found by using (4)-(6).
Equilibrium players strategies are:
u\(x0) = x2, u2 (x2 ) = x5, u3 (x5 ) = x8.
In the equilibrium (u^,u2,ug) the optimal path {x0,x2,x5,x8} from the initial vertex x0 to the terminal one x8 is realized.
Along the optimal path the game evolves in the following way. At the initial stage the network Gx0 shown in Fig. 1 is given. Then Pl. 1 breaks the connection with Pl. 2. This leads to the network Gx2 shown in Fig. 3. Then Pl. 2 makes a move
Fig. 3. Network Gx2
and proposes a connection to Pl. 1. This leads to the network Gx5 shown in Fig. 4. And finally Pl. 3 ends the game proposing the connection to Pl. 1. This leads to
Fig. 4. Network Gx5
the network Gx8 shown in Fig. 5.
Fig. 5. Network Gx8
The value of the multistage network game equals to (4, 8, 9) and stage payments to players are as follows:
Y(x0) = (1, 2 ^
Y (x2) = (0, 1, 1),
Y(x5j) = (—0.5, 2.5, 3),
Y(x8) = (3.5, 2.5,4).
References
Bellman, R. (1957). Dynamic Programming. Princeton University Press: Princeton, NJ. Dutta, B., van den Nouweland, A., Tijs, S. (1998). Link formation in cooperative situations.
International Journal of Game Theory, 27, 245-256.
Kuhn, H. W. (1953). Extensive Games and Problem Information. Ann. Math Studies, 28, 193-216.
Myerson, R. (1977). Graphs and cooperation in games. Mathematics of Operations Research, 2, 225-229.
Nash, J. (1951). Non-cooperative Games. Ann. of Math., 54, pp. 286-295.
Petrosjan, L. A., Kusyutin, D. V. (2000). Games in extensive form: optimality and stability.
St.Petersburg Univ. Press, St.Petersburg.
Petrosjan, L. A., Mamkina, S. I. (2005). Value for the Games with Changing Coalitional Structure. Games Theory and Applications, 10, 141-152.
Petrosyan, L., Sedakov, A. (2009). Multistage networking games with full information.
Upravlenie bolshimi sistemami, 26.1, 121-138 (in russian).
Petrosjan, L. A., Sedakov, A. A., Syurin, A. N. (2006). Multistage Games with Coalitional Structure. Viestnik of St.Petersburg Univ., Sec. 10, Vol. 3, pp. 97-110.
Shapley, L. S. (1953). A Value for n-Person Games. Contributions to the Theory of Games II, Princeton: Princeton University Press, 307-317.