Научная статья на тему 'New characteristic function for multistage dynamic games'

New characteristic function for multistage dynamic games Текст научной статьи по специальности «Математика»

CC BY
132
13
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
MULTISTAGE GAME / CHARACTERISTIC FUNCTION / TIME-CONSISTENCY / STRONGLY TIME-CONSISTENCY / МНОГОШАГОВАЯ ИГРА / ХАРАКТЕРИСТИЧЕСКАЯ ФУНКЦИЯ / ДИНАМИЧЕСКАЯ УСТОЙЧИВОСТЬ / СИЛЬНАЯ ДИНАМИЧЕСКАЯ УСТОЙЧИВОСТЬ

Аннотация научной статьи по математике, автор научной работы — Pankratova Yaroslavna B., Petrosyan Leon A.

The finite stage dynamic n-person games with transferable payoffs are considered. The cooperative version of the game is defined, and a new approach for constructing characteristic functions in multistage games based on characteristic functions defined in stage games is proposed. It is proved that the values of this new characteristic function dominate the values of characteristic function constructed using the min-max approach. This allows constructing the subcore of the classical core in the multistage game under consideration and guarantees that this new approach leads to time-consistent (works L. Petrosyan, G. Zaccour, 2003;L. Petrosyan, 1991) and in some cases strongly time-consistent solutions (paper L. Petrosyan, 1993). The example is provided showing the construction of this newly defined characteristic function and the time-consistency and strong time-consistency of the core.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Новая характеристическая функция для многошаговых динамических игр

В работе рассмотрены конечношаговые динамические игры n-лиц с трансферабельными выигрышами. Для таких игр разработана кооперативная версия игры и предложен новый подход к построению характеристической функции на основе характеристических функций, определенных в одновременных играх. Показано, что значения новой характеристической функции для каждой коалиции превосходят величины характеристической функции, построенной на основе максиминного подхода. Последнее обстоятельство позволяет использовать новую характеристическую функцию для построения подъядра рассматриваемой многошаговой игры. Получены условия, которые гарантируют, что этот новый подход приводит к динамически устойчивому (см. работы Л. Петросяна, Д. Закура 2003 г. и Л. Петросяна 1993 г.) и в некоторых случаях к сильно динамически устойчивому решению, которое совпадает с подъядром (статья Л. Петросяна 1993 г.). В работе приведен контрольный пример определения новой характеристической функции и показана сильная динамическая устойчивость подъядра, построенного с его помощью.

Текст научной работы на тему «New characteristic function for multistage dynamic games»

UDC 519.837 Вестник СПбГУ. Прикладная математика. Информатика... 2018. Т. 14. Вып. 4

MSC 91A20, 91A25

New characteristic function for multistage dynamic games*

Y. B. Pankratova, L. A. Petrosyan

St. Petersburg State University, 7—9, Universitetskaya nab., St. Petersburg, 199034, Russian Federation

For citation: Pankratova Y. B., Petrosyan L. A. New characteristic function for multistage dynamic games. Vestnik of Saint Petersburg University. Applied Mathematics. Computer Science. Control Processes, 2018, vol. 14, iss. 4, pp. 316-324. https://doi.org/10.21638/11702/ spbu10.2018.404

The finite stage dynamic те-person games with transferable payoffs are considered. The cooperative version of the game is defined, and a new approach for constructing characteristic functions in multistage games based on characteristic functions defined in stage games is proposed. It is proved that the values of this new characteristic function dominate the values of characteristic function constructed using the min-max approach. This allows constructing the subcore of the classical core in the multistage game under consideration and guarantees that this new approach leads to time-consistent (works L. Petrosyan, G. Zaccour, 2003; L. Petrosyan, 1991) and in some cases strongly time-consistent solutions (paper L. Petrosyan, 1993). The example is provided showing the construction of this newly defined characteristic function and the time-consistency and strong time-consistency of the core.

Keywords: multistage game, characteristic function, time-consistency, strongly time-consistency.

Introduction. There are many ways to define characteristic function in cooperative games. For one stage games in recent literature characteristic function is defined axiomatically. In the same time this approach is difficult to use for multistage dynamic games since for solution of cooperative version of dynamic games the evolution of characteristic function plays important role. In this paper we try to define the characteristic function in multistage game using the characteristic function in stage game, this approach seems to be effective from computational point of view. Also the newly defined characteristic function outnumbers the value of classical characteristic function in each stage game. This gives the possibility of constructing time-consistent and strongly time consistent solution concepts using non-negative imputation distribution procedure (IDP).

Main result. In the paper a finite multistage game G(z1), which starts from one stage game r(zi) in vertex zi of game tree G is considered. Denote by r(z) one stage game in vertex z of game tree G(z):

r(z) = < N;U*,...,UZ,...,UZ;K*,...,K*,...,KZn > . (1)

In formula (1) N is a set of players, which is the same for all games r(z), z e G, UZ is the set of strategies of player i e N and K* is a payoff function of player i. The case, when the game r(z) is finite, i. e. the sets U* are finite, is considered. During the game G( zi ) a finite sequence of one stage games

Г\z1),...,r(zk),...,r(zl)

* This work was supported by the Russian Foundation for Basic Research (grant N 17-51-53030). © Санкт-Петербургский государственный университет, 2018

is realized. Here if the stage game r(zk) takes place the next stage game r(zk+z) occurs in the vertex zk+z = T(zk; u1k,... ,u:^k) dependent on the stage zk and strategies of players uZk = (uzik ,...,unk) chosen in stage game r(zk). Under the strategy of player i e N in game G(z1) we understand the mapping ui(-), which determines the player's i e N choice in each possible one stage game, i. e. ui(z) = uZ e UZ. Each strategy profile u = (uz(-),... ,un(-)) uniquely determines the trajectory z = (z1,...,zl) and the payoffs of players as sums of their payoffs in corresponding stage games.

Strategy profile u = (uz(-),... ,un(-)), which maximizes the sum of players playoffs in game G(zz), we will call "cooperative strategy profile". This cooperative strategy profile generates a sequence of one stage games r(zz), r (z2),... ,r (zl), and corresponding path z = (z1,...,zl), which we will call cooperative trajectory.

We are interested in subgames of game G(zz) along cooperative trajectory

G(zi),G(z2),...,G(zi),

each of them starts from one stage game r (zk), k = 1,...,l.

Consider cooperative version of game G(zz) and subgames G(zk), k = 1,...,l.

Let V(zk,S), S c N, be a characteristic function in subgame G(zk), which is defined in classical sense [1], i. e. as a lower value of zero-sum game G(zk) between coalition S and coalition N\S, where coalition S is the first player and N\S is the second with payoff of coalition S equal to the sum of payoffs of its members. Also define the characteristic function V(z,S), S c N, in one stage game r(z) in a classical way as a lower value of zero-sum game associated with r(z) between coalition S as first player and coalition N\S as second.

Define W(S), S c N, as follows: W(S) = max V(z,S), S c N.

z

Denote by C(zk) the core in multistage game G(zk), and by C(zk) the core in one stage game r(zk), which realizes in stage k in vertex zk and D(zk) the set of imputations aZk = (aZk,...,annk) in r(zk), satisfying the condition

Y, aZk > W(S), S c N, S * N,

itS

Y azk = v(zk,N),

itN

here V(zk,N) is sum of players payoffs in the game r(zk), when players use the cooperative strategy profile in G(zk). Suppose that C(zk) * 0, C(zk) * 0 and D(zk) * 0. The last condition is true only if the inequality fulfils (a necessary condition) W(S) ^ min V(z, N).

z

Define

W(zk,S) = (l - k + 1)W(S), (2)

where l is a number of stages in game G(zz). Formula (2) defines a new characteristic function in multistage game G(zz), based on the analogue of characteristic function W(S) of one stage game r(z). Consider the set D(zk) as a set of imputations aZk = (aZk ,...,a:zk) in game G(zk) such that

Y aZ > W(zk,S), S c N, S * N,

itS

Y aZk = V(zk,N).

itN

Suppose that the condition

max V (z,S)=W (S )<V (z,N)=V (N ), z e G(z1), S * N,

is satisfied. The following theorem holds. Theorem 1. The inclusion

D (zk )cC(zk)

is true.

Proof. Since C(zk) is the set of all imputations azk = (aZk ,...,azZk) in the cooperative version of the multistage subgame G(zk) such that

£ aZk > V(Zk,S), S c N, S + N £ aZk = Z(Zk,N).

ieS icN

To prove (3) it is sufficient to prove

V (Zk,S) < W (Zk,S), k = 1,...,l, S c N. (4)

We prove (4) by induction on number of stages l in subgame G(zk). Suppose l = 1, then we have the subgame G(zl), which coincides with one stage game r(zl). Obviously in this case V(zl,S) = W(S), since all subgames of G(zl) contain only one vertex Zl. Consider a less trivial case l = 2. Then for lower value V(zl-1,S) we have the following analogue of Bellman equation:

V (zi-i,S) = max where Zi = T (Zi-i; ui,... ,un), or

min ( > K,

Uj ,jtN\S

(y Kf-1 (ui,...,un) + V (zi,S))

\itS L

V (zi-i,S) = max

Ui,itS

min ( У Kf-1 (ui,...,un) + V (T (zi-i; ui,...,Un),S)

Uj -jtN\S\itS

but

V(zi,S) = V(zi,S) ^ max V(z,S) = W(S)

and we get

V(zl-1,S) ^ max

Ui,itS

min (y Kf-1 (ui,...,un) + w (S)

Uj -jtN\S\itS

)

= W (S ) + max

Ui-itS

.(y Kf-1 (

\itS

min 0( У KT-1 (ui,...,un)

Uj ,jtN\S

= W(S) + V(zl-1,S) ^ W(S) + max V(z, S) = 2W(S) = W(zl-1,S).

z

Suppose the theorem is true for all l - k stage subgame. Then we can write

V (zi-k ,S) = max

Ui:itS

: (es

min ( У Kf к (ui,...,un) + V (T (zi-k; ui,...,un)

Uj ,jtN\S

,S))

By induction hypotheses

V (T (Zi-k; ui,...,un),S) = V (Zi-k+i ,S) < W (Zi-k+i,S) = (l - k)W (S).

318 Вестник СПбГУ. Прикладная математика. Информатика... 2018. Т. 14. Вып. 4

This gives us

V(zk,S) ^ max

min (£ KZk (u1,...,uri) + (l - k)W (S))

= (l - k)W(S)+ max

Ui,itS

min (У KZk (ui,...,un) Uj \S\tS г

^(l - k)W(S) + max V(z, S) = (l - k)W(S) + W(S) =

z

= (l - k + 1)W(S) = W(zk,S).

The theorem is proved.

Definition 1. The finite sequence of vectors 3 = (31,...,3i,...,3n) is called IDP in

i

G(z1) for an imputation a = (o1,...,On), if a = Y 3ik, where 3i = (3i1,...,3ik ,...,3ii)

k=1

and k is the corresponding stage in the game G(z1).

The IDP 3 = (31,...,/k____,/l) is called time-consistent [2-5] for imputation oZ1 e

D(z1) if for any 1 ^ k ^ l

i

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Y 3m = oZk e D(zk).

m=k

It is clear that if we take sequence of imputations oZ1 ,oZ2 ,...,oZk ,...,oZl, aZk e D(zk), k = 1,...,l, and define

3k = oZk - oZk+1, k = 1,...,l - 1,3 = oZl, (5)

the IDP 3 = (31,... ,3k____,3l) will be time-consistent IDP for oZ1 e D(z1). In some cases

it is important that 3k ^ 0, k = 1,...,l. But in general the nonegativeity of IDP can not be guaranteed. In our case the following theorem holds.

Theorem 2. For any oZ1 e D(z1) there exist non-negative time-consistent IDP 3

(3 > 0).

Proof. For each 1 ^ k ^ l we have

aZ 1 = Y 3m + aZk+1,

m=1

- - oZ1

which follows from (5) and time consistency of IDP /3 for imputation aZl. Take 3k = k = 1,...,l. Prove that 3 = (31,..., 3k,...,3i) is time-consistent IDP for oZ k. Define

~zk l-k+1 2

a k =-a

and prove that azk e D(zk) or

k+laZl > (l + k-l)W(S). (6)

itS l

Since az 1 e D(z\) we have

У azz1 > lW(S), У oZZ1 = lW(N). (7)

itS itN

U

Multiplying both sides of (7) on —~~ we get (6), which means that aZk e D(zk), k = 1,...,l. Then,

0Zk

¡3k = oZk - aZk+1 = — > 0.

Theorem is proved.

Definition 2. Set D(z1) is called strongly time consistent, if for each imputation

oZ1 e D(z 1), there exist such IDP 3 = (31,..., 3k,..., 3l), that D(z ^ d Y 3j ® D(z k+1),

k

У

j=l

where the operation © means a © B = {a + b, b e B}.

Theorem 3. Suppose V(z,N) = V(N) and is the same in all stage games. Under this condition the set D(z^ is time-consistent and strongly time-consistent.

The theorem holds also in general case without requirement that V(z, N) is the same for all stage games r(z), but the proof is more difficult.

If the condition of theorem 3 holds W(N) = lW(N) in G(z1). Suppose ^ e D(z1),

then

Y in > W(z1,S) = lW(S),

Y ti1 = W(z1,N) = lW(N).

itN

From (8) we have

Define Zik, k = l,...,l, as

itS l

= W(ZUN) = W(N).

itN l

A* = Y,

then we get that f3k = ((? 1k,... ,(iik) e D(zk), f3ik ^ 0. Consider the expression

k

Y z j © D(z k+1) j=1

and let £k+1 e D(z k+1), then we can construct IDP /? ' for the imputation £k+1

ni _ £fc+i,i 1 im~ l-(k + i)

and construct vector

6 = ¿1fj + ,¿1= l~(k + 1)6+1 = I6 + 6+1'

kk E ^ = 7 E in + E ^ 7lw(s) + (l~ k)w(s) =

itS l itS itS l

= kW(S) + (l - k)W(S) = lW(S)

and we get that £1 e D(z1).

(8)

Since imputation £k+i e D(zk+i) was arbitrary

k

£ Pi ® D(zk+i)cD(zi), j=i

which proves strongly time-consistency of D(z1).

Example. Consider a three persons three stages game G(z1) starting from the vertex z1, in which the stage game r(z^ is played. For each player i e N = {1,2,3} the set of strategies consists of two elements, which for simplicity we shall denote by {1,2}. The payoff function of players i e N in r(z^ is define as shown on Fig. 1.

/(5,5,5) (0,10,5)4 V(10,0,6) (1,1,5) /

Player 3

Player 1

Player 2

/(5,5,5) (0,10,5)4 \(10,0,6) (1,1,5)/

1\ /2

Player 2

Figure 1. Game r(zi)

Notice: If players choose profile (1,1,1), the payoffs are equal to (5,5,5). If on the first stage players 1 and 2 choose profiles (1,1) and (2,2) and player three chooses arbitrary strategy k = 1,2 the game passes to the stage

z2 = T(zi;1,1,k) = T(zi;2,2,k), k = 1, 2,

and on stage z2 the game r(z2) = r(z^ is played (the game r(z^ is repeated). If players 1 and 2 choose strategies (1,2) and (2,1) and player 3 chooses strategy 1 or 2 the game passes to the stage

z2 = T(zi; 1,2,k) = T(zi; 2,1,k), k = 1, 2,

and at stage z'2 the new stage game r(z2) is played. The payoff function of players i e N is defined as shown in Fig. 2.

Figure 2. Game )

In our example in games r(zi), r(z2) player 3 cannot change payoffs of players 1 and 2. This is done for simplicity.

If in stages z2, z'2 players 1 and 2 choose profiles (1,1) or (2,2), the game passes to the stage z3, where the following stage game r(z3) is played (Fig. 3). In either case the stage z3 is realized with the game r(z3) played on this stage (Fig. 4).

Player 3

1 / 2

(

(5,5,5) (6,0,10)

(0,10, or

(1,1,5) j

Player 1

1\ /2

Player 2

(5,5,0) 1(10,0,5)

(0,0,5) \ (1,1,10)/

/a

Player 2

Figure 3. Game r(z3)

Figure 4- Game Г(г3)

Thus, we have four different stage games which can occur in the game G(zi). In the Table we represent the values of characteristic functions for each of stage games.

Table. The values of characteristic functions

V(S) У(1) У(2) У(3) У(1,2) У(1,3) У(2, 3) У(1, 2, 3)

r(*2) = r(*i) 1 1 5 10 6 6 16

г(4) 0 0 5 10 5 6 16

Г(*з) 1 0 0 10 11 10 16

г(4) 0 0 0 10 9 5 16

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

By definition of W we have in this case

W (1) = 1, W (2) = 1, W (3) = 5,

W({1, 2}) = 10, W({1, 3}) = 11, W({2,3}) = 10, W({1, 2,3}) = 16.

All possible trajectories in G(zi) are cooperative, since V(N) = 16 in each stage game r(z). Thus, we have

W (zi,S) = 3W (S), W (z2,S)=2W (S), W (zs,S) = W (S).

Denoting for simplicity W(z,S) by W(S), we get

W(1) = 3 ■ 1 = 3, W(2) = 3 • 1 = 3, W(3) = 3 • 5 = 15,

W({1,2}) = 3 ■ 10 = 30, W({1, 3}) = 3 ■ 11 = 33, W({2,3}) = 3 ■ 10 = 30,

W({1,2, 3}) = 3 ■ 16 = 48.

The analogue of the core D(z1) in this case coincides with the set of solutions of following inequalities:

01 ^ 3, 02 ^ 3, 03 ^ 15, 01 + 02 ^ 30, 01 + 03 33, 02 + 03 30, 01 + 02 + 03 = 48.

It is clear that the set D(z1) is not empty (one can take 01 = 15, 02 = 15, 03 = 18 or «1 = 16, «2 = 15, «3 = 17). For each a e D(z\) we can define IDP = k = 1,2, 3, then it is easily seen that D(z1) is strongly time-consistent.

Conclusion. In this paper we tried to define conditions under which a subset of imputations from the core is time consistent and strongly time consistent in multistage game. For this reason we introduce a new characteristic function which dominates the values of characteristic function in the sense of the papers [6-9]. With the help of this newly defined characteristic function we construct the core and if it is not empty we prove its time consistency and strongly time consistency. This condition is strong enough but it holds in multistage games when stage games do not differ much from each other and have the same maximal joint payoff.

References

1. Von Neumann J., Morgenstern O. Theory of games and economic behavior. Princeton, Princeton University Press, 1953, 666 p.

2. Basar T. Time consistency and robustness of equilibria in noncooperative dynamic games. Dynamic Policy Games in Economics. Eds by F. Van der Ploeg, A. de Zeew. Amsterdam, North-Holland, Elsevier Science Publ., 1989, pp. 9-54.

3. Beard R., McDonald S. Time consistent fair water sharing agreements. Ann. Intern. Soc. Dyn. Games. Boston, Birkhauser Publ., 2007, vol. 9, pp. 393-410.

4. Marin-Solano J. Time-consistent equilibria in a differential game model with time inconsistent preferences and partial cooperation. Dynamic Games in Economics. Berlin, Springer Publ., 2014, pp. 219-238.

5. Yeung D. W. K., Petrosyan L. A. Cooperative stochastic differential games. New York, SpringerVerlag Publ., 2006, 253 p.

6. Petrosyan L. A. Strongly time-consistent differential optimality principles. Vestnik of Saint Petersburg University. Series Matematics. Mechanics. Astronomy, 1993, no. 4, pp. 35-40.

7. Petrosyan L. A., Gromova E. V. On an approach to constructing a characteristic function in cooperative differential games. Automation and Remote Control, 2017, vol. 78, no. 9, pp. 1680-1692.

8. Petrosyan L., Zaccour G. Time-consistent Shapley value allocation of pollution cost reduction. Journal of Economic Dynamics and Control, 2003, vol. 27, no. 3, pp. 381-398.

9. Petrosyan L. A. Time consistency of the optimality principles in non-zero sum differential games. Lecture Notes in Control and Information Sciences, 1991, vol. 157, pp. 299-311.

Received: August 28, 2018. Accepted: September 25, 2018.

Author's information:

Yaroslavna B. Pankratova — PhD in Physics and Mathematics; y.pankratova@spbu.ru Leon A. Petrosyan — Dr. Sci. in Physics and Mathematics, Professor; l.petrosyan@spbu.ru

Новая характеристическая функция для многошаговых динамических игр

Я. Б. Панкратова, Л. A. Петросян

Санкт-Петербургский государственный университет, Российская Федерация, 199034, Санкт-Петербург, Университетская наб., 7—9

Для цитирования: Pankratova Y. B., Petrosyan L. A. New characteristic function for multistage dynamic games // Вестник Санкт-Петербургского университета. Прикладная математика. Информатика. Процессы управления. 2018. Т. 14. Вып. 4. С. 316-324. https:// doi.org/10.21638/11702/spbu10.2018.404

В работе рассмотрены конечношаговые динамические игры те-лиц с трансферабельны-ми выигрышами. Для таких игр разработана кооперативная версия игры и предложен новый подход к построению характеристической функции на основе характеристических функций, определенных в одновременных играх. Показано, что значения новой характеристической функции для каждой коалиции превосходят величины характеристической функции, построенной на основе максиминного подхода. Последнее обстоятельство позволяет использовать новую характеристическую функцию для построения подъядра рассматриваемой многошаговой игры. Получены условия, которые гарантируют, что этот новый подход приводит к динамически устойчивому (см. работы Л. Пет-росяна, Д. Закура 2003 г. и Л. Петросяна 1993 г.) и в некоторых случаях к сильно динамически устойчивому решению, которое совпадает с подъядром (статья Л. Петросяна 1993 г.). В работе приведен контрольный пример определения новой характеристической функции и показана сильная динамическая устойчивость подъядра, построенного с его помощью.

Ключевые слова: многошаговая игра, характеристическая функция, динамическая устойчивость, сильная динамическая устойчивость.

Контактная информация:

Панкратова Ярославна Борисовна — канд. физ.-мат. наук, ст. преподаватель; y.pankratova@spbu.ru

Петросян Леон Аганесович — д-p физ.-мат. наук, проф.; l.petrosyan@spbu.ru

i Надоели баннеры? Вы всегда можете отключить рекламу.