Научная статья на тему 'STOCHASTIC N-PERSON PRISONER'S DILEMMA: THE TIME-CONSISTENCY OF CORE AND SHAPLEY VALUE'

STOCHASTIC N-PERSON PRISONER'S DILEMMA: THE TIME-CONSISTENCY OF CORE AND SHAPLEY VALUE Текст научной статьи по специальности «Математика»

CC BY
5
2
i Надоели баннеры? Вы всегда можете отключить рекламу.
Область наук
Ключевые слова
N-PERSON PRISONER'S DILEMMA / COALITION / DYNAMIC GAME / CORE / SHAPLEY VALUE / TIME CONSISTENCY

Аннотация научной статьи по математике, автор научной работы — Grinikh Aleksandra L.

A cooperative finite-stage dynamic n-person prisoner's dilemma is considered. The time-consistent subset of the core is proposed. The the Shapley value for the stochastic model of the n-person prisoner's dilemma is calculated in explicit form.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «STOCHASTIC N-PERSON PRISONER'S DILEMMA: THE TIME-CONSISTENCY OF CORE AND SHAPLEY VALUE»

Contributions to Game Theory and Management, XII, 151-158

Stochastic n-person Prisoner's Dilemma: the Time-Consistency of Core and Shapley Value

Aleksandra L. Grinikh

St. Petersburg State University, 7/9 Universitetskaya nab., St. Petersburg, 199034, Russia E-mail: st062331@student.spbu.ru

Abstract A cooperative finite-stage dynamic n-person prisoner's dilemma is considered. The time-consistent subset of the core is proposed. The the Shapley value for the stochastic model of the n-person prisoner's dilemma is calculated in explicit form.

Keywords: n-person prisoner's dilemma, coalition, dynamic game, core, Shapley value, time consistency.

1. Introduction

In recent times, great attention is paid to research processes of human interaction through Game theory models. Mathematical game theory is now booming. Dynamic games are examined. Dynamic games can play an important role in addressing the issue of politics, economics of monopolies and the distribution of market power, and some others. One of the fundamental models of the Game theory is the "prisoner's dilemma". Hamburger (Hamburger, 1973) has considered multi-agent behaviour effects throught implementation "n-person prisoner's dilemma".

A large number of players makes this survey more entertaining since even the characteristic function looks less trivial than in the two-agent model.

The existing literature on repeated and dynamic models of the "n-person prisoner's dilemma" is extensive and focuses particularly on theoretical analysis and searching empirical results.

In this paper we consider a new characteristic function introduced by Pet-rosyan's characteristic function (Petrosyan, 2019). Using this characteristic function the time-consistent subset of the core for dynamic n-person prisoner's dilemma is constructed.

2. The "n-person prisoner's dilemma" model description

A gameT = (N, Xi; Xn, H i (xi, xn), Hn (xi; xn )) is a static "n-person prisoner's dilemma" game, where N is a set of the players, |N| = n. We denote by xi G {C, D} = Xi the pure strategies for each players Vi G N, where C means "to cooperate" strategy, but D means "to defect". The payoff function Hi (xi; ..., xj, ..., xn), Vi G N linearly depends on the number of players (x) who have chosen the "to cooperate" strategy:

H x x ) _ Í Ci (x) = aix + bi, V x G (0, n], if x¿ = C.

' ' ' ' I Di (x) _ a2x + &2, V x G [0, n), if xi _ D.

This function meets the following requirements:

1. Di(x — 1) > Cj(x), Vx e [1,n], i. e. the strategy "to defect" strictly dominates the strategy "to cooperate";

2. Cj(n) > Dj(0), so the strategy profile (C, ..., C) is Pareto effective in contrast to (D, D).

3. Di (x) > Di (0), Vx e [0,n — Ci (x) > Ci (1), Vx e [1,n], therefore,

x

than in case of the absence of cooperating players.

4. Ci(x) = Cj (x) and Di(x) = Dj (x), this means that the players are symmetric.

StrafEn ( StrafEn, 1993) introduce some of this principles in his book. Considerable amount of literature use the table form of writing the payoff function. However, our investigation allows us to simplify the calculations of characteristic function and then describe the construction of the core and the computing of Shapley value. Moreover, such form of payoff function is useful for future research.

3. The core of the "n-person prisoner's dilemma"

Consider a dynamic game If which is played during Kf steps. This game consists of the set of / static games (71, ..., Yf )> which can be described as the model of "n-person prisoner's dilemma". The games are realised with probabilities p1,...,pf on each stage of the game If, Xf=1 Pj = 1-

The payoff functions for each possible static game are:

.Y3 , . 1 xY3 is a number of players, who plays the C strategy,

3 I T-t T- T 1 — /

Dj3 (xY3) = a2x + bY, V xY3 e [0, n), if x/3 = D and

' CY3 (x) = ajj x + bj, V xY e (0, n] ^ xY3 = C and

H (x1 ? . . . ? xi ? . . . ? xn )

^xY3 is a number of players, who plays the C strategy. Let

VY3 (N)= max (x1,...,xi,...,in)

xi ,...,Xi,...,Xn i—'

ieN

is equal for all Yj : j e [1, /].

Definition 1. A core of the game rf is a set of possible allocations (a1, ..., an which doesn't contradict to the following statements:

1. individual rationality: ai > VPf (i), Vi e N;

2. coalitional rationality: ^2ieS ai > VPf (S), VS C N;

— vr

.¿ieN'

3. efficiency: J2ieN ai = VFf (N).

The value of characteristic function for each individual player in each stage of If equals to

VY3 (i) = DY3 (0) = bY3.

Suppose, that S is a coalition: S C N, |S| = s. It can guarantee the payoff VY3 (S) = max (r (aY3 r + b13) + (s — r) (aY3r + bY3)) , VS C N.

re[0,s]

V(N) is the same in all games (71, ..., 7/) and it is equal to

VY (N)= max (s (a]3s + b^3 ) + (n - s) (a,3 s + b,3 ))

sG[0,n]

So, the core of the game If is the set of allocations which meets with following conditions:

EN=1 ai = pj maxse[ojn] (a^3 s2 + b]"3 s + a^ ns - alj s2 + b^ n - b^ s) ;

^f=1 ai > K/J2j=1 pj maxre[o, s] (a^3'r 2 + b^3'r + a^'sr - a^'r2 + b^3's - b^3'r) .

Definition 2. Define W (S), S c N as follows: W (S) = maxj- V (7^-, S). Denote by D (Yj ) the set of imputations 0P3 = ^aj3, ..., oï3j in V/, satisfying the condition _

Y^ o,3 > W (S), S c N, S = N,

ies

]ToY3 = V (Yj, N),

ieN

here V (Yj, N) is the maximum sum of players payoffs in the game If. (Petrosyan and Pankratova, 2018)

Definition 3. A set D (71) is called to be strongly time consistent in If if

1. D (Y1+1 ) = 0,j G [1,/j;

()

P = (P1,..., Pj,..., Pf : D (Y1 ) D Xj=1 Pj © D (71+1) for all allocations oY 1 G D (71).

©

Pj©D (y;+1) = {Pj © d (7;+1) : d (y;+1) G D (yî+:l)} ( PetrosyanandGrauer, 2004).

Theorem 1 (Petrosyan-Pankratova's subset of the core). 3-person game If

D

inequalities

oD > max bT,3, f je[1,/] 2

af + af >

> max ( max (4aY3 + 2b13 ) ; max (a]3 + b^3' + a^3' + b^3' ) ; max (2bY3 H ,

Vjel1./] ] je[i,/i /

af + af + af = max] (9a? + 3bi0 , V j, k G N.

D

the value of characteristic function for singletone in the game Yj, Vj e [1, /] is

VY3 (i) = H3 (d, d, d) = bY3, V i e N.

All of the players are symmetric, so the value of characteristic function for two-person coalition is the maximum of three sums JYieS HY3 (D, D, D), EieS HY3 (C, D, D) or £ieS HY3 (C, C, D), where S = {1, 2}. Therefore,

VY' (1, 2) = VY3 (1, 3) = VY3 (2, 3) = max (4a13 + 2bY3; aY3 + bY3 + aY3 + bY3; 2bY3)

(C, C, C) (D, D, D)

VY3 (N) = max {9a]3 + 3b13; 4a13 + 2bY3 + 2aY3 + bY3; aY3 + b13 + 2aY3 + 2b^ } . Assume that

W (S) = max V (y.j, S), S C N jef

Define Yj as a subgame of If with the starting point j, where j e [1, Kf]. Thus, the subgame Y1 coinsides with If.

Next define a new characteristic function W (If, S):

W (Yj, S) = (Kf — j + 1) W (S),

where j e [1, Kf ].

Since VY3 (N) = maxx1,...,xi,...,xn EieN HY3 (x1,... ,xi,..., xn) is equal for all Yj j e [1,/])

af > Kf max b?.

f je[1,f] 2

af + af > W (S), where |S| = 2 and

W (S) =

Kf maJ max (4a13' + 2b Y3') ; max (aY3' + bY3' + aY3' + bY3') ; max (2b

Vje[1>f] je[1f] je[1,f]

Eaf =

ie N

= Kf max {9aY3 + 3bY3; 4aY3 + 2b13 + 2aY3 + bY3; a13 + bY3 + 2aY3 + 2bY3 } , je[1f ]

Vj e [1,/ ].

This inequalities proves that D is a subset of the core.

But as far as W (N) is equal for all Yj in If, W (N) = Kf W (N).

Therefore, we can define imputation distribution procedure, as

aY 1

ß« = Kf

Then, for all subgames of the If

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Y j aj

ßij =

Kf - j + 1'

where j G [1 ,Kf ].

The allocation a can be written using IDP (See Petrosyan, 1993) which gives us

Yj l Yj Kf Yj

^ Kf - j + 1 Kf - j + 1+ ^ - j + 1 >

j£[i,Kf] ies f J j=1 ies f J j=i+1 ies f J

> IW (S) + (Kf - l)W (S) = W (S).

Hence, we construct the strongly time-consistent Petrosyan-Pankratova's D-subset of the core.

Example 1 (D-subset of the core). Consider a dynamic 3-person prisoner's dilemma If, where VYj (N) are equal for all j € [1, f]. Let us define Kf = 5, f = 3.

Table 1. Game 71

(the 1st is row-player, the 2nd is column player and the 3rd is page-player)

C C D D C D

C (100, 100, 100) (90, 115, 90) C (90, 90, 115) (80, 100, 100)

D (115, 90, 90) (100, 100, 90) D (100, 80, 100) (85, 85, 85)

Table 2. Game 72

1st 2nd 3rd

C C D D C D

C (100, 100, 100) (50, 105, 50) C (50, 50, 105) (0, 90, 90)

D (105, 50, 50) (90, 90, 0) D (90, 0, 90) (75, 75, 75)

Table 3. Game 73

1st 2nd 3rd

C C D D C D

C (100, 100, 100) (90, 110, 90) C (90, 90, 110) (8 0, 98, 98)

D (110, 90, 90) (98, 98, 80) D (98, 80, 98) (8 6, 86, 86)

The values of characteristic functions of the games (y1, 72, 73) for each coalitions are:

Then we construct D-subset of the core for the game If

Table 4. The values of the characteristic functions of the games 71-73

S {1} {2} {3} {1, 2} {1, 3} {2,3} {1, 2, 3}

V Y1 85 85 85 180 180 180 300

V Y 75 75 75 150 150 150 300

V Y3 86 86 86 180 180 180 300

ai > 430, V i € N;

an + ay > 900, V i, j € N, i = j;

ai + ay + ak = 1500, V i, j, k € N, i = j = k.

Consequently, D-subset of the core for this dynamic 3-person prisoner's dilemma contains imputations like (430, 470, 600) (500, 500, 500) (430, 535, 535), etc.

4. The Shapley value of stochastic "n-person prisoner's dilemma" Definition 4. The Shapley value for the Jf is called an imputation (Shp,..., Shi/j of the payoff VFf (N) such that

Shf (S - 1)!(n - S)! [Vrf (S) - Vrf (S \ {i})] (Shapley, 1953).

SCN n!

It is well-known that the Shapley value satisfies:

1. efficiensy:

^ Shff (V) = V(N);

ieN

2. symmetry: if players i and j are symmetric in accordance with VPf

Shf (Vrf) = Shff (Vrf) ;

3. additivity: for two games VWPf

ShSf (Vrf) + ShSf (Wrf) = ShSf (Vrf + Wrf);

4. null-player: if Vrf (S U {i}) - Vrf (S) = 0, VS C N \ {i}

Shff (Vrf) = 0

Consider now the generalisation of the game If, wher e V(N) can be not equal in different stages. Define the values of the characteristic functions of each games Yj, j € [1, f ] for the coalition N:

V(N )= max Vff/j (xi,..., Xi ,...,x„).

ieN

Assume that Sy is a coalition |Sj | = sy, which select the strategy D. This coalition we call deviating coalition. Then |N \ Sy | = n — sy.

The maximum of the characteristic function for the game Yj is achieved when the benefit from the deviation of each additional player

Dj3 (x = n - (s + 1)) - C?3 (x = n - s) =

= (a?3 (n - s - 1) + b"3 2) - (aY (n - s) + bY)

is less than the amount of losses of all other players, including those who have already deviated

n-Sj - 1 Sj

]T (C?3 (x*) - C?3 (x**)) + £ (DY3 (x*) - DY3 (x**)), i=i i=i

x* = n - sj x** = n - s - 1

sj

maximum values of the characteristic functions for each of the f possible realizations of Yj in If:

Yj Yj 7,-

a1 n — a1 Sj —

a2' S j

> a23 n — a? Sj — a? + b23 —

a7 n — af s j — bf .

Then, the number of diviating players is

(2a1j — aYj) n + (a^3 — b^3 + b? — a7j)

(2a 7j — 2a23 )

, V7j G Tf.

Therefore, the values of the characteristic functions on each stage of the game rf we can just share as

VYj (N) = (a?3 (n - sj) + b?3) (n - sj) + («2j (n - sj) + b?3) sj,

, j e [1,f].

where Sj =

(2a1j-a23 )n+(a2j -b^3' + b13

(2aYj-2aYj )

Due to the efficiency and the symmetry axioms we can just shared equally the expected value of the characteristic function of rf for the grand coalition.

Shi (Vf =

= K

f

j=i

((a]3 — aYj) S2 + (a7j n + b^3' — a7j n — bj Sj + aYj n2 + b^3' n) p

for V« G N Sj =

(2a1j -a2j )n+(a2j -bp' + b13 -a^3'

(2aY3-2aY3 )

And furthermore, on the each stage of Pf, the Shapley value is equal to

Shi (V73 )

f j=i

( (a 73 — a^3) S2 + (a23 n + b73 — a?3' n — b73') Sj- + a?3' n2 + b7

3

)Pj

for Vi G N so

(2a1j-aj )n+(a'2j-blj +b(j-ajj )

It does not change during the

_ (2a1j -2a2j ) _

transition from one stage of the game to the next, given that the probabilities of each possible games (y1; ..., Yf) remain at all stages. Accordingly, the Shapley value of this game is time-consistent and belongs to the D subset of the core.

5. Conclusion

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

The payoff function is constructed for the arbitrary number of players in the n-person prisoner's dilemma. It would be correct to say that this payoff function can be used for multystage dynamic game where n players participate as in the the standard "prisoners dilemma".

The purpose of this study is to define the time-consistent subset of the core. It is found the Petrosjan's characteristic function, which gives the possibility to find Petrosjan-Pankratova's subset of the core.

Moreover, the Shapley value of the stochastic version of the n'person prisoner's dilemma was constructed in accordance with obtained payoff function.

References

Hamburger, H. (1973). N-person prisoner's dilemma. Journal of Mathematical Sociology, 3(1), 27-48.

Petrosyan, L. A. (1993). Differential games of pursuit (Vol. 2). World Scientific, 312

Petrosyan, L. A. and Grauer, L.V. (2004). Multistage games. Journal of applied mathematics and mechanics, 68(4), 597-605.

Petrosyan, L. A. and Pankratova, Y. B. (2018). New characteristic function for multistage dynamic games. Vestnik of Saint Petesrburg University, 14(4), 316-324.

Petrosyan, L. et al. (2019). Strong Strategic Support of Cooperation in Multistage Games. International Game Theory Review (IGTR), 21(1), 1-12.

Shapley, L. S. (1953). A value for n-person games. Contributions to the Theory of Games, 2(28), 307-317.

Straffin, P. D. (1993). Game theory and strategy. MAA, 36.

i Надоели баннеры? Вы всегда можете отключить рекламу.