Научная статья на тему 'On the consistency of weak equilibria in multicriteria extensive games'

On the consistency of weak equilibria in multicriteria extensive games Текст научной статьи по специальности «Математика»

CC BY
6
2
i Надоели баннеры? Вы всегда можете отключить рекламу.
Область наук
Ключевые слова
MULTICRITERIA GAMES / EXTENSIVE GAMES / EQUILIBRIA / TIME CONSISTENCY

Аннотация научной статьи по математике, автор научной работы — Kuzyutin Denis

This paper considers weak equilibria properties for multicriteria n-person extensive games. It is shown that the set of subgame perfect weak equilibriums in multicriteria games with perfect information is non-empty, however one can not use the backwards induction procedure (in the direct way) to construct equilibria in multicriteria extensive game. Furthermore, we prove that weak equilibria satisfies time consistency in multicriteria extensive games (with perfect or incomplete information).

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «On the consistency of weak equilibria in multicriteria extensive games»

On the Consistency of Weak Equilibria in Multicriteria Extensive Games

Denis Kuzyutin

St.Petersburg University,

Faculty of Applied Mathematics and Control Processes, Bibliotechnaya pl. 2, St.Petersburg, 198504, Russia International Banking Institute,

Nevsky pr., 60, St. Petersburg, 191011, Russia E-mail: d.kuzyutin@ya.ru

Abstract This paper considers weak equilibria properties for multicriteria n-person extensive games. It is shown that the set of subgame perfect weak equilibriums in multicriteria games with perfect information is non-empty, however one can not use the backwards induction procedure (in the direct way) to construct equilibria in multicriteria extensive game.

Furthermore, we prove that weak equilibria satisfies time consistency in multicriteria extensive games (with perfect or incomplete information).

Keywords: multicriteria games, extensive games, equilibria, time consistency

1. Introduction

We deal with so-called multicriteria games (or the games with vector payoffs) when every player takes several criteria into account. Shapley (1959) defined the notion of equilibrium point for (two-person) games with vector-payoffs and showed the correspondence between equilibria and Nash equilibria (Nash, 1951) of so-called trade-off games.

The basic results for Nash equilibria in n-person extensive games with incomplete information were elaborated in (Kuhn, 1953). The main results concerning time consistency of optimality principles in extensive games were summarized in (Petrosjan and Kuzyutin, 2008).

Some interesting properties of equilibria in different classes of multicriteria extensive games were established in (Borm, 1999), (Petrosjan and Puerto, 2002), (Fahretdinova, 2002), (Kuzyutin and Nikitina, 2011).

The main purpose of this paper is to extend some results concerning Nash equilibria in n-person extensive unicriterium games to weak equilibria in multicriteria extensive games (with perfect and incomplete information).

Section 2 contains main notations used in extensive games. Section 3 contains brief summary on decomposition of extensive games and strategies.

The example in section 4 shows one undesireable property of weak equilibria in multicriteria extensive games: if we have equilibrium x in the subgame and equilibrium ipD in corresponding factor-game, the ’’composite behavior”

^ = (y>D,¥>X)n=i does not necessarily satisfy the equilibria condition in the original extensive game.

The existence theorem (for weak equilibrium in pure strategies in multicriteria extensive games with perfect information) is proved in section 5. A slight modifica-

tion of backwards induction procedure (which allows to construct subgame perfect weak equilibria) is also presented in this section.

The time consistency of weak equilibria in multicriteria extensive games (with perfect or incomplete information) is proved in sections 6 an 7.

2. Multicriteria n-person extensive games with perfect information

We‘ll use the following notations (Kuhn, 1953; Petrosjan and Kuzyutin, 2008):

— r = {N, K, P, A,h} — finite multicriteria n-person game (or the game with vector payoffs) in extensive form with perfect information;

— N = {1,...,n} — the set of players in r;

— K — the game tree (with initial node x0) that consists of the set Z of all

terminal nodes (endpoints) and the set X = K \ Z of all intermediate nodes;

— x < y means that (unique) path from x0 to y contains x, and x = y;

— S(x) — the set of all node x immediate ’successors”; S(x) = 0 V x G Z;

— S-i(x) — the unique immediate ”precessor” of the node x: x G S (S S-1(xo) = 0;

— Z(x) — the set {y G Z | x < y}, i.e. the set of terminal nodes, which can be reached from x;

— w = {x0,x1, x2,...,xi} — the play (or trajectory) of length l:

xo < xi < ... < xi, xi G Z,

xj-i = S-1(xj), j = 1,...,l.

— Pi — is the set of all nodes where player i moves,

U Pi = K\Z;

i£N

— A — the ’choice partition”, i.e.:

Aj = {x G K \ Z 1 |S(x)| = j};

— hi(z) = (hi/i(z),..., hi/r(i)(z)) — the player i payoffs vector at the terminal node z G Z.

The player‘s i pure strategy is a function (with domain Pi) that determines for every node x G Pi some choice or alternative y G S(x).

The set of all player‘s i pure strategies in r denote by &i,i G N. The strategy profile p = (y>i,..., y>n) determines a unique play w = {x0, xi,..., xl} in r, where

i(xk) = xk+i, if xk G Pi,xl G Z, and, correspondly, a collection of all players vector payoffs {hi(xl.

Due to one-one mapping between the all plays w set and the set Z of all terminal nodes, we‘ll use the following notation:

hi(w) = hi(xi), where w = {xo, xi, ..., xi},xi G Z.

Denote by Hi the r(i)-vector valued payoff function, that assigns to each strategy profile p = (pi,..., pn) the corresponding player i vector payoff:

Hi :]} &j Rr(i) (1)

j=i

Note that player i in multicriteria game r tries to maximize r(i) scalar criteria (i.e. all the components of his vector valued payoff function Hi(p) = (Hi|i(p),...,Hi|r(i) (p))).

Denote by MGp(n, K, r(1),..., r(n)) the class of all finite n-person multicriteria extensive games with perfect information and vector payoffs (1).

3. The decomposition of extensive games and strategies

In a game r with perfect information every intermediate node x G K\Z generates the subgame rx = {Nx,Kx,Px,Ax,hx}, which components are just the restrictions (Kuhn, 1953; Petrosjan and Kuzyutin, 2008) of corresponding components of the original game r onto subtree Kx (the subgame rx tree).

In particular,

h?(y) = hi(y) Vy G Z(x) Vi G N (2)

Denote by <PX the set of all player‘s i pure strategies in the subgame rx. The strategy profile px ^ nn=i generates the unique play wx = {x,..., xm} in the subgame and, hence, the collection of players‘ vector payoffs:

n

Hx :JJ &x Rr(i),i G N. (3)

j=i

Let x G K\Z, x = x0. For every strategy profile px in the subgame rx denote by rD = rD (px) the so-called factor-game on the tree KD = {x} U K\Kx.

Note that {x} U Z\Z(x) — the set of terminal nodes in factor-game, and

hD(x) = Hx(px),i G N. (4)

Denote by &D the set of all player‘s i pure strategies in factor-game rD. The strategy profile pD G n=i &D generates the unique play wD = {x0, ...,xk} in the factor-game rD and, hence, the collection of players‘ vector payoffs:

n

HD :£[ &D -^ Rr(i),i G N. (5)

j=i

The decomposition of original extensive game r at the node x onto subgame rx and factor-game rD generates the corresponding decomposition of pure (and mixed) strategies (Kuhn, 1953; Petrosjan and Kuzyutin, 2008). The pure strategy pi G &i decomposition at intermediate node x onto pure strategy pf G <Pf in the subgame rx and pure strategy pD G &D in the factor-game rD means that:

— px is the restriction of pi onto the set Px;

— pD is the restriction of pi onto the set PD of all player‘s i nodes in the factor-

game rD.

Note that Pi = Px y PD, and, hence, one can compose the player‘s pure strategy pi = (pD, pxx) G &i in the original game r from his strategies pi[ G and pD G &D in the subgame rx and factor-game rD correspondly.

4. Subgame perfect weak equilibrium in multicriteria extensive game

Let x,y G Rl, and y > x means that yi > xi for all i = 1,...,t. The vector

x G M C Rl is weak Pareto efficient (or undominated) in M if {y G Rf | y >

x}p M = $. In this case we‘ll use the following notation: x G WPO(M).

Given strategy profile p = (pi,..., pn) = (pi, p-i) in the finite n-person extensive multicriteria game with perfect information r G MGP(n, K,r(1),..., r(n)) denote by

Mi(r,p-i) = {Hi(pi,p-i),pi G &i} (6)

the set of all player‘s i attainable vector payoffs (due to arbitrary choice of his strategy pi G &i).

Definition 1. The strategy profile p = (pi,.. .,pn) is called (weak) equilibrium (Borm, 1999) in multicriteria game r G MGP(n, K, r(1),..., r(n)), iff

Hi(pi, p-i) G WPO(Mi(r, p-i)) Vi G N. (7)

We let ME(r) denote the set of all weak equilibriums in r. Note that (7) is

equivalent to the following condition:

(pi,..., pn) G ME(r) Vi G N ]3 pi G $i : Hi(pi, p-i) > Hi(pi, p-i). (8)

Definition 2. The strategy profile p G ME(r) is called subgame perfect weak equilibrium in r iff:

px G ME(rx) Vx G K\Z. (9)

Denote by SPME(r) the set of all subgame perfect weak equilibriums in r.

One should note that in case r(i) = 1 for all i G N condition (8) coincides with usual Nash equilibria requirement (Nash, 1951) in unicriterium game.

Let us remember now the important result, established in (Kuhn, 1953) (using the decomposition of extensive games and players‘ strategies): if we have Nash equilibrium px in the subgame rx and Nash equilibrium pD in the corresponding factor-game rD(px), the ”the composite behavior” p = {(pD,px)}n=i forms the Nash equilibrium in original game r.

This basic result is valid not only for the games with perfect information (and pure strategies) but for the games with incomplete information as well (when players use mixed strategies in general case). More precisely, the following theorem holds (Kuhn, 1953; Petrosjan and Kuzyutin, 2008).

Theorem 1. Let r be n-person extensive (unicriterium) game (with perfect or incomplete information), x — some intermediate node; (px = ((p\, ..., pn) — the Nash equilibrium (in mixed strategies in general case) in the subgame rx; (pD = (ppD, ..., (pD) — the Nash equilibrium in factor-game ^(p*).

If every player‘s i strategy pi allows the decomposition onto px and pD in the subgame rx and factor-game rd correspondly, then the strategy profile p = (pi, ..., pn) forms the Nash equilibrium in the original game r.

This fact, in particular, allows to use the backwards induction procedure to construct subgame perfect equilibrium in unicriterium multistage game with perfect information (Petrosjan and Kuzyutin, 2008).

However, the following example shows that the same conclusion is not valid for weak equilibrium in multicriteria extensive games.

Example 1. Consider the multicriteria 2-person game with perfect information r = (2, K, r(1) = 2, r(2) = 2) with game tree K, presented in fig. 1.

Figure1: 2-person multicriteria game Г.

The players‘ vector payoffs are signed near every endpoint, Pi = {x0,x},P2 = { y, yp} .

The players‘ strategies p\(x) = R (Right alternative at the node x) and p\(y) = L form weak equilibrium in the subgame ry:

py = (pi, py) G ME(ry).

Note, that in the factor-game rD (py) the node y is terminal node with players‘ 1

O'

It is also clear that the strategy profile pD (x0) = R and pDD (y) = L is weak equilibrium in factor-game rD(p)y :

pD = ( pD, pD) G ME(rD (py)),

3'

vector payoffs 15

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

and HD (pD j . 3

However, the ’’composite” strategy profile p = (pi, p2), where pi = (pD, px),i = 1, 2, does not satisfy the equilibrium requirement (8), because

Hi(pi, p2) G WPO(Mi(r, p2)).

Indeed, consider the first player strategy ^i(x0) = L,^i(x) = L. Then

H(Фl, (2) > H((1, (2) =(3)

5. The construction of SPME in multicriteria game

Unfortunately, one can not use the backwards induction procedure (in the direct way) to construct subgame perfect weak equilibriums in multicriteria game r G MGP(n, K,r(1),..., r(n)).

To prove that the set ME(r),r G MGP(n,K,r(1), ...,r(n)), is nonempty let us consider auxiliary unicriterium game rT. The only difference between original multicriteria game r and rT is that every player in rT tries to maximize only first criteria in his original vector payoff function. Thus, the player i payoff function in r^ is

n

HT(p)= Hi|i(p) Vi G N V(pi,...,pn) g]}. (10)

j=i

Note that rT is the usual (unicriterium) n-person extensive game with perfect information.

Lemma 1. Let p = ((pi,..., pn) — Nash equilibrium (in pure strategies)

in unicriterium extensive game LT with payoff function (10), that corresponds to multicriteria game r G MGP(n, K, r(1), ...,r(n)). Then

p = (pi,..., pn) G ME(r).

Proof. By the NE requirement we have

Hi|i(pi, p-i) < Hm(Pi, p-i), Vi G N, Vpi G $i.

Hence,

Hi|1(^pi, pp-i) max Hi|i(pi, (p-i).

Vie&i

Thus, there is no such strategy pi G <Pi that the following strict inequality holds

Hi( p i, P-i) > Hi(Pi, P—i ) .

In that case the strategy profile p obviously satisfies the ME requirement (8). Hence, p G ME(r). □

Lemma 2. If p = (pi,..., pn) is subgame perfect equilibrium in rx with payoff function (10), then

p G SPME(r).

Using lemma 1 and 2 and the fact that every finite n-person extensive game with perfect information possesses SPE (in pure strategies) we get the following result.

Theorem 2. Every finite n-person extensive multicriteria game

r G MGP(n, K,r(1), ...,r(n)) with perfect information possesses subgame perfect weak equilibrium p G SPME(r) in pure strategies.

Corollary 1. The set ME(r) of all weak equilibriums (in pure strategies) infinite n-person multicriteria extensive game r G MGP(n, K,r(1), ...,r(n)) with perfect information is non-empty.

To construct the set MSPE(r) in finite n-person multicriteria extensive game Г with perfect information one can use another auxiliary unicriterium game (so called ”trade-off unicriterium game”), suggested in (Shapley, 1959). Let

A(i) Є Arii) = {A Є Rr(i)\Aj > 0,Ai + ... + Xr(i) = 1}

denote the player i ”trade-off vector”, and

H4v) = E Xj (i) • Щз v). (її)

j=i

denote the payoff function of player i in auxiliary unicriterium trade-off game Г\.

Note that Гт is a partial case of trade-off game Г\, when X1(i) = 1,Aj(i) = 0,j = 1.

Let МЕ(Г\) denote the set of all Nash equilibriums in the trade-off game Г\. It was proved in (Shapley, 1959) that the set МЕ(Г) of all weak equilibriums in n-person multicriteria game Г coincides with the set МЕ(Г\) of all Nash equilibriums in all auxiliary trade-off games Г\ i.e.

МЕ(Г) = {ф Є МЕ(Гх)\Х = (A(1),..., A(n)) Є Щ=іЛг{і)}.

Using this basic result and lemma 1 and 2 we can propose the following technique to construct the set МЕ(Г) of all weak equilibriums (in pure strategies) in finite n-person multicriteria extensive game Г Є MGP(n,K,r(1), ...,r(n)) with perfect information:

1) for every player i Є N choose an arbitrary trade-off vector A(i) Є Лг(і).

2) apply the backwards induction procedure to auxiliary unicriterium trade-off game Г\ with payoff functions (11) to construct all subgame perfect equilibriums ф Є SPE(Г\) in pure strategies. All these strategy profiles ф are subgame perfect weak equilibriums in the original multicriteria game Г Є MGP(n, K, r(1),..., r(n)).

6. Time consistency of week equilibria in multicriteria extensive games with perfect information

The strategy profile ф Є МЕ(Г) generates the unique play (trajectory) ш on the game tree K in multicriteria extensive game Г Є MGP(n, K, r(1),..., r(n)) with perfect information. Let G(p) denote the set of all subgames along the play ш, i.e. G(p) = {Гх\х Є ш}.

Definition 3. The set ME(Г) (the optimality principle ME) satisfies the time consistency property (Petrosjan and Kuzyutin, 2008) if for every weak equilibrium ф Є ME(r) and every subgame Гх Є G(vp) the following inclusion holds: фх Є ME(Px).

Theorem 3. The set ME(r) of all weak equilibriums in pure strategies in n-person multicriteria extensive game Г Є MGP(n, K, r(1), ..., r(n)) with perfect information satisfies the time consistency property.

Proof. Let ф Є ME(f), i.e. condition (8) holds. Suppose that фх Є ME^) in some subgame Гх Є G(p). Then there exists such strategy vX Є Фхх of some player i that

HX(vх, ф-і) > нх(фх) = Ні(ф).

At the same time

H't(pxi, lpx-i) = Hi(^i, lp-i), where ^i = (pD, px) G &i.

Hence we constructed such strategy ^i of player i G N in the original game r, that

Hi(^i, p-i) > Hi (pi, p-i).

However, the last vector inequality contradicts (8).

Hence, the set ME(r), r G MGP(n, K, r(1),..., r(n)) is time consistent. □

7. Weak equilibria in mixed strategies in multicriteria extensive games with incomplete information

Now let us consider the class MG(n, K, r(1),..., r(n)) of finite n-person extensive games r = {N, K,P,A, U, h} with incomplete information (Kuhn, 1953) and with vector payoffs. We let U denote the collection of all players informational sets. Note that the mixed strategy profile p in extensive game r = {N, K, P, A,U, h} with incomplete information generates in general case the whole set Q(p) of plays (trajectories) w on the game tree K, and let p(w,p) denotes the probability of the play w realization in r if all players use the mixed strategies pi,i G N.

Note, that the intermediate node x generates the subgame rx (subgame on the tree Kx) of the game r with incomplete information iff every informational set in r is included in Kx or does not intersect with Kx.

Decomposition of extensive game r with incomplete information at the node x onto factor-game rD and subgame rx generates corresponding decomposition of mixed strategies (Kuhn, 1953; Petrosjan and Kuzyutin, 2008). In that case the following proposition holds.

Lemma 3. Every pair px and pD of player‘s i mixed strategies in rx and rD can be obtained as the result of decomposition of some mixed strategy pi in the original game r. Moreover, for each play w G r which contains x, the following condition holds:

p(w,j)= p(Wx,iD) ■ p(wx,jx), (12)

where pD = (jD,...,jd) — the strategy profile in rD, px = (px, ..., pn) — the strategy profile in the subgame rx, w = {x0, ...,x, ...,xi},xi G Z — the play (trajectory) in r, wx = {x0, ...,x} — the play in rD, wx = {x, ...,xi} — the play in rx, p(wx, jD) = p(x, jD) — the probability of reaching the node x if all players use the mixed strategies jD, i G N in factor-game rD.

As it was proved in (Fahretdinova, 2002), the set SPME(r) of all subgame perfect weak equilibriums (in mixed strategies)in finite n-person extensive multicriteria game with incomplete information is non-empty.

Moreover, note that one can apply the technique for SPME construction (which we suggested in section 5) in multicriteria extensive games with incomplete information as well.

Let p G ME(r), r G MG(n, K, r(1),..., r(n)), generates the set f2(p) of optimal plays w on the game tree K and G(p) — the set of all possible subgames rx along the ’optimal game evolution”, i.e. x G w,w G ^(p).

Theorem 4. The set ME(r) of all weak equilibriums (in mixed strategies) in the game r G MG(n,K,r(1), ...,r(n)) with incomplete information satisfies the time consistency property.

Proof, p G ME(r) iff every player i has no such mixed strategy pi that:

Hi( pi, p-i) > Hi(pi, p-i). (13)

Let rx G G(p), i.e. x G wn,wn G Q(p),x = x0. Note that the set of all optimal trajectories {wn}, generated by p can be divided onto two subsets: {nm} = {w\x G w} and {xk} = {w\w does not contain x}, and {rim} f){Xk} = 0.

Then

Hi(p) = Ep(nm, p) ■ hi(nm) + Ep(xk, p) ■ hi(xk) (14)

mk

Let pD = (pD,..., pD) — the result of strategy profile p decomposition, corresponding to factor-game rD = rD (px), and

P(Vx, pD) = p(x, pD) = p(x, p)

— the probability of reaching the node x (or the probability of play f/x = {x0, ...,x}) in factor-game rD, when all players use strategies pD, i G N.

Suppose that the time consistency condition is violated in the subgame rx, i.e.

px G ME(rx). Then for some player i G N there exists such strategy pf in rx that

Hx(px, p-i) >Hf(pf, p-i) (15)

Let the strategy profile (px, pxi) generates the set of plays {£* } in the subgame, which are realized with positive probabilities p(£x, (px, pxi)). Then we can rewrite the inequality (15):

E p(^s, (px, p-i)) ■ hx(ea) > E p(nm, px) ■ hx(nm). (16)

a m

Taking lemma 3 into account, the pair px and pD of player‘s i mixed strategies in rx and rD can be obtained as the result of decomposition of some strategy j3i = (pD, pf) in r. Moreovere:

Hi(Pi, p-i) = Ep(nx, pD) ■ p^I., (J^, px-i)) ■ hi(C) + Ep(Xk, p) ■ hi(Xk). (17)

a k

Now let us multiply both parts of inequality (16) onto positive value p(f/x, pD)

and then add kp(xk, p) ■ hj,(xk) to both parts of obtained vector inequality.

Taking (17) and (14) into account, we will finally have:

Hi (fti, p-i) > Hi(pi, p-i).

This inequality contradicts (13).

Hence, the set ME(r) in mixed strategies satisfies the time consistency property in n-person multicriteria extensive games with incomplete information. □

Acknowlegments. The author expresses his gratitude to L. A. Petrosjan for useful

discussions on the subjects.

References

Borm, P. , Megen F. and Tijs, S. (1999). A perfectness concept for multicriteria games. Mathematical Methods of Operation Research, 49, 401-412.

Fahretdinova, V. (2002). Positional games with vector payoffs (in Russian). MKO-10, 151153.

Kuhn, H. (1953). Extensive games and the problem of information. Annals of Mathematics Studies, 28, 193-213.

Kuzyutin, D. and Nikitina, M. (2011). On the consistency of equilibria in multicriteria extensive games. The Fifth International Conference ”Game Theory and Management”, SPbGU, 144-145.

Nash, J. (1950). Equilibrium points in n-person games. Proc. Nat. Acad. Sci., USA, 36, 48-49.

Petrosjan, L. and Kuzyutin, D. (2008). Consistent solutions of positional games (in Russian). Saint Petersburg University Press.

Petrosjan, L. and Puerto, J. (2002). Folk theorems in multicriteria repeated n-person games. Sociedad de Estadistica e Investigation Operativa Top, Vol. 10, No. 2, 275-287.

Shapley, L. (1959). Equilibrium points in games with vector payoffs. Naval Research Logistics Quarterly, P. 57-76.

Zhukovskiy, V. and Salukvadze, M. (1994). The vector-valued maximin. N.Y., Academic Press.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.