Научная статья на тему 'ON THE EXISTENCE AND DETERMINING STATIONARY NASH EQUILIBRIA FOR SWITCHING CONTROLLER STOCHASTIC GAMES'

ON THE EXISTENCE AND DETERMINING STATIONARY NASH EQUILIBRIA FOR SWITCHING CONTROLLER STOCHASTIC GAMES Текст научной статьи по специальности «Математика»

CC BY
12
5
i Надоели баннеры? Вы всегда можете отключить рекламу.
Область наук
Ключевые слова
STOCHASTIC GAME / SWITCHING CONTROL / STATIONARY STRATEGIES / STATIONARY NASH EQUILIBRIUM

Аннотация научной статьи по математике, автор научной работы — Lozovanu Dmitrii, Pickl Stefan

In this paper we consider the problem of the existence and determining stationary Nash equilibria for switching controller stochastic games with discounted and average payoffs. The set of states and the set of actions in the considered games are assumed to be finite. For a switching controller stochastic game with discounted payoffs we show that all stationary equilibria can be found by using an auxiliary continuous noncooperative static game in normal form in which the payoffs are quasi-monotonic (quasi-convex and quasi-concave) with respect to the corresponding strategies of the players. Based on this we propose an approach for determining the optimal stationary strategies of the players. In the case of average payoffs for a switching controller stochastic game we also formulate an auxiliary noncooperative static game in normal form with quasi-monotonic payoffs and show that such a game possesses a Nash equilibrium if the corresponding switching controller stochastic game has a stationary Nash equilibrium.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «ON THE EXISTENCE AND DETERMINING STATIONARY NASH EQUILIBRIA FOR SWITCHING CONTROLLER STOCHASTIC GAMES»

Contributions to Game Theory and Management, XIV, 290-301

On the Existence and Determining Stationary Nash Equilibria for Switching Controller Stochastic Games

Dmitrii Lozovanu1 and Stefan Pickl2

1 Institute of Mathematics and Computer Science of Moldova Academy of Sciences,

Academiei 5, Ghisinau, MD-2028, Moldova, E-mail: lozovanu@math.md

2 Institute for Theoretical Computer Science, Mathematics and Operations Research,

Universität der Bundeswehr München, 85577 Neubiberg-München, Germany, E-mail: stefan.pickl@unibw.de

Abstract In this paper we consider the problem of the existence and determining stationary Nash equilibria for switching controller stochastic games with discounted and average payoffs. The set of states and the set of actions in the considered games are assumed to be finite. For a switching controller stochastic game with discounted payoffs we show that all stationary equilibria can be found by using an auxiliary continuous noncooperative static game in normal form in which the payoffs are quasi-monotonic (quasi-convex and quasi-concave) with respect to the corresponding strategies of the players. Based on this we propose an approach for determining the optimal stationary strategies of the players. In the case of average payoffs for a switching controller stochastic game we also formulate an auxiliary noncooperative static game in normal form with quasi-monotonic payoffs and show that such a game possesses a Nash equilibrium if the corresponding switching controller stochastic game has a stationary Nash equilibrium.

Keywords: Stochastic game, Switching control, Stationary strategies, Stationary Nash equilibrium

1. Introduction

A switching controller stochastic game is a stochastic game in which the transition probabilities in a state are controlled only by one of the players. So, the set of states of an rn-player switching controller stochastic game can be divided into m disjoint subsets where each subset represents the set of states for one of the player that governs the transition probabilities. The problem of determining stationary Nash equilibria in switching controller stochastic games recently has been studied by Bayraktar et al., 2016; Dubey et al., 2017; Krishnamurthy, N., Neogy, S.K., 2020. For a switching controller stochastic game with discounted payoffs stationary equilibria exist because stationary Nash equilibria exist for a discounted stochastic game in general. Schultz, 1992 showed that the optimal stationary strategies of the players in a discounted switching controller stochastic game can be found by-solving an auxiliary linear complimentary problem. In the case of switching control stochastic games with average payoffs stationary Nash equilibria are known to exist only for some special classes of games, however, for such an arbitrary-game Thuijman and Raghavan, 1997 showed the existence of e- Nash equilibria. In general, the question of the existence of stationary Nash equilibria for an average stochastic game is an open problem.

In this paper we propose some new results concerned with determining stationary Nash equilibria in switching controller stochastic games with discounted and average https://doi.org/10.21638/11701/spbu31.2021.21

payoffs. We show that all stationary equilibria for a discounted switching controller stochastic game can be obtained as Nash equilibria for an auxiliary noncoopera-tive continuous static game in normal form where the payoffs are quasi-monotonic (quasi-convex and quasi-concave) with respect to the corresponding strategies of the players. Based on this we propose an approach for determining the optimal stationary strategies of the players for a discounted switching controller stochastic game. For an average switching controller stochastic game we also formulate an auxiliary noncooperative static game with quasi-monotonic payoffs and show that such a game has a Nash equilibrium if the corresponding average switching controller stochastic game has a stationary Nash equilibrium.

2. Formulation of the Basic Game Models

An m-player switching controller stochastic game consists of the following elements:

- a finite set of states X;

- a finite set of actions A®(x) in each state x G X for an arbitrary player i G {1, 2,. .., m};

- a reward rX,a for each player i G {1,2,..., m} in each state x G X for an arbitrary action vector a = (a1, a2 ..., am), where a® G A®(x), i = 1, 2 ..., m;

- a partition X = X1U X2 U • • • U Xm, of the set of states X, where X® represents the set of controllable states of player i G {1,2,..., m};

- a transition probability function p® : X® x xeX, A®(x) x X ^ [0,1] for each i G {1,2,..., m} that gives the transition probabilities pX,y of player i from an arbitrary x G X® and each a1 G A®(x) to an arbitrary y G X where S-yexPX]y = 1, Vx G X®, Va® G A®(x);

- a starting state x0 G X.

The game starts in the state x0 at the moment of time t = 0 where the players simultaneously and independently fix their actions a0 G A®(x0), i = 1,2, ...,m. After that the players receive the corresponding rewards rlxo,ao, i = 1, 2,..., m in x0 for the given action vector a0 = (aj, a2,..., a™). Then the game passes randomly to

a state xi G X according to the probability distribution {pX0,y}yex- At the moment of time t = 1 players observe the state xi G X and again simultaneously and independently select their actions aj G Ai(xi), i = 1, 2,..., m in the state xi and receive the corresponding rewards rlxai , i = 1,2,..., m for the given action vector ai = (aj, aj,..., a™)- Then the game passes randomly to a state x2 G X according

to a probability distribution {pXl,y}yeX. In generd, at the moment of time t the players observe the state xt G X, fa their actions a\ G A®(xt), i = 1, 2,..., m in xt ^d receive the corresponding rewards r£ , i = 1,2,..., m in xt for the given action vector at = (aj, a)2,..., ajm). Such a play of the game produces a sequence of states and actions x0, a0, xi, ai,..., xt, at,... that defines a stream of stage rewards

rit, at, r2t, at, . . . , r™, at, t = 0, 1, 2, . . . •

The average switching-controller stochastic game is the game with payoffs of the players

At- \

0 = t!i™ inf e( tErX T ,aj , i = 1, 2,...,m

where e is the expectation operator with respect to the probability measure induced by the Markov process with actions chosen by the first player and given starting state x0. Each player in this game has the aim to maximize his average reward per transition.

The discounted switching-controller stochastic game with a given discount factor Y is the game with payoffs of the players

= e| ^ yt ri |

xo I / j ' XT ,«t

\t=0 J

Each player in this game has the aim to maximize his discounted sum of stage rewards.

We will study these games in the case when the players use stationary strategies of selection the actions in the states.

3. Switching Controller Stochastic Games in Stationary Strategies

A strategy si(xt) of player i G {1,2,..., m} in a switching controller stochastic game is a mapping si that provides for every state xt G X a probability distribution over the set of actions Ai(xt). If these probabilities take only values 0 Mid 1, then si is called a pure strategy, otherwise si is called a mixed strategy. If these probabilities depend only of the state xt = x G X (i.e. si do not depend on t), then si is called a stationary strategy, otherwise si is called a non-stationary strategy.

Thus, we can identify the set of stationary strategies Si of player i with the set of solutions of the following system

E sX i = 1, Vx G X;

aieA(x)

si , > 0, Vx G X, Va G A(x)

in which the basic solutions corresponds to the set of pure stationary strategies.

Let s = (s1, s2,..., sm) G S = S1 x S2 x • • • x Sm be a profile of stationary-strategies (pure or mixed strategies) of the players. Taking into account that the transition probabilities in each state are controlled only by one of the players, we have that the dynamics of the game is determined by the stochastic matrix Ps = (Px,y), where the elements pX,y are calculated as follows

Px,y =53 s%x ai, for x G Xi mid y G X, i = 1, 2,..., m. (1)

ai £A*(x)

If Qs = (qX,y) ^^^^^^^g probability matrix of Ps then the average payoffs per

transition (s), w2o (s),..., ^mO (s) for the players are determined as follows

WX o (s) = ^ ,y ry,s, i = 1, 2,...,m, (2)

yex

where

n

ry,s = 53 IT ry,(a1,a2,...,am) (3)

(a1,a2,...,am)EA(y) k = 1

expresses the average payoff (immediate reward) in the state y G X of player i when the corresponding stationary strategies s1, s2,..., sm have been applied by-players 1, 2,..., m in y. Here A(y) = nl=1 A®(y).

The functions w1o(s), w20(s), ..., w^(s) on S = S1 x S2 x ••• x Sm, defined according to (2), (3), determine a game in normal form that we denote by ({Si}j=ym> {wX0(s)}j=^m )• This game corresponds to the average switching controller stochastic game in stationary strategies that in extended form is determined by the tuple ({Xi}i=I^, ^¿O^Lex^^ {•4,0^=1^ {^¿=1^ xo)-

A discounted switching controller stochastic game in stationary strategies with given discount factor y we define as follows. Let s = (s1, s2,..., sm) be a profile of stationary strategies (pure or mixed strategies) of the players. Then the elements of the probability transition matrix Ps = (pX,y) induced by s can be calculated according to (1) and we can find the matrix Ws = ), where Ws = (I — yPs)-1. After that we can determine the payoffs for the players as follows

0 (s) = 53 Wxo,yi = 1, 2, yex

where ry,s is calculated according to (3). These payoffs on S define a normal form game ({Si}i=^m, {aX0 (s)h=rm ) that corresponds to the discounted switching controller stochastic game in stationary strategies. In the extended form this game is determined by tuple ({Xi}i=1^, {Ai(x)}xex,i=1m> 14,0^=!^ 7, xo)-

In this paper we will consider also an average switching controller stochastic game in which the starting state is chosen randomly according to a given distribution 9 = {9X} 011 X. So, for a game we will assume that the play starts in the state x G X with probability 9X > 0 where 9X = 1. If the players use stationary

xex

strategies then the payoff functions

4(s) = 53 (s), i = 1, 2,.. ., m

xex

on S define a game in normal form ({Si}i=1^, {w^(s)}i=^m ) that in the extended form is determined by the tuple ({Xi{Ai(x)}xeX ¿=rm, {rX,ah=Tm, P, 9)-In the case 9X =0, Vx G X \ {x0}, 9Xo = 1 the considered game becomes an average

xo

Similarly, for a discounted switching controller stochastic game we can consider the case when the starting state is chosen randomly according to a given distribution 9 = {9X} on X. In this case we can define the payoffs

^(s)= E 9x<(s), i = 1, 2,..., m

Xex

on S and we obtain the game in normal form ({Si}i=^m, {ae(s)h=rm ) that is determined by the tuple ({Xi}i=1^, {Ai(x)}x£x,i=^, {/¿(x, o}i=1m, P, Y, 9).

Note that an average switching controller stochastic game in the case m = 1 becomes an average Markov decision problem that is determined by the tuple (X, |A(x)|xex, p, 0), where X = Xi,A(x) = A1(x),rx,a = and

a = a1 G A(x) for x G X. Similarly, a discounted switching controller stochastic game in the case m =1 becomes a discounted Markov decision problem that is determined by the tuple (X, |A(x)|xeX, |rx o|, p, 7, 0).

4. Preliminaries

In this section we present some properties of Markov decision problems with average and expected total discounted reward optimization criteria that we shall use in the following two sections for studying the switching controller stochastic games.

Let us consider a discounted Markov decision problem that is determined by the tuple (X, |A(x)|xeX, |rx,a|, p, y, 0). Lozovanu and Pickl, 2018 showed that this decision problem can be formulated and studied in terms of stationary strategies because the expected total discounted reward ae(s) for a given stationary strategy s when the starting state y G X is chosen randomly according to probability-distribution 0 = |0y|yeX can be represented as

ae(s) = 53 53 rx,asx,aqx

xex oeA(x)

where qx for x G X are determined uniquely from the following system

qy - Y 53 53 Sx,a9x = 0y, Vy G X.

xex oeA(x)

Furthermore, the following theorem has been proven.

Theorem 1. Let a discounted Markov decision problem be given and consider the function

ae (s) = 53 53 rx,aSx,aqx

xex oeA(x)

qx x G X

qy - y 53 53 sx,aqx = 0y, Vy g X.

xex oeA(x)

Then on the set S of the solutions of the system

{E sx,« = 1, Vx G X;

aeA(x)

sx,a > 0, Vx G X, Va G A(x).

i/ie function ae(s) depends only on sx,a for x G X, a G A(x) and ae(s) ¿s quasi-monotonic on S, i.e. ae(s) ¿s quasi-convex and quasi-concave on S fsee -Boyd and Vandenberghe, 2004). Moreover ae(s) ¿s continuous on S.

Thus, a discounted Markov decision problem in stationary strategies can be represented as a problem of maximization of quasi-monotonic function ae (s) on a S

We can present some similar results for an average Markov decision problem.

Let us consider an average Markov decision problem that is determined by the tuple (X, {A(x)}xEX, {rx,a}, p, 0). Lozovanu, D., 2018 showed that this decision problem can be formulated and studied in terms of stationary strategies. In this case the expected average reward (s) for a given stationary strategy s when the starting state y G X is chosen randomly according to probability distribution {0yEX} can be represented as

(s) = 53 53 rx,aSx,aqx xeX oeA(x)

where qx for x G X are determined uniquely from the following system of linear equations

{qy - E E sx,aPxJyqx = o, Vy g x;

XEX oeA(x)

qy + Wy - E E Sx,aPX,yWx = , Vy G X,

xEXaEA(x)

For the considered problem the following theorem has been proven.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Theorem 2. Lei an average Markov decision problem be given and consider the function

(s) = 53 53 rx,aSx,aqx x£X otA(x)

qx x G X

{qy - E E sx,aPx,yqx = ° Vy G X;

xEX oEA(x)

qy + Wy - E E sx,apX,yWx = 0y, Vy G X,

xElaEA(x)

Then on the set S of the solutions of the system

{E sx,« = 1, Vx G X; aEA(x)

sx,a > o, Vx G X, Va G A(x).

i/ie function (s) depends only on sx,a for x G X, a G A(x) and (s) is quasi-monotonic on S (i.e. we(s) is quasi-convex and quasi-concave on S).

So, an average Markov decision problem in stationary strategies can be represented as a problem of maximization of a quasi-monotonic function (s) on a polyhedron set S. However, here, the function (s) on S may be discontinuous,

S

The switching controller stochastic game models in stationary strategies with discounted and average payoffs that we present in the next two sections contain the game variants of the corresponding Markov decision problems in stationary strategies formulated above.

5. Determining Stationary Nash Equilibria for a Discounted Switching Controller Stochastic Game

The existence of stationary Nash equilibria for discounted stochastic games with finite state and action spaces has been proven by Fink, 1964 and Takahashi, 1964. Based on a constructive proof of this result suitable algorithms for determining stationary Nash equilibria for a discounted stochastic game has been elaborated (see Kallenberg, 2016). Lozovanu and Pickl, 2018 proposed the following approach for determining stationary equilibria in a discounted stochastic game with finite state and action spaces.

Let (X, |Ai(x)}i=Tm, |rx,a}i=Tm,P, Y, 0) be the tuple that determines a discounted stochastic game where X is the finite set of states; A®(x) is the set of actions of player i G |1,2,..., m| in the state x G X; rx,a ^ the reward of player i G |1, 2,..., m| in the state x for an action vector a = (a1, a2,..., am) in x; p® : X x x nl=1 A®(x) x X ^ [0,1] is the probability function that gives transition probabilities px,y from an arbitrary x G X to an arbitrary y G X for each action vector a = (a1, a2,..., am) in x; y is a given discount factor and 0 = |0y}yex is a X 0y

y

represents Nash equilibria of the following noncooperative static game in normal form (|Si|i=^m, |ag(s)|i=1"m )> where each set of strategies S®, i G |1, 2,..., m| represents the set of solutions of the system

and each payoff axo(s1, s2,..., sm)) is defined as follows

{m

ae (s1,s2 , sm)= E E n sx,ak rx,ai,a2...am qx,

xex (a1,a2,...,am)eA(x) k=1 ^

i = 1, 2, . . . , m,

where qx, x G X are determined uniquely from the following system of equations

qy - Y E E n pXi:1'a2""'am)qx = , Vy e X; (6)

for an arbitrary s = (s1, s2,..., sm) G S = S1 x S2 x • • • x Sm, where S® represents the set of solutions of system (4).

Each payoff ax0 (s1, s2,..., sm)) in the considered auxiliary noncooperative game

S

to the strategy s® on S®. Therefore, based on results of Dasgupta and Maskin, 1986 the game (|S®|®=1-m, |ae(s)|®=1m ) ^as a Nash equilibrium and this equilibrium is a stationary Nash equilibrium of the discounted stochastic game.

Taking into account that a discounted switching controller stochastic game represent a special case of a discounted stochastic game then we can specify the auxiliary

Vx e X;

Vx e X, Va e A4(x)

(4)

m

xGX (a1,a2,...,am)GA(x) k=1

game ({S®}®=1-m, {ae(s)}®=1m ) a specif case. So, if in (6) we take into account that for each set X® the transition probabilities in the states x G X® are controlled only by player i then the payoff a®(s) of player i G {1,2,..., m} on S is defined as follows

{m

(s,s, . ..,sm)^E E EI sx,ak • rx,(a1 ,a2 ...am) • qx,

xEX (a1,a2,...,am)EA(x) k=1 ^

i = 1, 2, . . . , m,

where qx, x G X are determined uniquely from the following system of equations

m

qy - Y ^ ^ ^ sx,ak • px^y • qx = 0y, Vy G X; (8)

k = 1 xEXk akEAk(x)

for an arbitrary fixed s = (s1, s2,..., sm) G S = S1 x S2 x • • • x Sm.

Each payoff function a® (sT,s2, ...,sm) on S is continuous. Moreover, Lo-zovanu and Pickl, 2018 showed that each payoff a®(sT,s2, ..., sm) on S is quasi-monotonic with respect to strategy s® on S®. Therefore, based on the results of Fan, 1966 and Dasgupta and Maskin, 1986 the following theorem holds.

Theorem 3. The game ({S®}®=^m, {ae(s)}®=Tm ) ^as a Nash equilibrium s* = (s1*,s2*,...,sm*) G S == S1 x S2 x • • • x Sm that is a stationary Nash equilibrium of the discounted switching controlled stochastic game determined by the tuple ({X®}®=1"m, {A®(x)}®=1-m, {rx,a}®=^m, p, Y, x0). Moreover this equilibrium is a stationary Nash equilibrium of the discounted switching controller stochastic game

xGX

So, a stationary Nash equilibrium for a discounted switching controller stochastic game determined by the tuple ({X®}®=^m, {A®(x)}®=^m, {rx,a}i = 1, m,p, y, 0) can be found by computing a Nash equilibrium of the noncooperative static game ({S®}®=1-m, (s)}®=i-m ) where the payoffs are determined according to (7),(8).

6. On Determining Stationary Nash Equilibria for an Average Switching Controller Stochastic Game

In this section we show that if an average switching controller stochastic game has a stationary Nash equilibrium then all such equilibria can be found from an auxiliary noncooperative static game in normal form ({S®}®=1-m, {w®(s)}®=^m ) where each payoffs w®(s1, s2,..., sm) are quasi-monotonic with respect to the corresponding strategies of the players. To prove this we shall use the conditions obtained by Lozovanu and Pickl, 2020 concerned with the existence of stationary Nash equilibria for an average stochastic game with finite state and action spaces.

Let (X, {A®(x)}®=1-m, {rx,a}®=1-m, p, 0) be the tuple that determines an average stochastic game where X is the finite set of states; A®(x) is the set of actions of player i G {1,2,..., m} in the state x; rx a is the reward of player i G {1, 2,..., m} in the state x for an action vector a = (a1, a2,..., am) in x; p : X x x£X nm=1 A®(x) x X ^ [0,1] is the probability function that gives transition probabilities px,y from an arbitrary x G X to an arbitrary y G X for each

action vector a = (a1, a2,..., am) in ^d 9 = {9y}yeX is a distribution function on X where 9y expresses the probability that the average stochastic game starts in y.

Lozovanu and Pickl, 2020 showed that if the average stochastic game determined by the tuple (X, {A®(x)}i=^m, irX,a}i=^m,P, 9) has stationary Nash equilibria then for such a game an auxiliary noncooperative game ({Si}i=i^, {w®(s)}j=^m ) with quasi-monotonic payoffs w®(s), i = 1, 2,..., m on S = S1 x S2 x,..., xSm for which Nash equilibria represents stationary Nash equilibria of the average stochastic game can be constructed. The set of strategies S® of player i G {1,2,... ,m} represents the set of solutions of the system

E s* i = 1, Vx G X;

aieAi(x) ^

s® i > 0, Vx G X, Va® G A®(x)

x, ai — ' ' V >

and each payoff w*0 (s1, s2,..., sm)) is defined as follows

( m

|w® (s1,s2,...,sm) ^E E n sX,a k • rX,( a V 2... a m) • qx,

x£X ( a 1, a 2 ,..., a m)eA(x) k = 1 (10)

i = 1, 2, . . . , m,

where qx for x G X are determined uniquely from the following system of linear equations

'm /12 m \

qy - E E n sXak • ,...,a ) • qx = 0, Vy G X;

x£X (a1,a2,...,am)£A(x) k=1 '

( , ,, ) ( ) (11) ' (a1 a2 am)

qy + wy -E E n sxak • pxy ) • wx = 9y, Vy G X,

x£X (a1,a2,..,«m)£A(i) k=1 '

for a fixed strategy s = (s1, s2,..., sm) G S. In general, an average stochastic game may have no stationary Nash equilibria (see Flesch et al., 1997) and then the game ({S^^y, {w®(s)}®=^m ) ^as no Nash equilibrium. However if the game ({S®}j=rm, {w^(s)}j=rm ) kas a Nash equilibrium s* = (s1*, s2*,..., sm*) then this equilibrium is a stationary Nash equilibrium of the average switching stochastic

y G X

unichain case of the average stochastic game, when the matrix Ps is unichain for sGS

'm /12 m\

qy -EE n sxak • <y,a ,...,a ) • qx = 0, Vy G X;

x£X (fl1,a2,...,am)EA(x) k=1 ' (12)

Exex qx = 1

In this case the game ({S®}j=^m, {w®>(s)}j=rm ) kas a Nash equilibrium s* = (s1 *, s2*,..., sm*) that is a stationary Nash equilibrium of the average stochastic

yGX

We can specify for an average switching controller game the auxiliary game model ({Si}i=1m, {w®(s)}i=j,m ) f°r the average stochastic game . If we specify the

auxiliary noncooperative game ({S^^Tm, (s)}j=Tm ) 311 average switching controller stochastic game determined by the tuple ({Xi}i=i,m, {A®(x)}xeXi=i,m, {rX 0}i=i"m, P, 0) then we obtain the game where the sets of strategies S% i = 1, 2,..., m represent the corresponding sets of solutions of the systems

E <oi = 1, Vx e X;

s^ > 0, Vx e X, Va4 e A4(x) and the payoffs

i m

|wj (S1,S2 E n • rX,(ai,a2...am) •

i£X (o1,«2,..,«m)eA(i) k=1 (14)

i = 1, 2,..., m,

where for x e X are determined uniquely from the following system of linear equations

( m k [ qy - EE E sLk • • qx = 0, Vy e x ;

I k = 1 xeXfc afceAk(x)

mk

qy + wy - E E E sXok • • w = 0y, Vy e X,

k=1 xeXfc okeAfc(x)

for a fixed s = (s1, s2,..., sm) e S.

Based on the results bove we can formulate the following theorem.

Theorem 4. Let be given an average switching controller stochastic game determined by the tuple ({Xi}i=i^, {Ai(x)}xex,i=i;m> ^,0^=1^ {Pi}i=1m, xo) and let ({Si}i=^m, {wj(s)}j=^m ) auxiliary noncooperative static game with

quasi-monotonic payoffs wj(s), i = 1,2,..., m /or i/ie average switching controller stochastic game. Then the game ({S®}j=^m, {wj(s)}j=rm ) a Nasft equilibrium if and only if the average switching controller game has a stationary Nash equilibrium.

So, if an average switching controller stochastic game has stationary Nash equilibria then such equilibria can be found as Nash equilibria of the auxiliary noncooperative game ({S®}j=^m, {wj(s)}j=rm ) where S^d wj(s), i = 1, 2...., m are defined according to (13)-(15). In general, the problem of determining a Nash equilibrium for the game ({S®}j=^m, {wj(s)}j=rm ) may be a difficult problem because the payoffs in the considered auxiliary game may be discontinuous with respect to all strategies of the players. If the paypffs of the auxiliary game are continuous or graph-continuous then according to Dasgupta and Maskin, 1986 we obtain that the average switching controller stochastic game possesses a stationary Nash equilibrium. An important class of average switching controller stochastic games for which stationary Nash equilibria exist represents the average stochastic positional games (see Lozovanu, D., 2018) and single controller stochastic games (see Rosenberg et al.,2004). For the unichain games the payoffs of the auxiliary game are continuous and therefore in this case the average switching controller stochastic

game possesses a stationary Nash equilibrtum. In this case the system (15) can be replaced by the following system

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

mk qy - EE E sxak • px,y • qx = 0, vy g X ;

k=1 x£Xk ak eAk(x) qy = 1

y£X

and we obtain a more simple game model.

7. Conclusion

A discounted switching controller stochastic game always possesses stationary Nash equilibria and all such equilibria can be found as Nash equilibria of the auxiliary noncooperative game with quasi-monotonic payoffs from Section 5. This auxiliary noncooperative game always has a Nash equilibrium because the payoffs are quasi-monotonic with respect to the corresponding strategies of the players and continuous with respect to all strategies of the players. For an average switching controller stochastic game also can be considered an auxiliary noncooperative static game with quasi-monotonic payoffs as it is shown in Section 6. However in this case the payoffs may be not continuous with respect to all strategies of the players and therefore the problem of determining a Nash equilibrium for the auxiliary static game may be a difficult problem. If the payoffs in the auxiliary game are continuous or graph-continuous in the sense of Dasgupta and Maskin, 1986 then the average switching controller stochastic game has a stationary Nash equilibrium. In the unichain case an average switching controller stochastic game always has a stationary Nash equilibrium because the payoffs in the auxiliary static game are continuous with respect to all strategies of the players. In general, the existence and determining stationary Nash equilibria for an average switching controller stochastic game is a difficult open problem (see Thuijman and Raghavan, 1997, Flesch et al., 1997). Nevertheless for an average switching controller stochastic game we can state that the proposed auxiliary noncooperative static game from Section 6 has a Nash equilibrium if and only if the average switching controller stochastic game has a stationary Nash equilibrium. An important class of average switching controller stochastic games for which stationary Nash equilibria exist represents the average stochastic positional games and single controller stochastic games. So, if we determine a Nash equilibrium for the auxiliary static game then we determine a stationary Nash equilibrium for the average switching controller stochastic game.

Acknowledgments. The authors are grateful to the unknown referee for interesing suggestions and remarks contributing to improve the presentation of the paper.

References

Boyd, S., Vandenberghe, L. (2004). Convex optimization. Cambridge University Press. Bayraktar, E., Cosso, A., Pham, H. (2016). Robost feedback switching control: Dynamic programming and viscosity solutions. SIAM Journal on Control and Optimization, 54, 5, 2594-2628.

Dasgupta, P. and Maskin, E. (1986). The existence of an equilibrium in discontinuous economic games. The Review of Economic Studies, 53, 1-26.

Dubey, D., Neogy, S.,G.k., Ghorui, D. (2017) Completely mixed strategies for generalzed bimatrix and switching controller stochastic games. Dynamic games and Applications, 7, 4, 535-554.

Fan, K. (1966). Application of a theorem, concerned sets with convex sections. Math. Ann.,1963, 189-203.

Fink, A. (1964). Equilibrium in a Stochastic n-Person Game. Journal of Science of the Hiroshima University, 28, 89-93.

Flesch, J., Thuijsman, F., Vrieze, K. (1997). Cyclic Markov equilibria in stochastic games. International Journal of Game Theory, 26, 303-314.

Kallenberg L. (2016). Markov Decision Processes. University of Leiden.

Krishnamurthy, N., Neogy, S.K. (2020) On Lemke processibility of LCP formulations for solving discounted swiching control stochastic games. Annals of Operations Research, 295, 2, 633-644.

Lozovanu, D. (2018). Stationary Nash equilibria for average stochastic positional games. Chapter 9 in the book "Frontiers of Dynamic Games,Static and Dynamic Game The-ory:Fondation and Aplication"(L.Petrossyan et al.eds.), Springer, 139-163.

Lozovanu, D., Pickl, S. (2018). Pure Stationary Nash Equilibria for Discounted Stochastic Positional Games. Contributions to game theory and management, 12, 246-260.

Lozovanu, D., Pickl, S. (2020). On the existence of stationary Nash equilibrium in average stochastic games with finite state and action spaces. Contributions to game theory and management, 13, 304-323.

Rosenberg, D., Solan, E., Vieille, N. (2004). Stochastic games with a single controller and incomplete information. SIAM Journal on Control and Optimization, 43, 1, 86-110.

Schultz, T. (1992). Linear complimentary and discounted switching-controller stochastic games. Journal of Optimization Theory and Applications , 73, 89-99.

Takahashi, M. (1964). Equilibrium Points of Stochastic Non-Cooperative n-Person Games. Journal of Science of the Hiroshima University, 28, 95-99.

Thuijsman, F., Raghavan, T. E. S. (1997). Perfect information stochastic games and related classes. International Journal of Game Theory, 26, 403-408.

i Надоели баннеры? Вы всегда можете отключить рекламу.