D.W.K. Yeung’s Condition for the Coalitional Solution of the Game of Pollution Cost Reduction
Anna V. Iljina1 and Nadezhda V. Kozlovskaya2
1 St.Petersburg State University,
Faculty of Applied Mathematics and Control Processes, Universitetski pr35, St. Petersburg, 198504, Russia E-mail: [email protected]
2 St. Petersburg State University,
Graduate School of Management,
Volkhovsky per 3, St. Petersburg, 199004, Russia E-mail: [email protected]
Abstract In this paper the problem of allocation over time of total cost incurred by coalitions of countries in a coalitional game of pollution reduction is considered. The Nash equilibrium in the game played by coalitions is computed and then the value of each coalition is allocated according to some given mechanism between its members. Obtained solution is time consistent and satisfies the irrational-behavior-proofness condition.
Keywords: differential game, cooperative game, dynamic programming, Hamilton-Jacobi-Bellman equation, Shapley value, Nash equilibrium, time-consistency, irrational-behavior-proofness condition.
1. Introduction
The global environmental problem requires joint effort of many countries. There may be disagreement among different parties as to the problem of allocation of costs of reducing emissions or pollution accumulations. This disagreement can be solved by negotiations, a mechanism which involves two important elements, i. e. fairness and (strategic) power. Existing multinational joint initiatives like Kyoto Protocol or pollution permit trading can hardly be expected to offer a long-term solution because there is no guarantee that participants will always be better off within the entire duration of the agreement. More than anything else, it is due to the lack of this kind of guarantees that current cooperative schemes would fail to provide an effective mean to avert disaster. Kyoto protocol specifies emission targets for the period 2008-2012. Such long term perspective brings forward an equally challenging task which is how to allocate over time the total individual share of the cost so that the countries stick, as time goes by, to the agreed solution at initial time, supposing that the global allocation problem has been solved.
To create a cooperative solution that every party would commit to from beginning to end, the proposed arrangement must remain optimal throughout the period of question. This is a ’classic’ game-theoretic problem. Time-consistency means that if one renegotiates the agreement at any intermediate instant of time, assuming that cooperation has prevailed from initial date till that instant, then one would obtain the same outcome. The notion of time-consistency in cooperative differential games was introduced in the paper (Petrosyan, 1993). Differential games provide an effective tool to study pollution control problems and to analyze the interactions between the participant’s strategic behaviors and dynamic evolution of pollution.
In this paper a model of pollution cost reduction is considered. Earlier the problem of global warming was considered by many authors (Dockner et al., 2000, Hau-rie and Zaccour, 1995, Kajtala and Pohjola, 1995). This model was introduced in the paper (Petrosyan and Zaccour, 2003). Firstly it was used in discontinuous time (Germain et al., 1998). The approach of this paper is different. The more general coalitional setting is considered, when not only the grand coalition, but also a coalitional partition of players can be formed. This kind of approach was not considered before when the process was modeled as differential game because of principle difficulties connected with the construction of solution. Coalitional values for static games have been studied in a series of papers (Bloch, 1966, Owen, 1997). In a recent contribution, it was proposed a characterization of the Owen value for static games under transferable utility (Albizur and Zarzuelo, 2004). The coalitional value for static simultaneous games with transferable payoffs was defined by generalizing the Shapley value to a coalitional framework (Owen, 1997). In particular, the coalitional value was defined by applying the Shapley value first to the coalition partition and then to the cooperative games played inside the resulting coalitions. This approach assumed that coalitions in the first level can cooperate (as players) and form the grand coalition. The game played with coalition partitioning becomes a cooperative one with a specially defined characteristic function: The Shapley value computed for this characteristic function is then the Shapley-Owen value for the game.
The present paper emerges from idea that it is more natural not to assume that coalitions on the first level can form a grand coalition. At the first step the Nash equilibrium in the game played by coalitions is computed. Secondly, the value of each coalition is allocated according to the Shapley value in the form of PMS-vector, that was derived in the paper (Petrosyan and Mamkina, 2006). The approach gives first time in game-theoretic literature the possibility to find a solution of coalitional differential game.
If the obtained solution is time consistent and satisfies group and individual rationality, however no rational payers can deviate from the chosen coalitional path. If irrational behavior appear later in the game the concerned player would still be performing better under the coalitional scheme. This condition is named irrational-behavior-proofness condition or the D.W.K. Yeung’s condition. The coalitional solution of the game of pollution cost reduction satisfies the D.W. K. Yeung’s condition.
The main result of this paper is the calculation of this solution (PMS-vector), the proof of its time-consistency and the verification of D.W.K. Yeung’s condition.
2. Problem statements
The dynamics of the model is proposed (Petrosyan and Zaccour, 2003). Let I be the set of countries involved in the game of emission reduction: I = {1, 2,...,n}. The game starts at the instant of time t0 from initial state x0. Emission of player i, (i = 1, 2, ...,n) at time t,t G [to; &>) is denoted ui(t). Let x(t) denote the stock of accumulated pollution by time t. The evolution of this stock is governed by the following differential equation:
X(t) = 'sy^j ui(t) — 5x(t), iei x(to) = xo,
where S denotes the natural rate of pollution absorption. Let denote ui(t) = ui and x(t) = x. Let Ci(ui) be the emission reduction cost incurred by country i when limiting its emission to level ui:
Ci(ui(t)) = - Ui)2, 0 <ui{t)<uu 7 > 0.
Di(x(t)) denotes its damage cost:
Di(x) = nx(t), n > 0.
Both functions are continuously differentiable and convex, with and Ci/(ui) < 0 and Di (x) > 0. The payoff function of the player i is defined as
Ki(xo,to ,u) = j e-p(t-t0)(Ci (ui(t)) + Di(x(t)))dt,
to
subject to the equation dynamics (10), where u = (ui,..., un) and S is the common social discount rate.
Let (Si, S2,..., Sm) be the partition of the set I such that Si n Si = 0, USi =
p
I, |Si| = ni,J2n = n.
i=1
Suppose that each player i G I is playing in interests of the coalition Sk, to which he belongs, trying to minimize the sum of payoffs of its members, i.e.
min ^ Ki(u,xo,to) = [ e-p(t-to) ^ {Ci(ui (t)) + Di(x(t))}dt
Ui’ieSk iESK tO i^-SK
subject to the equation dynamics (10).
3. Solution of the problem
By assumption, each player i G Sk is playing in the interests of the coalition Sk. Without loss of generality it can be assumed that coalitions Sk are acting as players. Then at the first stage a Nash equilibrium is computed. The Nash equilibrium is calculated with the help of Hamilton-Jacobi-Bellman equation (Dockner et al., 2000). The total cost of coalition Sk is allocated among the players according to Shapley value of corresponding subgame r(Sk). The game r(Sk) is defined as follows: let Sk be the set of players involved in the game r(Sk) , r(Sk) is a cooperative game. The computation of the characteristic function of this game isn’t standard. When the characteristic function is computed for the coalition K G Sk, the left-out players stick to their feedback Nash strategies. Payoffs of all players i G I forms a PMS-vector (Petrosyan and Mamkina, 2006). This implies that PMS-vector is defined by the following way:
Definition 1. The vector
PMS(x,t) = [PMS1(x,t),PMS2(x,t),...,PMSn(x,t)],
is a PMS-vector, where PMSj,(x,t) = Shi(Sk,x,t), if i G Sk, where
sms».».*)- E
/ ^ mi!
M Di,M CSk '
and (Si, S2,..., Sm) is the partition of the set I.
In subsection 3.1 we calculate a Nash equilibrium supposing that coalitions S1,S2,...,S„ are acting as players. Then in subsection 3.3 we construct PMS-vector, which we consider as the solution of the constructed coalitional game.
3.1. Computation of the Nash equilibrium
On the first step we compute a Nash equilibrium in the game played by calitions. Suppose coalitions Sk are acting as players. The optimization problem for the coalition Sk is
min Ki(u,x,t) = min f ep(t-to) V'' (Ci(ui)+ Di(x))dt
Ui,ieSk I
ieSk to ieSk
subject to pollution accumulation dynamics (10). To obtain a feedback Nash equilibrium, assuming differentiability of the value function, the system of Hamilton-Jacobi-Bellman equations must be satisfied. Denote by WSk the Bellman function of this problem. Above mentioned system is given by the following formula:
pWSk{x,t) = min [ V' (Ci(ui) + Di(x))+dWs^X^ (S^u^t) - Sx(t))],
Ui,itl ^ dx (n\
ieSk iel (2)
Sk ,k = 1,...,m.
Differentiating the right hand side of formulas (2) with respect to ui and equating to zero leads to
n _ - 1 &WSk ,0.
Ul — Ui ~ ~ ^
Substituting uN in (2) we get:
pWSk(x,t) = f + nkTTX+
2y dx
, V , A «« dWSt nk dWSk
iel i=i ' '
(4)
It can be shown in the usual way that the linear functions
WSk = ASk x + BSk, k = 1, 2,...,m (5)
satisfies the equation (2). Now note that
<«>
Substituting (5) and (6) into the formula (4) we get the coefficients ASk and BSk as follows:
nk n
Ask =
p + S’
r _ Ask _ v' n% A nk Bsk - —(2_^ui-2^—As* + 2y sk^
' i=1 i=1 ' '
If we combine this with (3) and (7) we get
N __ - 1 nkn
U{ = Ui--------------------
(8)
Y p + S
for i € I, if i € Sk. As the result we obtain the total cost of coalition Sk,k = 1,...,m, in the following form:
N m 1 2 1 2
... , nk n 1 nfn 1 nkn
^ OM = ( , z\(px+ z2ui ~z2---------------------77+n------”F • 9
p(p + S) Y P + S 2Y p + S
3.2. Computation of the Shapley value
Recall that the total cost of coalition Sk is allocated among the players according to the Shapley value. Similarly we have to find the characteristic function for the game r(Sk) and the Shapley value. The computation of the characteristic function of this game isn’t standard. When the characteristic function is computed for the coalition K € Sk, the left-out players stick to their Nash strategies.
Computation of Nash equilibrium in the game r(Sk). We can easily compute Nash equilibrium. Each player seeks to minimize a stream of discounted sum of emission reduction cost and damage cost. We have following system of optimization problems:
min Ki(u,xo,to) = J e-p(t-to){Ci(ui)+Di(x)}dt, i € Sk, (10)
to
subject to equation dynamics (10). The value function Wi(x, t) of system of dynamic programming problem (10) must satisfy the following system of Hamilton-Jacobi-Bellman equations:
dW-(X t)
pWi(x,t) = imn[Ci(ui) +Di(x) H---------% 1 C^Uj - Sx(t))\,i e Sk. (n)
u X ier
Differentiating the right hand side (11) with respect to ui and equating to zero leads to the following Nash emission strategies:
1 n
< = «,----------1-7, * £ Sk. (12)
Y P + S
Recall that urr = uN, i € Sj ,i € Sk , where uN is given by the formula (8). Substituting (12) and (8) in (11) we obtain
TXT < +\ nk/dWi 2 dWi
Pvt/,(M)= ( j +TI+
' i£l
1 1 (13) 1 ^ 1 * x ■ r- q
Y . , P + S Y dx
i=1,i=k i^Sk
Taking into account (5) we get
pAiX-\-pBi ——A2-\-nx-\-Ai{S~^Ui----Ai---—=j— — 6x}. (14)
2y tl y y jJ^=kp +S
It follows easily that
A
By assumption,
p + S
D A ” 1m nn 1 . (15)
p i=[ Y p +S 2Y
Wi = Aix + Bi, i € Sk. (16)
Compute outcomes for all remaining possible coalitions in the game
r (Sk). The characteristic function for the intermediate coalition L € Sk is computed by the solution of the following optimization problem:
min Ki(u, x 0, to )=/ e-p(t-to) ^ (Ci(ui) + Di(x))dx (17)
Ui.iEl /
J ieL,LcSk
subject to the equation dynamics (10). Let WL(x,t) be the Bellman function of the problem (17). The solution of the problem (17) is equivalent to the solution of the following Hamilton-Jacobi-Bellman equation:
pWL{x,t)= min \y^(Ci(ui) + Di(x)) -|------------------------^—CS^Ui-Sx)}. (is)
Ui,i£L dt v '
i^L iEl
Suppose the players from I\Sk stick to (8) and the players from Sk\L stick to (12). Differentiating the right hand side of expression (18) subject to ui, i € L, we get the strategies uL, i € L. Substituting uL,uN,urr in the formula (18) leads to
' iei
1 ^ n;7T 1 (nk - l)TT I 8WL T ^ ^
----Z_> I-?-----------1-? « ox(tj), L C <3fc,
Y . T“^, p + S Y p + S y dx
i=1,i=k
where l = |L|. Combining this with (5) and (6), we get:
I
2y
pALx + pBL = —A2l + lux + AL(y^/Uj-
iei
n
-- E ICS,.
Y i=7i=kp + S Y p + S Y
It can be easily checked that
Wl = Alx + Bl, L c Sk, (19)
where
nl
At
p + S
n Al r'ST' - 1 V nJ7r 1 (nfc 1 a \
Bl = —W^Ui - -----------------7TA^-
p i=i Y p + S Y p + S 2Y
Characteristic function. We have proved that characteristic function is given by the following formula:
'0, M = 0;
, > , Wi(x,t), M = {i};
V(M,x,t)={ W ( t M « , (21)
Wsk (x,t), M = Sk
WL(x,t), M = L
where Wi(x,t) is given by (16), WL(x,t) is given by (19) and WSk (x,t) is given by
(9).
The Shapley Value. We can see that values of characteristic function V(M, x, t) are the same for all coalitions K and L with IK| = |L|, since they depend on ^ ui.
iel
Therefore the Shapley value for the game r(Sk) equals:
Qh (Q V(Sk,x,t)
Shi(Sk, x,t) =-
nk
(22)
p(p + S) i^i i=i Y p + S 2Y p + S
for any i € I, if i € SK.
3.3. Constructing of the PMS-vector
We obtain the Shapley Value (22) for any game r(Sk) , where (S1, S2,..., Sm) is a coalitional partition of the set of players I. Taking into account definition 1, we obtain the formula for PMS-vector:
PMS(xo,to) = (PMSi(xo,t o ),PMS2(xo,t o ),...,PMSr(x o,to)), (23)
where PMSi(xo,to) = Shi(Sk,xo,to) if i € Sk, see (22). Substituting the Nash equilibrium strategy (8) and solving equation of dynamics (10) we obtain coalitional trajectory:
i m i 2
xN(t) = (xo -
ieI i=1 (24)
1 m 1 2 v ;
1 1 nin n
+aCC^-zJ—b-S i=iYp+S
4. Time-consistency
Time-consistency means that if one renegotiates the agreement at any intermediate instant of time, assuming that coalitional agreement has prevailed from initial date till that instant, then one would obtain the same outcome. The notion of time-consistency was introduced (Petrosyan, 1993) and was used in problems of environmental management (Petrosyan and Zaccour, 2003). We begin with definitions, which were introduced in the paper (Petrosyan and Zaccour, 2003). Consider subgames of our game with initial conditions (xN (t), t) on the coalitional trajectory and denote by PMS(xN(t),t) the corresponding PMS-vector:
PMS(xN (t), t) = (PMS1(xN (t), t), PMS2(xN (t), t),..., PMSn(xN (t), t)),
where PMSt(xn(t), t) = Shi(Sk, xN(t), t) if i € Sk, see (22).
Definition 2. The vector f3(t) = (31(t) ,32(t),... ,3n(t)) is a PMS-vector distribution procedure (PMSDP) if
PMSi(xo,to) = j e-p(t-to) 3i(t)dt, i € I.
to
Definition 3. The vector f3(t) = (31(t), f32(t),..., f3n(t)) is a time-consistent PMSDP if at (xN(t),t) at any t € [to, to) the following condition holds
t
PMS,x„, W = / e-p<'->№+ e-p"-,°PMS,(xN((), .€ ^
to
If this condition is valid for any t € [to, to), then initial agreement is unaltered during the game.
Theorem 1. The vector (3(t) = (p1(t),p2(t),... ,pn(t)), where (3(t) is given by
pi{t) = pPMSi(xn(t),t) - ^PMSi(xn(t),t) is a time-consistent PMSDP.
Consider PMS-vector (23) that was computed in the section 2.2. Straightforward calculations lead to
2
where i € Sk and xN (t) is given by formula (24). It can be readily shown by direct calculation that (25) satisfies the definition 2. This proves that (25) defines a time-consistent PMSDP.
5. The D.W.K. Yeung’s condition
In this paper the irrational-behavior-proofness condition is considered, which was named D.W.K. Yeung’s condition (Yeung, 2006). Consider the solution of the game in the form of PMS-vector. All players are involved in the game with coalitional structure. If at any intermediate instant of time t the initial agreement is disturbed, then the costs of all players will be less than the costs in the case, when coalitions weren’t formed. The D.W.K. Yeung condition
t
Vi(x(to),T) < Vi(x(t),T - t) + J A(t)dr.
to
The D.W.K. Yeung’s condition for the problem of emission reduction is described as follows:
t
Vi(x(to)) > e-p(t-t0 Vi(xN(t)) +J e-p(T-t0'>i3i(r)dT, (26)
to
where V (x(t0)) is the maximal guarantied payoff of player i with the initial state x(t0), when he plays individually, Vi(xN(t)) is the maximal guarantied payoff with initial state xN (t). The second expression which stands in right part of condition (26) is the payoff of player i which he gots during the time t - t0 with help of the distribution procedure of PMS-vector 3i(t), where 3i(t) equals (25).
In the condition (26) the sign of inequality is different, because every player seeks to decrease total costs. In the condition (26) the multiplier e-p(T-to) is appeared, because the game with discounted payoffs is considered.
Consider the integral in the right hand side of inequality (26). The substitution of 3i(t) (25) leads to the following calculations:
t
J e-p(T-to)3i(T)dT = to
= e-p(t-to)7r( ^ _ e-^-^XQ
2frf(p + 5)2 p + 6
e-S(t-to) n m n 2n 1 n m n 2n
+ "77-----, )> V ,3 / jJ) +
S(p + 6) ^ ^7(p + 6) pS j^[l(p + &)
n m o
+ p + <5 S(p + S)[^ 4 ^7(p + (5)j+
n m o 0 0
n ni \ nk n
P3 i=i i Y(P + 3) 2YP(P + 3)2 '
The value of Vi(xN(t)) = V({i},xN(t),t) is defined by the formula (21). Simplifying the right hand side of condition (26) leads to:
e-p(t-t°) Vi(xN (t))+| e-p(T-to)pi (r )dr = to
2 m 2 1
____1_____(e-p(t-to) y^n.2 _n I I) I 1^L +
7 p(p + Sy[ 3 2 2 p + (5
n m 2 2 2
n _ \~^Uj n. Uk n
+777Tn(2^*-2^---------------------------
P(P + 5) = P + 5 2py(p + 5)2'
The left hand side of the inequality (26) is written as follows:
n 2 2 n n _ n U n
vi(x(to)) = —TXO + ^ - ,.,,2 +
P + & u ' p(p + 5)^ i py(p + 5)2 2p7(p + 5)2'
As a result the proof of inequality (26) is equivalent to the proof of the following inequality:
t
m U2 1 m
Enj2-y^n“2’ J2ni=n- (27)
j=i j=i
The inequality (27) is equivalent to the following inequality:
m1
53 _ nJ') + 2^ _ 2nfc + ^
j=1,j=k
which is true. We have proved that the condition (26) is right.
References
Albizur, M. and J. Zarzuelo (2004). On coalitional semivalues. Games and Economic Behaviour, 2, 221-243.
Bloch, F. (1966). Sequantal formation of coalitions with fixed payoff division . Games and Economic Behaviour, 14, 90-123.
Dockner, E. J., S. Jorgensen, N. van Long and G. Sorger (2000). Differential Games in Economics and Management Science. Cambridge University Press, 41-85.
Haurie, A. and G. Zaccour (1995). Differential game models of global environment management. Annals of the International Society of Dynamic Games, 2, 3-24.
Kaitala, V. and M. Pohjola (1995). Sustainable international agreements on green house warming: a game theory study. Annals of the International Society of Dynamic Games,
2, 67-88.
Owen, G. (1997). Values of games with a priory unions. In: R. Henn and O. Moeschlin (eds.). Mathematical Economy and Game Theory (Berlin), 78-88.
Petrosyan, L. (1993). Differential Games of Pursuit. World Sci. Pbl., 320.
Petrosyan, L. and G. Zaccour (2003). Time-consistent Shapley value allocation of pollution cost reduction. Journal of Economic Dynamics and Control, 27, 381-398.
Petrosyan, L. and S. Mamkina (2006). Dynamic games with coalitional structures. International Game Theory Review, 8(2), 295-307.
Petrosyan, L. and N. Kozlovskaya (2007). Time-consistent Allocation in Coalitional Game of pollution cost reduction. Computational Economics and Financial and Industrial Systems, A Preprints Volume of the 11th ifac symposium, IFAC publications Internet Homepage, http://www.elsevier.com/locate/ifac, 156-160.
Yeung, D.W.K. (2006). An irrational - behavior - proofness condition in cooperative differential games. Intern. J. of Game Theory Rew., 8, 739-744.