Contributions to Game Theory and Management, XII, 366-386
Stackelberg Equilibrium of Opinion Dynamics Game in Social Network with Two Influence Nodes
Mengke Zhen
St. Petersburg State University, 7/9 Universitetskaya nab., Saint Petersburg 199034, Russia E-mail: mengkezhen@163. com
Abstract The alteration of opinions of individuals in groups over time is a particular common phenomenon in social life. Taking into account the influence of homogeneous members and some special influential persons, an opinion dynamics game is established. In a social network, two special influence nodes pursuing their certain goals with the process of influencing the opinions of other normal nodes in discrete time is considered. Prom the perspective of non-cooperation, Stackelberg equilibrium is selected as the solution of the opinion dynamics game. Given distinct information knowledge, players will derive different equilibrium strategies. The open-loop and feedback information configurations are investigated. In the two-person non-cooperative dynamic game, techniques of Pontryagin's minimum principle and dynamic programming are adopted to derive the equilibrium levels of influence for influence nodes and the equilibrium opinions for other normal nodes in the network. To compute and compare the various equilibrium concepts under different information structures, numerical results are presented for different scenarios.
Keywords: social network, influence, opinion dynamics, Stackelberg equilibrium
1. Introduction
In the complex interpersonal social networks, some influential opinions will determine the formation and revision of individuals' opinions. Delving into the tendency of opinion dynamics of the agents in community is conducive to deep comprehending of the progress of civilization, social development and stabilization. Furthermore, it is the crucial foundation of social control. Incipiently, research on whether all agents in the group would reach an opinion consensus is a hot topic. One of the most classical is the DeGroot model (DeGroot, 1974). Establishing the algebra of a Markov chain, DeGroot proposed the system framework of opinion dynamics, in which the linear combination of agents' opinions at the previous stage constitutes his current stage opinion. The social influence network knowledge structure was enriched by Friedkin and Johnsen (Friedkin and Johnsen, 1990; Friedkin and Johnsen, 1999), in their works the social influence process affected by both endogenous opinions and exogenous conditions was described. Literatures (Acemoglu and Ozdaglar, 2011; Buechel et al., 2015; Dandekar et al., 2013) had extended the theory of opinion dynamics, where the fields involved are economics, social and political sciences, engineering and computer sciences. In (Hegselmann and Krause, 2002), the bounded confidence framework for a Friedkin-Johnsen model was developed and a series of simulations was presented to illustrate the theoretical knowledge. The wisdom groups regarding to DeGroot model was examined in (Golub and Jackson,
2010), while regarding the initial opinions the stubbornness of agents was considered in (Ghaderi and Srikant, 2014). Recently, (Bure et al., 2015; Bure et al., 2017) constructed a specific structure of influence matrix to investigate the problem of reaching a consensus of opinion dynamics with three groups of agents influenced by-two nodes.
There is no starting point from game-theoretic perspective in the aforementioned literatures. Actually, to analyse opinion dynamics, distinct game-theoretic approaches can be adopted. For instance, the Hegselmann-Krause model in a well-designed potential game was investigated in (Etesami and Ba§ar, 2015), while a controlled DeGroot model of opinion dynamics was explored in (Barabanov et al., 2010; Gubanov et al., 2011).
A special structure of dynamic games was considered in this paper, where the influence in the opinion formation process is characterized by a discrete-time linear-quadratic game. The theory of linear-quadratic games is based on optimal control, which is related to the system dynamics described by a set of linear differential/difference equations and the criterion described by a quadratic function. In the proposed model, the opinion dynamics is established as an iteration of opinions from agents and influence levels (control variables) from players in linear form. Each player has his "desired" opinion, what players are concerned is whether they can optimise their performance by minimizing the associated costs. Furthermore, players' objective is to make agents' opinions close to their "desired" opinions in quadratic form, which is close to those in (Krawczyk and Tidball, 2006).
There are two distinguished directions to investigate in game theory, i.e., cooperation and non-cooperation. The popular non-cooperative solution concept Stackelberg equilibrium is examined in this paper. In Stackelberg competition, the follower moves sequentially after the leader. Technique of backward induction is adopted to solve the model. The follower can always react optimally after observing the strategy-taken by the leader, on the other hand, the leader should anticipate the predicted best response of the follower then develop a strategy minimizing his payoff.
Fundamentally, open-loop and feedback configurations are designed to specify diflferent requirements in control system. Under the open-loop information structure, players will make decisions independently of the process state of the system, which befitting case is the reduction in component count and complexity. The term feedback implies players will have some knowledge of the state of the system, thus the primary advantage is its ability to correct for outside disturbances. This paper assumes that players only use the knowledge of the initial state and current stage to determine their open-loop strategies, and the feedback strategies depend on both the current stage and state of the system. The technique of solving open-loop optimal control problem is Pontryagin's maximum (or minimum) principle, which we express as a system of algebra equations. Applying the dynamic programming theory, the equilibrium can be obtained under the feedback information knowledge by-solving the system of recurrence relations. With similar techniques, linear-quadratic differential games are studied in (Wang et al., 2019).
As an illustration and comparison, the symmetric theoretical social network constructed in (Sedakov and Zhen, 2019) is examined. The results of comparison of two distinguished noncooperative solutions Nash and Stackelberg equilibrium are presented. Indeed, sociological research usually take a long time based on tracking observation of communities. This paper establishes the opinion dynamics game of
Zachary karate club network (Zachary, 1977) lasting 36 months. The administrator and the instructor stand at quite various positions on the price of karate lessons, therefore this two players perform their own actions to achieve their objectives. The comparison is conducted when administrator and instructor play the role of leader respectively.
The outline of the rest of the paper is organized as follows. In Section 2, the opinion dynamics model in a social network is described as a two-person discrete-time linear-quadratic game. The subsequent actions of players described by Stackelberg equilibrium are explored in Section 3, which is investigated under open-loop and feedback information structures. At last, the numerical examples illustrate the theoretical findings in Section 4.
2. The opinion dynamics game model
The model of opinion dynamics game considered in this paper was proposed in (Sedakov and Zhen, 2019). We start with the treatment of an opinion dynamics game in a social network defined in finite discrete-time. Let the set of stages be T = {0,1,... ,T}. Denote the social communication structure by standard terminology (V, E), where V is a finite set of nodes in the social network and E is a set of edges between the nodes. Individuals in the social network are located in the nodes and communicate with each other through edges between each pair of nodes. There are two types of participants in the opinion dynamics game, for instance, sellers and consumers in the business relationship network. Thus, we suppose that the set of nodes can be decomposed as V = A U N A n N = 0. Each individual in A is called an agent, meanwhile a player or influence node in N. Therefore, the set A is an agent set and N is a player set in the network.
First, we illustrate the role of agents in the opinion dynamics game. Suppose there is a subject in the network, such as a new product, during the process of learning more information of the new product each consumer may hold distinct view about the product. In the opinion dynamics game we suppose that each agent i € A in the network has his own opinion on this specific "subject" which can be changed over time. To better measure and show the alterations of agents' opinions, we suppose that agents' opinions are numerical values. Given xi0 € X C R as the element i in the initial state vector xo, which means the initial opinion of agent i in
i
xi(t) € X his opinion at stage t = 1,..., T. As an illustration, let X = [0,1] in the business relationship network, xj(t) = 0 implies consumer i has no willingness to buy the new product, and the bigger xj(t) € X, the more willingness of consumer i to buy. Let state vectors x(t) = (xi(t),..., x|A| (t))' be the opinion profile of agents at stage t = 1,..., T, further denote x0 = (x10,..., x|A|0)' as the initial opinions.
Then we discuss how players affect agents' opinions in the opinion dynamics game. Let us go back to our illustration, sellers who sell the new products usually take diversified marketing promotional program to let more consumers know and consumers know more about their new products. For simplicity, we assume that there are two influence nodes in the network, i.e., N = {1, 2}. Denote by «¿(t) € U C R the influence level (or a control) of player i € N on network agents selected at stage t = 0,..., T — 1. Also as an illustration, consider in the business relationship network U = [0,1], then «¿(t) = 0 implies seller i does not make any effort to influence consumers' opinions on his new product, and «¿(t) € U can be considered
as the investment of seller i at stage t = 0,..., T — 1 with his total assets of 1. Thus, the more investment the greater the impact on consumers.
2.1. Opinion dynamics
As we all know, a particular consumer can form a new opinion on the product by acquiring information from sellers and consumers including his own experiences. In the game, we assume each agent evaluates his opinion at any stage aggregating the opinions of other agents in the network as well as the influence efforts of players. The opinion of agent i € A evolves according to the following system:
xi(t + 1) = wijxj(t) + biiui(t) + bi2U2(t), t = 0,..., T — 1, (1)
jeA
with xi(0) = xi0. In the transition function (1) from t to t + 1, wij- € [0,1] and bij € [0,1] are the levels of trust of agent i € A to the opinion of agent j € A and player j € N, respectively. It is not necessarily that wij- = wji. Additionally, we assume that XjeA wij +Y1 jeN bij- = 1 for any agent i € A. The opinions of players are considered to remain constant over time and hence are not included into the model. Let W = {wij}i,jeA bi = (bi1,..., b^^)', i € N. Then the opinion dynamics of agents in the network is given by:
x(t + 1) = Wx(t) + b1u1(t) + b2u2(t), t = 0,..., T — 1, x(0) = x0, (2)
with the following property holds:
Wx(t) + b1u1(t) + b2u2(t) is continuously differentiable on X|A| and convex on X|A| x UT
Evidently, each consumer in the considered business relationship network has two different types of channels for integrating information. That is, one channel is between consumers, another is between consumer and seller. In the opinion dynamics game, we also decompose the set of edges E into two disjoint sets EA and EN, i.e., E = Ea U En Ea n EN = 0, in which EA describes all connections between agents and EN describes all connections between pairs "player-agent". Furthermore, W b1 b2 E
(j, i) € Ea, if and only if wij > 0, (j, i) € En , if and only if bij > 0.
2.2. Criteria
Now we characterize the criteria for players in the opinion dynamics game. What are the criteria of sellers in the considered business relationship network? Prefixing a willingness to buy for consumers and starting from the initial willingness of consumers to buy the product, sellers choose the investments at each stages. In the business process, the costs to sellers come from two aspects, i.e., (i) how far is the consumers' willingness to the prefixed willingness; (ii) how much does the related investment cost. The principle of sellers is to reduce total costs, thus increase their net-income. Let ui = (ui(0),..., ui(T — 1)) € UT be an admissible profile of influence levels in T stages (or a strategy) selected by player i € N, where U is a closed and bounded (thereby compact) subset of R. Taking into account opinion dynamics i€N
given by:
T-1 f \
^(xj(t) - Xj)2 + cj«?(t) I + ^(xj(T) - Xj)2,
t=0 ye A / jeA
where xj G X is a given desired opinion for player i to which he tries to drive the opinions of all agents in the network selecting his strategy u^d cj > 0 measures the efforts of this player associated with the selection of Wj. The profile of states (x0, x(1), • • • , x(T)) satisfying the opinion dynamics (2) is called a state trajectory starting from initial state, which correspond to the strategy profile (w1,w2) minimizing the payoff functions.
This model is a two-person non-cooperative discrete-time linear-quadratic game. The payoff function of player i G N can be rewritten in a standard form for this class of games:
T-1 /1 1 \
Jj(wi, U2) = ( 2x(t)'Qix(t) + -RiW2(t) + q'x(t) J t=0 ^ '
+ - x(T )'Qjx(T ) + qj x(T ) + Zj, (3)
where 1 denotes a vector of ones of size |A|, Qj = 2/, I is an identity matrix of size |A|, R = 2cj, qj = —2Xj1 z = |A|(T + 1)x2 for i G N. The considered payoff functions have the following properties: (i) Ji is continuous on UT x UT, i G N, (mJ Jj ¿s strictly convex on UT, i G N,
(Hi) The transition reward of player i G N, i.e., 1 x(t)'Qjx(t) + 2Ru2(t) + qjx(t) ¿s continuously different-table on X.
3. Stackelberg equilibrium
In this section, we investigate the solution when players perform subsequent actions. Here we focus on Stackelberg equilibrium with assuming player 1 moves first and then player 2 moves sequentially. What is more, concepts of equilibrium vary-under different information structures. How to design strategies depends on what information is available to the players. Let us make a brief overview of classification of information in dynamic games.
iGN
mapping that depends on stage t and initial state x0. Formally, Mj(t) = ^f (t, x0) G U where ^s(-,x0) : {0, ...,T — 1} ^ U.
(ii) Feedback information structure, also known as Markovian information struc-
i G N t
the current state x(t), i.e., Mj(t) = af (t, x(t)) G U where af (•, •) : {0,..., T — 1} x X|A| ^ U.
Definition 1. A pair (wf*,u2*) is a Stackelberg equilibrium with player 1 as the leader if:
Ji(«1* ,«2* )= min max J1(m1,W2) uieuT «2£R2(t,«i)
where
R2(t,u i )=< u2 G UT | min J2(u
u2 euT
is the set of best response of the follower given arbitrary u 1 € UT.
The following of this section will describe the techniques of Pontryagin's maximum principle (minimum principle in this considered opinion dynamics game) and dynamic programming to derive open-loop and feedback Stackelberg equilibria respectively (Haurie et al., 2012).
3.1. Open-loop Stackelberg equilibrium
For each fixed V f € UT, the follower needs to minimize J2(V f, Vf) with the state
i€N
J (V 1,^2) = Ji(V s,V2) - ai,
where
= 2 XoQiXo + q'xo + |A|(T + 1)x2.
Apparentily, for each i G N
t - 1
J (Vf, Vf) = E ( 2x(t + 1)'Q<x(t + 1) + 1 R(Vf (t, xo))2 + qix(t + 1) ) , (4) t=0 ^ '
criteria (3) and (4) reach the minimum at the same profile (V f*, VI*)• Let us investigate the model with transformed criteria (4), then the system of equations can be derived as follows (Basar and Olsder, 1999).
Theorem 1. If (V f*, Vf*) denotes an open-loop Stackelberg equilibrium strategy, then there exist finite vector sequences 7(1), • • • ,7(T), A 1 (0), • • • , A 1 (T — 1), A2(0), • • • , A2(T — 1), A3(0), • • • , A3(T — 1) that satisfy the following rela,tions, for t = 0, ••• ,T — 1;
Xs*(t + 1) = Wis,(i) + biV2*(t, xo) + 62V!*(t,xo), dHi[x(t), V 2(t, xo), V2(t, xo), y(t + 1), A 1 (t), A2CO, A3(t)] dV 2(t,xo )
= b'iQixs* (t + 1) + R iVs*(t,xo) + b'i |q 1 + A 1 (t)
+ b 1Q2 WA2(t)+ 62A3 (t)
0,
dHi[x(t), V 2(t, xo), V2(t, xo), Y(t + 1), A 1 (t), A2(t), A3(t)]
dVS (t,xo )
b2Q ixs* (t + 1)+ b2 [q 1 + A 1 (t) + Ä2As(t) =0,
+ b2Q2
(5)
(6)
WA2 (t) + b2As(t)
Ai(t - 1) = Vx(t)Hi[x(i), Vi(t, xo), Vf(t, xo), y(t + 1), Ai(t), A2(t), A3(t)]
A2(i + 1) =
W' [Qixs* (t + 1) + qi + Ai (t) + Q2WA2 (t) + Q262A3 (t) dHi[x(t), VS(t, xo), Vf(t, xo), Y(t + 1), Ai(t), A2(t), A3(t)]
5y (t + 1)
= WA2(t)+ b2A3(t),
dH2[x(t), V f (t,xo), Vf(t, xo), y(t + 1)]
dVf(t, xo)
= 62Q2x*(t +1) + R2V2*(t,xo) + b2 q2 + Y(t +1)
Y(t) = W' [Q2x* (t + 1) + q2 + Y(t +1)
Ai (T - 1) = 0, A2 (0) = 0, Y(T) = 0,
(8) (9)
(10) (11)
where
Hi[x(t), V f (t, xo), Vf (t, xo), Y(t + 1), A i(t), A2(t), A3(t)] = 1 x(t + 1)'Qi x(t + 1) + 2Ri(V f (t, xo))2 + q'x(t + 1) + A i(t)'x(t + 1) + A2(t)'W' Q2x(t + 1) + q2 + Y(t + 1)
+ A3(t)' [&2(Q2x(t + 1) + q2 + Y(t + 1)) + R2V2S(t, xo) H2[x(t), V f x(t, xo), Vf (t, xo), Y(t + 1)] = 1 x(t + 1)'Q2x(t + 1) + 2R2(Vf (t, xo))2 + q2x(t + 1) + Y (t +1)' x(t + 1).
For deriving the open-loop Stackelberg equilibrium we need to construct a system of linear equations Ay = B, where y = (V i(0, xo), • • • , V i(T — 1, xo), V2(0, xo), ••• , V2 (T — 1,xo),xo,x(1)', ••• ,x(T)', y (1), ••• , Y (T ),Ai (0), ••• , Ai (T — 1),A2(0), • • • , A2(T — 1), A3(0), • • • , A3(T — 1))' e R(4|a|+3)t+|a|.
As the considered game is the linear-quadratic game, the following statement can be obtained (Basar and Olsder, 1999).
Theorem 2. If the inverse matrix [I+R- ib^P(t+1)+R- ib(I+Q2b2R- ib2)- i xP2(t + 1)]- i exists, then there exists an unique open-loop Stackelberg equilibrium with player 1 acting as the leader, which is given by:
Vf*(t,xo)=Ki (t)xf* (t)+ L i (t),
V2* (t, xo )=K2(t)xf* (t)+ L2(t),
with the unique state trajectory associated with this pair of strategies (V f* (t, xo), VI*
(t, xo))
xf* (t + 1) = 0(t)xf* (t) + 0(t),xf* (0) = xo, where K(t), Lj(t), i e N, ^(t), 0(t) are presented as follows:
0
Ki(t) = - [(I + Q^R-1^)-1^ + 1)#(t) + q2(i + 62Ä2-162Q2)-1wP3(i),
(12)
L1(t) = - R-1b1 {(/ + Q262Ä2-162)-1 [Pa(i + 1)0(t) + Pa(i + 1) + 91]
+q2(i + 62Ä-162Q2)-1Wp3(i)}, (13)
K2(t) = - R-1b2P1(t + i)^(t), (14)
L2(t) = - R-1b2 [P1 (t + 1)0(i)+ P1(t + 1)+ 92] , (15)
i + fi-162&2P1 (t +1) + R-1b1b1(i + Q2b2Ä-162)-1P2(i +1) x w - r-1&1&1q2(/ + 62Ä-162Q2)-1wP3(t^,
*(t)
-1
0(t) = I + R-1b262P1(t + 1) + R-1 &1&1(1 + Q262Ä-162)-1P2(t +1)
x •! -fi-1616'1
(I + Q2&2R-1b2)-1 (P2(t + 1) + 91) + Q2(1
+62Ä221 b2Q2) Wps(t) Proo/ Rearrange equation (7):
- R
-1
As(t) = -(R2 + b2Q2b2)-1 b2 [Q1x(t + 1) + 91 + A1(t) + Q2WA2(t) Plug (18) into (6):
^S(t,xo) = - R-1b1 {[Q1 - Q2&2(R2 + 62Q2b2)-1&2Q^ x(t +1) i - q2&2(r2 + &2Q2b2)-16^ [91 + A1 (t) i - b2(R2 + 62Q2b2)-1&2Q^ WA2 (t) j.
+ + Q2
Plug (18) into (8):
A1(t -1) = w ' { [Q1 - Q2&2(R2 + 62Q2&2)-1&2Q^ x(t +1) i - q2&2(R2 + &2Q2b2)-1&2l Î91 + A1 (t)
i - b2(R2 + b2Q2b2)-1b2Q2 WA2(t) , A1(T) = 0
+
+ Q2
Plug (18) into (9):
A2(t +1) = WA2 (t) - b2(R2 + &2Q2&2)-1&2 [Q1x(t +1) + 91 + A1(t) + Q2WA2(t^ , A2(0) = 0.
Assume that:
Y(t + 1) = [P1(t + 1) - Q2] x(t + 1) + p1(t + 1), A1(t) = [P2(t + 1) - Q J x(t + 1) + p2(t + 1), A2(t) = P3 (t)x(t)+ ps(t).
(16)
(17)
(18)
(19)
(20) (21)
1
Plug (19) into (10):
V2(t,xo) = —R-ib2 Pi(t +1)x(t + 1)+ pi(t + 1) + q2 Following Woodbury matrix identity:
(A + UCV )"
A-i — A-iU(C-i + VA-iU) i VA-i,
the strategy of the leader can be expressed:
V f (t, xo) = — Rrib'i ((I + Q2b2R-ib2)-i P2(t + 1)x(t + 1) + p2(t + 1) + qi
P
3(t)x(t) + p3(t)]} .
+ q2(i + b2R2-ib2Q2)-iw
If the inverse matrix
I + R2ib2b2Pi(t + 1) + Rribib'i(/ + Q2b2R2-ib2)2iP2(t + 1) exists, then state equation will be in following form:
-i
x(t + 1) = I + R-ib2b2Pi(t + 1) + R-ibib'i(/ + Q2b2R-ib2)-iP2(t + 1) W — R-ibibiQ2(I + b2R-ib2Q2)-iWP3 (t)] x(t) — R-ibib'i x (I + Q2b2R-ib2)-i(p2(t + 1) + qi) + Q2(I + b2R-ib2Q2)-i x Wpj(t)] — R2-ib2b2(pi(t + 1) + q2)} .
Denote
x(t + 1) = #(t)x(t) + 0(t), (22)
where ^(t), 0(t) are presented in equations (16), (17). Then denote
Vi (t, xo) = Ki(t)x(t)+ Li(t), V2 (t, xo) = K2(t)x(t)+ L2(t),
where Kj(t), Lj(t), i e N are presented in equations (12)-(15). Plug (19), (22) into (11):
(Pi(t) — Q2)x(t) + pi(t) = W'Q2 [^(t)x(t) + 0(t)] + W'[pi(t + 1) + q2
+ W'
(Pi(t + 1) — Q2)($(t)x(t) + 0(t)) ,P2(T) = 0,
Then the following system can be obtained:
Pi(t) — Q2 = W 'Pi(t + 1)£(t),
pi(t) = W' [Pi(t + 1)0(t) + pi(t + 1) + q2
Pi(T)= Q2, pi(T) = 0.
i
Stackelberg Equilibrium of Opinion Dynamics Game in Social Network Equation (18) becomes:
As(t) = - (R + {Qi [#(t)*(t) + 0(t)] + qi + [P (t + 1) - Qi]
x ^(t)x(t) + 0(t)] + P2(t +1) + Q2w[p3(t)x(t) + ps(i)]}
= - (R + b2Q2b2)-1b2 {[P2(t + l)^(i) + Q2wPs(i)]x(i) + qi + P2(t +1)0(i)+ P2(t +1) + Q2Wp3(t)} . (23)
Denote
where
As(t) = M (t)x(t) + N (t),
m (t) = - (Ä2 + b2Q2b2)-1&2 { P2(t + 1) [/ + R2-1b262Pi(t +1) + Rr1bibi
x (i + Q262R-162)-1P2(t +1)]-1 |w - fi2i6i6'1Q2(/ + 62R-162Q2)-1
x WPs(t^ + Q2
N(t) = - (R + 62Q262)-1&2 {qi - P2(t + 1) [/ + R-16262Pi(t + 1)
+ RrVi(/ + Q262R-162)-1 P2(t + 1)] 1[Rr1bibi((i + Q262R-162)-1
X (P2(t + 1) + qi) + Q2(1 + &2R21&2Q2)-1Wp3(t)) +
X (pi(t +1) + q2^ + P2(t + 1) + Q21 Plug (20), (21), (22), (23) into (8):
(P2(t) - Qi)x(t) + P2(t) = W' {Qi [^(t)x(t) + 0(t)] + P2(t + 1) + qi
+ [P2(t + 1) - Qi] [^(t)x(t) + 0(t)] + Q2W
x P3(t)x(t) + P3(t))+Q2&2 [M(t)x(t) + N(t)] } Then the second system can be obtained:
P2(t) - Qi = W' [P2(t + 1)£(t) + Q2WP3(t)+ Q262M(t) P2(t) = W' [P2(t + 1)0(t) + P2(t + 1) + qi + Q2b2P3(t) + Q262N(t) P2(T)= Qi, P2(T) = 0. Plug (21), (23) into (9):
P3(t + 1)[^(t)x(t) + 0(t)] + p3(t + 1) = w[P3(t)x(t) + p3(t)
+ 62 [M (t)x(t) + N (t)],
Finally, the last system will be the following form:
P3(t + 1)£(t) = WP3(t) - 62M(t),
P3(t + 1)0(t) + p3(t + 1) = Wp3(t) + 62N (t),
P3(0) = 0, p3(0) = 0. □
3.2. Feedback Stackelberg equilibrium
The value function for player i G N at stage t = 0,1, • • •, T — 1 is defined as:
T_1 /1 1 \
V/(t,x(t)) = £ 1 x*(T)'Qix*(t) + £Ri(a*(t, x(t)))2 + qix*(t)J
+ 1 x* (T )'QiX* (T) + qi x*(T), Vs (T, x(T)) =1 x* (T)'QiX* (T) + qix* (T),
where (<rf*, a2*) is the feedback Stackelberg equilibrium, (xs* (t),xs* (t+1), •",x'3* (T)) is the corresponding equilibrium state trajectory.
Reconsidering criteria (3), with the procedure of dynamic programming applied, the feedback Stackelberg equilibrium is obtained as follows (Haurie et al., 2012).
Theorem 3. For the proposed opinion dynamics game, a pair of strategies (<rf* , a2*) with the following form:
a?* (t,x) = —pS(t)'x(t) + rS(t),i G N,
constitutes a feedback Stackelberg equilibrium if and only if there exist functions Vs(t, •) : R|a| ^ R such that:
Vs(t,x) = Ix(t)'S?(t)x(t) + hS(t)'x(t) + s|(t),i G N,t G T, w/iere matrices S8(t), vectorspf(t), h|(t) and numbers r®(t), s|(t) satisfy:
pi (t){R + bl [I + 62 R- 1 b2S|(t + 1)]- 1 SS(t + 1)[1 + 62 r- 1 b2S2S(t + 1)]- 1 b 1} = w '[I + 62 r- 162S2S(t +1)]- 1S2 (t +1)[1 + 62 r- 1 b2S2(t +1)]- 1 b 1,
p2(t)[R2 + 62S2(t +1)62] + (t)b1 S2(t +1)62 = w 'S2(t +1)62, rS (t){R1 + 61[i + 62R-162S2(t + 1)]-1S2(t +1)[1 + 62R-162S2(t + 1)]-%} = —{hf(t +1)' — h2(t +1)' 62 (R2 + 62S2(t + ^r^S^t +1)} x [I + b2R-1b2S22(t + 1)]-161,
r2 (t)[R2 + 62S2(t +1)62] + r2(t)6'1 S2S(t +1)62 = — h2(t +1)'62,
S®(t) = Qi + RipS (t)p2(t)' + [w ' — £ p2(t)6j SS (t + 1) [w — £ 6j p8(t)'
jew jew
h2(t) = —r2(t)Rip2(t) + qi + [w ' — £ p2(t)6j Si2(t +1)
jew
x £ 6jrj(t) + [w' — £p3(t)6j] h2(t +1),
je w je w
11
s2(t) = 2 Ri[r2(t)]2 + 2 £ 6j rj(t) • S2(t +1) £ 6j r2 (t) jew jew
+ h2(t + 1)'£ 6j rj(t) + s2(t + 1), jew
for t = 0,..., T — 1, i G N, with the boundary conditions: Si(T) = Qi, hi(T) = qi, Si(T) = 0.
Stackelberg Equilibrium of Opinion Dynamics Game in Social Network Proof The follower will consider the following optimal problem:
Vs(t,x)
mm < 1 x(t)'Q2x(t) + 1 #2(0^, x))2 + q2x(t)
+ ^ [Wx(i] + 6iaS(t,x) + ba^sCt, x)J S|(t + 1) x Wx(t) + 6ia2(i,x)+ b20|(t,x) + h2(t + 1)' x Wx(t) + b 1 a 2 (t,x) + 62a2 (t, x) + s2(t + 1)} .
Minimizing the expression in the brace with respect to a2(t,x), we have:
R2a2(t, x) + b2S22(t + 1)[Wx(t) + b 1 a2(t, x) + b2a2(t,x)] + b2h2(t + 1) = 0. (24) Plug the linear form of strategies in the statement of the theorem into (24), thus:
2
02(t, x) = - R + 62S2s(t +1)6^ 62
x{ S2s (t + 1) Wx(t) + 6 1(tf(t,x) + h2(t + 1)} .
(25)
Considering the best response of the follower in form (25), the leader's value function will be the following:
Vs(t,x)= min I1 x(t)' Q ix(t) + 1 Ri(0 2(t,x))2 + q' x(t) ffj(t,x)eu [2 2
+ 2 [Wx(t) + 6iCTj(t, x) - 62(R + 62S2(t + 1)62)-1
x [(62S2(t + 1)(Wx(t) + 610-2(t,x) + 62h2(t +1)]}' x S2(t + 1){Wx(t) + 6102(t,x) - 62(^2 + 62 x S22(t + 1)62)-1[62S|(t + 1)(Wx(t) + 6102(t,x) + 62h2(t +1)]} + hf(t + 1)'{Wx(t) + 6102 (t, x) - 62(^2 + 62S2(t + 1)62)-1[62S2(t + 1)(Wx(t) + 6102 (t,x)) + 62h2(t + 1)]] + sf(t + 1)}.
Similar, minimize the expression in the bace with respect to 0f(t, x):
Ri 02 (t,x) + {[I - 62(^2 + 62s2(t + 1)62)-162s|(t + 1)]6i}'s2(t +1) x {Wx(t) + 6i02(t,x) - 62(R2 + 62S2(t +1)62)-1 [62s22(t + 1)(Wx(t) + 6102 (t,x)) + 62h2(t +1)]} + {h»(t + 1)'[i - 62 (R2 + 62 x s22(t + 1)62)-162s|(t + 1)]6i}' = 0.
Following Woodbury matrix identity, we obtain:
i - 62 [Ä2 + 62S2 (t + 1)62]-162s2(t +1) = [I + 62Ä2-162s22(t +1)]-1.
Further, the value function satisfies:
1 x(t)'S2(t)x(t) + h2(t)x(t) + s2(t) = 2 x(t)'Qix(t) + 1 Ri(a2(t,x))2 + qix(t)
+ 1[Wx(t) + 61a2(t,x) + 62a2(t, x)]'S?(t + 1)
x [Wx(t) + (t,x) + 62^2 (t, x)] + h2(t + 1)' x [Wx(t)+ (t,x) + 62a2(t,x)]+ s2(t + 1). (27)
Plug the assumed strategies into equations (24), (26) and (27). Then take the coefficients in each equation the same as each other in both sides. At last, the equations with respect to S?(t), p2(t)> h2(t), (t^d s2(t) described in Theorem 3 are derived. □
4. Numerical simulation 4.1. Example 1
Review the constructed symmetric opinion dynamics network game examined in (Sedakov and Zhen, 2019) firstly. The network is composed of A = {1, • • • , 10}, N = {Pl.1, Pl.2} with a symmetric connection i.e., each agent has the same degree of three, and each player only influences five of agents over twelve periods, so T = 12. Players employ the level of influence to each connected agent with J1,J2 G (0,1) respectively. The influences demonstrated in matrix W and vectors 615 62 are divided equally among all connections. Perform the same values of parameters as evaluated in (Sedakov and Zhen, 2019) i.e., the desired opinions for players are x1 = 0.5, x2 = 0.6, the influence costs are c1 = 0.3, c2 = 0.4, the influence levels in low and high scenarios are = 0.1, = 0.05, = 0.4, = 0.35, and the initial opinion profile of agents is x0 = (1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3,0.2, 0.1)'. Hereon, we consider six equilibria with:
(i) OLSEL implies open-loop Stackelberg equilibrium in low scenario,
(ii) FBSEL implies feedback Stackelberg equilibrium in low scenario,
(Hi) FBNEL implies feedback Nash equilibrium in low scenario,
(iv) OLSEH implies open-loop Stackelberg equilibrium in high scenario,
(v) FBSEH implies feedback Stackelberg equilibrium in high scenario,
(vi) FBNEH implies feedback Nash equilibrium in high scenario.
Table 1. Payoffs in different equilibria
PI. 1 PI. 2
OLSEL 1.5502 4.2825
FBSEL 1.8143 3.0788
FBNEL 1.7106 3.7534
OLSEH 1.7760 4.2298
FBSEH 2.6928 2.5586
FBNEH 1.6710 4.3867
STAGE AGENT
Fig. 1. Equilibrium strategies in low see- Fig. 2. Initial and terminal opinions under nario Equilibria in low scenario
The equilibrium strategies of players, equilibrium trajectories arid terminal opinions of agents in both low and high scenarios are presented in Fig. 1-8. As can be seen from Table 1, player 1 incurs the lowest expense J (V>f*) = 1-5502 employing open-loop Stackelberg equilibrium strategy in low scenario, while player 2 adopts feedback Stackelberg equilibrium strategy in high scenario with his lowest expense J2(ai*,a'2*) = 2.5586. Compared to Nash equilibrium in low scenario, player
1 will incur more under the same open-loop information structure, however player
2 will incur less. Players will have the same preference in high scenario compared to Nash equilibrium. The terminal opinions of agents in the six equilibria are shown as below:
x(T)OLSEL = (0.3742, 0.3742, 0.3952, 0.3980, 0.3983,
0.4150, 0.4112, 0.4117, 0.4412, 0.4342)', x(T)FBSEL = (0.3946, 0.3946, 0.4210, 0.4208, 0.4212,
0.4316, 0.4311, 0.4314, 0.4600, 0.4596)', x(T)FBNEL = (0.3790, 0.3790, 0.4017, 0.4025, 0.4028,
0.4139, 0.4124, 0.4127, 0.4394, 0.4370)', x(T)OLSEH = (0.4486, 0.4485, 0.4325, 0.4486, 0.4467,
0.5222, 0.5041, 0.5065, 0.5341, 0.4838)', x(T)FBSEH = (0.5048, 0.5049, 0.5569, 0.5483, 0.5496,
0.4945, 0.5035, 0.5026, 0.5379, 0.5637)', x(T)FBNEH = (0.4517, 0.4516, 0.4273, 0.4431, 0.4411, 0.5159, 0.4986, 0.5007, 0.5166, 0.4681)'.
Compared to Nash equilibrium both in low and high scenarios, agents hold more similar terminal opinions as in open-loop Stackelberg equilibrium. Furthermore, under feedback Stackelberg influences agents hold the higher levels of terminal opinions than under the other two equilibria in low scenario. Unlike low scenario, agents 1,2,3,4,5 and 10 hold a big difference in terminal opinions between the influences of feedback Stackelberg equilibrium strategies and Nash equilibrium strategies in high scenario. And whats more, agents 6,7,8 and 9 hold a similar terminal opinions under the three different influences.
; i I i
I I
Fig. 3. Open-loop St.ackelberg equilibrium trajectories in low scenario
Fig. 4. Feedback Nasli and St.ackelberg equilibrium trajectories in low scenario
Fig. 5. Equilibrium strategies in high see- Fig. 6. Initial and terminal opinions under nario Equilibria in high scenario
it I I 14
: ! i i i
Fig. 7. Open-loop St.ackelberg equilibrium Fig. 8. Feedback Nash and St.ackelberg trajectories in high scenario equilibrium trajectories in high scenario
4.2. Example 2: Zachary network
Fig. 9. Zacliary karate club network
Now consider a social network from (Zachary. 1977). Zachary observed a university-based karate club for a period of three years from 1970 to 1972. The relationships among the 34 individuals (i.e. V = {1, • • • , 34}) interacting in contexts outside those of karate classes, workouts, and club meetings were presented in Fig. 9. The karate club's chief administrator (player 1. i.e.. node 34 in Fig. 9) and the instructor (player 2. i.e.. node 1 in Fig. 9) had an incipient conflict over the price of karate lessons, so N = {1, 34}. The administrator preferred stable prices, while the instructor wished to raise prices substantially. As time passed there was a series of increasingly sharp factional confrontations over the price of lessons. Below we explore the behaviours of all members in the club under the proposed framework of opinion dynamics game.
The following procedure (Avrachenkov et al., 2017) estimates matrix W and vectors bi, b2. Denote Ni = {j e V|(i, j) e E} as the set of neighbors of player (or agent) i e N, and d = |Nj| is called the degree of player (or agent) i e N. Define below function for i e V:
'0, j
didj 2m 1, j
N..
j e Ni,
where m = 2 ^leV d is the total number of edges in the network. Based on functions above, assume:
, Vi,j e A,
bi =
, Vi e N, j e A.
Lot the desired proportions of changing the price of karate lessons of the administrator and the instructor be 0.4 and 0.8 respectively. Consider both administrator and instructor attempting to influence all trainees in the club with the same cost of controls ci = c2 = 40 during the horizon of 36 months, i.e. T = 36. Suppose U = [0,1].
? n i i n r :
' l ' /1 / r i f
i i I i i /
M/ ' 11 1
' i 11
1 v
■ • . • >• . i °». * ° ° °
Fig. 10. St.ackelberg Equilibrium strategies Fig. 11. Initial and terminal opinions under
St.ackelberg Equilibria
*
f.
ti.
ijiijit,
mm
Fig. 12. Open-loop St.ackelberg equilib- Fig. 13. Feedback St.ackelberg equilibrium rium trajectories trajectories
Administrator acts as leader Let the set of states be X = [0,1]. As administrator is the leader so x1 = 0.4, X2 = 0.8. Assume all trainees in the club (i.e., nodes indexed from 2 to 33 in Figure 9, so A = {2, • • • , 32}) have the following initial opinions:
xo =0.1 x I32 + (xi x 17
Xi + X2
, x1 | x2 ^ ^ . ' x1 + x2 /v ^ /
X2,xi x I3,---,x2 x I2,
07 x 1, x2,
xi + x2
-1 ' ^ ^ ^ -t ' xX1 + xX2 \!
,x2,x1,x2 x 12, 0,0, x2 x 15, ---,x2) ,
where 1k is the column vector of ones of size k.
As we can see in Fig. 10., administrator has lower controls than instructor in both open-loop and feedback Stackolborg equilibria. Administrator needs to put stronger control in feedback Nash equilibrium than in open-loop case, while instructor needs to put stronger control in open-loop Nash equilibrium than in feedback case except the last two months. From the numerical results, we find that both players have non-monotonic controls.
The corresponding equilibrium terminal opinions of all trainees in the karate club are presented as follows (See Fig. 11.):
x(T)OLSE = (0.0790, 0.0920, 0.0765, 0.0590, 0.0581,0.0581,0.0776, 0.0863,
x(T)FBSE = (0.0806, 0.0930, 0.0783, 0.0621, 0.0610,0.0610,0.0796, 0.0868,
The opinions of all trainees following the influences of both administrator and instructor under different information structures are presented in Fig. 12-13. From the equilibrium trajectories we know that under the same cost of controls, players' controls only have a slightly deviation from each of the concepts of equilibria. Thus the equilibrium opinions of agents is close to each other under different equilibria. Although there are so many similarity, players have their preference of information structure. The administrator and instructor will incur total equilibrium expenses as their payoffs as shown in Table 2.
Instructor acts as leader As instructor is the leader, so x1 = 0.8, x2 = 0.4, then all trainees in the club have the same profile of initial opinions as in the previous case:
0.1239, 0.0590, 0.0870, 0.0707, 0.0741,0.1367,0.1367, 0.0963, 0.0744, 0.1367, 0.0750, 0.1367, 0.0744,0.1367,0.1204, 0.1607, 0.1628, 0.1437, 0.1139, 0.1110, 0.1224,0.1078,0.0899, 0.1114)',
0.1238, 0.0621, 0.0910, 0.0738, 0.0753,0.1359,0.1359, 0.1010, 0.0772, 0.1359, 0.0761, 0.1359, 0.0772,0.1359,0.1200, 0.1600, 0.1620, 0.1423, 0.1139, 0.1113, 0.1219,0.1082,0.0901, 0.1114)'.
Table 2. Payoffs in Stackelberg Equilibria
Open-loop Feedback
Ji(u*1s,u*2s ) 73.3216 74.1277 J2{uls,u*2s ) 486.6995 483.4364
xo =0.1 x I32 + (
0 x ,
0 , x2, x 1 ,
The levels of influence of administrator and instructor are presented in Fig. 14. And the corresponding equilibrium terminal opinions of all trainees in the karate
club arc presented as follows (See Fig. 15.):
x(T)OLSE = (0.1035, 0.0968, 0.1073,0.1260, 0.1226, 0.1226,0.1119,0.0750, 0.0874, 0.1260, 0.1735,0.1338, 0.0843, 0.0764,0.0764,0.1951, 0.1309, 0.0764, 0.0805,0.0764, 0.1309, 0.0764,0.0725,0.1194, 0.1151, 0.0689, 0.0806,0.0828, 0.0694, 0.0828,0.0708,0.0773)', x(T)FBSE = (0.1064, 0.0991, 0.1105,0.1307, 0.1270, 0.1270,0.11530.0763, 0.0884, 0.1307, 0.1793,0.1386, 0.0864, 0.0764,0.0764,0.2015, 0.1354, 0.0764, 0.0825,0.0764, 0.1354, 0.0764,0.0728,0.1206, 0.1160, 0.0683, 0.0815,0.0839, 0.0696, 0.0840,0.0718,0.0781)'.
¡f H I } H
i
11 ' i ' 1 ' 1 V
f 1 F ' *
/ 1 / / I /
1 4 '
Fig. 14. St.ackelberg Equilibrium strategies Fig. 15. Initial and terminal opinions under
St.ackelberg Equilibria
(tit
ilUttlUiii
Fig. 16. Open-loop St.ackelberg equilib- Fig. 17. Feedback St.ackelberg equilibrium rium trajectories trajectories
The opinions of all trainees following the influences of both administrator and instructor under different information structures are presented in Fig. 16-17. The administrator and instructor will incur total equilibrium expenses as their payoffs as shown in Table 3. Apparently, both of them prefer the feedback information structure.
As shown in Table 2 and 3. the instructor acts as leader prefers the feedback Stackelberg equilibrium with a much more lower expense 71.3626. which is not the
Table 3. Payoffs in Stackelberg Equilibria
Open-loop Feedback
Ji(u*1s,u*2s) 484.6106 485.5769 J2{uls,u*2s ) 72.7073 71.3626
same information structure as previous case. In contrast, the administrator will incur a much more expense 484.6106 than previous case.
5. Conclusion
This paper investigated a two-person discrete-time opinion dynamics game in a social network. The non-cooperative Stackelberg equilibrium was explored under open-loop and feedback information structures. The statements of theoretical contents were characterized, which draw support from Pontryagin's minimum principle and dynamic programming theory. To perform numerical simulation, the comparison in a symmetric opinion dynamics network with the agents of three types was examined, whats more the Zachary karate club network was modeled as the opinion dynamics game. Compared to Nash equilibrium in low scenario of the first example, player 1 will incur more under the same open-loop information structure, however player 2 will incur less. Furthermore, players will have the same preferences in high scenario compared to Nash equilibrium. In the Zachary network, the equilibria were obtained under different leaderships of the administrator and the instructor. As it turns out that both administrator and instructor prefer acting as the leader in feedback information structure.
References
Acemoglu, D. and A. Ozdaglar (2011). Opinion dynamics and learning in social networks.
Dynamic Games and Applications, 1(1), 3-49. Avrachenkov, K. !•"... A.Y. Kondratev and V.V. Mazalov (2017). Cooperative game theory approaches for network partitioning. International Computing and Combinatorics Conference, 591-602.
Barabanov, I. N., N. A. Korgin, D.A. Novikov and A. G. Chkhartishvili (2010). Dynamic models of informational control in social networks. Automation and Remote Control, 71(11), 2417-2426.
Basar, T. and G.J. Olsder (1999). Games and dynamic games. Siam: Bangkok. Buechel, B., T. Hellmann and S. Klôfiner (2015). Opinion dynamics and wisdom under
conformity. Journal of Economic Dynamics and Control, 52, 240-257. Bure,V. M., E. M. Parilina and A. A. Sedakov (2015). Consensus in social networks with, heterogeneous agents and two centers of influence. "Stability and Control Processes" in Memory of VI Zubov (SCP), 2015 International Conference, 233-236. Bure, V. M., E. M. Parilina and A. A. Sedakov (2017). Consensus in a social network with,
two principals. Automation and Remote Control, 78(8), 1489-1499. Dandekar, P., A. Goel and D.T. Lee (2013). Biased assimilation, homophily, and the dynamics of polarization. Proceedings of the National Academy of Sciences, 110(15), 5791-5796.
DeGroot, M. II. (1974). Reaching a consensus. Journal of the American Statistical Association, 69(345), 118-121.
Etesami, S.R. and T. Ba§ar (2015). Game-theoretic analysis of the Hegselmann-Krause model for opinion dynamics in finite dimensions. IEEE Transactions on Automatic Control, 60(7), 1886-1897.
Friedkin, N. E. and E. C. Johnsen (1990). Social influence and opinions. Journal of Mathematical Sociology, 15(3-4), 193-206.
Friedkin, N. E. and E. C. Johnsen (1999). Social influence networks and opinion change. Advances in Group Processes, 16, 1-29.
Ghaderi, J. and R. Srikant (2014). Opinion dynamics in social networks with stubborn agents: Equilibrium and convergence rate. Automatica, 50(12), 3209-3215.
Golub, B. and M. O. Jackson (2010). Naive learning in social networks and the wisdom of crowds. American Economic Journal: Microeconomics, 2(1), 112-49.
Gubanov, D.A., D. A. Novikov and A. G. Chkhartishvili (2011). Informational influence and information control models in social networks. Automation and Remote Control, 72(7), 1557-1597.
Haurie, A., J. B. Krawczyk and G. Zaccour (2012). Dynamic noncooperative game theory. World Scientific Publishing Company: Singapore.
Hegselmann, R. and U. Krause (2002). Opinion dynamics and bounded confidence models, analysis, and simulation. Journal of artificial societies and social simulation, 5(3).
Krawczyk, J. B. and M. Tidball (2006). A discrete-time dynamic game of seasonal water allocation. Journal of optimization theory and applications, 128(2), 411-429.
Sedakov, A. A. and M. Zhen (2019). Opinion dynamics game in a social network with two influence nodes. Vestnik of Saint Petersburg University. Applied Mathematics. Computer Science. Control Processes, 15(1), 118-125.
Wang, C., H. Han and J. Han (2019). A New Network Feature Affects the Intervention Performance on Public Opinion Dynamic Networks. Scientific reports, 9(1), 5089.
Zachary, W. W. (1977). An Information Flow Model for Conflict and Fission in Small Groups. Journal of Anthropological Research, 33(4), 452-473.