The Environment Protecting Dynamics. An Evolutionary Game Theory approach
Paloma Zapata-Lillo
Department of Mathematics School of Science Universidad Nacional Autonoma de Mexico e-mail address: pzl@fciencias.unam.mx
Abstract. The action of large human conglomerates is behind many of the environmental catastrophes in the last decades. However, these same conglomerates are the main actors in some important environment-protection battles. People might lead to the construction of social organizations, and achieve a kind of accumulation of a protecting stock (laws, protocols, social conscience, etc). Frequently, those social movements grow up and become strong, but they do not remain forever, they often follow a kind of cycle. We study emergence and dynamics of social organizations through the repetition of a game in an evolutionary context. The history of the conflict goes through stages. The reached stochastic stable equilibrium in each stage determines which organization is formed, and which is the accumulated stock in next stage, leading a new game, and new equilibria.
Keywords: Prisoner Dilemma, Public Good, Organization that Forces cooperation, Organization Game, Strict Nash Equilibria, Learning Dynamics, Myopic People, Sample of Information, Mistakes, Markov Process, Perturbation, Stochastically Stable Equilibria, Accumulation, Learning-Accumulation Dynamics,
Depreciation.
Introduction
We study dynamics behavior of large populations facing any kind of environmental problem. There are destructive trends to face those problems inside most of such communities. The selfish individual interests might be too aggressive against the environment. However, in the real world, many of those populations face the undesired effects of their negative trends and try to achieve cooperation. In order to do that,
they create organizations (such as community assemblies, trade-unions, ecologist organizations) that force their members to cooperate, and avoid others’ destructive actions. Why do those organizations emerge?
All around the world, numerous communities organize themselves to resist the anti-environment tendencies of modern world. They can be found in Asia, Africa, Latin America, as well as in developed countries. Those organizations are not imposed by some forces outside the community. On the contrary, they emerge as free decisions of uncoordinated individuals. Some of those communities remain organized for centuries, for example, in underdeveloped countries. Nature and environmental protection are a part of their cultures. Of course, there are people who choose to free ride. But, it will surely emerge from a nucleus in charge of protecting the environment and other communal interests. This nucleus might change from time to time, and its size is increasing and decreasing in the long run. In contrast, other communities or social groups act as movements. That is to say, an organization of part of their population fights to reach obligatory protective conditions (laws, protocols, conscience). When those conditions are achieved, less and less people want to organize themselves and cooperate. However, laws and social forces wear down through time, and after some periods of apathy, people realize that laws have become only paper, and it is necessary to do something again. That is to say, there is a cycle as follows: people form an organization that grows in time, it disappears when goals are reached, and reemerges when those profits decreases through time. It is similar to trade-unions dynamics. Workers build strong organizations to fight for better wages and other benefits. However, when those conditions are established in contracts as an accumulated public wealth, people loose interest in collective actions, and the organization becomes a bureaucratic apparatus which is only useful to the leaders, until it is necessary to renew the organization.
We consider laws, protocols and conscience as a public good stock that can be accumulated, and it also can be depreciated (laws and social forces wear down on time). That good is public, that is, it is non-excludable. In this paper we study the dynamics of accumulation and depreciation of that special stock, and the responsibility of it on the kind of pattern that the organization achieves. Patterns where the community remains organized, with high probability, maybe changing the organization size and its composition or patterns with organization-disorganization cycles.
Conflicts similar to those have been studied in the literature, using a very simple Game theoretical model that, nevertheless, captures the core of the problem. That model is known as n-personal prisoner’s dilemma. [Hardin, 1964] who generalized the model to n players called it: The Tragedy of the Commons. Authors who have been modeled that kind of conflicts through prisoner’s dilemmas are, among many others, [Weissing and Ostrom, 1991], [Okada and Sakakibara, 1991], [Maruta and Okada, 2001], [Glance and Huberman, 1993, 1997]. The conditions under which people build organizations to overcome a prisoner’s dilemma have been studied by, among others, [Ostrom, 1990, 1998], [Weissing and Ostrom, 1991], [Okada, Sakakibara, 1991], [Maruta and Okada, 2001].
“Collective action” is the name applied in the literature to the process by which social organizations form and operate. [Olson, 1965], who pointed out the essential question related to the emergence of organizations to overcome social dilemmas, suggested that a social organization emerges to allow people to enjoy some common goods. But those goods are public goods, and nobody can be excluded from enjoying them. He asked why people decide to make some effort participating in those organizations. [Ostrom, 1990, 1998] studied the ways in which communities involved in common pool resource dilemmas (of the prisoner’s kind) overcame those dilemmas. She considered a repetition process, where individuals have bound rationality. Repetition of the conflict might lead people to design norms in order to overcome the dilemma, when there is good communication and trust among people. Then, some level of cooperation is reached. Repetition of the common pool resource dilemma, in her paper, is a game similar to the repeated prisoner’s dilemma that [Axelrod, 1984] analyzed.
In this paper, we follow an evolutionary approach [a la Kandori, Mailath, Rob, 1993], [Kandori, Mailath, 1995] and [Young, 1993, 1998]. Why an evolutionary approach should be meaningful in this setting? Because we want to study great populations that are involved in similar social conflicts. And, we want to find the social patterns that emerge in the long run as a result of the behavior of an enormous quantity of individuals that are repeatedly involved in that kind of conflict. We have to build a realistic model on the way about real people learn with many limitations from experience, and adjust their actions in order to improve their utility facing others’ behavior. Then we have to study the emergent social pattern in that process in this context. The Evolutionary Games Theory has been applied to the study of the setting up of the conventions in a society, the making of its ethical concepts and its social institutions; using a similar approach we have explained. See, for example, [Binmore, 1995, 1998, 2003], [Young, 1993, 1998]. Some authors, such as [Glance and Huberman, 1997], have studied the evolution of cooperative behaviors in the prisoner’s dilemma and other dilemmas. [Maruta and Okada, 2001] follow Young’s approach (adaptive play) in the non-accumulation case. Accumulation and depreciation provoke a process that is the history of development of environment and nature protecting laws. It is a process that goes through stages. In each of those stages, public stock (laws, etc.) does not change. Then a large population is involved in an organization game with fixed stock. That is, members of a community consider if they should voluntarily form an organization that forces themselves to cooperate. We study the emergent organization dynamics, when the conflict repeats in the long run relative to a stage. Encounters of that game occur between small groups of the members of that population drawn at random. The process of repetition of that organization game is a perturbed Markov process; it is not a repeated game. The bounded or limited rationality implies that individuals are able just to gather small amounts of information in a stochastic way and to improve their payoffs in response, without necessarily achieving the optimum. They also make mistakes in a stochastic way. Finally, the equilibria of the process are stochastic, and they describe which
social behavior patterns occur with the highest probability, for a given public stock k. There are two cases that depend on the size of the public stock. In the first one, stochastically stable equilibria that are reached in the stage correspond to some of the strict Nash equilibria, and it means that emerge a minimal organization, according to k. In another case, none organization is formed, in each stochastically stable equilibrium. Then, accumulation in a stage is determined by the stochastically stable equilibrium that is achieved in that stage. Accumulation and depreciation provoke community to get to another stage, that is, the community will be involved in an organization game where the stock has changed by the accumulation achieved in a lower stage, and by depreciation. Which pattern does that accumulation dynamics follow?.
The paper is organized as follows: in section 1, we describe the prisoner’s dilemma that shows the destructive trends existing inside the community. We also study the organization game, through which people can overcome those negative trends. It is similar to Okada and Maruta (2001). In the last part of that section, we explain the process of protective laws and conscience development that goes trough stages where people might organize in order to obtain stronger laws, and improve their concern. We establish, in section 3, the dynamics occurring in each stage as a perturbed Markov process that expresses a learning dynamics where community people achieve a stochastically stable equilibrium. That equilibrium might mean an organized activity that allows people to achieve better protective environment conditions or it might mean community people’s apathy and passivity. In section 4, we study which patterns of organization-accumulation could be established through the stages that happens in the long run. Finally, we present some conclusions, and some open questions for future works.
1. The Models
1.1. The Prisoner Dilemma
In order to develop the ideas in a specific way, let us think in a community N, N = {1, 2,...,n}, that is involved in a conflict about its water resources. This sort of conflicts are very common in many countries Let H be a large company that wants to establish an industrial or tourist project that might concentrate all the regional water resources in its hands. If carried out, this project might also pollute the groundwater reserves. However, the individuals, who supported H, would obtain some income from the company. Let us assume that the community members are players. Each one of those players has two possible strategies, either cooperation (C), which means to oppose company’s project, or non-cooperation (NC), supporting it and taking advantage of it. Cooperation provokes protective forces in two senses: a) Past cooperation has accumulated a kind of stock (laws, social consciences, etc.) that gives individual rights to each people over water resources. Because the laws and social force that protect the regional water resources guarantee each person’s from the community right to obtain some quantity of water, we consider that laws and forces as a stock that prevent community members’ rights over water. This stock have
a size that measure the quality of rights. Let k be the accumulated stock that has been accumulated until a stage t, we will say that each member of the community obtains an extra payoff fi (k) due to stock k. b) Current cooperation provokes that laws and consciences are more effective today and reach more accumulation for tomorrow. That is, individuals that choose C, in a period, create a protective tools that reinforce the laws, and each one obtains an extra payoff. Then, if s persons choose C we will say that each member of the community receives fs (s) utility from those s persons. Those protective forces benefit everybody, because laws and social force are public goods, that is, they are not excludable. We will assume fi and fs are strictly increasing functions, fs (0) = 0 and fi (0) = 0. On the other hand, we will denote by y the utility each individual expects to obtain from H. If an individual decided not to support the project and act against it, he would obtain Y smaller than 7. We will say that dc = 7 — Y is the cost of cooperation.
The previous suppositions can be expressed in a strategic game, which has N as a set of players. Each of them has Dj = { Cooperation (C), Defection (NC)} as a set of pure strategies. Let a be (a1, a2,...,an), a profile of pure strategies in D = D1 x D2 x ...x Dn, then the payoff got by any j is denoted by the function j that is defined in D as
( ) J Y — dC + fi (k)+ fs (sa + 1) if aj =
Vj =\ Y + fi (k) + fs (sa) if aj = NC, ( )
sa = # {i € N |i = j and ai = C }.
Let us assume fs (s + 1) — fs (s) > dc, for each s. It would be better to cooperate, for each j € N, without anyone forcing them to do it, independently on the decisions of the rest of the people. The fact that everybody cooperates would be, then, the only Nash equilibrium of the game. In that case, each individual looking only for own welfare would be part of the common welfare construction, confirming Adam Smith’s conjecture.
We are here more interested, however, in the case where fs (s +1) — fs (s) < dc,
for each 0 < s < n. Therefore, the strategy Defection (NC) is dominant and the
only Nash equilibrium is (NC, NC, .., NC). In this situation, if fs (n) > dc, which we also assume, we would be facing a similar tragedy to the Hardin one. In order to refer to this game we will use the notation G = (N, {Dj}, y>). In G, no one would choose an ecologist attitude without anything forcing them to do it, although there is a natural number l*, n > l* > 1, such that if l* members of the community cooperated, the utility of each people in N would be larger than y + fi (k). However, inside the community rises a new game which each of its members will think about in pursue of getting the best advantages of cooperation, even in an obligatory manner. That new game is the organization game.
1.2. Model I: The Organization Game
Organization Game that we consider is similar to Maruta-Okada one [2001]. But, we assume it happens in a stage, when community has accumulated a stock k.
Let us denote that game by Gk. Then,
Gk = (^N,^Dj = j Participation (P), No Participation (NP) j j ,ik^ . (2)
The organization could be, for instance, an assembly open to all members of the community who would like to participate in it. The objective of such organization would be to reach cooperation of their members in order to permit the existence of better conditions for the water resources. Members of the organization must cooperate. People not pertaining to the organization are free riders. We consider that the enforcement system has a cost d0 (k).
Our assumptions in order to establish the payoff function are: 1) If no group in the community forms an assembly, everyone earns y + fi (k). 2) s*k is the size of the minimum group S that can constitute an effective organization, in the sense that, if that group cooperated, each member of the community would obtain more than y + fi (k). Let us denote as fs (s) — do (k), the earn that each person obtained if s persons choose P, and the accumulated stock is k. It is an increasing function relative to s, and decreasing relative to k. do (k) is the organization cost, and it is an increasing function on k (it depends on fi (k)). If the number of people that decided P is less than sk, the assembly will not form. That is, sk is the minimum integer number such that fs (sk) > dc + do (k), s*k is an increasing function of k. 3) If aj = P, but sa < s*k — 1, j earns the payoff of a free rider.
Then the j payoff function cp is
(pj (p) = <
Y - dc + fi (k) + fs (s? + 1) - do (k)if s? > s*k - 1 and = P
Y + fi (k) if s? < s*k - 1 and aj = P
Y + fi (k) + fs (s?) - do (k) if s? > s*k and aj = NP
Y + fi (k) if s? < s*k and aj = NP
(3)
sa = #{ i € N | i = j and a% = P }
The game has two important properties that are enunciated in the two following propositions.
Proposition 1. If the community is “large” (n > s*k), a profile of pure strategies a* is a strict Nash equilibrium of the game Gk if and only if the number of players that chose P in a* is sk. If n < sk, there is not strict Nash equilibrium of the game Gk.
Proof.
Let us assume k is the accumulated stock. Let a be a profile such that the number of people who chose P is smaller than s*k — 1. No organization appears in that profile, and everybody earns y + fi (k). If any player j changed its strategy, and the others did not, it is still not possible to form an organization, and j would continue earning
Y + fi (k). a is Nash equilibrium, it does not matter if s*k is smaller than n, as is assumed, or if it is larger than n or equal to it. These equilibria are not strict.
Let be a profile such that exactly s*k people choose P, whoever they are. An organization is formed if s*k. Let j be a player who chose P. Then, j earned
Y — dc + fi (k) + fs (sk) — do (k), but if it changes its strategy, and the others do not, the organization cannot be formed, and j would earn y + fi (k), which is smaller than y — dc + fi (k)+ fs (s*k) — do (k). Let i be a player who chose NP in a. i earned
Y + fi (k) + fs (sk) — do (k); if it changes its strategy, and the others do not, it would earn y — dc + fi (k) + fs (sk + 1) — do (k), which is smaller than y + fi (k) + fs (sk) — do (k). Then a is strict Nash equilibrium.
Let us examine the other profiles. Let a be a profile of pure strategies such that the number of people who chose P is s*k — 1. No organization appears in that profile and each player earns y + fi (k). If j chose NP in a, and it changes its strategy, and the others do not, there would be s*k players that would choose P, and an organization would appear. Then j would earn y — dc + fi (k) + fs (sk) — do (k) which is larger than y + fi (k). That is to say, a is not Nash equilibrium.
Let a be a profile where the number of people who chose P is larger than sk. Let j be a player who chose P. An organization, whose size is sa + 1 would appear, then j would have earned y — dc + fi (k)+ fs (sa + 1) — do (k), but if it changes its strategy, and the others do not, an organization, whose size was sa, would appear. Then, j would earn y + fi (k) + fs (sa) — do (k), which is larger than y — dc + fi (k) + fs (sa + 1) — do (k). Then it is not Nash equilibrium. There is not another Nash equilibrium in pure strategies in the organization game.
Proposition 2. If the community is “large” (n > sk), the Organization Game Gk is weakly acyclic for each k.
Proof.
Let us assume k is the accumulated stock. Let a (1) be a profile of pure strategies such that w, the number of people who chose P, is larger than sk. If a player chose P in a (1), it will be called ji. The payoff of ji in a (1) is y — dc+fi (k)+fs ( so-(i) + 1 do (k). The best reply of ji to a (1) is NP. Let a (2) be the profile where ji chooses NP and each one of the rest of the players chooses the same strategy that it chose in a (1). If the number of people who chose P in a (2) is sk, a (2) is a strict Nash equilibrium. If that number is larger than s*k, it could proceed by induction to build {a (1) ,a (2) ,...,a (r)}, such that s*k persons choose P in a (r), and, for each i = 2,...,r, there is a player ji who changed its strategy from the profile a (i — 1), where it chose P, to the profile a (i) and NP is a ji best reply to a (i — 1). All of the rest of players did not change their strategies from a (i — 1) to a (i). a (r) is a strict Nash equilibrium that is reached in a finite number of steps.
Let a (1) be a profile of pure strategies such that w, the number of people who chose P, is smaller than sk. If a player chose NP in a (1), it will be called ji. The payoff of that player in a (1) is y + fi (k). If it changed its strategy, and w + 1 is smaller than sk, it would again earn y + fi (k). If it changed its strategy, and w +1 is equal to sk, it would earn y — dc + fi (k) + fs (sk) — do (k). In both cases, P is the best reply of ji to a (1). Let a (2) be the profile where ji chooses P and each one of the rest of the players chooses the same strategy that it chose in a (1). If the
number of people who chose P in a (2) is s*k, a (2) is strict Nash equilibrium. If that number is smaller than s*k the building of the search sequence could again proceed by induction. In a finite number of steps a strict Nash equilibrium would reach.
We establish a strategic game, and we consider that the solution is reached when the game is repeated in an evolutionary context. In the organization game, each j in N should choose if it takes part in an organization S or if rather not.
More important than Nash equilibria of the game are the equilibria in the long run, when the conflict is repeated period after period by people with limited rationality as it happens in the real world.
1.3. Model I: The Learning-Accumulation Process
We will study organization and accumulation dynamics. In those process, time is considered in two senses. In the first sense, the history of the process is the history of the development, stagnation and backwards in laws, protocols and people conscience. That is, the history of the public good stock accumulation and depreciation in the long run. As we said in this paper, the establishment of laws or decrees related to the environment protection is considered as a sort of accumulation of an environment-protecting stock. That long run process goes through stages. The public good that protects the environment is accumulated at the end of each stage. That is, people cooperation provokes, at the end of each stage, a growing of that social capital stock, and it survives in the following stage, although it may be depreciated. In each of those stages, the repetition of the organization game, and the learning dynamics (an evolutionary process) occurs once and once through short periods of time. During a stage the accumulated stock does not change. People learn from past periods experience in a limited way. They have, for example, an incomplete information about other people attitude, and they make mistakes with a small probability. They change their behavior according to that. Then, a stochastic stable equilibrium is reached. That equilibrium determines how many people participate in the organization and which is the accumulated stock in the next stage. Then, at the end of each stage t, the “stock” might be accumulated or not. Besides, the stock that had been accumulated until the beginning of the stage t is depreciated, and the following stage begins with a new accumulated stock, which is the depreciated old stock plus the accumulated stock in t.
It is possible that the process, in the long run, leads to an organization that achieves a “stock”, and then, keeps it forever, after repeated accumulation and depreciation, but this fact has very small probability. It is more probable that the process follows a kind of cycle. It could be an organization-disorganization cycle or a cycle where organization is changing its size. In both cases, the dynamics turns around a given stock.
In the two following subsections, we will study what is happened in a stage when the community has reached a stock k, and which dynamics could be provoked in the long run.
2. Learning Process
It is necessary to model the organization process using an adequate learning dynamics. It is a gradual process that would follow the members of a community involved in the considered water conflict. That is, many conflicts similar to the organization game take place daily among different groups of members of the community in which the decision to act individually or in an organized way was involved. Day by day such conflicts arose in one or another neighborhood; in the fields, at the market, in schools, or in other settings. Water has many different effects on the community and people have to adopt a position on the issue, often changing their way of thinking and acting according to their own and others’ experiences over time. Of course, each person did not have full information about other people’s behavior or attitude. Nobody gathered information systematically; instead, it was received at random. According to the information obtained people take the best decisions they can, although sometimes they make mistakes. That is, the emergence and evolution of an organization are spontaneous orders provoked by the repeated actions of the members of a great population.
2.1. Model I: The Repetition of the Conflict by a Great Population
Let us assume, in a stage t, there is an accumulated stock k and the game Gk is repeated in that stage. As we said, people who play that repeated game are not always the same. The process happens among the members of a great population N (for example, people from the community). Let m be the size of N. Meetings of the organization game (right over water disputes) in a set of n people from N (different ones each time) would happen daily, where the decision of acting individually or in an organized way was involved. Sometimes the conflict would arise between people who are neighbors, or among co-workers, etc. Water disputes are present in many different ways and people have to adopt elections about it. People would change their way of thinking and would act according to their own and others’ experiences. That is, we assume there is a large population N that contains many water conflicts expressed, in a period, by the game Gk . There are many conflicts of this kind, in the long stage, period after period. We also assume that people from N learn by experience, but they are myopic, and make mistakes in each period, and we study the stochastically stable equilibria of that process. That is the approach of [Kandori et al., 1993, 1995] and [Young, 1993, 1998].
We consider that the population N is partitioned in subpopulations Ni, N2,..., Nn, one for each player. For each j, Nj is large. We assume that partition in order to non-symmetric profiles in pure strategies were a part of the states of the dynamic process. Besides, in this way, it would be easier to introduce later the assumption that players might not have the sam^participation trends. A group of n persons, one of each Ni, could be the players of Gk.
In each period t, only one encounter happens. j knows about others’ attitude in the period t — 1 (that is, some strategies people chose in the last period). j uses that
experience in order to choose its strategy for next period. Each one makes mistakes with small probability. We will study the behavior of people in the stage t.
Let us assume that everybody in N chose a strategy, in a period t, and we denoted
zj as the strategy that chose member i from Nj (P or NP), the strategic structure
of society would be z, where
n
z = (z1 = (4, z2, zi) , zn = (zn, zn, znj) , $> = e. (4) i=1
Let Z be the set of strategic structures of society.
People learn by their own and others’ experience. However, they are not very
rational, they are myopic and have many limitations. According to their experiences, each j obtains a sample of size r from z, that is a vector
z (r) = (z1(r) = (4 ,z12 ,...,zir ) ,...,zn(r) = (j ,zj2 ,...,zn )) (5)
Then, j chooses his best reply to the sample z (r). That is, he considers z (r) the profile of mixed strategies that is determined by z (r) and chooses the best reply according to Gk.
z (r) = r ((xl> r ~ x'1) ’ ■■■’ r ~ x"))’ where xJ is the number of P that appears in zj(r).
Each z (r) might be elected as a sample with positive probability.
We build h as a correspondence from Z to Z such that:
If z' e h (z), there are at most n coordinates of z', one of each j = 1, 2,...,n, that are different to the correspondent in z. Besides, if zj is different to zj the strategy z'j is the best reply to one z (r) for player j according to Gk.
Definition 1. The learning dynamics of the process is a correspondence h from Z to Z such that if z' is in h(z), there are m samples of z with size r, one for each player, such that for m persons of N, each one from one of the Niy zj is the j’s best reply to his sample. Other coordinates of z do not change.
If the state of the process is z, there might not be a succession of states {z1 = z, z1,..., zs-1,zs = z'} such that zi+1 e h (zi), for i = 1,..,s — 1. That is, it might not be possible to reach some state z' through a finite number of periods, where the learning dynamics h is working. However, as we said, we consider that each member from N makes mistakes with a small probability. Then we will obtain a perturbation of the dynamics h, which allows to reach any z' from each z, through a finite number of periods.
Definition 2. A player j's mistake in the state z is a strategy that is not the best reply to any sample z (r).
It is clear that it is possible to reach any state z' from z through a finite number of periods, when a perturbation of learning dynamics h is working. That is, there is a
succession of states {z1 = z, z1,..., zs-1, zs = z'} such that, for i = 1,.., s — 1, zi+1 is obtained due to deviations of at most m players from one of the strategic structures in h (zi).
Let us represent the learning dynamics and its perturbations by Markov matrixes.
Definition 3. A Markov matrix Q, which order is \Z\, represents the learning dynamics h if, for each couple (z, z'),
\ Kz' if z' e h (z),
Qzz' = 1 H t f Æ U ( \ (6)
[0 if z' eh (z),
where Xzz' is the probability of choosing a combination of samples of z that determines z'.
If we consider that each member from N makes mistakes with a small probability n, and the mistakes of each person are independent of the mistakes of the other’s.
Definition 4. For each n e (0, a], a Markov matrix Q (n) represents the perturbed learning dynamics Q with mistakes rate n if
Qzz' (n) = (1 - n)n Qzz' + ^ ^ aJyVzZ'nJyz'1 (1 - n)n-\Jyz'1 Qzy,
yeh(z), y=z' Jyz' =6
where aJxx' is the probability to go from x to x’ when only the members of J make mistakes, and Jxx' is such that aXx' is positive.
Proposition 3. For each n e (0, a], Q (n) is regular. That is,
a) For each couple (z, z'), Qzz' (n) tends to Qzz' as n tends to 0.
b) There is q(n) a unique distribution vector such that q(n) Q (n) = 9 (n).
c) For each couple (z,z') such that Qzz' (n) > 0, there is a non-negative integer
number vzzj such that the limit of , as r] tends to 0; exists and is positive.
Proof.
a) Qzz' (n) is a polynomial in n, such that Qzz' is the coefficient of n°, then it is
obvious that Qzz' (n) tends to Qzz' as n tends to 0.
b) Consider two states z and z' (z = z'). Then, for each (z, z'), there is a succes-
sion {zi = z, zi,..., zs-i, zs = z'} such that Qzizi+1 (n) is positive, that is Q (n) is irreducible. Then there is q(n) a unique distribution vector such that q (n) Q (n) = 9(n).
c) If (z, z') is such that Qzz' (n) > 0, let vzz' be the minimum exponent of n in
Qzz' (??) such that corresponds to a positive coefficient, then is a polynomial,
where the coefficient of r/0 is positive. Then, the limit of , as rj tends to 0,
exists and it is positive.
That proposition said that the studied learning process perturbed by mistakes is regular as Young defined in his paper of conventions [1993]. In that paper, Young works with the concept of stochastic potential of each state of a perturbed Markov process Ps. The stochastic potential of the state z is the minimum of the costs of all the z-trees in the graphic (v, A^, where V = Z, A = {(z, z') \PZfZ, > 0}, and the cost function is defined as c (z,z') equals to the minimum number of mistakes to go from z to z'. Then the cost of a z-tree is the addition of all the c (z,z') that correspond to (z,z') in the z-tree. Young’s theorem (1993) about the stochastically stable equilibria of a regular perturbed Markov process.
Theorem 1 [Young, 1993]. Let PE be a regular perturbed Markov process, and j - the unique stationary distribution of Pe for each £ > 0. Then lime^o = j0 exists, and jP is a stationary distribution of P0. The stochastically stable states are precisely those which have minimum stochastic potential.
Which strategic structures of society are the stochastically stable equilibria of the organization game’s learning process considered?
Young (1993, 1998) studied the special case of a weakly acyclic game G under an adaptive play process, with samples incomplete enough. He found that the stochastically stable equilibria, in that case, are some of the states that correspond to strict Nash equilibria, those which have minimum stochastic potential. If Lr denotes the maximum of the trajectories that join a profile of pure strategies to a strict Nash equilibrium in the best-reply graphic of G, the theorem is expressed as follows.
Theorem 2 [Young, 1993]. Let G be a weakly acyclic n-person game, and let P be the adaptive process the unperturbed adaptive play with sizes of memory and sample a and r, respectively. If r < Process converges with probability one
to a convention from any initial state. The stochastically stable states are those that correspond to strict Nash equilibria with minimum stochastic potential.
The learning dynamics we are considering is not an adaptive play, the states are strategic structures of society instead of histories of size m. However, it has the same properties that Young’s adaptive play has according to weakly acyclic games. Then, it is possible to establish the same result when Gk. is weakly acyclic, that is, when n> s*k (the community is large enough). If r < it is possible to find an integer
positive number M and proving that from each state z the process reaches, with positive probability, a state which corresponds to a strict Nash equilibrium in at most M periods. Then if the community is large enough the stochastically stable equilibria correspond to strict Nash equilibria. That is, an organization is formed. Instead, if the population is small Gk is not weakly acyclic, and none organization would form.
Theorem 3. a) // to > s*k and r < a social strategic structure is a stochastic
stable equilibrium if and only if it is such that all the members of exactly s^l subpopulations choose P and the rest of the community people choose NP. b) If m < s^l none organization is formed in a stochastically stable equilibrium.
Proof.
Demonstration of a) follows the steps of Young’s one (1993) due to the fact that Gk is weakly acyclic, when n > s*k.
i) If the process is in the state z, in period t, it is possible to consider there is a positive probability that, in the period t + i — 1, for i = 1, 2,...,r, the players are the members r + i of all of the Nj subpopulation. There is a positive probability that all of them chose as sample the first r persons in each of the vectors of z, in each of the considered periods. Let j1 be that sample. We considered that each player changed his strategy for a best reply to j1, for next period. Let a be the profile of those best replies. Let us assume that member r + i of Nj chooses sj, the same player j's best reply to j1, during periods i = 1, 2,...,r. Then it has been generated a state z (1) that can be reached, with positive probability, where decisions of the first r members of each Nj correspond to j1 and decisions of the r +1,... 2r correspond to the profile a. Let j2 be the vector which has size r and all of its coordinates are a.
ii) Let us consider a sequence {a0 = a, a 1,a2 ,...,al = a*} of the best reply graphic that joins a with the strict Nash equilibrium a*. For i = 1, 2,...,r, let the members 2r + i of each Nj be the players in period 2r + i. If j1 is the unique player that changed his strategy from a to a1 in the best reply graphic. People who represent j1 in each of the considered r periods choose as sample j2. Other players choose j1. Then, it has been generated a state z(2) that can be reached, with positive probability, where decisions of the first r members of each Nj correspond to j1, decisions of the members r + 1, r + 2, ..., 2r correspond to j2, and decisions of the 2r + 1, ...3r correspond to the profile a1. Let j3 be the vector which has size r and all of its coordinates are a1 .
iii) It can be constructed by induction the sequence {z,z (1),..., z(l + 2)} such that decisions of the first r members of each Nj correspond to j1, for i = 3, ..,l + 2, decisions of the members ir + 1, ir + 2, ..., (i + 1)r correspond to ji, which is the vector that has size r and all of its coordinates are a1-2. That sample is jl+2, let us denote it as j*.
iv) When z(l + 2) that contains a sample j* is reached, we consider that each person who represents one of the players in the following periods chooses j* as sample, then period by period all the coordinates of the reached states are changing and becoming a*. The process without mistakes reaches z* = ((a*,...,a*),..., (a*, ...,a"m)) in at most max \Nj \ + r (Lr + 1) periods.
On the other hand, all the strict Nash equilibria correspond to similar strategic structure of society. In those structures, all the members of exactly s*k subpopulations choose P and the rest of the community people choose NP. Then, all of those strategic structures are stochastically stable equilibria.
Demonstration of b): If m < s*k, the process without mistakes is connected, then all of the states are stochastically stable equilibria. Each of them means that none organization is formed.
The stochastically stable equilibria tell us that in each stage, if the population is large enough, people form an organization and achieve a larger “accumulated
stock”, and if the population is small, there is not accumulation in the stage. In order to complete the dynamics of the stock, it is necessary to consider that the stock depreciates. The stochastically stable equilibria and depreciation determine which game will be played at the next stage.
3. Accumulation in the long run
We have studied that people, after period t, might accumulate a new stock that continues acting in next period. We assume that a new stock does not depreciate, but the stock that existed at the beginning of t depreciates before the period t +1 began. The conflict in the stage t is the game Gk if k is the stock that has been accumulated until t. At the end of t, if s*k < m, the stock is accumulated by the cooperation of sk persons. If sk > m, there is no accumulation. But k has suffered a depreciation.
Depreciation: The stock depreciates, because laws and social forces wear down on time. Let us assume that the accumulated stock is depreciated by a rate p, p G (0,1). If at the beginning of stage t there were k unities of stock, and Ak unities are accumulated during t, in t + 1 there will be ki equal to (1 — p) k + Ak unities of stock.
If, in period t, k is the accumulated stock, and there is a rate of depreciation equal to p, the stocks that will be accumulated with probability almost one in t +1 are:
( k (1 — p) if s*k > m,
k' = < k (1 — p) + h — h' if sk < m, h is the stock accumulated by sk (7) [ persons, and h' is the loss in stock due to organization cost.
In the long run, the community might remain organized, although the number of people that participate might be increasing and decreasing. This is the case, when it is reached a stock k, such that sk < m, but Ak is smaller or equal to pk. On the other hand, when it is possible to reach a stock k such that s*k > m, there is not any group inside the community that can allow people earn more than 7 + fi (k), and periods of disorganization happens until depreciation leads to a stock k' such that s*k < m. That is, an organization-disorganization cycle would be provoked.
Example. Let us consider that fi (k) = /3k, fac (s) = ¡3s, d0 (k) = ¡3 (7 + [3k). ¡3 is a unitary (marginal) “productivity” of the public good. We consider the corresponding game G. If ¡3 > 1, G is not a prisoner dilemma, the individual selfish interests lead people to cooperate. They always cooperate without an enforcing organization. On the contrary if ¡3 < 1, G is a prisoner dilemma. Let us study the behavior of the sizes of the stock and of the organization that emerges in the stochastically stable equilibria according to a stock that is more or less productive (3 = 0.5) and according to a stock that is not productive (3 = 0.75). The remainder parameters (7,p,dc,n) do not change.
Then, sî
f + 7 + №
and
k' =
k (1 - p) if
k (l - P) + s*k - (Y + 3k) if
sk > m
s*k < m.
assume that 7 1 O 1 0.6, n = 10.
3 =0.5
stock organization size stock organization size
k O 0 s0 = 3 k19 = 13.330 s13.330 = 9
ki =2 s2 =4 k20 = 13.332 s13.332 = 9
k to .3 8 S3.8 = 5 k21 = 13.333 s13.333 = 9
k3 = 5.52 s5.52 = 5 k to to 1 3 CO s13.334 = 9
k to CO 1 3 CO s 13.334 = 9
3 = 0.075
stock organization size stock organization size
ko = 0 so = 10 1k 8 1 3 00 0
ki = 9 s9 = 10 2 4 9. 1 9 k1 s12.942 = 10
k2 = 16.425 0
k3 = 14.783 0 k45 = 19.622 0
k4 = 13.305 s13.305 = 10 k46 = 17.660 0
k47 = 15.824 0
3 1 4 k1 sl3 = 10 k48 = 14.305 0
k15 = 19.725 0 0 3 3. 1 9 4 k s12.330 = 10
3 5 7. I> 1 ki 0 k50 = 19.622 0
k17 = 15.978 0 cycle: 5 stages cycle: 5 stages
(8)
(9)
4. Conclusions
The presented model only intends to study the qualitative behavior of communities with environmental problems. Nonetheless, in spite of its limitations, we can study some essential aspects in the trend of those communities to organize themselves, and in the relationship between those trends and the accumulation of the community wealth. We stress as well that the model dynamics reproduces qualitative properties that are seen in the real dynamics of the emerging organizations inside different social sectors.
Let us try to review the qualitative conclusions obtained with this dynamics model.
1. It is obvious that the large human conglomerations are some of the main actors responsible for the modern environmental catastrophe. Other important actors are the big corporations and the national states, particularly the world powers, as they try to satisfy their huge economical and political interests. Our model justifies the opinion that these human conglomerates are also main actors in facing the catastrophe. They may act both forcing themselves to change their own environment-damaging activities, and stopping those of the big corporations and state powers. If
this is true, the Evolutionary Game approach is the proper one to discover what behavior patterns will emerge in the long run. Many policies are inspired by the idea that prohibitive laws, punishments and several other projects can by themselves abolish the negative behavior patterns. If these patterns are Long Run Equilibria of the process where the communities are involved that idea may be chimerical and those laws and projects may not achieve their objectives. On the other hand, the environment-protecting laws, decrees and punishments can be very useful if they strengthen the virtuous Long Run Equilibria of the process in which the community is involved. They may even transform the game conditions in order to obtain virtuous equilibria.
2. The qualitative analysis carried out in this work let us obtain the following conclusions: a) If the direct payoff the members of the communities receive from the environment is very small relative to the cost necessary to protect it, they will aggressively act against the environment either directly or indirectly abandoning it. b) If, on the other hand, the benefit is large enough (the larger the better), it will emerge from inside the community a nucleus in charge of protecting the environment and other communal interests, either continuously or cyclically, watching over it and stopping negative activities from both other members of the community and actors with a bigger damaging capacity. This nucleus will also maintain the unity of the members of the community among themselves, as well as their link with their place of origin, reverting some of trends to emigrate.
3. We think that this approach can also be useful to study other sorts of social problems, such as the rising of crime in a society, migration, etc. In our opinion, they also have their origin in the destruction of communal and social links that provoke different social dilemmas of the kind of the prisoner’s one. In order to enrich the model, it would be interesting to build a unique dynamics that model learning and accumulation at the same time, in such a way that it would be necessary only to consider only the equilibria of that process. In the paper, it is necessary that the community achieves an equilibrium of the learning dynamics, in each stage, in order to build the accumulation dynamics. In order to enrich the learning process, it could be interesting to consider that in the communities (specially in the big ones), the individuals may have a much bigger probability to contact their neighbors, establishing an Organization Game with characteristics of a Spatial Game as those of Young (1998). The other interesting topic to study is the evolution of free riders due to their opportunities to accumulate wealth and power. They may have a set of strategies different to the rest. This is true also for those representing the enforcement system that may get out of the community control. That is, they may become new players looking for the use of the power they have got, in their own benefit.
References
Axelrod R. 1984. The Evolution of Cooperation. New York. Basic Books.
Binmore K. 1995 and 1998. Game Theory and the Social Contract. Vol. 1. Playing Fair. Vol. 2. Just Playing. MIT Press. Cambridge, Massachusetts: London.
Freidlin M., Wentzell A. 1984. Random Perturbations of Dynamical Systems. Springer-Verlag: New York.
Glance N. S., Huberman B. A. 1993. The Outbreak of Cooperation. Journal of Mathematical Sociology, 17 (4): 281-302.
Glance N. S., Huberman B. A. 1997. Training and Turnover in the Evolution of Organizations. Organization Science, 8 (1): 84-96.
Hardin G. 1964. The tragedy of the commons. Science, 12: 1243-1248.
Kandori M., Mailath G., Rob R. 1993. Learning, Mutation and Long Run Equilibria in Games. Econometrica, 61: 29-56.
Kandori M., Rob R. 1995. Evolution of Equilibria in the Long Run: A General Theory and Applications. Journal of Economic Theory, 65: 383-414.
Okada A., Sakakibara K. 1991. The Emergence of the State: A Game Theoretic Approach to Theory of Social Contract. The Economic Studies Quarterly, 42 (4): 315-333.
Olson M. 1965. tThe Logic of Collective Action. Cambridge, MA: Harvard University Press.
Ostrom E. 1990. Governing the Commons: The Evolution of Institutions for Collective Action. Cambridge: Cambridge University Press.
Ostrom E. 1998. A Behavioral Approach to the Rational Choice Theory of Collective Action. American Political Science Review, 92 (1): 1-22.
Takayama A. 1985. Mathematical Economics. The Dryden Press. Hinsdale, Illinois.
Weissing F., Ostrom E. 1991. Irrigations and the Games Irrigators Play: Rule Enforcement Without Guards. In: Selten R. (ed.). Game Equilibrium Models II. Methods, Morals, and Markets. Springer-Verlag, Berlin, Heidelberg; 188-262.
Young P. 1993. The Evolution of Conventions. Econometrica. 61: 57-84.
Young P. 1998. Individual Strategy and Social Structure: An Evolutionary Theory of Institutions. Princeton University Press. Princeton, New Jersey.