Научная статья на тему 'Dynamic logic versus GTS: a case study'

Dynamic logic versus GTS: a case study Текст научной статьи по специальности «Математика»

CC BY
71
12
i Надоели баннеры? Вы всегда можете отключить рекламу.
Журнал
Логические исследования
ВАК
zbMATH
Область наук
Ключевые слова
GAME-THEORETICAL SEMANTICS / IF LOGIC / MONTY HALL / DYNAMIC EPISTEMIC LOGIC / CONDITIONAL PROBABILITIES

Аннотация научной статьи по математике, автор научной работы — Sandu G.

In this paper I will compare several solutions to a well known puzzle: Monty Hall. This will enable us to illustrate varitous styles of logical reasoning, and in particular to compare dynamic logic with game-theoretical approaches.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «Dynamic logic versus GTS: a case study»

Dynamic logic versus GTS: A case study1

Gabriel Sandü

abstract. In this paper I will compare several solutions to a well known puzzle: Monty Hall. This will enable us to illustrate varitous styles of logical reasoning, and in particular to compare dynamic logic with game-theoretical approaches.

Keywords: game-theoretical semantics, IF logic, Monty Hall, dynamic epistemic logic, conditional probabilities

1 Monty Hall: Formulation of the problem

There are two formulations of the puzzles. The first one is more general:

Monty Hall shows the contestant C three closed doors: behind one of them there is a prize, the other two are empty. C chooses a door. Monty Hall opens any of the other doors, which is empty. Then she asks C whether he would like to switch the doors, and choose the remaining one which is closed. Is it in C's interest to do it? (Richard Isaac, The Pleasures of Probability 1995, 3)

The second formulation mentions a particular door chosen by the contestant:

Monty Hall (MH) hides a prize behind one of three doors, door 1, door 2, and door 3. The Contestant (C) has to guess it. Suppose his guess is door 1. Monty Hall, who knows the location of the prize and will not open

XI am indebted to Antonina Nepejvoda for the supercompilation of the Monty Hall.

that door, opens door 3 and reveals that there is no prize behind it. She then asks C whether he wishes to change from his initial guess to Door 2. Will changing to door 2 improve C's chances of winning the prize? (Grinstead and Snell, Introduction to Probabilities, 1998)

The second formulation of the problem 'asks for the conditional probability that C wins if she switches doors, given that she has chosen door 1 and that Monty Hall has chosen door 3' (Grinsteas and Snell). On the other side, the first formulation is about the comparative probabilities of two kinds of strategies for C, the 'switch' strategy and the 'stay' strategy:

We say that C is using the 'stay' strategy if she picks a door, and, if offered a chance to switch to another door, declines to do so (i.e., he stays with his original choice). Similarly, we say that C is using the 'switch' strategy if he picks a door, and, if offered a chance to switch to another door, takes the offer. Now suppose that C decides in advance to play the 'stay' strategy. Her only action in this case is to pick a door (and decline an invitation to switch, if one is offered). What is the probability that she wins a car? The same question can be asked about the 'switch' strategy. (Grinstead and Snell, Introduction to Probabilities, p. 137)

It should come as no surprize that the second formulation lends itself naturally to a solution in terms of conditional probabilities and updates in dynamic epistemic logic (DEL). The first formulation, on the other side, suggests a game-theoretical solution. I will give one.

2 The conditional probabilities account

We consider the second variant of the puzzle. We start with some abbreviations: Di is going to abbreviate 'The prize is behind Door i'; B will be an abbreviation for 'Monty Hall opens door 3'. We make the initial assumption that there is an equal probability that the prize is between each of the three doors. Hence

P (Di) = P (D2) = P (D3) = 1/3.

The second assumption is that when she has a choice, Monty Hall opens at random one of the two doors. In our particular situation in which C chose door 1, Monty Hall opens at random door 2 or door 3. Hence P(B) = 1/2.

The other probabilities are calculated as follows.

• When D\, Monty Hall is free to open door 2 or door 3: P (B/D{) = 1/2,

• When D2, Monty Hall has to open door 3: P(B/D2) = 1

• When D3, Monty Hall has to open door 2: P(B/D3) = 0.

In order to solve the puzzle, we have to calculate three conditional probabilities:

a) P(D\/B): the probability that the prize is behind door

1, given that Monty Hall opened door 3

b) P(D2/B): the probability that the prize is behind door

2, given that Monty Hall opened door 3

c) P(D3/B): the probability that the prize is behind door

3, given that Monty Hall opened door 3.

Using Bayes' theorem

P(A/B) = (P(B/A) x P(A))/(P(B))

we obtain

P (Di/B) = P (B/P{m P D = 1/3

P D/B) = P (B/PB) p D> = 2/3

, , x P(B/D3) x P(D3) P (D3/B) = { 1 PB { V =0

Thus the answer to the initial question is: Yes, C should switch to door 2.

3 Conditional probabilities: trees

In order to facilitate comparison, we shall present the same solution given above using trees. (This is also the solution in Grinstead and Snell 1998.) The tree will consist of 12 branches which corresponds extensionally to all the possible choices of Monty Hall and the Contestant. Each maximal branch has the form (x, y, z), where:

• x stands for the door with the prize,

• y stands for the door chosen be C, and

• z stands for the door opened by Monty Hall. In addition we have the following restrictions:

• If x = y, then z takes two possible values; and

• If x = y, then z can take only one value.

Thus the sequence (1,2,3) represents the history:

1. MH hides the prize behind door 1; C chooses door 1; MH opens door 3.

It is customary in this setting to represent events as sets of branches of the tree. For instance, the event Ci of C's choosing door 1 corresponds to

Ci = {(1,1,2), (1,1,3), (2,1,3), (3,1,2)},

the event B of Monty Hall's opening door 3 corresponds to

B = {(1,1, 3), (1, 2, 3), (2,1, 3)},

and the event Ci nB of C's choosing door 1 and Monty Hall opening door 3 corresponds to

Ci n B = {(2,1, 3), (1,1,3)}.

Next, we endow the tree with a probability structure. First, we make the same assumptions as earlier:

• The events of the car being hidden behind door 1, door 2, and door 3 are equiprobable

• The events of C's choosing door 1, door 2, and door 3 are equiprobable.

• Whenever Monty Hall has a choice to open one of two doors, she chooses at random; and when she can open only one door, the probability is 1.

We can now calculate the probabilities of the events which interest us.

• The probability of the event Ci n B = {(2,1, 3), (1,1,3)} :

P({(2,1,3), (1,1,3)}) = P(2,1, 3) + P(1,1, 3) = (i/3 X i/3 X 1) + (i/3 X i/3 X i/2) = i/6

• The probability of the event Di n Ci n B = {(1,1,3)}:

P(Di n Ci n B) = i/3 X i/3 X i/2 = i/i8 = i/i8

• The probability of the event D2 n Ci n B = {(2,1,3)}:

P(D2 n Ci n B) = i/3 X i/3 X 1 = i/9 = i/9

Finally, we apply Bayes' law to compute the probability that the car is behind door 1 given that C chose door 1and Monty Hall opend door 3, P(Di/Ci n B), and the probability that the car is behind door 2 given that C chose door 1and Monty Hall opend door 3, P(D2/Ci n B) . We have

P(Di/Ci n B) =

P(Di n Ci n B)/P(Ci n B) = /18/1/6 = i/3

and

P(D2/Ci n B) =

P(D2 n Ci n B)/P(Ci n B) = 1/9/i/6 = 2/3.

We have obtained the same solution that above.

4 Dynamic (Update) logic 4.1 Product updates

We consider the formulation to Monty Hall in which C chooses door 1. This carves out a subtree from the big tree above which consists of four maximal branches:

01 = (1,1, 2)

02 = (1,1, 3)

03 = (2,1, 3)

04 = (3,1, 2)

In update logic this tree is seen as generated in three stages.

1. First MH put the prize behind one of the three doors. This generates an epistemic model M1 which corresponds to the first layer in the tree.

2. M1 is then updated with C's action a1: C chooses door 1. The result is the product model M2 which corresponds to the second layer.

3. Finally MH (publicly) opens some door. This updates M2 with two possible actions, a2(= she opens door 2), and a3(= she opens door 3). The result is the product model M3which corresponds to the third layer of the tree.

Each action is associated with a set of preconditions which specify in which circumstances (possible worlds) it may be performed. C's and Monty Hall's actions are governed by the following principles which determine their preconditions:

a) C may choose any of the three doors

b) Monty Hall can open only a door thatC did not choose, and where the car is not hidden.

Now some of the details.

The epistemic model M1 has the form

M1 = (W1,Rb ,R1MH)

where

• Wi = {wi, w2, w3}, (wi represents the world where the car is behind door 1, etc)

• RMh = {(w,w) : w € Wi} (Monty Hall's actions are accessible to herself)

• RC = Wi x Wi (Monty Hall's actions are not accessible to the contestant C).

At stage (2), Mi is updated with the action model Ai = (Vi,QC, QMh), where Vi = {ai}. Given that aiis a public action, both accessibility relations QC and QlMH are Vi x Vi. From (a) we know that Pre(ai) = Wi. Hence

M2 = Mi x Ai = (W2,Rc,RMH)

where

• W2 = Wi X Vi .

• RC2 = W2 X W2 (all the worlds in W2 remain indistinguishable to C)

• RMh = {(w,ai), (w,ai) : w € Wi} (Monty Hall knows exactly where she is).

Let us abbreviate the possible worlds in W2 by:

vi = (wi, ai) v>2 = (w2, ai)

v3 = (w3,ai)

Finally, the product model M2 is updated with the action model

A2 = (V2,QC ,QMH )

where

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

• V2 = {a2,a3}, (a2= Monty Hall opens door 2, etc).

• QC = QMH = {(a2, a2), (a3, a3)}.

From (b) we know that Pre(a2) = {v^v3} and Pre(a3) = {v1,v2}. The result of the update is the product model

M3 = M2 xA2 = (W3, R3c, R3MH)

where W3 = W2 x A2 and the accessibility relations R?C and R3MH inherit the uncertainities from M2.

Let us abbreviate the possible worlds of W3 by:

x = (w1, a1, a2) y = (wi,ai,a3) z = (w2, ai, a3) u = (w3, a1, a2)

In order to give a solution to the puzzle, we need to establish what C knows at this stage, i.e. RC. Given that a2 and a3 are public actions, C knows, after a2 is performed, that she could be either in x or in u, i.e. RCxu and RCux (plus the corresponding reflexivity conditions). And after a3 is performed, she knows she can be either in y or in z, that is, RCyz and RCzy (plus the corresponding reflexivity conditions). Graphically:

W\ ■ ■ W2 ■ ■■ W3

i i i

ai ■ ■ ai ■ ■ ■ ai

/ N i i

a2 a3 ■ ■■ a3 a2

x y z u

4.2 Product updates with probabilities

Earlier on, we endowed trees with a probability structure. We now do the same for product update models. I follow very closely van Benthem (2003).

For epistemic models M, we consider, for each agent i, the equivalence classes Di,s = {i : Rist}. Probability functions Pi,s are defined over the probability space Dis. For simplicity, we take these functions to be uniform: all the worlds in the set Dis are equiprob-able. Following van Benthem, we simplify matters even more in finite models and assume that the functions Pis assign probabilities

Pi,s(w) to single worlds w. We can then use sums of these values to assign probabilities to propositions, viewed as the set of worlds where they are true. Then we can interpret Pi)S(p) as assigning a probabilistic value assigned to p by the agent i in the possible world s. In case this value is 1, this will correspond to the assertion Kip.

Next, we assign probabilities to actions in the universe of the action models A. This is done relatively to a state s. The basic notion is Pi)S(a): the probability that the agent i assigns to action a in the world s. In our example we assume that all this has been settled in some way or another, giving us agents' probabilities for worlds, and also for actions at worlds.

Finally we are ready to handle the puzzle. We are interested in the last update. Given that Monty Hall's action of opening a door is a public one, reference to the agent i does not matter, and we shall be concerned with probabilities functions of the form Ps(a). We are interested in computing the relevant probabilities in

More specifically, we are interested in the probabilities the agents assign to the possible worlds in W3. As mentioned, these worlds have the form v = (wi, ai), etc.

The central notion is Pc,(v,a)(v',b): the probability agent C assigns to the world (v', b) in the world (v, a). In order to compute it, we need to know the probability Piv (v') that C assigns to the world v' in v, and the probability Pv (b) assigned to the action b in the world v'. But this is not enough, for the action b could have been performed from any other world u indistinguishable (for agent C) from v. So we also need the probabilities PC,v(u) for every u such that RCvu together with the probabilities Pu(b). Then we use the formula:

M3 = M2 x A2 = (W3, RC, R3mh)•

P C,v (v ) x P v' (b) Erc vu PC,v (u) x Pu(b)

PC, v (v') X Pv' (b)

Thus in our case we need to compute the value of

PC (Vl) = Pc

and that of

We have

Pc (V2) = Pc )(w2,ai)-

Pc,(wi,ai){W1,al)

Pc,wi(w\) X Pwi(ai)

Pc,wi(wi) X Pwi(ai)+Pc,W1W2) X Pw2(ai) + PcwW) X Pw3(ai)

¡X-

= 1

-xl+-xl+-xl 3

3 3 3

A similar computation yields

PC,v1 (v2) = Po,(w1,a1)(W2,ai) = 3

Finally we are interested in

PC,x(y) = PC,(vi,a3)(vl,a3i)

and

Pc,x(z) = Pc,(vi,a3)(V2,a3).

The first one represents the probability that C assigns in the (actual) world (vl,a3) (the prize is behind door 1, C chooses door 1, Monty Hall opens door 3) to the very same world; the second one represents the probability that C assigns in the world (vl,a3) to the world (v2, a3) which is identical to the actual world, except for the prize being behind door 2. We have

PC,(vi,a3) (vl,a3) =

PC ,v1 (vi) x Pvi(a3)

Pc,vi (vi) X Pvi(a3) + Pc,vi (V2) X PV2(a3)

1 1

~ X - 1

3 2 _ 1

11,1 3

- X - + - X 1

323

Similarly

PC )(v2 7 a3) Pc,vi (V2) X Pv2 (a3)

Pc,vi (vi) X Pvi(a3) + Pc,vi (v2) X Pv2(a3)

- x 1 o

3__= 2

1 11, 3

323

We recover the same result as earlier: it is rational for C to switch to door 2.

5 Game-theoretical solutions

We consider the second formulation of the puzzle. To my knowledge, there is no full-fledged game-theoretical solution in the literature.

I will first describe a solution, due to Isaac (1995), which comes close to a game-theoretical one.

5.1 Isaac' solution

Isaac represents the puzzle as consisting abstractly of the succession of three actions:

a) C chooses one of the three doors

b) Monty Hall opens one of the two remaining doors, the one without a prize

c) C switches doors

followed by an a label W or L which shows whether C won or lost. The door where the prize is hidden is denoted by 1, the other two by 2 and 3.

Thus the sequence (1, 2, 3, L) should be read:

C chooses the door where the prize is; MH opens the other door 2; C switches to door 3; C looses.

Notice that the stage in the puzzle which indicates where Monty Hall has hidden the prize, is not explicitly represented. On the other side, there is one an extra-layer which represents C's action

of switching doors and another extra-layer which specifies who lost or won. Notice also that the labels 1,2 and 3 are not rigid, they do not designate any concrete door.

When we think of C's action of switching doors, 4 possible situations can occur:

• (C chose the door where the prize is; Monty Hall opens the other door, 2; C switches doors; C looses): (1, 2, 3,L)

• Identical with the previous one, except that the last two choices are reversed: (1,3,2, L)

• (C chose one of the doors without the prize, say 2; Monty Hall opens the other door without the prize, 3; C switches to 1; C wins): (2, 3,1,W)

• Identical with the previous one, except that the first two choices are reversed: (3,2,1, W)

We now have to endow the space

{(1, 2, 3, L), (1, 3, 2, L), (2, 3,1, W), (3, 2,1, W)}

with probabilities.

It is reasonable to assume that the probability that C chose the door where the prize is equals the probability that he chooses the door 2 (without the prize), and the probability that he chooses door 3. The event corresponds to

{(1, 2, 3, L), (1, 3, 2, L)}

and the last two ones to {(2,3,1, W)} and {(3,2,1, W)}. So we assume that

P({(1, 2, 3, L), (1, 3, 2, L)}) = 1/3 P({(2, 3,1,W)}) = 1/3 P({3, 2,1,W)} = 1/3

We do not know the probabilities P(1, 2, 3, L) and P(1, 3, 2, L) but we shall not need them. What we are interested in is the event 'C wins' which corresponds to

{(2, 3,1,W), (3,2,1,W)}

and the event 'C looses' which actually turns out to be the same event as 'C chose the door where the prize is'. Obviously

P({(2,3,1,W), (3,2,1,W)}) = P({(2, 3,1,W)}) + P({(3, 2,1, W)}) = 2/3

We now have an answer to our initial puzzle. Using the 'switching' strategies C will win with probability 2/3 and loose with probability l/3.

A similar argument will represent the 'stick to the same door' strategy by

{(1, 2,1,W), (1, 3,1,W), (2, 3, 2,L), (3, 2, 3,L)}

By an argument similar to the previous one, we see that the probability that C wins (= {(1, 2,1, W), (1,3,1, W)})) is l/3 whereas the probability that C looses is 2/3.

Isaac's conclusion is: switching doors gives C a probability of 2/3 to win the car, whereas sticking to his initial choice will give him a probability of l/3 to win the car. Notice that:

• The solution is general, it concerns the first variant of Monty Hall puzzle.

• The solution does not appeal to conditional probabilities.

• There is a layer in the representation which makes explicit C's second guess.

We shall incorporate these elements in our game-theoretical solution.

5.2 A game-theoretical solution

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

We shall formulate Monty Hall as an extensive, finite win-loss game of imperfect information played by two players: the contestant C tries to identify the door with the prize whereas his opponent Monty Hall tries to deceive him. The game tree will extend the tree we introduced in connection with the conditional probabilities approach. Maximal branches will have now the form (x, y, z, t) with an extra

term t to stand for the final choice of C. In this setting the maximal sequence (1,1,2,1) represents the possible play of the game:

MH hides the prize behind door 1; C chooses door 1; MH opens door 2; C chooses door 1.

This play is a win for C if the last element of the sequence is the same as the first: in his second choice C chooses the door where the prize is. Note that in each play (x, y, z,t), x and z are choices made by Monty Hall, whereas y and t are choices made by C. To specify the information of the players, notice that

C1 Any histories (x) and (x') are equivalent (indistinguish-

able) for player C

C2 Any histories (x,y,z) and (x',y',z') such that y = y'

and z = z' are equivalent for player C.

(C1) tells us that C does not know the door where the prize is, when making his first choice. (C2) expresses the fact that C does not know the door where the prize is, when he makes his second choice.

Next we specify the strategies of the players. We shall take the strategies of C to consist of pairs (fy, ft) of functions: fywill give her a choice for y and ft achoice for t. Given the requirement (C1), fy will have to be a constant function, i.e. fy (x) = fy (x') for any doors x, x' where Monty Hall hides the price. This amounts to fy being an individual i (a door). Similar comments apply to ft : given the requirement (C2), we can assume that ft is a function h of two arguments, y and z. All in all we shall take C's strategies to consist of pairs (i,hi), where i stands for a door and hi for a function of two arguments (y, z).

A strategy (i, hi) is winning if C wins every play where she follows it. The notion of 'following a strategy' is standard in game theory and we shall not give a formal definition.

We focuse on two kinds of strategies for player C (all the others are weakly dominated by them).

• The 'stay' strategy, sCtay: choose a door, then stick to the initial choice no matter what Monty Hall does.

It is encoded by three strategy pairs, i.e.,

SSctay = {(i,hi): i = 1, 2, 3},

where hi(y, z) = i, for every y and z.

Each such strategy (i,hi) is followed in every play

(x,i,z,hi(i,z))

for any x and z. As mentioned earlier, it is winning whenever C's initial guess is correct, i.e., i = x = hi(y,z), and loosing otherwise. Obviously none of the 'stay' strategies is winning simpliciter.

• The 'switch' strategy, sCWitch: choose a door, and then after Monty Hall opens a door, switch doors.

This strategy is encoded by three strategy pairs SCWitch = {(1,/i), (2,/2), (3, /3)}

where

/i(1, 2) = 3 /i(1, 3) = 2 /2(2, 3) = 1 /2(2,1) = 3 /3(3, 2) = 1 /3(3,1) = 2

Each of the three strategies wins in two cases: when the initial choice is incorrect, i = x; and it looses in one case, when the initial choice is correct.

Monty Hall's strategies consist of pairs (j,g): j is a value for x; and the function g associates to each argument (x, y) a value for z.

The only strategy available for Monty Hall (given the rules of the game) is: 'hide the prize behind a door, and after C chooses a door, open any other door'. Thus her set of strategies, SMH, contains the following strategy pairs:

(1,gi): gi(1,1) = 2 gi(1, 2) = 3 gi(1, 3) = 2

(1,g'i): gi (1,1) = 3 gi (1, 2) = 3 gi (1, 3) = 2

(2,g2): g2(2,1) = 3 g2(2, 2) = 1 g2(2, 3) = 1

(2,g2): g'2(2,1) = 3 g2(2, 2) = 3 g2(2, 3) = 1

(3,g3): g3(3,1) = 2 g3(3, 2) = 1 g3(3, 3) = 1

(3,g3): g3(3,1) = 2 g3(3, 2) = 1 g3(3, 3) = 2

Each of the strategy pair (j,gj) is followed in every play of the form (j, y, gj(j, y),t), for any y and t. It is winning whenever j = t and loosing otherwise. None of these strategies is winning sim-pliciter.

Monty Hall formulated as an extensive game of imperfect information is indeterminate: neither Monty Hall nor Eloise has a winning strategy.

To overcome indeterminacy we move to mixed strategies. Before, we need few definitions and results from classical game theory.

5.2.1 Strategic games: equilibria in pure strategies

A finite two player strategic game has the form r = (SI, SII, uI, uII) where:

1. Si is the set of strategies of the first player

2. SII is the set of strategies of the second player

3. uI and uII are the payoff functions of the players. That is, for every a € SI and r € SII, uI(a,r) gives player I a payoff, which is a real number; and the same for uII.

Fix a 2 player strategic game r = (SI,SII,uI,uII). When a* € SI and r* € Sii, the pair (a* ,r*) is an equilibrium in r iff the following two conditions are jointly satisfied:

(i) uI(a* ,r*) > uI(a, r*) for every strategy a in SI. In other words

uI (a* ,r *) = maxa uI (a,r *)

(ii) uII(a*,r*) > uII(a*,r) for every strategy r in SII . In other

words

uII(a* ,r*) = maxT uII(a* ,r)

When SI and SII are finite, there is a simple algorithm for identifying the equilibria:

• In each column, circle the maximum payoffs of player I (if the maximum payoff occurs more than once, circle every occurrence)

• In each row, circle the maximum payoffs of player II

• A pair of strategies (a*,r*) is an equilibrium in r iff both uI(a*,r*) and uII(a*,r*) are circled.

It is straightforward to transform the extensive Monty Hall game into a finite 2 player win-lose strategic game.

We shall take the two players to be Monty Hall and C.

We have already specified the set of strategies of Monty Hall, SMH, and the set of strategies of the Contestant, Sc. Notice that whenever Monty Hall follows one of her strategies in SMH, and C follows one of his strategies in SC, a play of the extensive game is generated which is a win for either one of the players. For instance, when Monty Hall follows (3,g3) and C follows (1,hi), the result is the play (3,1, 2,1) which is a win for Monty Hall. This will fix the payoff functions uMH and uc. Here is the matrix representation of the strategic Monty Hall game:

(i,gi) (1,g'i ) (2,92) (2,92 ) (3,93) (3,93 )

(1,h) (1,0) (1,0) (0,1) (0,1) (0,1) (0,1)

(2,h2) (0,1) (0,1) (1,0) (1, 0) (0,1) (0,1)

(3,ha) (0,1) (0,1) (0,1) (0,1) (1,0) (1,0)

(1,/i) (0,1) (0,1) (1,0) (1, 0) (1,0) (1,0)

(2, /2) (1,0) (1,0) (0,1) (0,1) (1,0) (1,0)

(3, /3) (1,0) (1,0) (1,0) (1,0) (0,1) (0,1)

The rows represent the strategies of the Contestant and the colums those of Monty Hall. The reader may convince himself, by applying the algorithm described above, that there is no equilibrium in the game. This is, obviously, nothing else than the counterpart of the indeterminacy of the extensive game of imperfect information.

5.2.2 Strategic games: mixed strategy equilibria

Let r = (Si, SII, uI, uII) be a two player finite strategic game.

• A mixed strategy v for player p is a probability distribution over Sp, that is, a function v : Sp ^ [0,1] such that

£r&sp v(t) = 1

• v is uniform over Si C Si if it assigns equal probability to all strategies in S' and zero probability to all the strategies in Si — Si.

Let A(Sp) be the set of mixed strategies over Sp. If i € A(SI) and v € A(SII), the expected utility for player p is given by:

Up(i,v )= Y1 ¡(a)v (r )up(a,r)■

a&S! t&Sn

We can identify a pure strategy a € SI with a 'degenerate' mixed strategy which assigns to a probability 1 and 0 to all the other strategies in SI. That is, when a € SI and v € A(SII), we let

Up(a,v) = ^ v(r)up(a,r).

T&SH

Similarly, when r € SII and i € A(SI), we let

uP(I,t) = Y^ I(a)uP(a,r)■

Let r = (SI, SII, uI, uII) be a two player finite strategic game which is also a win-lose game (the only payoffs are 0 and 1). For € A(SI) and v* € A(SII), the definition of (i*,v*) being a mixed strategy equilibrium in r is completely analogue to the earlier one. The following results are well known.

Theorem 1 (von Neuman's Minimax Theorem). Every finite, two-person, win-lose game has an equilibrium in mixed strategies.

Corollary 1. Let (i,v) and (i',v') be two mixed strategy equlib-ria in a win-lose game. Then Up(i,v) = Up(i',v').

The above results tell us that for two-player finite win-lose games an equilibrium always exists (von Neumann's theorem), and in addition, any two mixed strategy equilibria deliver the same expected utility. We shall take the value of the game to be the expected utility delivered by any of the mixed strategy equilibrium in the game.

We give a simple algorithm for identifying mixed strategy equilibria:

Proposition 1. In a finite, two player strategic game, the pair (p*, v*) is an equilibrium if and only if the following conditions hold:

1. Ui(p*, v*) = Ui(a, v*) for every a € Si in the support of p*

2. Uii(p*,v*) = Uii(p*,T) for every t € Sii in the support of v*

3. UI (p*,v*) > Ui (a, v *) for every a € SI outside the support of p*

4. Uii(p*,v*) > Uii(p*,t) for every t € Sii outside the support of v *.

Here are few results from classical game theory which help us to reduce a game to a smaller one, after which we can apply the Proposition above.

Definition 1. Let r = (SI,SII,uI,uII) be a finite two player strategic, win-lose game. For a, a' € SI, we say that a' weakly dominates a if the following two conditions hold:

(i) For every t € SI :

ui (a', t) > ui (a, t)

(ii) For some t € SII :

ui (a', t) > ui (a, t).

A similar notion is defined for Abelard.

The following result enables us to eliminate weakly dominated strategies.

Proposition 2. Let r = (Si,Sii,ui,uii) be a finite 2 player, win-lose game strategic game. Then r has an equilibrium in mixed strategies (pi,pii) such that for each player p none of the strategies in the support of ap is weakly dominated in r.

A proof of this fact may be found in Mann et al (Proposition 7.22).

Definition 2. Let r = (Si,Sii,ui,uii) be a finite two player, win-lose strategic game. For a, a € Si, we say that a is payoff equivalent to a if for every t € Sii : ui (a', t) = u^i (a, t).

A similar notion is defined for Abelard. The next Proposition allows us to reduce the game to a smaller one by eliminating all the payoff equivalent strategies, except one.

Proposition 3. Let r = (Si,Sii,ui,uii) be a finite two player, win-lose strategic game. Then r has an equilibrium in mixed strategies (ii, iii) such that for each player p there are no strategies in the support of ap which are payoff equivalent.

A proof of this fact may be found in Mann et al (Proposition 7.23).

We now return to the strategic Monty Hall game. We notice that each strategy (i,hi) is weakly dominated by some strategy (j,fj). For instance (1,h\) is weakly dominated by (2,f2). Hence by the second proposition above we know that that game has the same value as the game

(i,gi) (i,g'i) (2,92) (2,92) (3,93) (3,93)

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

(i,fl) (0, i) (0, i) (i, 0) (i, 0) (i, 0) (i, 0)

(2, f2) (i,0) (i, 0) (0, i) (0, i) (i, 0) (i, 0)

(3, f3) (i, 0) (i, 0) (i, 0) (i, 0) (0, i) (0, i)

The next observation is that the strategies (i,gi) and (i,gi) are payoff equivalent for Abelard. Hence by the last proposition we know that the value of the game is the same as that of the game:

(i,9i) (2,92) (3,93)

(i,fl) (0, i) (i, 0) (i, 0)

(2,f2) (i, 0) (0, i) (i, 0)

(3,f3) (i, 0) (i, 0) (0, i)

Let i be the uniform probability distribution, i.e. ¡(1, fi) = 1 and

v the uniform probability distribution v(j,gj) = 3. It is straightforward to check, using the first proposition above, that (i, v) is an equilibrium. The expected utility of player C (i.e., the value of the game) for this equilibrium is 2/3.

Notice that C's strategy p assigns an equal probability to each of the pure strategies which implement the 'switch' strategy. The important thing is not that it returns to player C an expected utility of 2/3 but rather that it weakly dominates the 'stay' strategy. If we want to compute the expected utility returned to C by the latter strategy, we should return to the bigger game where both the 'switch' and the 'stay' strategies are listed. We know that the value of the game described there is the same as that delivered by the equilibrium pair (p, v). In that game, let v* be the same as v, and

let p* be the probability distribution such that: p*(i,fi) = - and

3

p*(i, hi) = 0. The pair (p*,v*) is an equilibrium in this larger game. We compute UC((i, hi),v*):

Uc ((i, hi),v *)= £ v*(t )uc ((i,hi),t) = 2 x 1 x 1 = 3

t&SMH

In other words, the 'stay' strategy returns an expected utility of 3. We have obtained the same result as in Isaac's approach.

6 IF logic

IF logic (Independence-Friendly logic) is an extension of first-order logic which contains quantifiers and connectives of the form

(Bx/W), (Vx/W), (y/W), (A/W)

where the interpretation of e.g. (Bx/W) is: 'the choice of x is independent of the values of the variables in W'. When W = 0, we recover the standard quantifiers. For illustration, the sentence

• For every x and x', there exists a y depending only on x and a y' depending only on x' such that Q(x, x', y, y') is true

is rendered in the new symbolism by

VxVx'(By/{x'})(3y'/{x, y})Q(x, x', y, y').

IF sentences are interpreted by semantical games of imperfect information (Mann et al). However, we prefer to give an interpretation in terms of Skolem functions and Kreisel counterexamples.

The skolemized form or skolemization of p, with free variables in U, Sku(p), is given by the following clauses:

1. SkU(0) = 0, for 0 a literal

2. SkU(0 ◦ 0) = SkU(0) ◦ SkU(0), for o € [V, A}

3. SkU((Vx/W)0) = VxSkuu{x}(0)

4. SkU((3xW)0) = Sub(SkUu{x}(0),x,f(yi,...,yn))

where y\, ■■■,yn are all the variables in U — W and f is a new function symbol of appropriate arity. We abbreviate Sk0(p) by Sk(p).

Skolemizing makes explicit the dependencies of variables. We obtain an alternative definition of truth. For every IF formula p, model M, and assignment s which includes the free variables of p we let: M, s l=+fc p if and only if there exist functions g\,..,gn of appropriate arity in M to be the interpretations of the new function symbols in SkU (p) such that

M,gl,...,gn,s = SkU (p)

where U is the domain of s. The functions g\,...,gn are called skolem functions.

We now define the dual procedure of Skolemization. The Kreisel form Ktu(p) of the IF formula p in negation normal form with free variables in U is defined by:

1. Ktu (0) = —0, for 0 a literal

2. KTU(0 V 0) = KTU(0) A KTU(0),

3. KTU (0 A 0) = KTU (0) V KTU (0)

4. Ktu((3x/W)0) = VxKtuu{x}(0)

5. Ktu((Vx/W)0) =Sub(KTuu{x}(0),x,g(yi,...,ym))

where y\,ym are all the variables in U — W.

We now obtain an alternative definition of falsity. For every IF formula p, model M, and assignment s which includes the free variables of p we let: M, s =^fc p if and only if there exist h\,hm in

M to be the interpretations of the new function symbols in Kr(p) such that

M, hb ...,hm,s N Kru(p)

where U is the domain of s. We call hi,...,hm Kreisel counterexamples.

The Monty Hall game is expressed in IF logic by the sentence Vx(By/{x})Vz[x = z A y = z ^ (Bt/{x})x = t]

or equivalently by the sentence pMH

Vx(By/{x})Vz[x = z V y = z V (Bt/{x})x = t].

We can think of the Contestant, C, as the existential quantifier and disjunction, and of Monty Hall as the universal quantifier. We do not want to push the formalization too far. The intuitive reading of our sentence should be clear: For all Door x where the prize is hidden by Monty Hall, for every door y guessed by C, for every door z opened by Monty Hall, if z is distinct from x and from y, then C has one more choice to identify the door where the prize is. The Skolemized form of pMH is

VxVz[x = z V c = z V x = f (c, z)]

and its Kreisel form is

Vy[d = g(d, y) V y = g(d, y) V Vt(g(d, y) = t)]

where c, d, f and g are new function symbols. The reader should convince herself that on models M = {1,2,3} (corresponding to the three doors) the possible values of (c, f) correspond to the set of strategies of the Contestant; and the possible values of (d,g) correspond to the set of strategies of Monty Hall.

Then we can identify the value of pMH in the model M = {1,2,3} to be 2/3.

7 Conclusion

The updated account gave the same solution to the Monty Hall problem as the classical account based on conditional probabilities. Both approaches conditionalize, the former on actions, the second on propositions and yield two posterior probabilities: P(D1/B) = 1/3 and P(D2/B) = 2/3 in the latter; and Pc,(vi ,a3)(v 1,0,3) and Pc,(vi,a3)(v2,a3) in the former. I take both approaches to provide a solution to a particular, local, decision theoretical problem, that of explaining why a particular action is more rational than another in certain particular circumstances.

Yet there are important differences between them. Van Benthem points out that the conditional probabilities account describes what would be the probability that the car is behind door 1 if B were to happen (alternatively, if action a3 were to be performed). On the other side, he takes P(wi,ai)(p) (reference to the agent C has been erased) to describe rather the probability of p in the state (w1,a3) reached now after action a3 has been performed. 'The [latter] takes place once arrived at one's vacation destination, the [former] is like reading a travel folder and musing about tropical islands. The two points are related, but not identical'.

References

[1] Baltag, A., Moss, L., and S. Solecki, The Logic of Public Announcements, Common Knowledge, and Private Suspicions, Proceedings TARK, pp. 43-56, 1988.

[2] Barbero, F., and G. Sandu, Signaling in Independence-friendly logic, forthcoming.

[3] Van Benthem, J., Gerbrandy, J., and B. Kooi, Dynamic Update with Probabilities, Studia Logica 93(1):67-96, 2009.

[4] Van Benthem, J., Conditional Probability Meets Update Logic, Journal of Logic, Language and Information 12(4):409-421, 2003.

[5] Galliani, P., and G. Sandu, Games in Natural Language, in R. Ver-bruke, S. Gosh, and J. van Benthem (eds.), Handbook of Strategic Reasoning, Springer. fortchcoming.

[6] Grinstead, Ch. M., and L. Snell, Introduction to Probabilities, American Mathematical Society, 1998 (second edition).

[7] Hintikka, J., and G. Sandu, Informational Independence as a Semantic Phenomenon, in J. E. Fenstad et al. (eds.), Logic, Methodology and Philosophy of Science, vol. 8, Amserdam: Elsevier, 1989, pp. 571-589.

[8] Hintikka, J., and G. Sandu, Game-Theoretical Semantics, in: J. van Benthem & A. ter Meulen (eds.), Handbook of Logic and Language, Elsevier, Amsterdam, 1997, pp. 361-410.

[9] Hintikka, J., Principles of Mathematics Revisited, Cambridge, UK: Cambridge University Press, 1996.

10] Isaac, R., The Pleasures of Probabilities, Springer-Verlag, New York, 1995.

11] Mann, A. I., Sandu, G., and M. Sevenster, Independence-Friendly Logic: A Game-Theoretic Approach. Cambridge, UK: Cambridge University Press, 2011.

12] Van Roou, R., Signalling Games Select Horn Strategies, in G. Katz, S. Reinhard, and Ph. Reuter (eds.), Sinn und Bedeutung VI, Proceedings of the Sixth Annual Meeting of the Geselschaft für Semantik, University of Osnabrück, 2002, pp. 289-310.

13] Sandu, G., Game-Theoretical Semantics, in L. Horsten and R. Pet-tigrew (eds.), Continuum Companions to Philosophical Logic, 2011, pp. 251-300.

14] Sandu, G., Independence-Friendly Logic: Dependence and Independence of Quantifiers in Logic, Philosophy Compass 7 (10):691-711, 2012.

15] Sevenster, M., Branches of Imperfect Information: Logic, Games, and Computation, PhD Thesis, Amsterdam: University of Amsterdam, 2006.

16] Sevenster, M., and G. Sandu, Equilibrium Semantics of Languages of Imperfect Information, Annals of Pure and Applied Logic 161(5):618-631, 2010.

17] Tao, T., Printer-friendly CSS, 2007 and nonfirstoderisability, in Terence Tao, What's new: Updates on my research and expository papers, discussion of open problems, and other maths related topics.

i Надоели баннеры? Вы всегда можете отключить рекламу.