Game Theory | World of Mathematics
Many games involve rolling dice, shuffling cards or spinning wheels, and we can use probability to determine how likely certain outcomes are. This chapter, on the other hand, is about games where there is no ‘luck’ involved: these games are called Combinatorial Games.
One example of a combinatorial game is chess, but it is so complex, with so many different moves and positions, that it is almost impossible to analyse chess using the methods we will develop throughout this chapter. Here is an example of a much simpler game:
There are two boxes with chocolates, and two players eat them alternatingly. At each turn, a player has to eat one or more chocolates, but only from one box at a time. For example, a player could eat three chocolates from box A, but not one from box A and one from box B.
Both players continue, alternatingly eating chocolates, until both boxes are empty. Whoever gets the last chocolate wins.
Here you can try playing this game against the computer. Start by clicking on all the chocolates you want to eat, then click the button to end your turn.
After some time you may notice that you always lose. In fact, it it clear from the beginning that the computer always wins unless it makes a mistake. (And computers never make mistakes…) The following sections will explore different methods to analyse combinatorial games, to find winning strategies and to determine whether it is better to go first or to go second.
If you have already noticed a pattern and worked out a winning strategy, the following sections may seem rather complicated for solving a simple game. However the same methods can them be applied to much more complex games.
One method to think about combinatorial games is to make a list of all possible outcomes. This is best done in a tree diagram, where every fork shows all possible choices a player could make. Here is the tree diagram for a slightly simpler version of the game above, with only three chocolates per box.
Whoever empties the first box loses, because the opponent can empty the other box, thereby taking the last chocolate. Therefore Player 1 only has two sensible choices: taking one or taking two chocolates from one box.
Now it’s Player 2’s turn. Player 2 also doesn’t want to empty any box. This means there are two possibilities in one case, and three in the other.
It’s Player 1’s turn. With fewer chocolates left there are much fewer possibilities that don’t involve emptying the first box, which would lead to certain loss.
We continue: if there is one chocolate in both boxes, one player has to empty the first box. The opponent can then take the last chocolate from the second box and wins.
Let us highlight the final positions in which Player 1 and Player 2 win.
So far the game seems quite fair: there are three winning positions for each player. Now let us think about the positions second from last.
Once we have arrived at any of these positions we already know who is going to win. If you can only go to a winning position for Player 1, this also has to be a winning position for Player 1. And the same is true for Player 2. We can colour the positions second from last according to which winning position they lead to.
And we can do the same again: whenever we can only go to winning positions of one kind, we colour the previous position in the same colour.
Unfortunately we will get stuck at some point…
When at these two positions there is a choice: we could either go to a winning position for Player 1 or to a winning position for Player 2. Here we have to make an assumption that both players are intelligent and will play in their best interest. If it is Player 1’s turn he/she will of course go to his/her own winning position and the same for Player 2.
We have coloured the first case blue because Player 2 has a choice to go to his/her own winning position. The second case is coloured green because Player 1 has a choice to go to his/her own winning position. These are the positions where players have to be careful not to make a mistake.
The same is happens here: Player 2 has the option to go to blue and green winning positions, but – if he plays intelligently – he/she will of course choose blue.
It seems that whatever Player 1 does on the first turn, he/she will always end up on a winning position for Player 2.
Player 2 is destined to win from the beginning – unless he makes a mistake.
This method is useful for simple games, but impractical if we have boxes with many more chocolates. If there are five chocolates per box, we would have to consider more than 10,000 possibilities!
P and N-Positions
In the tree diagram above we had many copies of the same state in different branches of the tree. Instead let us draw a diagram of all the different states, and connect two states with an arrow if a player could move from one to the other. (Remember that you can’t put chocolates back or take chocolates from more than one box.)
We will again highlight various states with different colours, but the colours will have a different meaning than above.
Here you can see all possible states of the game we were playing.
These arrows show the possible moving directions. On your turn you can move either down or to the right, however far you want.
We know that once we have reached the bottom right corner, the previous player will have taken the last chocolate and won. We call it a P-position and colour it blue.
If we are in any state that leads to the P-position, the next player will win. We call these N-positions and colour them red.
From here you can only move to N-position. Once you have done that, you opponent is next and can win. Therefore this has to be a P-position.
Any positions that leads to the new P-position have to be an N-position – The next player will win. Can you see a pattern?
Here is another P-position from where you can only move to N-positions.
These two are N-positions because the next player can move to a P-position.
Finally the starting position is a P-position. The “previous” player wins, and whoever makes the first move is destined to lose.
The pattern is quite obvious: all positions along the diagonal, where there are the same number of chocolates in each box, are P-positions. All the other positions are N-position. And this extends to bigger box sizes, including nine chocolates per box like in the game at the beginning of this article. You always made the first move, so you had no chance of winning unless the computer had made a mistake.
A P-position is a position in which the previous player will win (who moved to that position) and a N-position is a position where the next player will win (who moves away from that position). When playing, you want to make sure that you always end your turn on a P-position.
We also observed that from a P-position you can only move to N-positions, and from a N-position you can move to at least one P-position.
|Starting on a P-position, the next player will lose. Therefore the next player must only be able to move to N-positions.||Starting on an N-position, the next player will win. Therefore there must be at least one P-position where the next player can move to. (The game will change if you make a mistake.)|
To analyse a game, we have to start from the end when we know who would have won. Then we can work backwards using the two rules above to classify all positions in the game.
In any game that can be analysed using this method, the outcome is determined from the beginning. If you are unlucky and you are the player destined to lose, there is nothing you can do except hope that your opponent makes a mistake…
The Game of Nim
The game we have been thinking about is a variant of Nim. The winning strategy for only two boxes of chocolates is easy to find, but things get more interesting when we have three or more boxes. Instead of boxes of chocolates we will simply use piles of counters: you are allowed to take as many counters as you want, but only from one pile at a time. We can denote the various states of the game using numbers: for example, (2,5,4) means there are three piles with 2, 5 and 4 counters respectively.
Notice that it doesn’t matter which order the piles are in, or whether there are additional piles with zero counters. For example, (2,5,4) = (5,0,4,2). We have already shown that (1,1), (2,2), (3,3), … are all P-positions, and there is a simple method for determining whether positions with three or more piles are P or N. This method may seem quite unexpected and unrelated to game theory. It arises when you analyse P and N-positions mathematically.
A Nim state (a, b, c, …) is a P-position if the binary digital sum or Nim sum of a, b, c, … is 0. Otherwise it is a N-position. The Nim sum is often written as a ⊕ b ⊕ c ⊕ … and can be calculates as shown in the following example.
To find the Nim sum 3 ⊕ 6 ⊕ 7 we proceed as follows:
4 2 1 3 0 1 1 6 1 1 0 7 1 1 1 2 0 1 0
Above you can see the three numbers 3, 6 and 7 in different rows and powers of 2 (1, 2, 4, 8, …) in different columns. We first need to write 3, 6 and 7 in binary, which means writing them as a sum of powers of 2.
Note that 3 = 1 + 2. We need to add the powers 1 and 2, but we don’t need 4. Therefore 3 = 011 in binary.
Note that 6 = 2 + 4. We need the powers 2 and 4, but we don’t need 1. Therefore 6 = 110 in binary.
Note that 7 = 1 + 2 + 4. Here we need all three powers of two.
Now we have to add the columns we just created, but without carrying digits.
In this column, we have an even number of 1s so the answer is 0.
In this column we have an odd number of 1s so the answer is 1.
In this column we have an even number of 1s so the answer is 0.
The binary digital sum of 3, 6 and 7 is 010, and when we convert this from binary we get 2. Since the Nim sum is not 0, the Nim state (3,6,7) is a N-position. Removing two counters from any pile will make the Nim sum 0, so this would represent moving to a P-position.
Nim has several important properties:
- Exactly two opponents move alternately.
- The moves and all options are clearly specified by rules, and there are no chance moves.
- There are only finitely many different positions and the game will always come to an end when one player is unable to move. This means that there are no draws and no cycles, which could repeat forever.
- The players have perfect information. Card games often don’t have perfect information because one player doesn’t know the opponents’ cards.
- From any one position of the game both players have the same choice of moves. This is not true for chess, because from any particular position, one player can only move white figures and the opponent can only move black ones.
Games with all these properties are called Impartial Games. Mathematicians discovered that any impartial game is equivalent to a game of Nim with certain box sizes. This means that the P and N-positions match up, and that there are always the same number of possible moves. A winning strategy for any impartial game can be found by converting it into Nim and then using the Nim sum.Article on Combinatorial Game Theory
Impartial games are interesting to analyse from a mathematical point of view, but once you have found a winning strategy they are not particularly exciting to play – you know right from the beginning who is going to win.
There are many other combinatorial games. Some, like chess, are so complex that we can’t use methods like the ones above. Chess computers don’t try millions of different possibilities – they play very much like a human being would: analysing the current position and following certain strategies.
Another branch of Game Theory is about situations where people have to make decisions. The outcome depends on your own decision but also on everybody else’s decision – which we don’t know in advance. One example where this happens is economics: companies have to make business decisions and “play” against each other in various markets.
Here are a some interesting situations which can arise in game theory:
Imagine two prisoners are locked in two separate cells of a prison. They are accused of committing a crime together and are questioned individually. Both prisoners are promised to get away if they betray their accomplice, who will gets the full sentence of 10 years.
If both prisoners stay silent, there is not enough evidence so both get a shorter sentence of 1 year. If both betray each other, each is sentenced to 5 years in prison.
The following table shows the four possible outcomes, depending on the actions of Prisoner A and Prisoner B:
|Prisoner A betrays||Prisoner A stays silent|
|Prisoner B betrays||A: 5 years
B: 5 years
|A: 10 years
B goes free
|Prisoner B stays silent||A goes free
B: 10 years
|A: 1 year
B: 1 year
Let us suppose we were Prisoner A and thinking about which action to take: betraying of staying silent.
- If we knew Prisoner B would betray us (first row), betraying would get us 5 years while staying silent would get us 10 years. Thus we should also betray.
- If we knew Prisoner B would stay silent (second row), betraying would get us 0 years while staying silent would get us 1 year. Thus we should betray.
It seems that – no matter what Prisoner B does – betraying will give us a shorter time in jail and thus is the best thing to do.
Prisoner B will be thinking exactly the same and will also betray. Both prisoners will betray each other and will both be sentenced to 5 years in jail.
Notice, however, that if they had cooperated and both stayed silent, they would have managed to force an outcome that would have been better for both of them: just one year each.
In the Prisoners dilemma, the position (A betrays, B betrays) is called a Nash equilibrium: no individual player can improve his/her outcome by changing their strategy.
In 1951, the mathematician John Forbes Nash (* 1928) proved that all “games” of this kind have Nash equilibria (but there can be more than one). These are not necessarily the best outcome for all players (see the Prisoners dilemma), but they are the choices which players will end up making.
John Nash was jointly awarded the 1994 Nobel Prize in Economics for his work, and his biography is depicted in the Academy Award winning movie A Beautiful Mind.
Nash equilibria are of fundamental importance when analysing economic behaviour like that of big companies, as well as wars, arms races or even soccer games. In all these cases, we have to make a decision, taking into account the decisions our opponent(s) could make.
Here is another famous problem in game theory: decision making in marriage.
A couple decides to go out after work. They either go to the Opera (which the wife prefers) or to a football match (which the husband prefers). Their phones are broken and they cannot contact each other where to go.
The following table shows their respective “gains”. If both go to different places, their gain is 0. If they go to the same place, their gain is either 2 or 3, depending on whether they went to their preferred location or not. (The gain is often called the utility function.)
|Wife goes to Opera||Wife goes to Stadium|
|Husband goes to Opera||W: 3
|Husband goes to Stadium||W: 0
In this case there are two Nash equilibria: both going to the opera or both going to the stadium. This means that it is much harder to make a decision, often involving psychology and behavioural science (not just mathematics).
If we repeat this “experiment” many times we would observe that husband and wife don’t have a fixed strategy of going to a particular place, but that they employ a mixed strategy: both go to each location with a certain probability.
In this example you can calculate that, optimally, the husband goes to the stadium with probability 3/5 and to the opera with probability 2/5. For the wife, these numbers are swapped.