Cards, dice, roulette and game shows: probability is one of the most fun areas of mathematics, full of surprises and real life applications.


Probabilities and likelihoods are everywhere around us, from weather forecasting to games, insurance or election polls. However, in the history of mathematics, probability it is actually a very recent idea. While geometry and algebra were studied by ancient Greek mathematicians more than 2500 years ago, the concepts of probability only emerged in the 17th and 18th century.

According to legend, two of the greatest mathematicians, Blaise Pascal and Pierre de Fermat, would regularly meet up in a small cafe in Paris.

To distract from the difficult mathematical theories they were discussing, they often played a simple game: they repeatedly tossed a coin – every heads was a point for Pascal and every tails was a point for Fermat. Whoever had more points after three coin tosses had to pay the bill.

One day, however, they get interrupted after the first coin toss and Fermat has to leave urgently. Later, they wonder who should pay the bill, or if there is a fair way to split it. The first coin landed heads (a point for Pascal), so maybe Fermat should pay everything. However, there is a small chance that Fermat could have still won if the would have been tails.

Pascal and Fermat decided to write down all possible ways the game could have continued:


Pascal wins


Pascal wins


Pascal wins


} Fermat wins

All four possible outcomes are equally likely, and Pascal wins in of them. Thus they decided that Fermat should pay 3/4 of the bill and Pascal should pay 1/4.

Pascal and Fermat had discovered the first important equation of probability: if an experiment has multiple possible outcomes which are all equally likely, then

Probability of an event = Number of ways the event could happenTotal number of possible outcomes.

In our example, the probability of Pascal winning the game is 34=0.75, and the probability of Fermat winning the game is 14=0.25.

What are Probabilities

A probability is a number between 0 and 1 which describes the likelihood of a certain event. A probability of 0 means that something is impossible; a probability of 1 means that something is certain.

For example, it is that you will meet a real life dragon, and it is that the sun will rise tomorrow. The probability of a coin landing heads is exactly .

The probability of rolling a 6 on a die, or picking a particular suit from a deck of cards is than 0.5 – which means unlikely. The probability of a good football team winning a match, or of a train arriving on time is than 0.5 – which means likely.


Here are some more events: drag them into the correct order, from likely to unlikely:

You throw a die game_die and it lands on 6.
Penguins penguin live on the North Pole.
It’s going to rain rain_cloud in November.
A baby will be born in China today. baby_bottle
You buy a lottery ticket and win the Jackpot tada.
A newborn baby will be a girl girl.

We often use probabilities and likelihoods in everyday life, usually without thinking about it. What is the chance of rain tomorrow? How likely is it that I will miss the bus? What is the probability I will win this game?

Tossing a (fair) coin has two possible outcomes, heads and tails, which are both equally likely. According to the equation above, the probability of a coin landing heads must be 12 = 0.5, or 50%.

Note that this probability is in between 0 and 1, even though only one of the outcomes can actually happen. But probabilities have very little to do with actual results: if we toss a coin many times we know that of the results are heads – but we have no way of predicting exactly which tosses landed heads.

Even events with tiny probabilities (like winning the lottery tada) can still happen – and they do happen all the time (but to a very small proportion of the people who participate).

Probabilities also depend on how much each of us knows about the event. For example, you might estimate that the chance of rain today is about 70%, while a meteorologist with detailed weather data might say the chance of rain is 64.2%.

Or suppose that I toss a coin and cover it up with my hands – the probability of tails is 50%. Now I peek at the result, but don’t tell you. I know for certain what has happened, but for you the probability is .

There are many different ways to think about probabilities, but in practice they often give the same results:

classical probability

The classical probability of landing heads is the proportion of possible outcomes that are heads.

frequentist probability

The frequentist probability is the proportion of heads we get if we toss the coin many times.

subjectivist probability

The subjectivist probability tells us how strongly we believe that the coin will land heads.

Remember that while probabilities are great for estimating and forecasting, we can never tell what actually will happen.

Now let’s have a look at some fun applications of probability.

Analysing Roulette

Soon after their initial discovery, mathematicians started applying the laws of probability to many different parts of life – including casino games.

One of these mathematicians was Karl Pearson who analysed the results of roulette games published in the French newspaper Le Monaco.

Roulette consists of a wheel with the numbers from 1 to 36 coloured in red and black, as well as a green 0. A ball rolls around the outside and randomly lands on one of the numbers. Gamblers can bet on a single number, a set of multiple numbers, or just a colour. Their potential winning depends on the likelihood of each of these outcomes.

Here is one of the many hundreds of newspaper extracts which Pearson collected and analysed. At first sight, it looks pretty random:

Roulette results on 19 August 1823, Table 5:


A roulette wheel has the same number of red and black numbers. If we ignore the green 0 (which means the casino wins) we would expect the number of red and black numbers to be . Let’s check that this is indeed the case for the set of results above.


This looks pretty evenly distributed – there is a small difference between the number of red and black results, but that is always to be expected in probability.

However, Pearson didn’t stop here. He realised that if the results were completely random, then each of the four possible pairs of two consecutive colours should also be equally likely. Again we can count the number of occurrences in our example:


For some reasons, it seems that RR and BB happen much than RB and BR, even though they should all have the same probability. Of course, we might have just been unlucky in this particular sequence of results – but Pearson tested many thousands of results and always found the same.

It gets even worse if we look at triples of results. Each of the 8 possible triples of colours should be equally likely, but that is clearly not the case here:


It seems that in this particular casino, the colours alternate much more often than one would expect. There are hardly any long sequences of the same colour (RRR or BBB).

Pearson calculated that the probability of seeing results which were this skewed was less than 1 in 100,000,000! He assumed that the Roulette wheels were rigged to create higher profits for the Casino – and wrote many angry letters to expose this scam.

When he finally travelled to Monte Carlo, he discovered that the reason for the skewed results was of a very different nature: the journalists who were supposed to be recording the results were instead just sitting in the bar of the casino, drinking, and making up random colours…

This story shows that we humans tend to be quite bad at coming up with random-looking data: we often underestimate unlikely events (long sequences of the same colour) and overestimate likely ones (alternating colours). This can be used effectively to detect fraud in banking and insurance.

Here you can try for yourself if you are better than the journalists: write down a sequence of Rs and Bs, and find out how random it really is:

Randomness Score: 100/100

Beat the Dealer

While Pearson only analysed previous Roulette results, others tried to use mathematics to increase their chances of winning in casinos. One of these was Edward Thorp, who invented card counting – a technique that allowed him to beat casinos at Blackjack.

He later turned his focus to Roulette: believing that, if you knew the exact position and speed of the ball in a Roulette wheel, you should be able to use Physics to approximately predict the outcome. After the dealer sets the roulette wheel spinning, there are just a few seconds when you are still allowed to place new bets. Unfortunately this time is much too short for humans to calculate the outcome in their head.

At the Massachusetts Institute of Technology, Thorp discussed his ideas with Claude Shannon, another mathematician and the father of information theory. Together they decided to build the first ever wearable computer, decades before the likes of Google Glass or Apple Watch.

The computer was roughly the size of a pack of cigarettes and strapped around their waist. A set of wires ran down to their shoe, which they tapped whenever the ball crossed a certain marker on the roulette wheel. That allowed the computer to calculate its speed, and predict where it would end up. Another set of wires led from the computer to an earpiece, which produced different tones based on different outcomes.

During the summer of 1961, Thorp and Shannon successfully tried their computer in Las Vegas. But while they made some money, the computer – which even contained parts of model airplanes – was not robust enough to be used at a larger scale.

Thorp wrote about their results in a scientific paper, and of course, computers were later forbidden in casinos. Thorp even got banned from all casinos in Las Vegas, but by then he had already moved on to yet more profitable ventures: using mathematics and computers on the stock market.

After this short trip through history, let’s get back to some actual mathematics…

Predicting the Future

If we roll two dice at once and add up their scores we could get results from up to . However, not all outcomes are equally likely. Some results can only happen one way (to get 12 you have to roll + ) while others can happen in multiple different ways (to get 5 you could roll + or + ).

This table shows all possible outcomes:


The most likely result when rolling two dice is 7. There are outcomes where the sum is 7, and outcomes in total, so the probability of getting a 7 is 636=0.1666.

The least likely outcomes are 2 and 12, each with a probability of 136=0.0277.

It is impossible to forecast the outcome of a single coin toss or die roll. However, using probability we can very accurately predict the outcome ofmany dice.

If we throw a die 30 times, we know that we would get around 16×30=5 sixes. If we roll it 300 times, there will be around 16×300=50 sixes. These predictions get more and more accurate as we repeat the predictions more and more often.

In this animation you can roll many “virtual” dice at once and see how the results compare to the predicted probabilities:

Rolling Dice

${ probTable(d) }

We roll ${d} dice at once and record the SUM of their scores. The green lines represent the probabilities of every possible outcome predicted by probability theory and the blue bars show how often each outcome happened in this computer generated experiment.

Notice how, as we roll more and more dice, the observed frequencies become closer and closer to the frequencies we predicted using probability theory. This principle applies to all probability experiments and is called the Law of large numbers.

Similarly, as we increase the number of dice rolled at once, you can also see that the probabilities change from a straight line (one die) to a triangle (two dice) and then to a “bell-shaped” curve. This is known as the Central Limit Theorem, and the bell-shaped curve is called the Normal Distribution.

Monty Hall

Welcome to the most spectacular game show on the planet! You now have a once-in-a-lifetime chance of winning a fantastic sports car which is hidden behind one of these three doors. Unfortunately, there are only goats behind the other two doors. Select one to make your choice!

Are you sure about that? You can still change your mind and select a different door…

A great choice, but let me make life a little easier for you. I’ll open one of the other doors with a goat, so that there are only two doors left for you to pick from. Do you want to stick with your choice, or do you want to swap?

Ok – let’s see how you did…

Looks like you made the right choice. Congratulations, you just won a beautiful new sports car!

If you play this game many times, you’ll notice that you’re more likely to win if you after the first door is opened, rather than sticking with your initial choice.

But how can this be – surely the car is equally likely to be behind each of the two remaining doors?

The explanation is very subtle. When you pick the initial door, the probability of being correct is 13 and the probability of being wrong is 23.

Artboard 20

After the game master opens one of the other doors, the probability of being wrong is still 23, except now all this probability is on just one door. This means that swapping doors your chance of winning.

Artboard 21

Even if this doesn’t seem very intuitive, we can prove that it is correct – simply by listing all different possibilities:

Out of the 9 possibilities need you to switch doors, to win. This gives a chance of 69=23 like before.

True Randomness

Most of this chapter relied on the fact that things like coins, or dice, or roulette wheels are completely random. However, that is not really true – we already learned that Edward Thorpe managed to predict the outcome of roulette.

Suppose we toss a coin: the chance of it landing heads is 0.5. If we knew which way the coin was facing just before it left the hand, we might be able to make a slightly better prediction, such as 0.58 or 0.41. If we also knew the weight and size of the coin, and the angle, position and speed as it left the hand, we could use the laws of physics – gravity, friction and air resistance – to model the motion of the coin and to predict the outcome. Finally, if we knew the exact position of every atom in the coin and of all the air molecules surrounding it, we could create a computer simulation to accurately predict what will happen.

One could argue that tossing a coin really isn’t random at all – it is chaotic. That means that the underlying physical principles are so complex that even tiny changes to the starting conditions (speed, angle) can have a dramatic effect on the final outcome. We can use coins in games and gambling not because they are random, but because it is so incredibly difficult (and for practical purposes impossible) to predict the result.

The same principle applies to many other “random” events in life, including dice and roulette wheels. They are not really random, we simply don’t have the tools to do the mathematical calculations accurately enough to predict the outcome.

But true randomness does exists – at the very foundations of matter. A block of radioactive material consists of millions of atoms which decay over time: they fall apart into smaller atoms while emitting dangerous radiation.

Physicists know the probability that a particular atom will decay in a certain period of time. In fact, for a large block of radioactive material, the overall rate of decay is so steady that it is used in atomic clocks. But even knowing the exact properties of every atom, it is impossible to work out which one will decay next – this is completely random.

Radioactive decay of atoms is caused by forces which act at much smaller scales within atoms, and which can be explained using Quantum mechanics. During the last century, physicists like Max Planck and Paul Dirac discovered that fundamental particles have a mind-blowing property: they can be in multiple different places at the same time. They don’t have a fixed position, but instead a probability distribution (or wave function) which tells us how likely it is we are going to find them at a particular position.

This incredible property is used by Quantum computers. Conventional computers can only ever do one computation at a time. Quantum computers can use the properties of subatomic particles to do many calculations at the same time – and that makes them significantly faster.

We can’t really understand or explain quantum mechanics – we just have to accept that it is what is predicted by mathematical theory and confirmed by physical observations. The curious quantum effects have only ever been observed on tiny scales of a few atoms, and it is not clear how they affect us in everyday life. But it is the only known effect in nature that produces true randomness.