## Introduction

Probabilities and likelihoods are everywhere around us, from weather forecasting to games, insurance or election polls. However, in the history of mathematics, probability it is actually a very recent idea. While geometry and algebra were studied by ancient Greek mathematicians more than 2500 years ago, the concepts of probability only emerged in the 17th and 18th century.

According to legend, two of the greatest mathematicians,

To distract from the difficult mathematical theories they were discussing, they often played a simple game: they repeatedly tossed a coin – every *heads* was a point for Pascal and every *tails* was a point for Fermat. Whoever had more points after three coin tosses had to pay the bill.

One day, however, they get interrupted after the first coin toss and Fermat has to leave urgently. Later, they wonder who should pay the bill, or if there is a fair way to split it. The first coin landed *heads* (a point for Pascal), so maybe Fermat should pay everything. However, there is a small chance that Fermat could have still won if the *tails*.

Pascal and Fermat decided to write down all possible ways the game could have continued:

All four possible outcomes are equally likely, and Pascal wins in

Pascal and Fermat had discovered the first important equation of probability: if an experiment has multiple possible outcomes which are all equally likely, then

Probability of an event = Number of ways the event could happenTotal number of possible outcomes.

In our example, the probability of Pascal winning the game is 34 = 0.75, and the probability of Fermat winning the game is 14 = 0.25.

## What are Probabilities

A **probability** is a number between 0 and 1 which describes the likelihood of a certain **event**. A probability of 0 means that something is *impossible*; a probability of 1 means that something is *certain*.

For example, it is *heads* is exactly

The probability of rolling a 6 on a die, or picking a particular suit from a deck of cards is

Here are some more events: drag them into the correct order, from likely to unlikely:

We often use probabilities and likelihoods in everyday life, usually without thinking about it. What is the chance of rain tomorrow? How likely is it that I will miss the bus? What is the probability I will win this game?

Tossing a (fair) coin has two possible outcomes, *heads* and *tails*, which are both equally likely. According to the equation above, the probability of a coin landing *heads* must be 12 = 0.5, or 50%.

Note that this probability is *in between* 0 and 1, even though only one of the outcomes can actually happen. But probabilities have very little to fo with actual results: if we toss a coin many times we know that *exactly which* tosses landed heads.

Even events with tiny probabilities (like winning the lottery ) *can still happen* – and they *do happen* all the time (but to a very small proportion of the people who participate).

Probabilities also depend on how much each of us knows about the event. For example, you might estimate that the chance of rain today is about 70%, while a meteorologist with detailed weather data might say the chance of rain is 64.2%.

Or suppose that I toss a coin and cover it up with my hands – the probability of tails is 50%. Now I peek at the result, but don't tell you. I know for certain what has happened, but for you the probability is

There are many different ways to think about probabilities, but in practice they often give the same results:

The **classical** probability of landing heads is the proportion of *possible outcomes* that are heads.

The **frequentist** probability is the proportion of heads we get if we toss the coin *many times*.

The **subjectivist** probability tells us how strongly we *believe* that the coin will land heads.

Remember that while probabilities are great for *estimating and forecasting*, we can never tell what *actually* will happen.

Now let’s have a look at some fun applications of probability.

## Analysing Roulette

Soon after their initial discovery, mathematicians started applying the laws of probability to many different parts of life – including casino games.

One of these mathematicians was *Le Monaco*.

Roulette consists of a wheel with the numbers from 1 to 36 coloured in **red** and **black**, as well as a green 0. A ball rolls around the outside and randomly lands on one of the numbers. Gamblers can bet on a single number, a set of multiple numbers, or just a colour. Their potential winning depends on the likelihood of each of these outcomes.

Here is one of the many hundreds of newspaper extracts which Pearson collected and analysed. At first sight, it looks pretty random:

Roulette results on 19 August 1823, Table 5:

A roulette wheel has the same number of red and black numbers. If we ignore the green 0 (which means the casino wins) we would expect the number of red and black numbers to be

This looks pretty evenly distributed – there is a small difference between the number of red and black results, but that is always to be expected in probability.

However, Pearson didn’t stop here. He realised that if the results were completely random, then each of the four possible pairs of two consecutive colours should also be equally likely. Again we can count the number of occurrences in our example:

For some reasons, it seems that **RR** and **BB** happen much **R****B** and **B****R**, even though they should all have the same probability. Of course, we might have just been *unlucky* in this particular sequence of results – but Pearson tested many thousands of results and always found the same.

It gets even worse if we look at triples of results. Each of the 8 possible triples of colours should be equally likely, but that is clearly not the case here:

It seems that in this particular casino, the colours alternate much more often than one would expect. There are hardly any long sequences of the same colour (**RRR** or **BBB**).

Pearson calculated that the probability of seeing results which were this skewed was less than 1 in 100,000,000! He assumed that the Roulette wheels were rigged to create higher profits for the Casino – and wrote many angry letters to expose this scam.

When he finally travelled to Monte Carlo, he discovered that the reason for the skewed results was of a very different nature: the journalists who were supposed to be recording the results were instead just sitting in the bar of the casino, drinking, and making up random colours…

This story shows that we humans tend to be quite bad at coming up with random-looking data: we often underestimate unlikely events (long sequences of the same colour) and overestimate likely ones (alternating colours). This can be used effectively to detect fraud in banking and insurance.

Here you can try for yourself if you are better than the journalists: write down a sequence of Rs and Bs, and find out how random it really is:

Randomness Score: 100/100

## Beat the Dealer

While Pearson only analysed previous Roulette results, others tried to use mathematics to increase their chances of winning in casinos. One of these was *card counting* – a technique that allowed him to beat casinos at

He later turned his focus to Roulette: believing that, if you knew the exact position and speed of the ball in a Roulette wheel, you should be able to use Physics to approximately predict the outcome. After the dealer sets the roulette wheel spinning, there are just a few seconds when you are still allowed to place new bets. Unfortunately this time is much too short for humans to do calculate the outcome in their head.

At the Massachusetts Institute of Technology, Thorp discussed his ideas with *wearable computer*, decades before the likes of Google Glass or Apple Watch.

The computer was roughly the size of a pack of cigarettes and strapped around their waist. A set of wires ran down to their shoe, which they tapped whenever the ball crossed a certain marker on the roulette wheel. That allowed the computer to calculate its speed, and predict where it would end up. Another set of wires led from the computer to an earpiece, which produced different tones based on different outcomes.

During the summer of 1961, Thorp and Shannon successfully tried their computer in Las Vegas. But while they made some money, the computer – which even contained parts of model airplanes – was not robust enough to be used at a larger scale.

Thorp wrote about their results in a scientific paper, and of course, computers were later forbidden in casinos. Thorp even got banned from all casinos in Las Vegas, but by then he had already moved on to yet more profitable ventures: using mathematics and computers on the stock market.

After this short trip through history, let’s get back to some actual mathematics…

## Probability Trees

In real life, coins never have exactly a probability of 0.5. It might be 0.4932 or 0.500012, depending on their exact shape or physical properties. In mathematics we don’t have to worry about these tiny inaccuracies: we can simply assume that our “mathematical model” of a coin has exactly a 0.5 probability of landing heads and is truly random. With this simplification, we can start answering much more interesting questions.

More coming soon…

### Probability Trees

More coming soon…

## Venn Diagrams

More coming soon…

## Predicting the Future

If we roll two dice at once and add up their scores we could get results from

This table shows all possible outcomes:

2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |

The most likely result when rolling two dice is 7. There are

The least likely outcomes are 2 and 12, each with a probability of 136 = 0.0277.

It is impossible to forecast the outcome of a single coin toss or die roll. However, using probability we can very accurately predict the outcome of *many* dice.

If we throw a die 30 times, we know that we would get around 16 × 30 = 5 sixes. If we roll it 300 times, there will be around 16 × 300 = 50 sixes. These predictions get more and more accurate as we repeat the predictions more and more often.

In this animation you can roll many “virtual” dice at once and see how the results compare to the predicted probabilities:

### Rolling Dice

We roll **green lines** represent the probabilities of every possible outcome predicted by probability theory and the **blue bars** show how often each outcome happened in this computer generated experiment.

Notice how, as we roll more and more dice, the observed frequencies become closer and closer to the frequencies we predicted using probability theory. This principle applies to all probability experiments and is called the **Law of large numbers**.

## Monty Hall

Welcome to the most spectacular game show on the planet! You now have a once-in-a-lifetime chance of winning a fantastic sports car which is hidden behind one of these three doors. Unfortunately, there are only goats behind the other two doors. Simply tap on one to make your choice!

Are you sure about that? You can still change your mind by tapping a different door…

A great choice, but let me make life a little easier for you. I'll open one of the other doors with a goat, so that there are only two doors left for you to pick from. Do you want to stick with your choice, or do you want to swap?

Ok – let's see how you did…

Looks like swapping doors was a good choice. Congratulations, you just won a beautiful new sports car!

Sorry – it seems like time time you only won a goat. But don't worry, you can play again!

If you play this game many times, you’ll notice that you’re more likely to win if you

But how can this be – surely the car is equally likely to be behind each of the two remaining doors?

The explanation is very subtle. When you pick the initial door, the probability of being correct is 13 and the probability of being wrong is 23.

After the game master opens one of the other doors, the probability of being wrong is *still* 23, except now all this probability is on just one door. This means that swapping doors

Even if this doesn't seem very intuitive, we can prove that it is correct – simply by listing all different possibilities:

Out of the 9 possibilities

## True Randomness

Most of this chapter relied on the fact that things like coins, or dice, or roulette wheels are completely random. However, that is not really true – we already learned that Edward Thorpe managed to predict the outcome of roulette.

Suppose we toss a coin: the chance of it landing heads is 0.5. If we knew which way the coin was facing just before it left the hand, we might be able to make a slightly better prediction, such as 0.58 or 0.41. If we also knew the weight and size of the coin, and the angle, position and speed as it left the hand, we could use the laws of physics – gravity, friction and air resistance – to model the motion of the coin and to predict the outcome. Finally, if we knew the exact position of every atom in the coin and of all the air molecules surrounding it, we could create a computer simulation to accurately predict what will happen.

One could argue that tossing a coin really isn’t random at all – it is *chaotic*. That means that the underlying physical principles are so complex that even tiny changes to the starting conditions (speed, angle) can have a dramatic effect on the final outcome. We can use coins in games and gambling not because they are random, but because it is so incredibly difficult (and for practical purposes impossible) to predict the result.

The same principle applies to many other “random” events in life, including dice and roulette wheels. They are not really *random*, we simply don’t have the tools to do the mathematical calculations accurately enough to predict the outcome.

But *true randomness* does exists – at the very foundations of matter. A block of radioactive material consists of millions of atoms which decay over time: they fall apart into smaller atoms while emitting dangerous radiation.

Physicists know the probability that a particular atom will decay in a certain period of time. In fact, for a large block of radioactive material, the overall rate of decay is so steady that it is used in atomic clocks. But even knowing the exact properties of every atom, it is impossible to work out *which one* will decay next – this is completely random.

*at the same time*. They don't have a fixed position, but instead a probability distribution (or wave function) which tells us how likely it is we are going to find them at a particular position.

This incredible property is used by Quantum computers. Conventional computers can only ever do one computation at a time. Quantum computers can use the properties of subatomic particles to do many calculations at the same time – and that makes them significantly faster.

We can’t really *understand* or *explain* quantum mechanics – we just have to accept that it is what is predicted by mathematical theory and confirmed by physical observations. The curious quantum effects have only ever been observed on tiny scales of a few atoms, and it is not clear how they affect us in everyday life. But it is the only known effect in nature that produces *true randomness*.