I was listening to a podcast recently about “Thinking Clearly” when the presenter gave a very rapid description of a common mistake that occurs when human intuition meets statistical information.

The particular bit of statistical analysis, when you meet its effects in real-life, can be hard to understand so I thought I’d draw a picture that might help people to understand the mechanism of putting a “threat percentage” into the correct context.

Imagine there is a *“nasty medical condition X”* that affects one in 20 of all men over the age of 55. Now imagine that there is a diagnostic test for the condition that is 90% accurate (meaning it will return the correct result 90 times out of a hundred and the wrong result 10 times out of 100).

You are male, just past your 55th birthday, and your doctor tells you that you’ve tested positive. How terrified should you be or, to put it another way, which is more likely (and by how much): you’ve got X, or you haven’t got X?

The human, intuitive, response is simple: you’ve been told the test is 90% accurate and you’ve tested positive; so your digestive system is probably telling you that it’s almost certain you’ve got X.

The statistician’s approach (after the initial reflexive shock, perhaps) is to apply ** Bayesian thinking** which can be turned into pictures as follows:

- Step 1: What does
*“one in 20”*look like? (left hand square)

- Step 2: What does
*“90% accurate”*look like? (right hand square)

- Step 3: What does the picture look like when we superimpose the individual (independent) probabilities:

The big white block toward the top left tells us about the group of people who are correctly informed that they are X-free; the tiny red block in the bottom right tells us about the really unlucky ones who are told they’re X-free when they actually have X *(false negatives)*.

Now compare the two pink blocks: the vertical rectangle to the right is the group that have X and test positive; the horizontal rectangle at the bottom is the group who ** don’t** have X but test positive anyway

*(false positives)*.

The visual impression from this image is that if you’ve been told that you tested positive it’s nearly twice as likely that you are X-free than you are to have X: but let’s put the numbers into the picture to get a better handle on this. I’ll use a population of 10,000 (which, conveniently, I can represent as a square measuring 100 by 100):

### In a population of 10,000

- X-free = 95 * (90 + 10) = 9,500 (95%)
- Got X = 5 * (90 + 10) = 500 (5%)

- Correct result given = 90 * (95 + 5) = 9,000 (90%)
- Wrong result given =10 * (95 + 5) = 1,000 (10%)

- X-free and received right result = 8,550 … (95 * 90, top left)
- Got X and received wrong result = 50 … (5 * 10, bottom right)

- Got X and received right result = 450 … (5 * 90, top right)
- X-free and received wrong result = 950 … (95 * 10, bottom left)

Given the underlying population data (** “priors”**) for this example, we see that a positive result from a test that’s 90% accurate means there’s a probability of 450 / (950 + 450) = 0.32 (32%) that you’ve got X.

### Footnote

The result of this very simple hypothetical case is not intuitively obvious to most people; but if you thought it was easy to get to the right answer you might want to look at the ** Monty Hall** problem, which also leads to

**and the**

*Bertrand’s Boxes***problem.-**

*Three Prisoners*