Bayes' Rule for Ducks

Sunday February 23, 2014

You look at a thing.

Is it a duck?

Re-phrase: What is the probability that it's a duck, if it looks like that?

Bayes' rule says that the probability of it being a duck, if it looks like that, is the same as the probability of any old thing being a duck, times the probability of a duck looking like that, divided by the probability of a thing looking like that.

\[ Pr(duck | looks) = \frac{Pr(duck) \cdot Pr(looks | duck)}{Pr(looks)} \]

This makes sense:

If ducks are mythical beasts, then \( Pr(duck) \) (our "prior" on ducks) is very low, and the thing would have to be very duck-like before we'd believe it's a duck. On the other hand, if we're at some sort of duck farm, then \( Pr(duck) \) is high and anything that looks even a little like a duck is probably a duck.
If it's very likely that a duck would look like that (\( Pr(looks|duck) \) is high) then we're more likely to think it's a duck. This is the "likelihood" of a duck looking like that thing. In practice it's based on how the ducks we've seen before have looked.
The denominator \( Pr(looks) \) normalizes things. After all, we're in some sense portioning out the probabilities of this thing being whatever it could be. If 1% of things look like this, and 1% of things look like this and are ducks, then 100% of things that look like this are ducks. So \( Pr(looks) \) is what we're working with; it's the denominator.

Here's an example of a strange world to test this in:

There are ten things. Six of them are ducks. Five of them look like ducks. Four of them both look like ducks and are ducks. One thing looks like a duck but is not a duck. Maybe it's a fake duck? Two ducks do not look like ducks. Ducks in camouflage. Test the equality of the two sides of Bayes' rule:

\[ Pr(duck | looks) = \frac{Pr(duck) \cdot Pr(looks | duck)}{Pr(looks)} \]

\[ \frac{4}{5} = \frac{\frac{6}{10} \cdot \frac{4}{6}}{\frac{5}{10}} \]

It's true here, and it's not hard to show that it must be true, using two ways of expressing the probability of being a duck and looking like a duck. We have both of these:

\[ Pr(duck \cap looks) = Pr(duck|looks) \cdot Pr(looks) \]

\[ \displaystyle Pr(duck \cap looks) = Pr(looks|duck) \cdot Pr(duck) \]

Check those with the example as well, if you like. Using the equality, we get:

\[ Pr(duck|looks) \cdot Pr(looks) = Pr(looks|duck) \cdot Pr(duck) \]

Then dividing by \( Pr(looks) \) we have Bayes' rule, as above.

\[ Pr(duck | looks) = \frac{Pr(duck) \cdot Pr(looks | duck)}{Pr(looks)} \]

This is not a difficult proof at all, but for many people the result feels very unintuitive. I've tried to explain it once before in the context of statistical claims. Of course there's a wikipedia page and many other resources. I wanted to try to do it with a unifying simple example that makes the equations easy to parse, and this is what I've come up with.

This post was originally hosted elsewhere.