Beta

Bayes theorem, the geometry of changing beliefs

Below is a short summary and detailed review of this video written by FutureFactual:

Bayes Theorem Demystified: How to Update Beliefs with Evidence

Overview

This video distills one of the central formulas in Bayes Theorem, illustrating how new evidence should update, not replace, prior beliefs. Through a relatable librarian versus farmer example and a famous Linda problem, it shows how rational updating works and why representation matters when applying probability to real life.

Key insights

  • Bayes Theorem ties together prior beliefs, the likelihood of evidence under each hypothesis, and the updated posterior probability.
  • A simple representative sample helps translate abstract probabilities into intuition about how evidence shifts beliefs.
  • Geometric intuition using areas instead of counts offers a flexible, on the fly way to sketch Bayes reasoning.
  • Cognitive biases can distort probability judgments; explicit ratios and careful sampling reduce error.

Overview and Purpose

The video introduces Bayes Theorem as a fundamental instrument for scientific reasoning and machine learning. It frames probability not as a mystical sense of uncertainty, but as a precise language for updating beliefs in light of new evidence. The central message is that beliefs should be updated rather than dictated by new data, and that Bayes provides a disciplined way to do that.

The Steve Example and a Preview of Bayes Tools

The presenter begins with a narrative device to ground the mathematics. Steve is described with personality traits that are vaguely reminiscent of a librarian. Viewers are asked to weigh whether Steve is more likely to be a librarian or a farmer, given a description that seems to align with librarian stereotypes. This setup is designed to lead to a Bayesian calculation where prior information matters. The ratio of librarians to farmers in the population is introduced as the prior probability for the librarian hypothesis. The narrator cautions that the exact ratio is not the point; rather, the key idea is that the prior matters and must be adjusted by evidence.

The speaker then walks through the mechanism of updating a belief with Bayes Theorem using a specific numerical illustration. A sample representation, such as 200 farmers and 10 librarians, is used to illustrate the learning process. The numbers yield a posterior probability after observing the description of Steve. The numeric illustration shows how, even when Librarian traits are four times as likely as Farmer traits to match the description, the larger pool of farmers can dominate the update. This is the heart of Bayes: new evidence restricts the space of possibilities and the update is governed by the relative likelihoods and the prior ratio.

The Formalization: Prior, Likelihood and Posterior

The video then generalizes from the concrete example to the formal rule. It defines the prior as the probability of the hypothesis before examining any new data. The likelihood is the probability of observing the data given the hypothesis is true. The complementary probability, not H, given the data, is used to calculate the denominator in Bayes Theorem. The posterior is the resulting probability that the hypothesis holds after observing the evidence. The math is presented as a proportional relationship that can be written in a standard formula with P(H|E) proportional to P(H) times P(E|H) and in the denominator summing over both the true and false cases of the hypothesis. The final posterior is the fraction of all evidence cases that support the hypothesis.

Geometric and Proportional Thinking

Beyond the algebra, the video emphasizes geometric intuition. A square space is used as a visual metaphor where each event occupies a region. The area of the left region that also contains the evidence yields the numerator of Bayes Theorem, while the total restricted space under the evidence yields the denominator. The geometry interpretation is presented as a helpful way to understand how the numerator grows as the hypothesis becomes more consistent with the evidence, while the denominator grows or shrinks depending on how informative the evidence is relative to the alternative.

Representative Sampling and Cognitive Biases

The narrative then revisits the broader context. Kahneman and Tversky are cited to illustrate how people often neglect base rates when confronted with descriptive details. A famous Linda problem is used to show that people intuitively overestimate the likelihood of a conjunction due to representativeness heuristics. The speaker differentiates between the irrationality debate and the practical takeaway: the correct update still depends on the ratio of base rates, so you should think about both description and prevalence. The example of 100 people with a Steve-like description is used to show how a rough, concrete framing helps align intuition with statistical principles. A key point is that context matters for priors and even reliability of the likelihoods, which is why the diagram or a carefully chosen sample helps anchor reasoning in reality.

Practical Guidance and Takeaways

The video closes by encouraging a process oriented approach to probability. It suggests not memorizing the formula by rote but instead drawing the diagram as needed to guide thinking. A probabilistic view of thought itself is proposed, where belief revision is a function of both evidence quality and prior knowledge. This perspective reframes how scientists, engineers, and researchers think about updating models in the face of new data, a core practice in Bayesian statistics, machine learning, and empirical inquiry.

To find out more about the video and 3Blue1Brown go to: Bayes theorem, the geometry of changing beliefs.