MS-251›Bayes’ Rule

Probability and StatisticsTopic 13 of 36

Bayes’ Rule

11 minread

1,919words

Intermediatelevel

Bayes' Rule

Bayes' Rule is one of the most important and powerful concepts in probability theory and statistics. It allows you to update the probability of an event based on new evidence. The rule is named after the Reverend Thomas Bayes, an 18th-century statistician who introduced the concept of conditional probability.

Bayes' Rule helps us to calculate posterior probabilities. In simple terms, it tells us how to revise our belief about an event (posterior probability) based on new data or evidence (likelihood), and it connects this to the initial belief (prior probability).

Mathematical Formulation

Bayes' Rule provides a way to compute the conditional probability $P(A \mid B)$ of event $A$ given that event $B$ has occurred. It is written as:

P(A \mid B) = \frac{P(B \mid A) P(A)}{P(B)}

Where:

$P(A \mid B)$ is the posterior probability: the probability of event $A$ occurring given that $B$ has occurred.
$P(B \mid A)$ is the likelihood: the probability of observing event $B$ given that event $A$ has occurred.
$P(A)$ is the prior probability: the initial probability of event $A$ occurring before any new evidence is observed.
$P(B)$ is the marginal likelihood or evidence: the total probability of event $B$ occurring, which can be computed as: $P(B) = P(A) \cdot P(B \mid A) + P(\neg A) \cdot P(B \mid \neg A)$ where $\neg A$ denotes the complement of event $A$ (i.e., the event where $A$ does not happen).

Interpretation

Bayes' Rule provides a way to update your probability estimate for event $A$ based on new evidence $B$ . Here’s a breakdown of each component:

Prior probability ( $P(A)$ ) represents your initial belief or knowledge about event $A$ before any new information is taken into account.
Likelihood ( $P(B \mid A)$ ) tells you how likely the evidence $B$ is, assuming that $A$ is true.
Posterior probability ( $P(A \mid B)$ ) is your updated belief about the probability of event $A$ , given the evidence $B$ .
Marginal likelihood ( $P(B)$ ) normalizes the equation to ensure the probabilities sum to 1. It represents the total probability of observing $B$ , taking into account all possible scenarios.

Example: Medical Diagnosis

One of the most common applications of Bayes' Rule is in medical testing or diagnostic problems, where we want to determine the probability of a patient having a certain disease given the results of a diagnostic test.

Let's assume we have the following information:

Event $A$ : A person has the disease.
Event $B$ : The test result is positive (i.e., the test indicates the person has the disease).

We want to calculate the probability that the person actually has the disease given a positive test result, i.e., $P(A \mid B)$ .

We are given:

Prior probability ( $P(A)$ ): The probability that the person has the disease before the test result is known. For example, let’s say the disease affects 1% of the population: $P(A) = 0.01$
Likelihood ( $P(B \mid A)$ ): The probability of a positive test result given that the person actually has the disease. For example, the test is 95% accurate in detecting the disease: $P(B \mid A) = 0.95$
False positive rate ( $P(B \mid \neg A)$ ): The probability of a positive test result given that the person does not have the disease. For example, the test is 5% likely to incorrectly identify a healthy person as diseased: $P(B \mid \neg A) = 0.05$
Prior probability of not having the disease ( $P(\neg A)$ ): Since 1% of the population has the disease, 99% do not: $P(\neg A) = 0.99$

To calculate $P(A \mid B)$ , the probability of having the disease given a positive test result, we use Bayes' Rule:

P(A \mid B) = \frac{P(B \mid A) P(A)}{P(B)}

First, we need to calculate $P(B)$ , the total probability of a positive test result. This is done using the law of total probability:

P(B) = P(B \mid A) P(A) + P(B \mid \neg A) P(\neg A)

Substituting the known values:

P(B) = (0.95 \times 0.01) + (0.05 \times 0.99) = 0.0095 + 0.0495 = 0.059

Now, applying Bayes' Rule:

P(A \mid B) = \frac{0.95 \times 0.01}{0.059} = \frac{0.0095}{0.059} \approx 0.161

Thus, the probability that the person actually has the disease, given that they tested positive, is approximately 16.1%.

Why is This Result Surprising?

At first glance, one might expect a very high probability that the person has the disease given a positive test result. However, despite the test's high accuracy (95% sensitivity), the actual probability of having the disease is relatively low due to the low prior probability (only 1% of the population has the disease) and the false positive rate (5%). Even with a positive test result, the test is more likely to be a false positive due to the low base rate of the disease.

Generalizing Bayes’ Rule: Multiple Hypotheses

Bayes' Rule can also be extended to more than two possible events. Suppose you have multiple hypotheses $H_1, H_2, \dots, H_n$ , and you want to calculate the probability of each hypothesis $H_i$ given the evidence $B$ . Bayes' Rule in this case becomes:

P(H_i \mid B) = \frac{P(B \mid H_i) P(H_i)}{\sum_{j=1}^{n} P(B \mid H_j) P(H_j)}

Where:

$P(H_i \mid B)$ is the posterior probability of hypothesis $H_i$ given the evidence $B$ .
$P(B \mid H_i)$ is the likelihood of observing $B$ given hypothesis $H_i$ .
$P(H_i)$ is the prior probability of hypothesis $H_i$ .
The denominator normalizes the probabilities by summing over all possible hypotheses $H_1, H_2, \dots, H_n$ .

Summary of Key Concepts

Bayes' Rule allows us to update our beliefs about an event (posterior probability) based on new evidence.
The formula is: $P(A \mid B) = \frac{P(B \mid A) P(A)}{P(B)}$ Where:
- $P(A \mid B)$ is the posterior probability.
- $P(B \mid A)$ is the likelihood.
- $P(A)$ is the prior probability.
- $P(B)$ is the marginal likelihood.
Medical Diagnostics Example: Bayes' Rule can be used to calculate the probability of a person having a disease given a positive test result.
Law of Total Probability is used to calculate $P(B)$ , the total probability of evidence $B$ by considering all possible causes.
Counterintuitive Results: Even with high test accuracy, the probability of having a disease may remain low if the prior probability is low.

Bayes' Rule is widely used in fields such as statistics, machine learning, data science, and medical diagnostics. It provides a powerful framework for making decisions and updating beliefs based on new data.

Previous topic 12

Independence and the Product Rule

Next topic 14

Random Variables and Probability Distributions

Past Papers

Open this section to load past papers

Click on Show Past Papers to see past papers.

MS-251›Bayes’ Rule

Probability and StatisticsTopic 13 of 36

Bayes’ Rule

11 minread

1,919words

Intermediatelevel

Bayes' Rule

Mathematical Formulation

Bayes' Rule provides a way to compute the conditional probability $P(A \mid B)$ of event $A$ given that event $B$ has occurred. It is written as:

P(A \mid B) = \frac{P(B \mid A) P(A)}{P(B)}

Where:

$P(A \mid B)$ is the posterior probability: the probability of event $A$ occurring given that $B$ has occurred.
$P(B \mid A)$ is the likelihood: the probability of observing event $B$ given that event $A$ has occurred.
$P(A)$ is the prior probability: the initial probability of event $A$ occurring before any new evidence is observed.
$P(B)$ is the marginal likelihood or evidence: the total probability of event $B$ occurring, which can be computed as: $P(B) = P(A) \cdot P(B \mid A) + P(\neg A) \cdot P(B \mid \neg A)$ where $\neg A$ denotes the complement of event $A$ (i.e., the event where $A$ does not happen).

Interpretation

Bayes' Rule provides a way to update your probability estimate for event $A$ based on new evidence $B$ . Here’s a breakdown of each component:

Prior probability ( $P(A)$ ) represents your initial belief or knowledge about event $A$ before any new information is taken into account.
Likelihood ( $P(B \mid A)$ ) tells you how likely the evidence $B$ is, assuming that $A$ is true.
Posterior probability ( $P(A \mid B)$ ) is your updated belief about the probability of event $A$ , given the evidence $B$ .
Marginal likelihood ( $P(B)$ ) normalizes the equation to ensure the probabilities sum to 1. It represents the total probability of observing $B$ , taking into account all possible scenarios.

Example: Medical Diagnosis

Let's assume we have the following information:

Event $A$ : A person has the disease.
Event $B$ : The test result is positive (i.e., the test indicates the person has the disease).

We want to calculate the probability that the person actually has the disease given a positive test result, i.e., $P(A \mid B)$ .

We are given:

Prior probability ( $P(A)$ ): The probability that the person has the disease before the test result is known. For example, let’s say the disease affects 1% of the population: $P(A) = 0.01$
Likelihood ( $P(B \mid A)$ ): The probability of a positive test result given that the person actually has the disease. For example, the test is 95% accurate in detecting the disease: $P(B \mid A) = 0.95$
False positive rate ( $P(B \mid \neg A)$ ): The probability of a positive test result given that the person does not have the disease. For example, the test is 5% likely to incorrectly identify a healthy person as diseased: $P(B \mid \neg A) = 0.05$
Prior probability of not having the disease ( $P(\neg A)$ ): Since 1% of the population has the disease, 99% do not: $P(\neg A) = 0.99$

To calculate $P(A \mid B)$ , the probability of having the disease given a positive test result, we use Bayes' Rule:

P(A \mid B) = \frac{P(B \mid A) P(A)}{P(B)}

First, we need to calculate $P(B)$ , the total probability of a positive test result. This is done using the law of total probability:

P(B) = P(B \mid A) P(A) + P(B \mid \neg A) P(\neg A)

Substituting the known values:

P(B) = (0.95 \times 0.01) + (0.05 \times 0.99) = 0.0095 + 0.0495 = 0.059

Now, applying Bayes' Rule:

P(A \mid B) = \frac{0.95 \times 0.01}{0.059} = \frac{0.0095}{0.059} \approx 0.161

Thus, the probability that the person actually has the disease, given that they tested positive, is approximately 16.1%.

Why is This Result Surprising?

Generalizing Bayes’ Rule: Multiple Hypotheses

P(H_i \mid B) = \frac{P(B \mid H_i) P(H_i)}{\sum_{j=1}^{n} P(B \mid H_j) P(H_j)}

Where:

$P(H_i \mid B)$ is the posterior probability of hypothesis $H_i$ given the evidence $B$ .
$P(B \mid H_i)$ is the likelihood of observing $B$ given hypothesis $H_i$ .
$P(H_i)$ is the prior probability of hypothesis $H_i$ .
The denominator normalizes the probabilities by summing over all possible hypotheses $H_1, H_2, \dots, H_n$ .

Summary of Key Concepts

Bayes' Rule allows us to update our beliefs about an event (posterior probability) based on new evidence.
The formula is: $P(A \mid B) = \frac{P(B \mid A) P(A)}{P(B)}$ Where:
- $P(A \mid B)$ is the posterior probability.
- $P(B \mid A)$ is the likelihood.
- $P(A)$ is the prior probability.
- $P(B)$ is the marginal likelihood.
Medical Diagnostics Example: Bayes' Rule can be used to calculate the probability of a person having a disease given a positive test result.
Law of Total Probability is used to calculate $P(B)$ , the total probability of evidence $B$ by considering all possible causes.
Counterintuitive Results: Even with high test accuracy, the probability of having a disease may remain low if the prior probability is low.

Previous topic 12

Independence and the Product Rule

Next topic 14

Random Variables and Probability Distributions

Past Papers

Open this section to load past papers

Click on Show Past Papers to see past papers.