A sampling distribution is the probability distribution of a given statistic (such as the sample mean, sample variance, or sample proportion) that is calculated from a random sample. Instead of focusing on the values of a single sample, a sampling distribution describes the behavior of a statistic across all possible samples of a particular size drawn from a population.
Sampling distributions play a critical role in inferential statistics, as they allow us to make conclusions about a population based on sample data. They are fundamental to understanding how sample statistics estimate population parameters and provide the foundation for hypothesis testing and confidence intervals.
A sampling distribution describes how a statistic (like the sample mean) behaves across many samples drawn from the same population.
One of the most commonly studied sampling distributions is the sampling distribution of the sample mean. It describes the distribution of the means of all possible random samples of a given size taken from a population.
Mean of the Sampling Distribution of the Sample Mean: The mean of the sample means is equal to the population mean:
This property is known as the unbiasedness of the sample mean — it is an unbiased estimator of the population mean.
Standard Deviation of the Sampling Distribution (Standard Error): The standard deviation of the sample mean is called the standard error (SE). It is smaller than the population standard deviation because averages tend to be less variable than individual data points:
Where:
As the sample size increases, the standard error decreases, meaning that sample means are more tightly clustered around the population mean.
Shape of the Sampling Distribution of the Sample Mean:
The Central Limit Theorem (CLT) is a key result in probability theory that explains why sampling distributions are often normal. The CLT states that:
This is a powerful result because it allows statisticians to apply techniques that assume normality, even when the population distribution is not normal, as long as the sample size is sufficiently large (usually is considered large enough).
When the statistic of interest is a proportion (such as the proportion of people who favor a particular policy), the sampling distribution of the sample proportion is used.
The sampling distribution of has the following properties:
Mean of the Sampling Distribution of the Sample Proportion:
Where is the population proportion. The sample proportion is an unbiased estimator of the population proportion.
Standard Deviation of the Sampling Distribution of the Sample Proportion (Standard Error): The standard deviation of the sample proportion is called the standard error of the proportion:
Where:
Shape of the Sampling Distribution of the Sample Proportion: The sampling distribution of will be approximately normal if:
These conditions ensure that both the number of successes and failures in the sample are sufficiently large for the sampling distribution to be approximated by a normal distribution.
In addition to the sample mean and sample proportion, we can examine the sampling distributions of other statistics, such as the sample variance or sample median. For each statistic, the sampling distribution will have its own set of properties, including its mean and standard deviation.
For example:
Sampling distributions are central to the practice of statistical inference, which is the process of drawing conclusions about a population based on sample data. They allow us to:
Make Predictions: By understanding the sampling distribution of a statistic, we can estimate the probability of observing certain values of the statistic and make predictions about future observations.
Construct Confidence Intervals: Sampling distributions help determine how much variability to expect in a sample statistic, which is essential for constructing confidence intervals around a population parameter.
Hypothesis Testing: Sampling distributions provide the foundation for hypothesis tests. By comparing sample statistics to their expected values under a null hypothesis, we can assess the likelihood of observing the sample data if the null hypothesis is true.
In practice, sampling distributions help us understand the behavior of sample statistics and make informed decisions based on sample data.
Open this section to load past papers