The sampling distribution of the sample varianceS2 is an important concept when making inferences about the population variance σ2 based on sample data. It describes how the sample variance behaves when repeated samples are taken from the same population.
1. Sample Variance S2 and its Calculation
The sample varianceS2 is a measure of the spread or dispersion of the sample data. It is calculated as:
S2=n−11i=1∑n(xi−xˉ)2
Where:
xi are the individual sample data points,
xˉ is the sample mean,
n is the sample size.
2. The Sampling Distribution of S2
When we repeatedly take samples of size n from a population and compute the sample variance S2 for each sample, the distribution of these sample variances will follow a specific pattern, provided the population is normally distributed.
Key Properties:
Shape of the Sampling Distribution:
The sampling distribution of the sample variance S2 follows a Chi-square distribution when the population from which the sample is drawn is normally distributed.
This distribution depends on the sample size n and the population variance σ2.
Specifically, if X1,X2,…,Xn are independent and identically distributed (i.i.d.) random variables drawn from a normal population, then the sample variance S2 is related to a chi-square distribution.
Degrees of Freedom:
The sampling distribution of the sample variance follows a Chi-square distribution with n−1 degrees of freedom.
This is because the sample mean xˉ is used to calculate S2, and there is one degree of freedom lost when estimating the population mean from the sample.
Mean of the Sampling Distribution:
The mean of the sampling distribution of S2 is equal to the population varianceσ2:
E[S2]=σ2
This means that the sample variance S2 is an unbiased estimator of the population variance σ2.
Variance of the Sampling Distribution:
The variance of the sampling distribution of S2 is given by:
Var(S2)=n−12σ4
This shows that the variability of the sample variance decreases as the sample size n increases.
Standard Deviation of the Sampling Distribution (Standard Error of S2):
The standard deviation of the sample variance is the standard error of the sample variance, which is:
SES2=n−12σ4
This formula helps us understand the spread of the sample variance estimates.
3. Chi-Square Distribution and the Sample Variance
As mentioned earlier, when the underlying population is normally distributed, the sample variance S2 follows a chi-square distribution with n−1 degrees of freedom.
The relationship between the sample variance and the chi-square distribution is:
σ2(n−1)S2∼χn−12
Where:
χn−12 denotes a chi-square distribution with n−1 degrees of freedom,
σ2 is the population variance,
S2 is the sample variance,
n−1 is the degrees of freedom.
This formula states that the scaled sample variance σ2(n−1)S2 follows a chi-square distribution. The scaling factor (n−1) accounts for the degrees of freedom and allows us to make inferences about the population variance.
Example:
If we have a sample of size 10 (i.e., n=10) from a normally distributed population with a population variance σ2=25, and we calculate the sample variance S2, we can say that:
25(10−1)S2∼χ92
This means that the scaled sample variance 259S2 follows a chi-square distribution with 9 degrees of freedom.
4. Practical Use of Sampling Distribution of S2
The sampling distribution of S2 has several important applications in statistics:
Confidence Intervals for Population Variance:
We can use the chi-square distribution to construct confidence intervals for the population variance σ2 based on the sample variance S2.
For a given confidence level (say, 95%), the confidence interval for σ2 can be calculated using the formula:
(χα/2,n−12(n−1)S2,χ1−α/2,n−12(n−1)S2)
Where:
S2 is the sample variance,
n is the sample size,
χα/2,n−12 and χ1−α/2,n−12 are the chi-square critical values corresponding to the desired confidence level.
Hypothesis Testing:
The sampling distribution of S2 is used in hypothesis testing for the population variance σ2. For example, if we want to test whether the population variance is equal to a specific value σ02, we can use the following test statistic:
χ2=σ02(n−1)S2
This test statistic follows a chi-square distribution with n−1 degrees of freedom.
Summary of Key Points
The sampling distribution of S2 (sample variance) is the distribution of the sample variance calculated from repeated samples taken from the population.
If the population is normally distributed, the sampling distribution of S2 follows a chi-square distribution with n−1 degrees of freedom.
The mean of the sampling distribution of S2 is equal to the population variance σ2, making S2 an unbiased estimator of σ2.
The variance of the sampling distribution of S2 is n−12σ4.
The chi-square distribution plays a central role in constructing confidence intervals for the population variance and in hypothesis testing for variance.
Understanding the sampling distribution of S2 is crucial when making inferences about population variances and performing statistical tests related to variance.