Chapter 10: Confidence Intervals
Learning Objectives
- Construct and interpret a one-proportion $z$-interval for a population proportion $p$
- Determine the minimum sample size needed to achieve a desired margin of error
- Construct and interpret a one-sample $t$-interval for a population mean $\mu$
- Verify the conditions (Random, Large Counts / Normal, 10%) before constructing a CI
- Distinguish between paired and two-sample procedures
- Identify how confidence level, sample size, and variability affect CI width
10.1 Estimating a Population Proportion
When we want to know a population proportion $p$ (the true fraction of individuals with some characteristic), we use a sample proportion $\hat{p}$ as our best single estimate. But a single number ignores sampling variability — a confidence interval captures that uncertainty by providing a range of plausible values.
Definition: Point Estimate and Confidence Interval
A point estimate is a single value used to estimate a parameter. For a population proportion, the point estimate is the sample proportion $\hat{p} = \frac{\text{count of successes}}{n}$.
A confidence interval is an interval computed from sample data that, under repeated sampling, captures the true parameter a specified percentage of the time (the confidence level).
Conditions for a One-Proportion z-Interval
Before constructing any confidence interval, you must verify three conditions:
- Random: The sample was obtained by random sampling (SRS, stratified random sample, etc.).
- Large Counts: $n\hat{p} \geq 10$ and $n(1-\hat{p}) \geq 10$ — ensures the sampling distribution of $\hat{p}$ is approximately Normal.
- 10% Condition: $n \leq 0.10N$ — the sample is at most 10% of the population, so observations are approximately independent.
One-Proportion z-Interval
When conditions are met, a $C$% confidence interval for the population proportion $p$ is:
$$\hat{p} \pm z^* \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$$
- $\hat{p}$ = sample proportion (center / point estimate)
- $z^* \cdot \sqrt{\hat{p}(1-\hat{p})/n}$ = margin of error (ME)
- Common critical values: $z^* = 1.645$ (90%), $z^* = 1.960$ (95%), $z^* = 2.576$ (99%)
Example 10.1 — Constructing a 95% Confidence Interval for $p$
In a random sample of $n = 200$ voters, $\hat{p} = 0.62$ said they support a ballot measure. Construct a 95% confidence interval for the true proportion $p$.
Step 1 — Conditions:
- Random: stated to be a random sample. ✓
- Large Counts: $200(0.62) = 124 \geq 10$ ✓ and $200(0.38) = 76 \geq 10$ ✓
- 10%: assumes population of voters is large. ✓
Step 2 — Calculate: For 95%, $z^* = 1.960$.
$$\text{ME} = 1.960 \cdot \sqrt{\frac{0.62 \cdot 0.38}{200}} = 1.960 \cdot 0.0343 = 0.0672$$
$$\text{CI}: \quad 0.62 \pm 0.0672 \implies (0.553,\ 0.687)$$
Interpretation: We are 95% confident that the true proportion of voters who support the ballot measure is between 55.3% and 68.7%.
Example 10.2 — Minimum Sample Size
How large a sample is needed so that the margin of error is at most 0.04 at 95% confidence? Use the conservative estimate $\hat{p} = 0.5$.
Solve $z^* \cdot \sqrt{\hat{p}(1-\hat{p})/n} \leq \text{ME}$ for $n$:
$$n \geq \left(\frac{z^*}{\text{ME}}\right)^2 \hat{p}(1-\hat{p}) = \left(\frac{1.96}{0.04}\right)^2 \cdot 0.25 = 2401 \cdot 0.25 = 600.25$$
Always round up: $n = 601$. Using $\hat{p} = 0.5$ gives the most conservative (largest) required sample size.
A random sample of $n = 150$ adults found $\hat{p} = 0.45$ use a streaming service. Construct a 90% confidence interval for $p$. ($z^* = 1.645$)
Show Answer
ME $= 1.645 \cdot \sqrt{(0.45)(0.55)/150} = 1.645 \cdot 0.04062 = 0.0668$.
CI: $0.45 \pm 0.0668 \implies (0.383,\ 0.517)$.
We are 90% confident the true proportion is between 38.3% and 51.7%.
AP Exam Tip: Always state and verify all three conditions before constructing a CI — you will lose points for skipping this step. When interpreting, say "We are $C$% confident the true proportion $p$ is between __ and __." Never say "the probability that $p$ is in this interval is 95%" — $p$ is fixed; the interval is random.
Simulation of 10 confidence intervals — most capture the true proportion $p = 0.50$ (dashed line), but some miss. Green = contains $p$; red = misses $p$.
Figure 10.1 — Simulated Confidence Intervals for $p$
10.2 Estimating a Population Mean
When estimating a population mean $\mu$ and the population standard deviation $\sigma$ is unknown (almost always in practice), we cannot use the $z$-procedures. Instead, we substitute the sample standard deviation $s$ and use the $t$-distribution, which has heavier tails than the Normal to account for the extra uncertainty.
Definition: $t$-Distribution
The $t$-distribution is symmetric and bell-shaped like the Normal, but has heavier tails. It is indexed by degrees of freedom $df = n - 1$. As $df \to \infty$, the $t$-distribution approaches the standard Normal $N(0,1)$.
The $t$-statistic for a sample mean is: $t = \dfrac{\bar{x} - \mu_0}{s/\sqrt{n}}$
Conditions for a One-Sample t-Interval
- Random: Data come from a random sample.
- Normal/Large Sample: Population is Normal, OR $n \geq 30$ (by CLT), OR the data show no strong skewness/outliers when $n$ is small.
- 10% Condition: $n \leq 0.10N$.
One-Sample $t$-Interval for $\mu$
When conditions are met, a $C$% confidence interval for the population mean $\mu$ is:
$$\bar{x} \pm t^* \cdot \frac{s}{\sqrt{n}}$$
- $t^*$ = critical value from the $t$-distribution with $df = n-1$ at confidence level $C$
- $s/\sqrt{n}$ = standard error of $\bar{x}$
- Use a $t$-table or calculator to find $t^*$ for given $df$ and $C$
Example 10.3 — One-Sample $t$-Interval
A random sample of $n = 20$ students has mean test score $\bar{x} = 84.5$ and standard deviation $s = 6.2$. Construct a 95% CI for the true mean $\mu$.
Conditions: Random ✓; $n = 20 < 30$ — must assume approximately Normal population ✓; 10% ✓.
$df = 19$, $t^* = 2.093$ (from $t$-table at 95%, $df=19$).
$$\text{ME} = 2.093 \cdot \frac{6.2}{\sqrt{20}} = 2.093 \cdot 1.387 = 2.903$$
$$\text{CI}: \quad 84.5 \pm 2.903 \implies (81.6,\ 87.4)$$
Interpretation: We are 95% confident that the true mean test score is between 81.6 and 87.4 points.
Example 10.4 — Effect of Sample Size on CI Width
The width of a $t$-interval is $2 \cdot t^* \cdot s/\sqrt{n}$. As $n$ increases:
- $\sqrt{n}$ increases → $s/\sqrt{n}$ (standard error) decreases → ME shrinks → narrower interval.
- Additionally, $df = n-1$ increases → $t^*$ decreases (approaches $z^*$) → even smaller ME.
Quadrupling the sample size (e.g., from 25 to 100) approximately halves the margin of error.
A random sample of $n = 40$ has $\bar{x} = 120$ and $s = 15$. Construct a 95% CI for $\mu$. ($df = 39$, $t^* \approx 2.023$)
Show Answer
CI: $120 \pm 4.80 \implies (115.2,\ 124.8)$.
We are 95% confident the true mean is between 115.2 and 124.8.
AP Exam Tip: Four factors affect CI width — confidence level (higher level → wider), sample size $n$ (larger $n$ → narrower), variability $s$ (larger $s$ → wider), and critical value $t^*$ (larger level → larger $t^*$). Know which direction each factor pulls the interval width.
Comparison of $t$-distribution (blue, $df=5$) vs. standard Normal (gray). The $t$-distribution has heavier tails, producing wider confidence intervals for small samples.
Figure 10.2 — $t$-Distribution vs. Standard Normal
10.3 Paired vs. Two-Sample Procedures
When comparing two groups, the choice of procedure depends on how the data were collected. Using the wrong procedure is a common AP exam error.
Paired vs. Two-Sample: Which to Use?
- Paired (Matched Pairs): Two measurements on the same subject (before/after), or on naturally matched pairs. Compute differences $d_i = x_{1i} - x_{2i}$ and use a one-sample $t$-interval on the differences.
- Two Independent Samples: Subjects in the two groups are different, unrelated individuals. Use a two-sample $t$-interval.
Paired procedures are more powerful (reduce variability) when a natural pairing exists.
Paired t-Interval
Let $\bar{d}$ = mean of differences and $s_d$ = standard deviation of differences. The paired $t$-interval is:
$$\bar{d} \pm t^* \cdot \frac{s_d}{\sqrt{n}} \qquad df = n-1$$
Two-Proportion z-Interval
For the difference of two population proportions $p_1 - p_2$:
$$(\hat{p}_1 - \hat{p}_2) \pm z^* \cdot \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}$$
Example 10.5 — Paired $t$-Interval
Ten students took a pre-test and post-test after a tutoring program. Differences (post $-$ pre): 3, 5, $-$1, 4, 6, 2, 7, 3, 4, 5.
$\bar{d} = 3.8$, $s_d = 2.2$, $n = 10$. For 95% CI: $df = 9$, $t^* = 2.262$.
$$\text{ME} = 2.262 \cdot \frac{2.2}{\sqrt{10}} = 2.262 \cdot 0.696 = 1.57$$
$$\text{CI}: \quad 3.8 \pm 1.57 \implies (2.23,\ 5.37)$$
Interpretation: We are 95% confident the true mean improvement is between 2.23 and 5.37 points. Since the entire interval is positive, there is convincing evidence of improvement.
When should you use a paired $t$-interval instead of a two-sample $t$-interval? Give a real-world example.
Show Answer
Comparison of three confidence intervals with the same center ($\hat{p} = 0.50$): wider intervals for lower $n$ or higher confidence level. Narrowest = 95%, $n=100$; middle = 95%, $n=50$; widest = 99%, $n=50$.
Figure 10.3 — Confidence Interval Width Comparison
10.4 Choosing Confidence Level and Sample Size
Researchers must balance two competing goals: high confidence (wide intervals) and precision (narrow intervals). The key levers are:
- Decrease ME: increase $n$, lower confidence level, reduce variability
- Increase ME: decrease $n$, raise confidence level, more variability
Summary of Confidence Interval Types
| Procedure | Parameter | Test Statistic | Conditions |
|---|---|---|---|
| 1-proportion $z$-interval | $p$ | $z^*$ | Random, Large Counts, 10% |
| 1-sample $t$-interval | $\mu$ | $t^*$ ($df=n-1$) | Random, Normal/Large, 10% |
| 2-proportion $z$-interval | $p_1 - p_2$ | $z^*$ | Random (both), Large Counts (both), 10% |
| 2-sample $t$-interval | $\mu_1 - \mu_2$ | $t^*$ (conservative $df$) | Random (both), Normal/Large (both), 10% |
Example 10.6 — Sample Size at 99% Confidence
How large a sample is needed so the margin of error is at most 0.03 at 99% confidence? Use the conservative $\hat{p} = 0.5$.
$$n \geq \left(\frac{z^*}{\text{ME}}\right)^2 \cdot \hat{p}(1-\hat{p}) = \left(\frac{2.576}{0.03}\right)^2 \cdot 0.25 = (85.87)^2 \cdot 0.25 \approx 7373 \cdot 0.25 = 1843.25$$
Wait — recalculate: $(2.576/0.03)^2 = (85.87)^2 = 7373.5$. Then $7373.5 \cdot 0.25 = 1843.4$. Round up: $n = 1844$.
Note: the higher confidence level (99% vs 95%) requires a much larger sample to maintain the same margin of error.
Practice Problems
A random sample of $n = 500$ adults finds $\hat{p} = 0.68$ support stricter environmental laws. Construct a 99% CI for $p$ and interpret it. ($z^* = 2.576$)
Show Solution
ME $= 2.576\cdot\sqrt{(0.68)(0.32)/500} = 2.576\cdot0.02087 = 0.0538$.
CI: $(0.626,\ 0.734)$. We are 99% confident the true proportion is between 62.6% and 73.4%.
A 95% CI for $\mu$ is $(42.1, 51.9)$. What is the point estimate $\bar{x}$ and the margin of error?
Show Solution
Margin of error: $(51.9 - 42.1)/2 = 4.9$.
The interval is centered at 47.0 with ME of 4.9.
A student constructs a CI and reports: "There is a 95% probability that the true proportion is in this interval." Is this correct? Explain.
Show Solution
$n = 15$, $\bar{x} = 32.4$, $s = 5.1$. Construct a 90% CI for $\mu$. ($df = 14$, $t^* = 1.761$)
Show Solution
CI: $32.4 \pm 2.319 \implies (30.1,\ 34.7)$.
We are 90% confident the true mean is between 30.1 and 34.7.
How does changing from 95% to 99% confidence affect the width of a CI, assuming the same data? Is this a desirable change?
Show Solution
A researcher wants ME $\leq 0.05$ at 95% confidence with no prior estimate of $p$. What sample size is required?
Show Solution
$n \geq (1.96/0.05)^2 \cdot 0.25 = (39.2)^2 \cdot 0.25 = 1536.64 \cdot 0.25 = 384.16$.
Round up: $n = 385$.
Matched pairs experiment: 8 subjects exercise before/after a diet. Differences: 2, 4, 3, 5, 1, 4, 3, 4. Construct a 95% CI for the true mean difference. ($\bar{d}=3.25$, $s_d=1.165$, $t^*=2.365$, $df=7$)
Show Solution
CI: $3.25 \pm 0.974 \implies (2.28,\ 4.22)$.
We are 95% confident the true mean reduction is between 2.28 and 4.22 units. The interval is entirely positive, providing evidence of a real reduction.
AP FRQ: Group A: $n_1=80$, $\hat{p}_1=0.55$. Group B: $n_2=100$, $\hat{p}_2=0.44$. Construct a 95% CI for $p_1 - p_2$ and determine if there is evidence of a difference.
Show Solution
SE $= \sqrt{(0.55)(0.45)/80 + (0.44)(0.56)/100} = \sqrt{0.003094 + 0.002464} = \sqrt{0.005558} = 0.07455$.
ME $= 1.96 \cdot 0.07455 = 0.1461$.
CI: $(0.55-0.44) \pm 0.1461 = 0.11 \pm 0.1461 = (-0.036,\ 0.256)$.
Since the interval contains 0, we do not have convincing evidence of a difference between $p_1$ and $p_2$ at the 95% confidence level.
📋 Chapter Summary
Confidence Interval Structure
$\text{statistic} \pm \text{critical value} \times \text{standard error}$ — or equivalently $\text{estimate} \pm \text{margin of error}$.
A 95% CI means: if we repeat this procedure many times, 95% of intervals constructed this way will capture the true parameter.
$\bar{x} \pm t^* \cdot \dfrac{s}{\sqrt{n}}$. Use when $\sigma$ unknown. Degrees of freedom: $df = n - 1$.
$\hat{p} \pm z^* \sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}}$. Use $z^* = 1.645$ (90%), 1.960 (95%), 2.576 (99%).
Conditions (PLAN acronym)
- Random — data comes from a random sample or randomized experiment
- Normal — $n \geq 30$ (CLT) or population is Normal (for means); $np \geq 10$ and $n(1-p) \geq 10$ (for proportions)
- Independent — $n \leq 10\%$ of population (10% condition)