Chapter 2: Numerical Summaries of Distributions

AP Statistics · MathHub · 2026

Learning Objectives

2.1 Measures of Center

A measure of center describes a typical or representative value in a distribution. The two most common are the mean and the median.

Mean and Median

Mean ($\bar{x}$): The arithmetic average. Add all values and divide by $n$.

$$\bar{x} = \frac{x_1 + x_2 + \cdots + x_n}{n} = \frac{\sum x_i}{n}$$

Median ($M$): The middle value when data are ordered. If $n$ is even, average the two middle values.

Which to Use: Mean or Median?

The choice depends on the shape of the distribution and the presence of outliers:

AP Exam Tip: Outliers and skewness pull the mean toward the tail but leave the median relatively unchanged. In right-skewed distributions, mean > median. In left-skewed distributions, mean < median.

Example 2.1 — Mean vs. Median with an Outlier

Seven students' test scores: 72, 75, 78, 80, 82, 85, 43.

Step 1 — Order the data: 43, 72, 75, 78, 80, 82, 85

Step 2 — Median: Middle value (4th of 7) = 78

Step 3 — Mean: $\bar{x} = \frac{43+72+75+78+80+82+85}{7} = \frac{515}{7} \approx \mathbf{73.6}$

The outlier 43 dragged the mean down to 73.6, far below six of the seven scores. The median 78 better represents a typical score.

TRY IT

Daily high temperatures (°F) for a week: 68, 71, 70, 73, 69, 95, 72. Find the mean and median. Which better describes a typical day?

Show Answer
Ordered: 68, 69, 70, 71, 72, 73, 95
Median: 71°F (4th value)
Mean: (68+69+70+71+72+73+95)/7 = 518/7 ≈ 74.0°F
The outlier 95°F inflates the mean. The median 71°F better represents a typical day.

2.2 Measures of Spread

A measure of spread describes how variable the data are. More spread means more variability. We study three: range, IQR, and standard deviation.

Range and Interquartile Range (IQR)

Range and IQR

Range = Maximum − Minimum. Simple but not resistant to outliers.

Quartiles: $Q_1$ = median of lower half; $Q_3$ = median of upper half.

IQR = $Q_3 - Q_1$. The spread of the middle 50% of data. Resistant to outliers.

Example 2.2 — Finding Q₁, Q₃, and IQR

Exam scores: 55, 62, 70, 74, 78, 82, 85, 88, 91, 96

n = 10, so the median = average of 5th and 6th values = (78+82)/2 = 80

Lower half: 55, 62, 70, 74, 78 → $Q_1$ = 70

Upper half: 82, 85, 88, 91, 96 → $Q_3$ = 88

IQR = 88 − 70 = 18

Outlier check: Any value below $Q_1 - 1.5 \cdot \text{IQR} = 70 - 27 = 43$ or above $Q_3 + 1.5 \cdot \text{IQR} = 88 + 27 = 115$ is an outlier. Score of 55 is not an outlier.

Standard Deviation

Standard Deviation and Variance

The sample standard deviation $s_x$ measures the typical distance from each data value to the mean.

$$s_x = \sqrt{\frac{\sum (x_i - \bar{x})^2}{n-1}}$$

The variance is $s_x^2$. Divide by $n-1$ (not $n$) for an unbiased estimate of the population variance.

Key properties of standard deviation:

Example 2.3 — Computing Standard Deviation

Data: 3, 7, 7, 6, 5. Find $s_x$.

Step 1 — Mean: $\bar{x} = (3+7+7+6+5)/5 = 28/5 = 5.6$

Step 2 — Deviations squared:

$x_i$$x_i - \bar{x}$$(x_i - \bar{x})^2$
3−2.66.76
71.41.96
71.41.96
60.40.16
5−0.60.36
Sum11.20

Step 3: $s_x = \sqrt{11.20/(5-1)} = \sqrt{2.80} \approx \mathbf{1.67}$

TRY IT

Data: 2, 4, 4, 4, 5, 5, 7, 9. Find the IQR and determine whether any outliers exist.

Show Answer
n=8: Median = (4+5)/2 = 4.5
Lower half: 2, 4, 4, 4 → Q₁ = (4+4)/2 = 4
Upper half: 5, 5, 7, 9 → Q₃ = (5+7)/2 = 6
IQR = 6 − 4 = 2
Outlier fences: lower = 4 − 3 = 1; upper = 6 + 3 = 9
No values outside [1, 9]. No outliers.

2.3 Choosing Summary Statistics

SituationCenterSpread
Symmetric, no outliersMean ($\bar{x}$)Standard deviation ($s_x$)
Skewed or outliers presentMedian ($M$)IQR

This pairing matters: always report mean with $s_x$ and median with IQR. Mixing them (e.g., median with $s_x$) is penalized on the AP exam.

AP Exam Tip: When comparing distributions, always address shape, center, spread, and outliers (SOCS) and use comparative language in context: "The IQR for Group A (18 pts) is larger than for Group B (12 pts), indicating Group A has more variability in scores."

2.4 Z-Scores (Standardized Values)

Z-Score

A z-score measures how many standard deviations a value $x$ is from the mean:

$$z = \frac{x - \bar{x}}{s_x}$$

A positive z-score means the value is above the mean; negative means below. Z-scores have no units.

Z-scores allow comparison across different scales. A student scoring 82 on Test A ($\bar{x}=75, s=8$) and 76 on Test B ($\bar{x}=70, s=4$) performed better on Test B ($z_A = 0.875$, $z_B = 1.5$).

Example 2.4 — Interpreting Z-Scores

AP Statistics scores: $\bar{x} = 3.1$, $s_x = 0.9$. A student scored 5. Find and interpret the z-score.

$$z = \frac{5 - 3.1}{0.9} = \frac{1.9}{0.9} \approx \mathbf{2.11}$$

Interpretation: The student's score of 5 is about 2.11 standard deviations above the mean AP Statistics score.

Interactive: Explore how mean and standard deviation shape a Normal distribution

Figure 2.1 — Normal Distribution with Adjustable Mean and Standard Deviation

2.5 The Empirical Rule (68-95-99.7 Rule)

For data that follow a Normal distribution, the Empirical Rule gives the percentage of values within 1, 2, and 3 standard deviations of the mean:

Empirical Rule (for Normal Distributions Only)

$\approx \mathbf{68\%}$ of values fall within $\bar{x} \pm 1s_x$
$\approx \mathbf{95\%}$ of values fall within $\bar{x} \pm 2s_x$
$\approx \mathbf{99.7\%}$ of values fall within $\bar{x} \pm 3s_x$

Example 2.5 — Applying the Empirical Rule

Adult male heights are approximately Normal with $\bar{x} = 70$ in and $s_x = 3$ in.

(a) What percent of men are between 67 and 73 inches?

67 = 70 − 3 = $\bar{x} − s_x$ and 73 = $\bar{x} + s_x$ → approximately 68%

(b) What percent are taller than 76 inches?

76 = 70 + 2(3) = $\bar{x} + 2s_x$. 5% are outside $\bar{x} \pm 2s_x$; half (2.5%) are above 76 inches. → approximately 2.5%

(c) Find the z-score for a man who is 74 inches tall.

$$z = \frac{74-70}{3} = \frac{4}{3} \approx 1.33$$

Interactive: 68-95-99.7 Rule — shaded regions show the empirical rule regions

Figure 2.2 — The Empirical Rule for Normal Distributions

TRY IT

SAT Math scores are approximately Normal with $\mu = 530$ and $\sigma = 115$. (a) What percent of students score between 415 and 645? (b) What is the z-score for a student who scored 760?

Show Answer
(a) 415 = 530 − 115 = μ − σ; 645 = 530 + 115 = μ + σ. By the Empirical Rule: ≈68%
(b) z = (760 − 530)/115 = 230/115 = 2.0. This score is 2 standard deviations above the mean.

Interactive: Z-score visualizer — see where a value falls relative to the distribution

Figure 2.3 — Z-Score Position on the Normal Curve

Practice Problems

1

Find the mean and median of: 4, 8, 6, 5, 3, 2, 8, 9, 3, 7. Which is larger? What does this suggest about the distribution's shape?

Show Solution
Ordered: 2, 3, 3, 4, 5, 6, 7, 8, 8, 9
Median = (5+6)/2 = 5.5
Mean = 55/10 = 5.5
Mean = Median → approximately symmetric distribution.
2

Home prices in a neighborhood: $280K, $305K, $290K, $315K, $1,200K. Which measure of center better represents a typical home price? Explain.

Show Solution
Ordered: 280, 290, 305, 315, 1200 (thousands)
Median = $305K
Mean = 2390/5 = $478K
The $1,200K outlier inflates the mean far above four of five prices. The median ($305K) better represents a typical home price.
3

Data: 12, 15, 18, 20, 22, 25, 28. Find (a) Q₁, (b) Q₃, (c) IQR, (d) any outliers using the 1.5×IQR rule.

Show Solution
Median = 20 (4th of 7)
Lower half: 12, 15, 18 → Q₁ = 15
Upper half: 22, 25, 28 → Q₃ = 25
IQR = 10
Fences: 15−15=0, 25+15=40. All values in [0,40]. No outliers.
4

A distribution has $\bar{x}=50$ and $s_x=10$. Find the z-scores for values 35, 50, and 68. Interpret each.

Show Solution
z(35) = (35−50)/10 = −1.5 (1.5 SDs below mean)
z(50) = (50−50)/10 = 0 (at the mean)
z(68) = (68−50)/10 = 1.8 (1.8 SDs above mean)
5

IQ scores are Normal with μ=100 and σ=15. Using the Empirical Rule: (a) What percent score between 70 and 130? (b) What percent score above 145?

Show Solution
(a) 70=100−2(15) and 130=100+2(15) = μ±2σ → ≈95%
(b) 145=100+3(15)=μ+3σ. Only 0.3% outside μ±3σ; half (0.15%) above 145 → ≈0.15%
6

Class A scores: mean=78, s=12. Class B scores: mean=78, s=4. Both classes have the same mean. What does the difference in standard deviation tell you?

Show Solution
Class A has much more variability (s=12 vs s=4). Scores in Class A are more spread out around the mean of 78 — some students scored very high or very low. Class B scores are tightly clustered near 78, indicating more consistency.
7

(AP Free Response Style) Two running groups recorded their weekly mileage. Group A: mean=42, median=38, s=18. Group B: mean=35, median=36, s=6. Compare the distributions of weekly mileage for the two groups.

Show Solution
Shape: Group A's mean (42) > median (38) suggests right skew, likely from a few high-mileage runners. Group B's mean ≈ median suggests approximate symmetry.
Center: Group A has higher typical mileage (median 38 mi) than Group B (median 36 mi).
Spread: Group A has far more variability (s=18) than Group B (s=6), meaning mileage is much more consistent in Group B.
Outliers: Group A's right skew suggests possible high-mileage outliers pulling the mean up.
8

A student scored 88 on a test where $\bar{x}=80$ and $s=6$. On a second test, she scored 74 where $\bar{x}=65$ and $s=9$. On which test did she perform better relative to her class?

Show Solution
Test 1: z = (88−80)/6 = 8/6 ≈ 1.33
Test 2: z = (74−65)/9 = 9/9 = 1.00
She performed better on Test 1 (z=1.33 vs z=1.00), scoring relatively higher above her class mean.

📋 Chapter Summary

Measures of Center

Mean

$\bar{x} = \dfrac{\sum x_i}{n}$ — the arithmetic average. Sensitive to outliers and skewness. Use when distribution is roughly symmetric.

Median

The middle value when data is sorted. Resistant to outliers. Use when distribution is skewed or has outliers. For skewed data: median is preferred.

Measures of Spread

Standard Deviation

$s = \sqrt{\dfrac{\sum(x_i - \bar{x})^2}{n-1}}$ — typical distance from the mean. Use with mean for symmetric distributions.

IQR (Interquartile Range)

$\text{IQR} = Q_3 - Q_1$ — spread of the middle 50% of data. Resistant to outliers. Use with median.

Range

$\text{Range} = \max - \min$ — simplest spread measure. Not resistant to outliers.

Outlier Rule (1.5×IQR)

A value is an outlier if it falls below $Q_1 - 1.5\times\text{IQR}$ or above $Q_3 + 1.5\times\text{IQR}$.

Choosing Summary Statistics

  1. Symmetric, no outliers → Mean and standard deviation
  2. Skewed or outliers present → Median and IQR
  3. Five-number summary → Min, Q1, Median, Q3, Max (always useful)
  4. Context matters — explain what spread/center means for the situation

📘 Key Terms

Mean ($\bar{x}$)Average of all values. Not resistant — pulled toward outliers and skewness.
MedianMiddle value in sorted data. Resistant to outliers. Preferred for skewed distributions.
Standard DeviationTypical deviation from the mean; measures spread for symmetric distributions.
IQR$Q_3 - Q_1$ — spread of the middle half of data. Resistant to outliers.
OutlierA value more than 1.5×IQR below $Q_1$ or above $Q_3$.
Five-Number SummaryMin, Q1, Median, Q3, Max — used to construct boxplots and describe distributions.
← Chapter 1: Exploring Data All Chapters