A-Level Mathematics – Chapter 11: Statistics
Statistics underpins evidence-based reasoning in science, medicine, economics, and everyday life. This chapter builds from the foundations of data collection and summary statistics through probability laws, discrete distributions, and — at A2 level — the powerful normal distribution and formal hypothesis testing.
Specification Note
Content labelled Year 2 is A2-level only. Sections 11.1–11.5 form the AS Statistics content. The Large Data Set (LDS) is assessed in Edexcel; familiarity with context-based questions is important for all boards.
Contents
11.1 Data Collection
Key Vocabulary
- Population — the entire group of individuals or items being studied.
- Sample — a subset of the population selected for investigation.
- Census — data collected from every member of the population.
- Sampling frame — a list of all members of the population from which the sample is drawn.
- Bias — a systematic tendency for a sample to over- or under-represent certain members of the population.
Sampling Methods
| Method | Description | Advantage | Disadvantage |
|---|---|---|---|
| Simple random | Every member has an equal chance of selection (e.g. using a random number generator) | Free from bias | Requires a complete sampling frame; impractical for large populations |
| Systematic | Select every $k$th member after a random start | Simple to implement | Can introduce periodic bias if the population has a repeating pattern |
| Stratified | Divide the population into strata; sample each stratum proportionally | Represents sub-groups accurately | Strata must be clearly defined and non-overlapping |
| Quota | Interviewer selects individuals until quotas for each category are filled | Cheap and quick | Not random; prone to interviewer bias |
| Opportunity (convenience) | Select whoever is available at the time | Very easy to carry out | Highly likely to be biased |
Example 11.1.1 — Choosing a sampling method
A school has 600 students: 200 in Year 12 and 400 in Year 13. A researcher wants a sample of 60 students representative of both year groups.
Stratified sampling: Year 12 proportion $= \frac{200}{600} = \frac{1}{3}$, so select $\frac{1}{3} \times 60 = 20$ from Year 12. Select $40$ from Year 13. Within each year group, use simple random sampling to choose the individuals.
Example 11.1.2 — Identifying bias
A survey of shopping habits is conducted outside a supermarket on a Tuesday morning. Identify a source of bias.
People shopping on a Tuesday morning are likely to be retired or not in full-time employment, so the sample over-represents those groups. Workers and students are under-represented. This is an example of opportunity sampling introducing bias.
11.2 Measures of Location and Spread
Measures of Location (Averages)
- Mean: $\bar{x} = \dfrac{\sum x_i}{n}$ (or $\dfrac{\sum f_i x_i}{\sum f_i}$ for grouped or frequency data)
- Median: the middle value when data are ordered; for $n$ values, the median is at position $\frac{n+1}{2}$
- Mode: the most frequently occurring value (or modal class for grouped data)
Measures of Spread
- Range: largest value $-$ smallest value
- Interquartile range (IQR): $Q_3 - Q_1$, where $Q_1$ is the lower quartile and $Q_3$ is the upper quartile
- Variance: $\sigma^2 = \dfrac{\sum (x_i - \bar{x})^2}{n} = \dfrac{\sum x_i^2}{n} - \bar{x}^2$
- Standard deviation: $\sigma = \sqrt{\sigma^2}$
Example 11.2.1 — Calculate mean and standard deviation for the data set: 4, 7, 3, 9, 2, 8, 6.
$n = 7$, $\displaystyle\sum x_i = 4 + 7 + 3 + 9 + 2 + 8 + 6 = 39$
$\bar{x} = \dfrac{39}{7} \approx 5.571$
$\displaystyle\sum x_i^2 = 16 + 49 + 9 + 81 + 4 + 64 + 36 = 259$
$\sigma^2 = \dfrac{259}{7} - \left(\dfrac{39}{7}\right)^2 = 37 - 30.918\ldots \approx 6.082$
$\sigma \approx 2.47$ (to 3 s.f.)
Example 11.2.2 — Identify outliers using the IQR fence rule.
An outlier is a value that lies more than $1.5 \times \text{IQR}$ below $Q_1$ or above $Q_3$. For the ordered data set 2, 5, 6, 7, 8, 9, 10, 11, 25:
$Q_1 = 5.5$, $Q_3 = 10.5$, $\text{IQR} = 5.0$.
Lower fence $= 5.5 - 1.5 \times 5 = -2.0$. Upper fence $= 10.5 + 1.5 \times 5 = 18.0$.
The value 25 exceeds 18.0, so it is identified as an outlier.
Example 11.2.3 — Mean from a frequency table.
Find the mean from the following frequency distribution:
| $x$ | 1 | 2 | 3 | 4 | 5 |
|---|---|---|---|---|---|
| Frequency $f$ | 3 | 7 | 10 | 6 | 4 |
$\sum f = 30$, $\sum fx = 3(1) + 7(2) + 10(3) + 6(4) + 4(5) = 3 + 14 + 30 + 24 + 20 = 91$
$\bar{x} = \dfrac{91}{30} \approx 3.03$
Example 11.2.4 — Comparing the mean and median.
Salaries (£000s) at a small firm: 22, 24, 25, 26, 28, 30, 75. Find the mean and median and comment.
Mean $= \dfrac{230}{7} \approx £32.9$k. Median $= £26$k (the 4th value in order).
The mean is pulled upward by the outlier of £75k, so the median is a better measure of a "typical" salary here.
11.3 Probability
Basic Probability Rules
- For any event $A$: $0 \le P(A) \le 1$
- $P(A') = 1 - P(A)$ (complement rule)
- Addition law: $P(A \cup B) = P(A) + P(B) - P(A \cap B)$
- Mutually exclusive: $P(A \cap B) = 0$, so $P(A \cup B) = P(A) + P(B)$
- Multiplication law: $P(A \cap B) = P(A) \times P(B \mid A)$
- Independent: $P(A \cap B) = P(A) \times P(B)$
- Conditional probability: $P(A \mid B) = \dfrac{P(A \cap B)}{P(B)}$
Figure 11.1 — A probability tree illustrating two draws from a bag containing 3 red and 2 blue balls without replacement. Each path probability is the product of its branch probabilities.
Example 11.3.1 — Addition law
In a class of 30 students, 18 study French, 12 study Spanish, and 6 study both. Find the probability that a randomly chosen student studies French or Spanish.
$P(F \cup S) = P(F) + P(S) - P(F \cap S) = \dfrac{18}{30} + \dfrac{12}{30} - \dfrac{6}{30} = \dfrac{24}{30} = \dfrac{4}{5}$
Example 11.3.2 — Conditional probability
A bag contains 4 red and 6 blue balls. Two are drawn without replacement. Find the probability that the second ball is red, given the first was red.
After drawing one red ball, 3 red and 6 blue remain (9 total).
$P(\text{2nd red} \mid \text{1st red}) = \dfrac{3}{9} = \dfrac{1}{3}$
Example 11.3.3 — Tree diagram: two draws without replacement
From a bag of 3 red (R) and 2 blue (B) balls, two are drawn without replacement. Find $P(\text{one of each colour})$.
$P(RB) = \dfrac{3}{5} \times \dfrac{2}{4} = \dfrac{6}{20}$ $P(BR) = \dfrac{2}{5} \times \dfrac{3}{4} = \dfrac{6}{20}$
$P(\text{one of each}) = \dfrac{6}{20} + \dfrac{6}{20} = \dfrac{12}{20} = \dfrac{3}{5}$
Example 11.3.4 — Independent events
Events $A$ and $B$ satisfy $P(A) = 0.4$, $P(B) = 0.3$ and are independent. Find $P(A \cup B)$.
$P(A \cap B) = P(A) \times P(B) = 0.4 \times 0.3 = 0.12$
$P(A \cup B) = 0.4 + 0.3 - 0.12 = 0.58$
11.4 Discrete Random Variables
Definition — Discrete Random Variable
A discrete random variable (DRV) $X$ takes a countable set of values $x_1, x_2, \ldots$, each with an associated probability. The probability distribution specifies $P(X = x_i)$ for each $x_i$. The probabilities must satisfy: $P(X = x_i) \ge 0$ and $\sum_i P(X = x_i) = 1$.
Expectation and Variance
Expected value (mean): $E(X) = \mu = \displaystyle\sum_i x_i \, P(X = x_i)$
Variance: $\text{Var}(X) = E(X^2) - [E(X)]^2$ where $E(X^2) = \displaystyle\sum_i x_i^2 \, P(X = x_i)$
Linear transformation: $E(aX + b) = aE(X) + b$ $\text{Var}(aX + b) = a^2\,\text{Var}(X)$
Example 11.4.1 — Find $E(X)$ and $\text{Var}(X)$.
| $x$ | 1 | 2 | 3 | 4 |
|---|---|---|---|---|
| $P(X=x)$ | 0.1 | 0.3 | 0.4 | 0.2 |
$E(X) = 1(0.1) + 2(0.3) + 3(0.4) + 4(0.2) = 0.1 + 0.6 + 1.2 + 0.8 = 2.7$
$E(X^2) = 1(0.1) + 4(0.3) + 9(0.4) + 16(0.2) = 0.1 + 1.2 + 3.6 + 3.2 = 8.1$
$\text{Var}(X) = 8.1 - (2.7)^2 = 8.1 - 7.29 = 0.81$
Example 11.4.2 — Finding a missing probability.
Given $P(X=1) = 0.20$, $P(X=2) = k$, $P(X=3) = 0.35$, $P(X=4) = 0.15$, find $k$.
Since all probabilities sum to 1: $0.20 + k + 0.35 + 0.15 = 1 \Rightarrow k = 0.30$.
Example 11.4.3 — Linear transformation.
Given $E(X) = 3$ and $\text{Var}(X) = 2$, find $E(4X - 1)$ and $\text{Var}(4X - 1)$.
$E(4X - 1) = 4E(X) - 1 = 4(3) - 1 = 11$
$\text{Var}(4X - 1) = 4^2 \,\text{Var}(X) = 16 \times 2 = 32$
11.5 Binomial Distribution
Definition — Binomial Distribution $B(n, p)$
$X \sim B(n, p)$ if all of the following hold:
- There are a fixed number $n$ of independent trials.
- Each trial results in either success (probability $p$) or failure (probability $1-p$).
- The probability $p$ is constant across all trials.
The probability of exactly $r$ successes is: $$P(X = r) = \binom{n}{r} p^r (1-p)^{n-r}, \quad r = 0, 1, \ldots, n$$ Mean: $E(X) = np$ Variance: $\text{Var}(X) = np(1-p)$
Figure 11.2 — Probability distribution of $X \sim B(10,\, 0.4)$. The bars show $P(X = r)$ for $r = 0, 1, \ldots, 10$. The distribution is slightly right-skewed since $p < 0.5$.
Example 11.5.1 — A biased coin has $P(\text{heads}) = 0.6$. Tossed 8 times, find $P(X = 5)$.
$X \sim B(8,\, 0.6)$
$P(X = 5) = \dbinom{8}{5}(0.6)^5(0.4)^3 = 56 \times 0.07776 \times 0.064 \approx 0.2787$
Example 11.5.2 — Find $P(X \le 2)$ for $X \sim B(5,\, 0.3)$.
$P(X = 0) = (0.7)^5 = 0.16807$
$P(X = 1) = 5(0.3)(0.7)^4 = 0.36015$
$P(X = 2) = 10(0.3)^2(0.7)^3 = 0.30870$
$P(X \le 2) = 0.16807 + 0.36015 + 0.30870 = 0.83692 \approx 0.837$
Example 11.5.3 — Mean and standard deviation of a binomial variable.
For $X \sim B(20,\, 0.35)$, find the mean and standard deviation.
$E(X) = np = 20 \times 0.35 = 7.0$
$\text{Var}(X) = np(1-p) = 20 \times 0.35 \times 0.65 = 4.55$
$\text{SD}(X) = \sqrt{4.55} \approx 2.13$
Example 11.5.4 — Finding $n$ and $p$ from the mean and variance.
$X \sim B(n, p)$ has $E(X) = 6$ and $\text{Var}(X) = 4.2$. Find $n$ and $p$.
$np = 6$ and $np(1-p) = 4.2$. Dividing: $1 - p = \dfrac{4.2}{6} = 0.7$, so $p = 0.3$. Then $n = \dfrac{6}{0.3} = 20$.
Exam Tip
On the exam, use your calculator's binomial distribution functions (binomPDF and binomCDF) to compute probabilities efficiently. Always state the distribution clearly, e.g. "$X \sim B(n, p)$", before calculating, and remember to check all four conditions for a binomial model.
11.6 Normal Distribution Year 2
Definition — Normal Distribution $N(\mu,\, \sigma^2)$
A continuous random variable $X$ has a normal distribution $X \sim N(\mu, \sigma^2)$ if its probability density function is the bell-shaped curve: $$f(x) = \frac{1}{\sigma\sqrt{2\pi}} \exp\!\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)$$ Key properties: symmetric about $\mu$; mean $=$ median $=$ mode $= \mu$; approximately 68% of values lie within $\mu \pm \sigma$, 95% within $\mu \pm 2\sigma$, 99.7% within $\mu \pm 3\sigma$.
Figure 11.3 — The standard normal curve $N(0,\,1)$ with the region $-1 \le z \le 1$ shaded in blue. This region contains approximately 68.3% of the total probability.
Standardising to the Standard Normal
If $X \sim N(\mu, \sigma^2)$, then $Z = \dfrac{X - \mu}{\sigma} \sim N(0, 1)$ (the standard normal distribution). You can then read probabilities from $\Phi(z)$ tables or use your calculator's normal distribution function.
Example 11.6.1 — Find $P(X < 72)$ where $X \sim N(70,\, 25)$.
Standardise: $Z = \dfrac{72 - 70}{\sqrt{25}} = \dfrac{2}{5} = 0.4$
$P(X < 72) = P(Z < 0.4) = \Phi(0.4) \approx 0.6554$
Example 11.6.2 — Find $P(60 < X < 80)$ where $X \sim N(70,\, 100)$.
$Z_1 = \dfrac{60-70}{10} = -1$, $\quad Z_2 = \dfrac{80-70}{10} = 1$
$P(-1 < Z < 1) = \Phi(1) - \Phi(-1) = 0.8413 - 0.1587 = 0.6826$
Example 11.6.3 — Inverse normal: find $a$ such that $P(X < a) = 0.9$, where $X \sim N(50,\, 16)$.
From tables, $\Phi^{-1}(0.9) \approx 1.2816$.
$\dfrac{a - 50}{4} = 1.2816 \Rightarrow a = 50 + 4(1.2816) = 55.13$
Normal Approximation to the Binomial
When $n$ is large and $p$ is not too close to 0 or 1, $B(n, p)$ can be approximated by $N(np,\; np(1-p))$. A continuity correction must be applied: replace a discrete value $k$ with the continuous interval $(k - 0.5,\; k + 0.5)$.
Example 11.6.4 — Approximate $P(X \ge 60)$ for $X \sim B(100,\, 0.55)$ using the normal approximation.
$\mu = 55$, $\sigma^2 = 100 \times 0.55 \times 0.45 = 24.75$, $\sigma \approx 4.975$.
With continuity correction: $P(X \ge 60) \approx P(Y \ge 59.5)$ where $Y \sim N(55,\, 24.75)$.
$Z = \dfrac{59.5 - 55}{4.975} \approx 0.904$; $\quad P(Z \ge 0.904) = 1 - \Phi(0.904) \approx 1 - 0.8169 = 0.1831$
11.7 Statistical Hypothesis Testing Year 2
Key Concepts in Hypothesis Testing
- Null hypothesis $H_0$: the assumption being tested (a specific value of the parameter).
- Alternative hypothesis $H_1$: what we believe if $H_0$ is rejected. One-tailed ($p > p_0$ or $p < p_0$) or two-tailed ($p \ne p_0$).
- Significance level $\alpha$: the probability threshold below which $H_0$ is rejected (commonly 5% or 1%).
- $p$-value: the probability of observing a result at least as extreme as the one obtained, assuming $H_0$ is true. Reject $H_0$ if $p$-value $\le \alpha$.
- Critical region: the set of values for the test statistic that lead to rejection of $H_0$.
Example 11.7.1 — One-tailed binomial test (upper tail)
A manufacturer claims a machine produces defective items with probability 0.1. An inspector tests 20 items and finds 5 defective. Test at the 5% significance level whether there is evidence that the defect rate has increased.
$H_0: p = 0.1$ $H_1: p > 0.1$ (one-tailed, upper)
Under $H_0$: $X \sim B(20,\, 0.1)$.
$P(X \ge 5) = 1 - P(X \le 4) \approx 1 - 0.9568 = 0.0432$
Since $0.0432 < 0.05$, we reject $H_0$. There is sufficient evidence at the 5% level to conclude the defect rate has increased.
Example 11.7.2 — Two-tailed binomial test
It is claimed a coin is fair ($p = 0.5$). In 10 tosses, 2 heads are obtained. Test at the 5% level.
$H_0: p = 0.5$ $H_1: p \ne 0.5$ (two-tailed)
Under $H_0$: $X \sim B(10,\, 0.5)$. Each tail may use at most $2.5\%$ of the significance level.
$P(X \le 2) = P(0) + P(1) + P(2) \approx 0.0010 + 0.0098 + 0.0439 = 0.0547$
Since $0.0547 > 0.025$, $X = 2$ does not fall in the critical region. We do not reject $H_0$. There is insufficient evidence to conclude the coin is biased.
Example 11.7.3 — Finding the critical region
For $H_0: p = 0.4$, $H_1: p < 0.4$ using $X \sim B(15,\, 0.4)$ at the 5% significance level, find the critical region.
Look for the largest $c$ such that $P(X \le c) \le 0.05$.
$P(X \le 2) \approx 0.0271 \le 0.05$ ✓ $P(X \le 3) \approx 0.0905 > 0.05$ ✗
Critical region: $X \le 2$. Actual significance level: $2.71\%$.
Exam Tip
Always write a conclusion in context. Do not merely write "reject $H_0$" — state what this means in terms of the original claim, e.g. "There is sufficient evidence at the 5% significance level to suggest that the probability of defects has increased." Hypothesis testing provides evidence, not proof.
Practice Problems
Problem 1
A company has 400 employees: 160 in production, 120 in sales, and 120 in administration. A stratified sample of 50 is required. How many should be selected from each department?
Show Solution
Production: $\dfrac{160}{400} \times 50 = 20$ Sales: $\dfrac{120}{400} \times 50 = 15$ Administration: $\dfrac{120}{400} \times 50 = 15$. Total = 50. ✓
Problem 2
Calculate the mean and standard deviation of: 12, 15, 11, 18, 14, 16, 13, 17.
Show Solution
$n = 8$, $\sum x = 116$, $\bar{x} = 14.5$
$\sum x^2 = 144 + 225 + 121 + 324 + 196 + 256 + 169 + 289 = 1724$
$\sigma^2 = \dfrac{1724}{8} - (14.5)^2 = 215.5 - 210.25 = 5.25$; $\sigma = \sqrt{5.25} \approx 2.29$
Problem 3
Events $A$ and $B$ satisfy $P(A) = 0.5$, $P(B) = 0.4$, and $P(A \cap B) = 0.2$. (a) Find $P(A \cup B)$. (b) Are $A$ and $B$ independent?
Show Solution
(a) $P(A \cup B) = 0.5 + 0.4 - 0.2 = 0.7$
(b) Check: $P(A) \times P(B) = 0.5 \times 0.4 = 0.2 = P(A \cap B)$. Yes, $A$ and $B$ are independent.
Problem 4
The discrete random variable $X$ has $P(X = x) = kx$ for $x = 1, 2, 3, 4$. Find $k$, $E(X)$, and $\text{Var}(X)$.
Show Solution
$10k = 1 \Rightarrow k = 0.1$
$E(X) = 1(0.1) + 2(0.2) + 3(0.3) + 4(0.4) = 3.0$
$E(X^2) = 1(0.1) + 4(0.2) + 9(0.3) + 16(0.4) = 10.0$
$\text{Var}(X) = 10 - 9 = 1$
Problem 5
$X \sim B(12,\, 0.25)$. Find (a) $P(X = 3)$, (b) $P(X \le 2)$, (c) $P(X \ge 4)$.
Show Solution
(a) $P(X=3) = \dbinom{12}{3}(0.25)^3(0.75)^9 = 220 \times 0.015625 \times 0.07508 \approx 0.2581$
(b) $P(X=0) \approx 0.03168$; $P(X=1) \approx 0.12671$; $P(X=2) \approx 0.23228$. $P(X \le 2) \approx 0.391$
(c) $P(X \ge 4) = 1 - P(X \le 3) \approx 1 - 0.391 - 0.258 = 0.351$
Problem 6
$X \sim N(45,\, 36)$. Find (a) $P(X < 51)$, (b) $P(X > 42)$, (c) $P(39 < X < 51)$.
Show Solution
(a) $Z = \dfrac{51-45}{6} = 1.0$; $P(Z < 1.0) \approx 0.8413$
(b) $Z = \dfrac{42-45}{6} = -0.5$; $P(Z > -0.5) = \Phi(0.5) \approx 0.6915$
(c) $Z_1 = -1$, $Z_2 = 1$; $P(-1 < Z < 1) \approx 0.6826$
Problem 7
A die is suspected of being biased towards 6. In 20 rolls, a 6 appears 7 times. Test at the 5% significance level whether there is evidence of bias.
Show Solution
$H_0: p = \frac{1}{6}$, $H_1: p > \frac{1}{6}$ (one-tailed). Under $H_0$: $X \sim B(20, \frac{1}{6})$.
$P(X \ge 7) = 1 - P(X \le 6) \approx 1 - 0.9216 = 0.0784$.
Since $0.0784 > 0.05$, we do not reject $H_0$. Insufficient evidence at the 5% level to conclude the die is biased towards 6.
Problem 8
Find the value of $a$ such that $P(X < a) = 0.025$ where $X \sim N(100,\, 64)$.
Show Solution
$\Phi^{-1}(0.025) = -1.960$. So $\dfrac{a - 100}{8} = -1.960 \Rightarrow a = 100 - 15.68 = 84.32$.
Problem 9
$X \sim B(50,\, 0.45)$. Using a normal approximation with continuity correction, estimate $P(X \le 20)$.
Show Solution
$\mu = 22.5$, $\sigma^2 = 50 \times 0.45 \times 0.55 = 12.375$, $\sigma \approx 3.518$.
$P(X \le 20) \approx P(Y \le 20.5)$ where $Y \sim N(22.5, 12.375)$.
$Z = \dfrac{20.5 - 22.5}{3.518} \approx -0.569$; $P(Z \le -0.569) \approx 0.2847$.
Problem 10
A researcher believes a new treatment increases recovery rate above the standard 30%. In a trial of 25 patients, 12 recovered. Carry out a hypothesis test at the 5% significance level.
Show Solution
$H_0: p = 0.3$, $H_1: p > 0.3$ (one-tailed upper). Under $H_0$: $X \sim B(25, 0.3)$.
$P(X \ge 12) = 1 - P(X \le 11) \approx 1 - 0.9021 = 0.0979$.
Since $0.0979 > 0.05$, we do not reject $H_0$. Insufficient evidence at the 5% level that the new treatment increases recovery rate above 30%.