Distributions, Regression & Hypothesis Testing
Descriptive statistics summarise the key features of a data set.
If $X$ counts the number of successes in $n$ independent trials, each with probability $p$ of success, then $X \sim B(n, p)$.
$$P(X = r) = \binom{n}{r} p^r (1-p)^{n-r}$$
Expected value: $E(X) = np$
Variance: $\text{Var}(X) = np(1-p)$
Standard deviation: $\sigma = \sqrt{np(1-p)}$
The normal distribution $X \sim N(\mu, \sigma^2)$ is a symmetric, bell-shaped continuous distribution.
In the IB exam, all normal distribution probabilities are found using the GDC (graphical display calculator), not tables.
The Pearson's product-moment correlation coefficient (PMCC) $r$ measures the strength and direction of a linear relationship:
$$\hat{y} = a + bx, \quad \text{where } b = \frac{S_{xy}}{S_{xx}}, \quad a = \bar{y} - b\bar{x}$$
The regression line always passes through $(\bar{x}, \bar{y})$.
Use the regression line of $y$ on $x$ to predict $y$ from a given $x$ value.
The general procedure:
Common tests in IB AA HL:
A fair six-sided die is rolled 8 times. Let $X$ be the number of times a 6 is rolled. Find $P(X \geq 2)$.
The mass of apples is normally distributed with mean $\mu = 180$ g and standard deviation $\sigma = 15$ g. (a) Find $P(X > 200)$. (b) Find the value of $m$ such that $P(X > m) = 0.1$.
Box A contains 3 red and 2 blue balls. Box B contains 1 red and 4 blue balls. A box is chosen at random, then a ball is drawn. Given that the ball is red, find the probability it came from Box A.
Q1. A dataset has values 3, 7, 8, 12, 15, 17, 21, 25. Find the median, lower quartile $Q_1$, upper quartile $Q_3$, and IQR.
Ordered data (8 values): 3, 7, 8, 12, 15, 17, 21, 25.
Median $= \frac{12+15}{2} = 13.5$.
Lower half: 3, 7, 8, 12 → $Q_1 = \frac{7+8}{2} = 7.5$.
Upper half: 15, 17, 21, 25 → $Q_3 = \frac{17+21}{2} = 19$.
IQR $= Q_3 - Q_1 = 19 - 7.5 = \mathbf{11.5}$.
Q2. A player makes a free throw with probability 0.7. Find the probability of making exactly 6 out of 10 free throws. Also find the expected number and standard deviation.
$X \sim B(10, 0.7)$.
$P(X=6) = \binom{10}{6}(0.7)^6(0.3)^4 = 210 \times 0.117649 \times 0.0081 \approx \mathbf{0.200}$
$E(X) = np = 10 \times 0.7 = \mathbf{7}$
$\sigma = \sqrt{np(1-p)} = \sqrt{10 \times 0.7 \times 0.3} = \sqrt{2.1} \approx \mathbf{1.45}$
Q3. Test scores are normally distributed with $\mu = 65$ and $\sigma = 8$. What percentage of students score between 50 and 80?
Standardise: $z_1 = \frac{50-65}{8} = -1.875$ and $z_2 = \frac{80-65}{8} = 1.875$.
$P(50 < X < 80) = P(-1.875 < Z < 1.875) \approx 0.9394 - 0.0606 \approx \mathbf{93.9\%}$
(Using GDC: normalcdf$(50, 80, 65, 8) \approx 0.939$.)
Q4. Two events $A$ and $B$ satisfy $P(A) = 0.4$, $P(B) = 0.5$, $P(A \cap B) = 0.2$. Are $A$ and $B$ independent? Find $P(A \mid B)$.
Check independence: $P(A) \times P(B) = 0.4 \times 0.5 = 0.2 = P(A \cap B)$. Since this holds, $A$ and $B$ are independent.
$P(A \mid B) = \dfrac{P(A \cap B)}{P(B)} = \dfrac{0.2}{0.5} = \mathbf{0.4}$ (equal to $P(A)$, confirming independence).