Appendix C — Statistics
C.1 Marginal distribution
Imagine you have a box of objects of different colors and shapes:
- 3 red balls
- 6 green balls
- 1 red square
- 4 green squares
Let:
- \(X\) is the color of an object (red or green),
- \(Y\) is the shape of an object (ball or square)
If you pick an object at random, the joint distribution \(\mathbb{P}(X, Y)\) describes the probability of every (color, shape) combination:
- 3 red balls
- 6 green balls
- 1 red square
- 4 green squares
- \(\mathbb{P}(X = \text{red}, Y = \text{ball})\)
- \(\mathbb{P}(X = \text{green}, Y = \text{ball})\)
- \(\mathbb{P}(X = \text{red}, Y = \text{square})\)
- \(\mathbb{P}(X = \text{green}, Y = \text{square})\)
If you only care about color, the marginal distribution of color \(\mathbb{P}(X)\), is found by summing the joint probabilities over all shapes:
\[\begin{align} \mathbb{P}(X = \text{red}) &= \mathbb{P}(X = \text{red}, Y = \text{ball}) + \mathbb{P}(X = \text{red}, Y = \text{square}) \\ &= \sum_{y} \mathbb{P}(X = x, Y = y) \end{align}\]