7 MCMC
7.1 Monte Carlo simulation
Monte Carlo simulation is a method that uses repeated random sampling to obtain numerical results.
We will use Monte Carlo simulation to create a circle and estimate \(\pi\).
7.1.1 Create a circle
To create a circle of radius \(r = 1\) centered at \((0, 0)\), we first define a square with side length 2, extending from \([-1, 1]\) in both \(x\) and \(y\) directions. We generate random points uniformly within this square by drawing \(x\) and \(y\) values from the interval \([-1, 1]\).
set.seed(7)
<- 10000
runs <- runif(runs, min = -1, max = 1)
xs <- runif(runs, min = -1, max = 1) ys
The distance from each point to the center \((0, 0)\) is \(\sqrt{x^2 + y^2}\). A point lies within the circle if \(\sqrt{x^2 + y^2} \leq 1\).
<- xs ^ 2 + ys ^ 2 <= 1 ^ 2
in_circle <- data.frame(iter = 1:runs, xs, ys, in_circle) df
7.1.2 Estimate \(\pi\)
The area of a circle is \(\pi r^2\).
The area of the bounding square is \((2r)^2 = 4r^2\).
Dividing the area of the circle by the area of the square, we get a ratio:
\[\frac{\text{area of circle}}{\text{area of square}} = \frac{\pi r^2}{4r^2} = \frac{\pi}{4}\]
Therefore,
\[\pi = 4 \times \frac{\text{area of circle}}{\text{area of square}}\]
If we approximate the area of circle by the number of simulated points within the circle, and the area of square by the total number of simulated points, we have:
\[\pi \approx 4 \times \frac{\text{simulated points within circle}}{\text{total number of simulated points}}\]
$mc_pi <- 4 * cumsum(df$in_circle) / df$iter df
Let explore how the approximation of \(\pi\) improves as more points are sampled.
7.1.3 MCMC in infectious disease modelling
Suppose that we have an iid (see Definition A.8) incubation periods dataset of \(n\) samples \(y = (y_1, y_2, \cdots, y_n)\). We assume these data are independent draws from a Gamma distribution with shape \(\alpha\) and rate \(\beta\).
A random variable \(x\) that is gamma-distributed with shape \(\alpha\) and rate \(\beta\) has this pdf:
\[f(x | \alpha, \beta) = \frac{\beta^{\alpha}}{\Gamma(\alpha)} x^{\alpha - 1} e^{-\beta x}\]
\(f(x | \alpha, \beta)\) or sometimes written as \(f(x; \alpha, \beta)\) (Wikipedia) reads as: probability density of random variable \(x\) given that we know parameters \(\alpha\) and \(\beta\).
Step 1. Write the likelihood function of the dataset.
Likelihood of a data point is \(f(y_1| \alpha, \beta)\).
Likelihood of the dataset is:
\[\begin{align} f(y| \alpha, \beta) &= f(y_1| \alpha, \beta) \times f(y_2| \alpha, \beta) \times \cdots \times f(y_n| \alpha, \beta) \\ &= \prod_{i = 1}^{n} \frac{\beta^{\alpha}}{\Gamma(\alpha)} y_i^{\alpha - 1} e^{-\beta y_i} \\ &= \frac{\beta^{n\alpha}}{\Gamma(\alpha)^n} \prod_{i = 1}^{n} y_i^{\alpha - 1} e^{-\beta y_i} \end{align}\]
Step 2. Assign prior distributions to the parameters of interest.
Both shape \(\alpha\) and rate \(\beta\) are positive so a natural to distribution to represent our prior beliefs them is a Gamma distribution. We will also assume that a priori (i.e. before we see the data) that \(\alpha\) and \(\beta\) are independent.
\[\alpha \sim \text{Gamma}(\lambda_\alpha, \nu_\alpha)\]
The pdf of \(\alpha\) is:
\[f(\alpha| \lambda_\alpha, \nu_\alpha) = \frac{\nu_\alpha^{\lambda_\alpha}}{\Gamma(\lambda_\alpha)} \alpha^{\lambda_\alpha - 1} e^{-\nu_\alpha \alpha}\]
Because \(\frac{\nu_\alpha^{\lambda_\alpha}}{\Gamma(\lambda_\alpha)}\) does not depends on \(\alpha\).
\[\begin{align} f(\alpha| \lambda_\alpha, \nu_\alpha) &= \frac{\nu_\alpha^{\lambda_\alpha}}{\Gamma(\lambda_\alpha)} \alpha^{\lambda_\alpha - 1} e^{-\nu_\alpha \alpha} \\ &\propto \alpha^{\lambda_\alpha - 1} e^{-\nu_\alpha \alpha} \end{align}\]
Similarly:
\[\beta \sim \text{Gamma}(\lambda_\beta, \nu_\beta)\]
\[\begin{align} f(\beta| \lambda_\beta, \nu_\beta) &= \frac{\nu_\beta^{\lambda_\beta}}{\Gamma(\lambda_\beta)} \beta^{\lambda_\beta - 1} e^{-\nu_\beta \beta} \\ &\propto \beta^{\lambda_\beta - 1} e^{-\nu_\beta \beta} \end{align}\]
Step 3. Write the joint posterior distribution.
\[\begin{align} f(\alpha, \beta | y) &\propto f(y| \alpha, \beta) \times f(\alpha) \times f(\beta) \\ &= f(y| \alpha, \beta) \times f(\alpha| \lambda_\alpha, \nu_\alpha) \times f(\beta| \lambda_\beta, \nu_\beta) \\ &= \frac{\beta^{n\alpha}}{\Gamma(\alpha)^n} \prod_{i = 1}^{n} y_i^{\alpha - 1} e^{-\beta y_i} \times \alpha^{\lambda_\alpha - 1} e^{-\nu_\alpha \alpha} \times \beta^{\lambda_\beta - 1} e^{-\nu_\beta \beta} \end{align}\]
Step 4. Derive the full conditionals.
The full conditional