7  MCMC

7.1 Monte Carlo simulation

Monte Carlo simulation is a method that uses repeated random sampling to obtain numerical results.

We will use Monte Carlo simulation to create a circle and estimate \(\pi\).

7.1.1 Create a circle

To create a circle of radius \(r = 1\) centered at \((0, 0)\), we first define a square with side length 2, extending from \([-1, 1]\) in both \(x\) and \(y\) directions. We generate random points uniformly within this square by drawing \(x\) and \(y\) values from the interval \([-1, 1]\).

set.seed(7)

runs <- 10000
xs <- runif(runs, min = -1, max = 1)
ys <- runif(runs, min = -1, max = 1)

The distance from each point to the center \((0, 0)\) is \(\sqrt{x^2 + y^2}\). A point lies within the circle if \(\sqrt{x^2 + y^2} \leq 1\).

in_circle <- xs ^ 2 + ys ^ 2 <= 1 ^ 2
df <- data.frame(iter = 1:runs, xs, ys, in_circle)

7.1.2 Estimate \(\pi\)

The area of a circle is \(\pi r^2\).

The area of the bounding square is \((2r)^2 = 4r^2\).

Dividing the area of the circle by the area of the square, we get a ratio:

\[\frac{\text{area of circle}}{\text{area of square}} = \frac{\pi r^2}{4r^2} = \frac{\pi}{4}\]

Therefore,

\[\pi = 4 \times \frac{\text{area of circle}}{\text{area of square}}\]

If we approximate the area of circle by the number of simulated points within the circle, and the area of square by the total number of simulated points, we have:

\[\pi \approx 4 \times \frac{\text{simulated points within circle}}{\text{total number of simulated points}}\]

df$mc_pi <- 4 * cumsum(df$in_circle) / df$iter

Let explore how the approximation of \(\pi\) improves as more points are sampled.

Figure 7.1: Approximation of pi

7.1.3 MCMC in infectious disease modelling

Suppose that we have an iid (see Definition A.8) incubation periods dataset of \(n\) samples \(y = (y_1, y_2, \cdots, y_n)\). We assume these data are independent draws from a Gamma distribution with shape \(\alpha\) and rate \(\beta\).

A random variable \(x\) that is gamma-distributed with shape \(\alpha\) and rate \(\beta\) has this pdf:

\[f(x | \alpha, \beta) = \frac{\beta^{\alpha}}{\Gamma(\alpha)} x^{\alpha - 1} e^{-\beta x}\]

\(f(x | \alpha, \beta)\) or sometimes written as \(f(x; \alpha, \beta)\) (Wikipedia) reads as: probability density of random variable \(x\) given that we know parameters \(\alpha\) and \(\beta\).

Step 1. Write the likelihood function of the dataset.

Likelihood of a data point is \(f(y_1| \alpha, \beta)\).

Likelihood of the dataset is:

\[\begin{align} f(y| \alpha, \beta) &= f(y_1| \alpha, \beta) \times f(y_2| \alpha, \beta) \times \cdots \times f(y_n| \alpha, \beta) \\ &= \prod_{i = 1}^{n} \frac{\beta^{\alpha}}{\Gamma(\alpha)} y_i^{\alpha - 1} e^{-\beta y_i} \\ &= \frac{\beta^{n\alpha}}{\Gamma(\alpha)^n} \prod_{i = 1}^{n} y_i^{\alpha - 1} e^{-\beta y_i} \end{align}\]

Step 2. Assign prior distributions to the parameters of interest.

Both shape \(\alpha\) and rate \(\beta\) are positive so a natural to distribution to represent our prior beliefs them is a Gamma distribution. We will also assume that a priori (i.e. before we see the data) that \(\alpha\) and \(\beta\) are independent.

\[\alpha \sim \text{Gamma}(\lambda_\alpha, \nu_\alpha)\]

The pdf of \(\alpha\) is:

\[f(\alpha| \lambda_\alpha, \nu_\alpha) = \frac{\nu_\alpha^{\lambda_\alpha}}{\Gamma(\lambda_\alpha)} \alpha^{\lambda_\alpha - 1} e^{-\nu_\alpha \alpha}\]

Because \(\frac{\nu_\alpha^{\lambda_\alpha}}{\Gamma(\lambda_\alpha)}\) does not depends on \(\alpha\).

\[\begin{align} f(\alpha| \lambda_\alpha, \nu_\alpha) &= \frac{\nu_\alpha^{\lambda_\alpha}}{\Gamma(\lambda_\alpha)} \alpha^{\lambda_\alpha - 1} e^{-\nu_\alpha \alpha} \\ &\propto \alpha^{\lambda_\alpha - 1} e^{-\nu_\alpha \alpha} \end{align}\]

Similarly:

\[\beta \sim \text{Gamma}(\lambda_\beta, \nu_\beta)\]

\[\begin{align} f(\beta| \lambda_\beta, \nu_\beta) &= \frac{\nu_\beta^{\lambda_\beta}}{\Gamma(\lambda_\beta)} \beta^{\lambda_\beta - 1} e^{-\nu_\beta \beta} \\ &\propto \beta^{\lambda_\beta - 1} e^{-\nu_\beta \beta} \end{align}\]

Step 3. Write the joint posterior distribution.

\[\begin{align} f(\alpha, \beta | y) &\propto f(y| \alpha, \beta) \times f(\alpha) \times f(\beta) \\ &= f(y| \alpha, \beta) \times f(\alpha| \lambda_\alpha, \nu_\alpha) \times f(\beta| \lambda_\beta, \nu_\beta) \\ &= \frac{\beta^{n\alpha}}{\Gamma(\alpha)^n} \prod_{i = 1}^{n} y_i^{\alpha - 1} e^{-\beta y_i} \times \alpha^{\lambda_\alpha - 1} e^{-\nu_\alpha \alpha} \times \beta^{\lambda_\beta - 1} e^{-\nu_\beta \beta} \end{align}\]

Step 4. Derive the full conditionals.

The full conditional