1. Simple SIR model with gamma distributed lengths of stay
The SIR model uses 3 compartments: S (susceptible), I (infected), R (recovered) to describe clinical status of individuals. We use the most simple form of SIR model to demonstrate how to define the distribution of the lengths of stay distribution.
The model equations are:
\[S_{t+1} - S_{t} = -\lambda S_{t} = -\frac{\beta I_{t}}{N}S_{t}\] \[I_{t+1} - I_{t} = \frac{\beta I_{t}}{N}S_{t} - \gamma I_{t}\] \[R_{t+1} - R_{t} = \gamma I_{t}\]
- \(N\): total population size, \(N = S + I + R\)
- \(\beta\): the product of contact rates and transmission probability; usually we define \(\lambda =\frac{\beta I_{t}}{N}\) as the force of infection
- \(\gamma\): recovery rate
Usually to solve the model easier we make an assumption that the recovery rate \(\gamma\) is constant, this will leads to an exponentially distributed length of stay i.e most individuals recover after 1 day being infected.
A more realistic length of stay distribution can look like this, of which most patients recovered after 4 days. We defined this using a gamma distribution with shape = 3 and rate = 1/2.
The model now look like this:
Model specification
Model transition
We have two transitions S -> I
and
I -> R
in this case. The transitions are specified in a
list follow this format "transition" = equation"
, of which
equation is defined with one of our functions for waiting time
distribution.
Another option to define the transitions is by using denim’s DSL. Refer to the Model definition in denim article for more information.
Initial state
Use a vector to define the compartments with their assigned names and
initial values in the format
compartment_name = initial_value
:
initialValues <- c(
S = 999,
I = 1,
R = 0
)
Model parameters
If we use a math expression, any symbols except the compartment names
are parameters, and would be defined by constant values. There are two
constant parameters in our example: beta
and
N
:
parameters <- c(
beta = 1.2,
N = 1000
)
Model application
Time step specification
We run the model for 30 days and give output at 0.01 daily intervals. The default interval (time step) is 1 if not declared explicitly.
simulationDuration <- 30
timeStep <- 0.01
mod <- sim(transitions = transitions,
initialValues = initialValues,
parameters = parameters,
simulationDuration = simulationDuration,
timeStep = timeStep)
head(mod)
#> Time S I R
#> 1 0.00 999.0000 1.000000 0.000000e+00
#> 2 0.01 998.9880 1.011988 2.599289e-09
#> 3 0.02 998.9759 1.024120 2.078652e-08
#> 4 0.03 998.9636 1.036396 7.019852e-08
#> 5 0.04 998.9512 1.048820 1.665435e-07
#> 6 0.05 998.9386 1.061393 3.256027e-07
plot(mod)
2. How the algorithm work?
In the SIR model, all infected individuals are presented by a single compartment I and have the same recovery rate \(\gamma\).
We want the recovery rate of individuals who had been infected for 1 day differ from the recovery rate of 2-day infected patients. So rather than using one compartment for infected (I), we define multiple infected sub-compartments. The number of sub-compartments depends on the maximum day we expect all infected individuals would be recovered.
For example, if we expect a disease with a maximum 4 days of infection, we will end up with 4 sub-compartments. Each sub-compartment has its own recovery rate \(\gamma_{1}\), \(\gamma_{2}\), \(\gamma_{3}\), \(\gamma_{4}\). At day 4 it is certain that the patient will recover (because we assume that this disease has a maximum 4 days of infection), \(\gamma_{4} = 1\).
Let \(R_1 + R_2 + R_3 + R_4 = \Sigma R\). We have \(\frac{R_1}{\Sigma R} = p_1\), \(\frac{R_2}{\Sigma R} = p_2\), \(\frac{R_3}{\Sigma R} = p_3\), \(\frac{R_4}{\Sigma R} = p_4\). Our mission is to estimate \(\gamma_{1}\), \(\gamma_{2}\), \(\gamma_{3}\) to obtain \(p_1\), \(p_2\), \(p_3\), \(p_4\) that fit a pre-defined distribution at the equilibrium state. This can be obtained by setting:
\[\gamma_{i} = \frac{p_i}{1 - \sum_{j=1}^{i-1}p_j}\]
For a given length of stay distribution, we identify the maximum
length of stay using its cumulative distribution function. Because
cumulative distribution function is asymptotic to 1 and never equal to
1, we need to set a value that is acceptable to be rounded to 1. If we
want a cumulative probability of 0.999 to be rounded as 1, we set the
error tolerance threshold as 1 - 0.999 = 0.001
(specified
by the argument errorTolerance = 0.001
). The time when
cumulative probability = 0.999 will be set as the maximum length of stay
of the compartment. Default errorTolerance
of
denim
is set at 0.001
.
Initialize population in sub-compartments
By default, the initial population is always assigned to the first
sub-compartment (for example, if the initial value for I compartment is
I = 1
, denim will initialize I1 = 1
while
I2 = I3 = I4 = 0
).
User can also choose to distribute initial population across
sub-compartments based on the specified distribution, i.e. with initial
population I = n
, then denim will initialize
I1 = n*p1
, I2 = n*p2
, I3 = n*p3
,
I4 = n*p4
. To set up the initial population in this way,
simply specify the parameter dist_init = TRUE
(only
applicable for distribution transitions).
3. Waiting time distribution
denim
offers 2 main ways to define a transition: either
by a waiting time distribution, or a mathematical expression.
Current available distributions in this package including:
d_exponential(rate)
: Discrete exponential distribution with parameterrate
d_gamma(rate, shape)
: Discrete gamma distribution with parametersrate
andshape
d_weibull(scale, shape)
: Discrete Weibull distribution with parametersscale
andshape
d_lognormal(mu, sigma)
: Discrete log-normal distribution with parametersmu
andsigma
nonparametric(waitingTimes)
: A vector of values, could be numbers, percentages, density of the length of stay based on real data,denim
will convert it into a distribution
Mathematical expression: Transition can also be described using a
math expression such as beta * S * I / N
. You will need to
define parameters that are not compartment names in the
parameters
argument
4. Multiple transitions from a compartment
In denim
, transitions between one compartment to
multiple compartments are modeled as either (i) multinomial transition
or (ii) competing risks.
Consider this example:
There are two scenarios in this example:
Susceptible individuals can be infected or vaccinated. The assumption here is they will be infected first (
S -> I
), and then the rest of them who were not infected will get vaccinated (S -> V
).Infected individuals can recover or die. If the mortality probability is known, we can implement it into the model, for example by defining
0.9 * I -> R
(90% individuals will recover) and then0.1 * I -> D
(10% of them die). By doing so, we ensure that the mortality probability is 10%, while also define the length of stay of individuals at the infected state before recover or die follows gamma or log-normal distribution, respectively.
We can define the model for this example as follows:
transitions <- denim_dsl({
S -> I = beta * S * I / N
S -> V = 5
0.9 * I -> R = d_gamma(1/3, 2)
0.1 * I -> D = d_lognormal(2, 0.5)
})
initialValues <- c(
S = 999,
I = 1,
R = 0,
V = 0,
D = 0
)
parameters <- c(
beta = 1.2,
N = 1000
)
simulationDuration <- 20
timeStep <- 0.01
mod <- sim(transitions = transitions,
initialValues = initialValues,
parameters = parameters,
simulationDuration = simulationDuration,
timeStep = timeStep)
head(mod)
#> Time S I V R D
#> 1 0.00 999.0000 1.000000 0.00 0.000000e+00 0.000000e+00
#> 2 0.01 998.9380 1.011983 0.05 4.988903e-06 0.000000e+00
#> 3 0.02 998.8759 1.024099 0.10 1.997114e-05 1.466074e-33
#> 4 0.03 998.8136 1.036349 0.15 4.500034e-05 1.651885e-29
#> 5 0.04 998.7512 1.048736 0.20 8.013109e-05 8.335817e-27
#> 6 0.05 998.6886 1.061259 0.25 1.254190e-04 8.306459e-25
plot(mod, ylim = c(0, 1000))
For a more detailed explanation for transitions to multiple states, refer to this article
5. Another example
transitions <- denim_dsl({
S -> I = beta * S * (I + IV) / N
S -> V = 2
0.1 * I -> D = d_lognormal(mu = d_mu, sigma = d_sigma)
0.9 * I -> R = d_gamma(rate = r_rate, shape = r_shape)
V -> IV = 0.1 * beta * V * (I + IV) / N
IV -> R = d_exponential(iv_r_rate)
})
initialValues <- c(
S = 999,
I = 1,
R = 0,
V = 0,
IV = 0,
D = 0
)
parameters <- c(
beta = 1.2,
N = 1000,
d_mu = 2,
d_sigma = 1/2,
r_rate = 1/3,
r_shape = 2,
iv_r_rate = 2
)
simulationDuration <- 20
timeStep <- 0.01
mod <- sim(transitions = transitions,
initialValues = initialValues,
parameters = parameters,
simulationDuration = simulationDuration,
timeStep = timeStep)
plot(mod)
head(mod)
#> Time S I V IV D R
#> 1 0.00 999.0000 1.000000 0.00000000 0.000000e+00 0.000000e+00 0.000000e+00
#> 2 0.01 998.9680 1.011983 0.02000000 0.000000e+00 0.000000e+00 4.988903e-06
#> 3 0.02 998.9359 1.024099 0.03999998 2.428759e-08 1.466074e-33 1.997114e-05
#> 4 0.03 998.9036 1.036350 0.05999993 7.296340e-08 1.651885e-29 4.500082e-05
#> 5 0.04 998.8712 1.048738 0.07999985 1.461358e-07 8.335817e-27 8.013303e-05
#> 6 0.05 998.8386 1.061263 0.09999975 2.439207e-07 8.306459e-25 1.254238e-04