Appendix A — Dictionary
Poisson process:
Trajectory or sample path: is the path that a system’s state follows over time within its state-space. Different initial states result in different trajectories. Each path in Figure 5.1 is a trajectory.
Definition A.1 Domain-specific language (DSL): is a computer language designed for a specific purpose or “domain”. DSLs act as a bridge, allowing users to write models in a familiar syntax while benefiting from the speed of compiled languages (FitzJohn et al., 2021).
Bifurcation: a small change in a system’s parameters causes the system to split into two distinct behaviors or paths.
Definition A.2 State-space models are models that use state variables to describe a system by a set of first-order differential or difference equations, rather than by one or more nth-order differential or difference equations (source). “State” refers to a set of variables that describe the current condition of the system, while “space” represents the collection of all possible values these state variables can assume.
State-space model: is a Hidden Markov Process where the system has unobservable (hidden) “state variables” which change over time, each state depends only on the current state (Markov property). Hidden states are not directly measured, but we have observation data which may contain (potentially limited) information on the state variables. In infectious disease modelling, SIR and its stochastic extensions follow this framework. The true state of the system is often unknown, but we have observed data with measurement errors (e.g., underreporting or diagnostic errors) and a plausible understanding of the measurement process that translate the hidden states into the observed data (Endo et al., 2019).
A sufficient statistic \(g(x)\) for a parameter \(\theta\) is a function of the data that captures all the information in the dataset about that parameter. If you have a sufficient statistic, you no longer need the entire dataset to do inference on \(\theta\).
Definition A.3 Function: A relationship that assigns exactly one output \(Y_i\) to each input \(X_i\).
Definition A.4 Functional form: The explicit mathematical expression that describes this relationship of a function. Example:
\[f(x) = 2x + 3\]
In probability distributions, functional form refers to the probability density function (PDF) or probability mass function (PMF).
Definition A.5 Functional form “up to a constant”: we know the shape or structure of the distribution but not the exact scaling factor that makes it a valid probability distribution (i.e., sums or integrates to 1).
Definition A.6 White noise process: a sequence of serially uncorrelated (no correlation between its values at different times) random variables with zero mean and finite variance.
Ad hoc: (literally Latin for “for this”) means built for a particular purpose, usually quickly and temporarily, generally signifies a solution designed for a specific problem or task, non-generalizable, and not intended to be able to be adapted to other purposes, source.
Definition A.7 Case fatality rate (CFR): the number of deaths divided by the number of confirmed cases (Meyerowitz-Katz & Merone, 2020).
Definition A.8 Infection fatality rate (IFR): the number of deaths divided by the number of infected individuals (Meyerowitz-Katz & Merone, 2020).
Definition A.9 Forecast: predictions about what will happen, helpful for situation awareness.
Definition A.10 Projection: estimates about what could happen, considering multiple possible outcomes, helpful for intervention planning when faced with uncertainty.
Definition A.11 Epidemic is “the occurrence of more cases of disease than expected in a given area or among a specific group of people over a particular period of time” (US-CDC) (Texier et al., 2016). Epidemic is derived from the Greek epi and demos meaning “that which is upon the people”, to describe the burden of some phenomenon on “the people” (Nelson & Williams, 2014).
Definition A.12 Tractable: can be solved (source)
Definition A.13 Microsimulation: an analysis in which individual instantiations of a system - such as a patient’s lifetime or the course of an epidemic - are generated using a random process to ‘draw’ from probability distributions a large number of times, in order to examine the central tendency and possibly the distribution of outcomes (Kim & Goldie, 2008).
Definition A.14 iid (independent and identically distributed random variables): each random variable has the same probability distribution as the others and all are mutually independent.
Definition A.15 Closed-form solution (or closed form expression) is any formula that can be evaluated in a finite number of standard operations (source).
infection fatality ratio (IFR) and case fatality ratio (CFR) and serology data.
Definition A.16 Residual: is the difference between the observed value and the estimated value.
Definition A.17 Error: is the deviation of the observed value from the true value.
Matrix and linear transformation:
- Linear transformation is a function (transformation = function) that take in a vector and spit out a vector.
- Lines remain lines, and origin remains fixed.
Eigenvector (\(\vec{v}\)):
Eigenvalue (\(\lambda\)):
Definition A.18 Inverse matrix:
\[\begin{bmatrix} a & b \\ c & d \end{bmatrix}^{-1} = \frac{1}{ad - bc} \begin{bmatrix} d & -b \\ -c & a \end{bmatrix}\]
Definition A.19 Monte Carlo methods or Monte Carlo experiments: using repeated random sampling simulation to obtain numerical results.
Transpose: flips a matrix/vector such that row become column and column become row. Notation: \(A^\intercal\)
\[\begin{align} A &= \begin{bmatrix} a & b & c \\ d & e & f \end{bmatrix} \\ A^\intercal &= \begin{bmatrix} a & d \\ b & e \\ c & f \end{bmatrix} \end{align}\]
\[\begin{align} A &= \begin{bmatrix} a & b & c \end{bmatrix} \\ A^\intercal &= \begin{bmatrix} a \\ b \\ c \end{bmatrix} \end{align}\]
Tensor: an object with > 1 dimension
Matrix: an object with 2 dimensions
A probability is between 0 and 1 and is the chance (or risk) that an event (death, infection, etc…) happens. In general this probability is defined over a period of time and will necessarily increase as the duration of this period of time increases.
Definition A.20 A rate is the number of new events that occur during specified time period (Selvin, 2004). It is thus always positive (because both the number of events and time period are positive) and always expressed per unit of time (like speed km/h or m/s). Rate can be > 1. Mathematically, it is the limit of the above-mentioned probability when the duration of the period of time tends towards zero (i.e. very small, i.e. instantaneous measure).
A proportion is the ratio of a numerator and a denominator and by definition is between 0 and 1.
What is the difference between proportion and probability? Consider this example:
A bag is filled with 100 balls, of which 30 are blue and 70 are green. Without looking, you pull out a ball, write down the color, then put it back to the bag. You repeat the action 10,000 times. The number of times you pulled out a blue ball is 3011.
- The proportion of blue balls in the bag is 30/100 = 0.3
- A Frequentist say: The probability of pulling out a blue ball is 3011/10,000 = 0.3011 because you did experiment in long-run (10,000 times).
- A Bayesian say: The probability of pulling out a blue ball is 0.3 because you know that you have a proportion of 0.3 blue balls in the bag.
Definition A.21 Hazard: is the probability an event occurs at time \(t\), conditionally on being at risk.
\[h(t) = \mathbb{P}(T = t|Survive = 1)\]
In infectious disease modelling, hazard is the force of infection (Hay et al., 2024).
Definition A.22 Cumulative incidence or cumulative distribution function: is the probability an event occurs at anytime before time \(t\).
\[I(t) = \mathbb{P}(T \leq t)\]