<- 1:7
week <- c(0.85, 1.1, 0.85, 1.2, 0.95, 1.1, 0.9)
amount_time <- c(90, 76, 37, 27, 19, 13, 9)
count <- count / amount_time
rate <- data.frame(week, amount_time, count, rate) df
21 Generalized linear models
21.1 Model
\[g(Y) = \beta_0 + \beta_1 x_1 + \dots + \beta_k x_k\]
There are three components to a GLM:
Random component: the probability distribution of the outcome Y, any distribution within the exponential family.
Systematic component: the linear combinations of predictors \(\beta_0 + \beta_1 x_1 + \dots + \beta_k x_k\)
Link function \(g(Y)\): the link between random and systematic components.
21.2 Common link functions
21.3 Offset
rate = count / unit time
offset converts a count to rate
21.4 Poisson regression
Poisson regression models are generalized linear models with the logarithm as the link function.
\[\log(\text{count}) = \beta_0 + \beta \cdot \text{week}\]
If we are fitting \(\text{rate}\) instead of \(\text{count}\):
\[\begin{align} \log(\text{rate}) = \log\left(\frac{\text{count}}{\text{amount time}}\right) &= \beta_0 + \beta \cdot \text{week} \\ \Leftrightarrow \log(\text{count}) - \log(\text{amount time}) &= \beta_0 + \beta \cdot \text{week} \\ \Leftrightarrow \log(\text{count}) &= \beta_0 + \beta \cdot \text{week} + \log(\text{amount time}) \end{align}\]
The \(\log(\text{amount time})\) is an offset.
library(ggplot2)
<- glm(count ~ week + offset(log(amount_time)), family = poisson)
mod mod
Call: glm(formula = count ~ week + offset(log(amount_time)), family = poisson)
Coefficients:
(Intercept) week
5.081 -0.435
Degrees of Freedom: 6 Total (i.e. Null); 5 Residual
Null Deviance: 164.3
Residual Deviance: 2.313 AIC: 42.68
<- predict(mod, type = "response")
pred ggplot(df, aes(x = week, y = count)) +
geom_point() +
geom_line(aes(y = pred)) +
theme_classic()