Count models with offset

When the count data records the number of events happening within a time interval, one must ensure that all intervals have the same duration. If this is not the case, one option is to derive the probability to observe k events from the underlying hazard function. This is explained in the TTE documentation page. The second option is to consider an offset and this is shown here.

Let’s assume that our data records the number of events during the follow-up period. The data is considered as count data (i.e an integer value) and we would like to model this data using a negative binomial distribution:

with the mean number of event and the parameter that controls the over dispersion compared to the Poisson distribution. This model can be selected from the library of Count models.

However, the number of events we observe tends to larger for individuals which have a long follow-up period. To take this into account, we can let depend on the duration of the follow-up observation called DUR. We expect that the number of events is proportional to the follow-up observation duration (often called offset), i.e that a doubling of the observation period leads to twice as much events on average. This would correspond to:

This equation is equivalent to:

with and .

We recognize in the equation above the typical formula for a lognormally distributed parameter.

Thus we set in Monolix:

lambda with a lognormal distribution and no random-effects (unless there are several observations per individual, which would enable to estimate inter-individual variability on lambda)
new covariate tlogDUR = log(DUR) added on lambda
beta_tlogDUR fixed to 1
omicron with a lognormal distribution and no random effects (same overdispersion for all individuals)

Equivalence with glm.nb in R

In R, one could fit a negative binomial model using the glm.nb() function. Considering an offset for the duration of the internal DUR and in addition the effect of the covariate WT on the mean number of events, one could write:

CODE

glm.nb(Count~ log(WT/70)+offset(log(DUR)), data=count_data, link=log)

The negative binomial model for count data can be selected from the count library, using the parametrization with lambda (mean number of events) and omicron (overdispersion).

The link=log indicates that the formula applies to the log of the mean number of events. The offset indicates that the variable DUR will enter the formula with a coefficient fixed to 1. On the opposite, the covariate log(WT/70) will enter the formula with a coefficient that will be estimated.

The equation for the mean number of events is thus:

In Monolix, this corresponds to:

with a lognormal distribution and no random effects
new covariate tlogWT = log(WT/70) added on
new covariate tlogDUR = log(DUR) added on
beta_tlogDUR fixed to 1
with a lognormal distribution and no random effects