Count observation model

Related resources on modeling count data in Monolix:

Columns used to define observations: formatting of count data in the MonolixSuite.
Count data model library : detailed description of the library of count models integrated within Monolix.
Count data models: examples of count data models from the Monolix demos.

On the current page, we explain the different models for count data that can be used in Monolix and their syntax.

Count observation model

Use of count data

Longitudinal count data is a special type of longitudinal data that can take only nonnegative integer values {0, 1, 2, …} that come from counting something, e.g., the number of seizures, hemorrhages or lesions in each given time period. In this context, data from individual i is the sequence where is the number of events observed in the jth time interval .
Count data models can also be used for modeling other types of data such as the number of trials required for completing a given task or the number of successes (or failures) during some exercise. Here, is either the number of trials or successes (or failures) for subject i at time . For any of these data types we will then model as a sequence of random variables that take their values in {0, 1, 2, …}. If we assume that they are independent, then the model is completely defined by the probability mass functions for and . Here, we will consider only parametric distributions for count data.

Observation model syntax

Considering the observations as a sequence of conditionally independent random variables, the model is completely defined by the probability mass functions . An observation variable for count data, with name Y for instance, is defined using the following syntax:

CODE

DEFINITION:
Y = {type=count, P(Y=k) = ...}

type=count: indicates the data type
P(Y=k): probability of a given count value k, for the observation named Y. The observation name is free but must be the same at the beginning of the line and for the probability definition. k is a reserved keyword and represents a positive integer. k supersedes in this scope any variable k defined previously. The probability must be in [0,1].

A transformed probability can also be provided. The transformation can be log, logit, or probit. For instance with a log-transformation:

CODE

DEFINITION:
Y = {type=count, log(P(Y=k)) = ...}

As k is only recognized within the probability definition, it is not possible to define the probability using k in an EQUATION block above. However, it is possible to use if/else statements within the probability definition:

CODE

DEFINITION:
Y = {type=count, 
if k==0
Pk = ...
else
Pk = ...
end
P(Y=k) = Pk}

Common mathematical functions to define count distributions are factorial(a), factln(a) (logarithm of factorial) and gammaln(a) (logarithm of gamma function). They can be used with a any positive numerical value (not only integers). Note that factorials grow very rapidly and can be considered as “+infinity” in a computer, even when the probability is defined as a ratio of two factorials which stays with reasonable values on paper. It is thus convenient to works with logarithms of factorials, which grow much slower (see examples).

Examples

Example 1: Poisson distribution with time evolution

In this example, the Poisson distribution is used for defining the distribution of :

where the Poisson intensity is function of time . This model is implemented as follows

CODE

[LONGITUDINAL]
input = {a,b}

EQUATION:
lambda = a+b*t

DEFINITION:
y = {type=count, P(y=k) = exp(-lambda)*(lambda^k)/factorial(k)}

Example 2: Binomial distribution

We consider n Bernouilli trials, each having a probability of success p. The probability of having k successes is:

To avoid that be so large that it will be considered as NaN by a computer, it is good practice to define the log of the probability to convert the ratios of large number into a sum of smaller numbers:

The corresponding Mlxtran model is:

CODE

[LONGITUDINAL]
input = {n, p}

DEFINITION:
CountNumber = {type=count, log(P(CountNumber=k)) = gammaln(n+1) - factln(k) - gammaln(n-k+1) + k*log(p) + (n-k)*log(1-p)}

OUTPUT:
output = Y

Example 3: Poisson distribution with zero inflation

Zero-inflations can be encoded using if/else statements:

CODE

[LONGITUDINAL]
input = {lambda, f}

DEFINITION:
CountNumber = {type=count, 
if k==0
Pk = exp(-lambda)*(1-f) + f
else
Pk = exp(k*log(lambda) - lambda - factln(k))*(1-f)
end
P(CountNumber=k) = Pk}

OUTPUT:
output = CountNumber

Library of count models

The MonolixSuite library of models includes many pre-written count data models: Count library.