# Count observation model

Related resources on modeling count data in Monolix:

Columns used to define observations: formatting of count data in the MonolixSuite.

Count data model library : detailed description of the library of count models integrated within Monolix.

Count data models: examples of count data models from the Monolix demos.

On the current page, we explain the different models for count data that can be used in Monolix and their syntax.

**Count observation model**

## Use of count data

Longitudinal count data is a special type of longitudinal data that can take only nonnegative integer values {0, 1, 2, …} that come from counting something, e.g., the number of seizures, hemorrhages or lesions in each given time period. In this context, data from individual *i *is the sequence where is the number of events observed in the *j*th time interval .

Count data models can also be used for modeling other types of data such as the number of trials required for completing a given task or the number of successes (or failures) during some exercise. Here, * *is either the number of trials or successes (or failures) for subject *i* at time . For any of these data types we will then model as a sequence of random variables that take their values in {0, 1, 2, …}. If we assume that they are independent, then the model is completely defined by the *probability mass functions* for and . Here, we will consider only parametric distributions for count data.

## Observation model syntax

Considering the observations as a sequence of conditionally independent random variables, the model is completely defined by the probability mass functions . An observation variable for count data, with name `Y`

for instance, is defined using the following syntax:

```
DEFINITION:
Y = {type=count, P(Y=k) = ...}
```

type=count: indicates the data type

P(Y=k): probability of a given count value

`k`

, for the observation named`Y`

. The observation name is free but must be the same at the beginning of the line and for the probability definition.`k`

is a reserved keyword and represents a positive integer. k supersedes in this scope any variable k defined previously. The probability must be in [0,1].

A transformed probability can also be provided. The transformation can be log, logit, or probit. For instance with a log-transformation:

```
DEFINITION:
Y = {type=count, log(P(Y=k)) = ...}
```

As `k`

is only recognized within the probability definition, it is not possible to define the probability using `k`

in an EQUATION block above. However, it is possible to use if/else statements within the probability definition:

```
DEFINITION:
Y = {type=count,
if k==0
Pk = ...
else
Pk = ...
end
P(Y=k) = Pk}
```

Common mathematical functions to define count distributions are `factorial(a)`

, `factln(a)`

(logarithm of factorial) and `gammaln(a)`

(logarithm of gamma function). They can be used with `a`

any positive numerical value (not only integers). Note that factorials grow very rapidly and can be considered as “+infinity” in a computer, even when the probability is defined as a ratio of two factorials which stays with reasonable values on paper. It is thus convenient to works with logarithms of factorials, which grow much slower (see examples).

## Examples

* Example 1:* Poisson distribution with time evolution

In this example, the Poisson distribution is used for defining the distribution of :

where the Poisson intensity is function of time . This model is implemented as follows

```
[LONGITUDINAL]
input = {a,b}
EQUATION:
lambda = a+b*t
DEFINITION:
y = {type=count, P(y=k) = exp(-lambda)*(lambda^k)/factorial(k)}
```

* Example 2:* Binomial distribution

We consider `n`

Bernouilli trials, each having a probability of success `p`

. The probability of having k successes is:

To avoid that be so large that it will be considered as NaN by a computer, it is good practice to define the log of the probability to convert the ratios of large number into a sum of smaller numbers:

The corresponding Mlxtran model is:

```
[LONGITUDINAL]
input = {n, p}
DEFINITION:
CountNumber = {type=count, log(P(CountNumber=k)) = gammaln(n+1) - factln(k) - gammaln(n-k+1) + k*log(p) + (n-k)*log(1-p)}
OUTPUT:
output = Y
```

* Example 3:* Poisson distribution with zero inflation

Zero-inflations can be encoded using if/else statements:

```
[LONGITUDINAL]
input = {lambda, f}
DEFINITION:
CountNumber = {type=count,
if k==0
Pk = exp(-lambda)*(1-f) + f
else
Pk = exp(k*log(lambda) - lambda - factln(k))*(1-f)
end
P(CountNumber=k) = Pk}
OUTPUT:
output = CountNumber
```

## Library of count models

The MonolixSuite library of models includes many pre-written count data models: Count library.