Observation (error) model
Objectives: learn how to use the predefined residual error models.
Demos: warfarinPK_project, bandModel_project, autocorrelation_project, errorGroup_project
https://www.youtube.com/watch?v=NQvp7X68HnYIntroduction
For continuous data, we are going to consider scalar outcomes () and assume the following general model:
for i from 1 to N, and j from 1 to , where is the parameter vector of the structural model f for individual i. The residual error model is defined by the function g which depends on some additional vector of parameters ξ. The residual errors are standardized Gaussian random variables (mean 0 and standard deviation 1). In this case, it is clear that and are the conditional mean and standard deviation of , i.e.,
Available error models
In Monolix, we only consider the function g to be a function of the structural model f, i.e. leading to an expression of the observation model of the form
The following error models are available:
constant
: . The function g is constant, and the additional parameter is .proportional
: . The function g is proportional to the structural model f, and the additional parameters are . By default, the parameter c is fixed at 1 and the additional parameter is .combined1
: . The function g is a linear combination of a constant term and a term proportional to the structural model f, and the additional parameters are (by default, the parameter c is fixed at 1).combined2
: . The function g is a combination of a constant term and a term proportional to the structural model f(g = bf^c), and the additional parameters are (by default, the parameter c is fixed at 1).
Notice that the parameter c is fixed to 1 by default. However, it can be unfixed and estimated.
The assumption that the distribution of any observation is symmetrical around its predicted value is a very strong one. If this assumption does not hold, we may want to transform the data to make it more symmetric around its (transformed) predicted value. In other cases, constraints on the values that observations can take may also lead us to transform the data.
Available transformations
The model can be extended to include a transformation of the data:
As we can see, both the data and the structural model f are transformed by the function u so that remains the prediction of . Classical distributions are proposed as transformation:
normal
: u(y) = y. This is equivalent to no transformation.lognormal
: u(y) = log(y). Thus, for a combined error model for example, the corresponding observation model writes . It assumes that all observations are strictly positive. Otherwise, an error message is thrown. In case of censored data with a limit, the limit has to be strictly positive too. [Note: if your observations are already log-transformed in your dataset, then you should not use that distribution as y in the formula would correspond to log-transformed measurements. Instead use the normal distribution.]logitnormal
: u(y) = log(y/(1-y)). Thus, for a combined error model for example, the corresponding observation model writes . It assumes that all observations are strictly between 0 and 1. It is also possible to modify these bounds and not “impose” them to be 0 and 1, i.e. to define the logit function between a minimum and a maximum: the function u becomes u(y) = log((y-y_min)/(y_max-y)). Again, in case of censored data with a limit, the limits too must belong strictly to the defined interval.
Any questions on what is the formula behind your observation model? There is a button in the interface as on the figure below where the observation model is described linking the observation (named CONC in that case) and the prediction (named Cc in that case). Note that ϵ is noted e here.
Defining the residual error model from the Monolix GUI
A menu in the frame Statistical model & Tasks of the main GUI allows one to select both the error model and the distribution as on the following figure (in blue and green respectively)
A summary of the statistical model which includes the residual error model can be displayed by clicking on the button .
Some basic residual error models
warfarinPK_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)
The residual error model used with this project for fitting the PK of warfarin is a combined error model, i.e. .
Several diagnosis plots can then be used for evaluating the error model. The observation versus prediction figure below seems ok.
Remarks:
Figures showing the shape of the prediction interval for each observation model available in Monolix are displayed here.
When the residual error model is defined in the GUI, a bloc
DEFINITION:
is then automatically added to the project file in the section[LONGITUDINAL]
of<MODEL>
when the project is saved:
DEFINITION:
y1 = {distribution=normal, prediction=Cc, errorModel=combined1(a,b)}
Residual error models for bounded data
bandModel_project (data = ‘bandModel_data.txt’, model = ‘lib:immed_Emax_null.txt’)
In this example, data are known to take their values between 0 and 100. We can use a constant error model and a logitnormal for the transformation with bounds (0,100)
if we want to take this constraint into account.
In the Observation versus prediction plot, one can see that the error is smaller when the observations are close to 0 and 100 which is normal. To see the relevance of the predictions, one can look at the 90% prediction interval. Using a logitnormal distribution, we have a very different shape of this prediction interval to take that specificity into account.
VPCs obtained with this error model do not show any mispecification:
This residual error model is implemented in Mlxtran as follows:
DEFINITION:
effect = {distribution=logitnormal, min=0, max=100, prediction=E, errorModel=constant(a)}
Using different error models per group/study
https://www.youtube.com/watch?v=vuoOTB9EzacerrorGroup_project (data = ‘errorGroup_data.txt’, model = ‘errorGroup_model.txt’)
Data comes from 3 different studies in this example. We want to have the same structural model but use
different error models for the 3 studies. A solution consists in defining the column STUDY with the reserved keyword OBSERVATION ID. It will then be possible to define one error model per outcome:
Here, we use the same PK model for the 3 studies:
[LONGITUDINAL]
input = {V, k}
PK:
Cc1 = pkmodel(V, k)
Cc2 = Cc1
Cc3 = Cc1
OUTPUT:
output = {Cc1, Cc2, Cc3}
Since 3 outputs are defined in the structural model, one can now define 3 error models in the GUI:
Different residual error parameters are estimated for the 3 studies. One can remark than, even if 2 proportional error models are used for the 2 first studies, different parameters b1 and b2 are estimated: