Longitudinal model-based meta-analysis (MBMA) with MonolixSuite
Download data set only | Download all Monolix&Simulx project files V2021R2 | Download all Monolix&Simulx project files V2023R1 | Download all Monolix&Simulx project files V2024R1
This case study demonstrates how to implement and analyze a longitudinal MBMA model using Monolix and Simulx.
Introduction
Due to the high costs of drug development, it is crucial to determine as soon as possible during the drug development process if the new drug candidate has a reasonable chance of showing clinical benefit compared to the other compounds already on the market. To help the decision makers in go/no-go decisions, model-based meta-analysis (MBMA) has been proposed. MBMA consists in using published aggregate data from many studies to develop a model and support the decision process. Note that individual data are usually not available as only mean responses over entire treatment arms are reported in publications.
Because in an MBMA approach one considers studies instead of individuals, the formulation of the problem as a mixed effect model differs slightly from the typical PK/PD formulation. This case study presents the implementation, analysis, and use of MBMA models for decision support in Monolix and Simulx.
As an example, the focus is on longitudinal data concerning the clinical efficacy of drugs for rheumatoid arthritis (RA). The goal is to evaluate the efficacy of a drug in development (phase II) in comparison to two drugs already on the market. This case study is strongly inspired by the following publication:
Demin, I., Hamrén, B., Luttringer, O., Pillai, G., & Jung, T. (2012). Longitudinal model-based meta-analysis in rheumatoid arthritis: an application toward model-based drug development. Clinical Pharmacology and Therapeutics, 92(3), 352–9.
Background
Drug candidate overview
Rheumatoid arthritis (RA) is a complex auto-immune disease, characterized by swollen and painful joints. It affects around 1% of the population and evolves slowly (the symptoms come up over months). Patients are usually first treated with traditional disease-modifying anti-rheumatic drugs, the most common being methotrexate (MTX). If the improvement is too limited, patients can be treated with biologic drugs in addition to the MTX treatment. Although biologic drugs have an improved efficacy, remission is not complete in all patients. To address this unmet clinical need, a new drug candidate, Canakinumab (abbreviated Canaki), has been developed at Novartis. After a successful proof-of-concept study, a 12-week, multicenter, randomized, double-blind, placebo-controlled, parallel-group, dose-finding phase IIB study with Canaki+MTX was conducted in patients with active RA despite stable treatment with MTX. In the study, Canaki+MTX was shown to be superior to MTX alone.
The goal is to compare the expected efficacy of Canaki with two other biologics already available on the market, Adalimumab (abbreviated Adali) and Abatacept (abbreviated Abata). Several studies involving Adali and Abata have been published in the past (see list below). A common endpoint in these studies is the ACR20, which represents the percentage of patients achieving a 20% improvement. Specifically, the focus is on longitudinal ACR20 data, measured at multiple time points after treatment initiation, with the aim of using a mixed-effect model to describe this data.
Overview of MBMA concepts
The formulation of longitudinal mixed effect models for study-level data differs from longitudinal mixed effect models for individual-level data. As detailed in
Ahn, J. E., & French, J. L. (2010). Longitudinal aggregate data model-based meta-analysis with NONMEM: Approaches to handling within treatment arm correlation. Journal of Pharmacokinetics and Pharmacodynamics, 37(2), 179–201.
to obtain the study-level model equations, one can write a non-linear mixed effect model for each individual (including inter-individual variability) and derive the study-level model by taking the mean response in each treatment group, i.e the average of the individual responses. In the simplistic case of linear models, the study-level model can be rewritten such that a between-study variability (BSV), a between-treatment arm variability (BTAV) and a residual error appear. Importantly, in this formulation, the residual error and the between treatment arm variability are weighted by with _{ }the number of individuals in treatment arm k of study i. If the model is not linear, a study-level model cannot be easily derived from the individual-level model. Nevertheless, one usually makes the approximation that the study-level model can also be described using between-study variability, between-treatment arm variability and a residual error term.
This result can be understood in the following way: when considering studies which represent aggregated data, the between-study variability represents the fact that the inclusion criteria vary from study to study, thus leading to patient groups with slightly different mean characteristics from study to study. This aspect is independent of the number of individuals recruited. On the opposite, the between-treatment arm variability represents the fact that when the recruited patients are split in two or more arms, the characteristics of the arms will not be perfectly the same, because there is a finite number of individuals in each arm. The larger the arms are, the smaller the difference between the mean characteristics of the arms is, because one averages over a larger number. This is why the between-treatment arm variability is weighted by . For the residual error, the reasoning is the same: as the individual-level residual error is averaged over a larger number of individuals, the study-level residual error decreases.
The study-level model can be written similarly to the individual-level model, with between-study variability and between-treatment arm variability considered instead of inter-individual variability and inter-occasion variability. Additionally, the residual error and between-treatment arm variability are weighted by
Before proposing a model for rheumatoid arthritis, a closer look at the data is required.
Data selection, extraction, and visualization
Data selection
To compare Canaki to Adali and Abata, we have searched for published studies involving these two drugs. Reusing the literature search presented in Demin et al., we have selected only studies with inclusion criteria similar to the Canaki study, in particular: approved dosing regimens, and patients having an inadequate response to MTX. Compared to the Demin et al. paper, we have included two additional studies published later on. In all selected studies, the compared treatments are MTX+placebo and MTX+biologic drug.
The full list of used publications is:
Data extraction
Using an online data digitizer application, the longitudinal ACR20 data from nine selected publications was extracted. The information is stored in a data file, along with additional study details. The in-house individual Canaki data from the phase II trial is averaged over treatment arms to compare with the aggregate literature data. A screenshot and column descriptions are provided.
STUDY: index of the study/publication. (Equivalent to the ID column)
ARM: 0 for placebo arm (i.e MTX+placebo treatment), and 1 for biologic drug arm (MTX+biologic drug)
DRUG: name of the drug administrated in the arm (placebo, adali, abata or canaki)
TIME: time since beginning of the treatment, in weeks
Y: ACR20, percentage of persons on the study having an at least 20% improvement of their symptoms
Publication: first author of the publication from which the data has been extracted
Narm: number of individuals in the arm
TRT: treatment studied in the study
DiseaseDuration: mean disease duration (in years) in the arm at the beginning of the study
CReactiveProtein: mean C-reactive protein level (in mg/dL)
SwollenJointCount: mean number of swollen joints
Data visualization
Monolix includes a built-in data visualization module for an overview after data import. Upon opening Monolix, select "New project" and browse for the data file. The columns are assigned, and the data visualization is displayed for analysis.
The STUDY
column is tagged as ID, and the ARM
column is tagged as OCCASION to use the MonolixSuite's inter-individual and inter-occasion variability features to define between-study and between-treatment-arm variability. All columns that may be used to stratify/color the plots must be tagged as either continuous or categorical covariate, even if these columns will not be used in the model.
After clicking "ACCEPT" and "NEXT," the data is displayed in the "Plots" tab. To better distinguish the treatments and arms, the DRUG and TRT covariates are assigned in the "STRATIFY" tab. The layout is arranged in a single line, with treatments split by TRT and colored by DRUG. In the "SETTINGS" tab, custom x-axis limits are enabled, automatically setting the range to 1 and 52.14. Similarly, setting custom Y-axis limits (5.38 to 75.66) showcases the variation in ACR20 increase over time. Biologic treatments are colored (blue for Abata, red for Adali, green for Canaki), while placebo arms are displayed in black, demonstrating the efficacy of MTX+biologic drug compared to MTX+placebo. Note that even in the “placebo” arms the ACR20 improves over time, due to the MTX treatment. For Canaki, which is the in house drug in development, only one study is available, the phase II trial.
Given this data, the question arises whether Canaki could be more effective than Adali and Abata. To answer this, a longitudinal mixed-effect model will be developed.
Description of the longitudinal mixed-effect model
Based on the observed trajectory of ACR20 over time, an Emax structural model is suggested to capture the data's behavior.
The ACR20 score represents the fraction of responders, so it must remain between 0-1 (or 0-100 as a percentage). To maintain this, a logit transformation is applied to the observations :
with the index over studies, the index over time, and the index over the treatment arms. is the residual error, it is distributed according to
Given , the number of individuals in treatment arm of study , the analysis focuses primarily on (efficacy) due to the limited data. To model , between-study variability (BSV) and between-treatment arm variability (BTAV) are considered only for this parameter. A logit distribution is selected to ensure Emax values remain between 0 and 1.
with representing BSV and representing BTAV
For , a log-normal distribution without random effects is selected:
with and the typical population parameters, that depend on the administrated drug . The BSV (index (0)) and BTAV (index (1)) terms for Emax have the following distributions:
Between-treatment arm variability and residual error are weighted by the number of individuals per arm, unlike between-study variability, as explained in the section “MBMA concepts” above.
The next section covers implementing this model in Monolix.
Model setup in Monolix
In Monolix, selecting the error model is limited to a pre-defined list, which does not include a residual error model weighted by the square root of the number of individuals. Therefore, both the ACR20 observations of the data set and the model prediction will be transformed in the following manner, utilizing a constant error model. The equation can be rewritten as follows:
with
a constant (unweighted residual error term,
the transformed data,
the transformed model prediction.
This data transformation is added to the data set, in the column transY.
At this stage, the new dataset with the transY column can be loaded into Monolix, and the columns can be assigned as before, with the exception that the transY column will now be used as OBSERVATION.
STUDY is tagged as ID, ARM as OCCASION, DRUG as CATEGORICAL COVARIATE, TIME as TIME, transY as OBSERVATION, Narm as REGRESSOR, TRT as CATEGORICAL COVARIATE and DiseaseDuration, CReactiveProtein and SwollenJointCount as CONTINUOUS COVARIATE.
To match the observation transformation in the data set, the Mlxtran model file will include the following:
ACR20 = ...
pred = logit(ACR20)*sqrt(Narm)
OUTPUT:
output = pred
In order to be available in the model, the data set column Narm must be tagged as regressor, and passed as argument in the input list:
[LONGITUDINAL]
input = {..., Narm}
Narm = {use=regressor}
In this example, the number of individuals per arm (Narm) remains constant over time. In cases of dropouts or non-uniform trials, Narm may vary over time. This will be accommodated in Monolix, as regressors can change over time by definition.
Parameter Distribution Considerations
Similar to the error model, weighting the BTAV directly via the Monolix GUI is not possible. As an alternative, the fixed effect (EmaxFE), the BSV random effect (etaBSVEmax), and the BTAV random effect (etaBTAVEmax) can be defined separately in the GUI, with the parameter value reconstructed within the model file. Since Narm has been defined as a regressor, it can be used to weight the BTAV terms. Care must also be taken when adding the random effects to the normally distributed transformed parameters. The syntax for the structural model is as follows:
[LONGITUDINAL]
input = {EmaxFE, T50, etaBSVEmax, etaBTAVEmax, Narm}
Narm = {use=regressor}
EQUATION:
; transform the Emax fixed effect (EmaxFE) to tEmax (normally distributed)
tEmax = logit(EmaxFE)
; adding the random effects (RE) due to between-study variability and
; between arm variability, on the transformed parameters
tEmaxRE = tEmax + etaBSVEmax + etaBTAVEmax/sqrt(Narm)
; transforming back to have EmaxRE with logit distribution (values between 0 and 1)
Emax = exp(tEmaxRE)/(1+exp(tEmaxRE))
; defining the effect
ACR20 = Emax * (t / (T50 + t))
; adding a saturation to avoid taking logit(0) (undefined) when t=0
ACR20sat = min(max(ACR20,0.01),0.99)
; transforming the effect ACR20 in the same way as the data
pred = logit(ACR20sat)*sqrt(Narm)
OUTPUT:
output = pred
In addition to the previously explained steps, a saturation ACR20sat=min(max(ACR20,0.01),0.99)
is added to prevent ACR20
from being 0 when t
=0, which would result in pred=logit(0)
(undefined).
In the structural model tab, selecting 'New model,' copying and pasting the code above, and saving the model will allow it to appear below:
In the next tab, Statistical Model & Tasks, the statistical part of the model will be defined. For the observation model, a constant error model shall be selected from the list, in accordance with the data transformation performed.
For the parameter distributions, a logit distribution shall be chosen for EmaxFE, with distribution limits set between 0 and 1 (note that the default initial value for EmaxFE is 1, leading to default logit limits of (0,1). So the initial value for EmaxFE needs to be changed first, before logit(0,1) can be set), lognormal for T50, and a normal distribution for the random effects etaBSVEmax, and etaBTAVEmax.
Since this model formulation allows EmaxFE to represent only the fixed effects, the random effects for this parameter and T50 must be disabled. As etaBTAVEmax corresponds to inter-occasion variability, the random effects will be disabled at the STUDY level (indicated as “ID” in the GUI) but enabled at the ARM level. The random effects at both levels are thus set in the following way:
For etaBSVEmax, its distribution is , so the mean must be enforced at zero. This can be achieved in the initial estimates tab by using the cog-wheel button to fix etaBSVEmax_pop at 0. The same action should be taken for etaBTAVEmax_pop.
Covariate Considerations
As a final step, DRUG will be added as a covariate for EmaxFE and T50 to estimate one EmaxFE and one T50 value for each treatment (placebo, adali, abata, and canaki). Before adding the DRUG column as a covariate, the 'add covariate / discrete' button should be clicked to create a new covariate called tDRUG, with placebo as the reference category.
The newly created tDRUG is added on EmaxDE and T50. The other possible covariates will be investigated later on.
Summary view
Model development
Parameter estimation
Parameter initialization
Before starting the parameter estimation, reasonable initial values should be identified. In the Initial Estimates tab, within the Check initial estimates window, initial values for the population parameters can either be set manually or the auto-init function can be run to determine the initial values, as shown in the image below. The red curve represents the prediction with the estimated initial values, while the blue dots correspond to the transY data.
Configuring Settings for Parameter Estimation
The parameter estimation task is then launched. After approximately 160 iterations, the SAEM exploratory phase transitions to the second phase due to the automatic stopping criteria.
To assess convergence, the parameter estimation can be run using different initial values and a different random number sequence. This can be accomplished by clicking on “assessment” right next to the run button. This module runs estimation tasks multiple times with varying seeds and initial values.
Convergence trajectories, final estimates, and repeated log-likelihood calculations can be examined to assess the robustness of the estimates.
The colored oscillating curves in the upper plot above show the parameter estimates trajectories, while the colored horizontal lines in the lower plot above show the standard deviation of the estimated population parameter value of each run. The trajectories of many parameters show significant variation between runs. Parameters without variability, such as EmaxFE, are generally more difficult to estimate, as parameters with and without variability are estimated using different methods. The SAEM algorithm requires to draw parameter values from their marginal distribution, which exists only for parameters with variability.
Several methods can be used to estimate the parameters without variability. By default, these parameters are optimized using the Nelder-Mead simplex algorithm. As the performance of this method seems insufficient in our case, we will try another method. In the SAEM settings (accessed via the wrench icon next to the 'Population parameters' task in the Statistical Model & Tasks tab), 3 options are available:
No variability (default): optimization via Nelder-Mead simplex algorithm
Variability in the first stage: during the first phase, an artificial variability is added and progressively forced to zero. In the second phase, the Nelder-Mead simplex algorithm is used.
Add decreasing variability: an artificial variability is added for these parameters, allowing estimation via SAEM. The variability is progressively forced to zero, such that at the end of the estimation proces, the parameter has no variability.
The 'Variability in the first stage' method shall selected, the project shall be saved under a new name (MBMA_r02_variability1stage.mlxtran) and the convergence assessment needs to be relaunched. This time, the convergence is satisfactory:
Results
The estimated values can be reviewed directly in the “Assessment” folders generated during the convergence assessment, or the parameter estimation and Fisher information matrix estimation (e.g., via linearization) can be redone to obtain the standard errors.
The point estimates in the Wald test, along with the associated p-values in the test section of the results tab, represent the difference between the placebo and the other drug categories. A significant p-value indicates that the drug effect is statistically different from 0, meaning the covariate effect should be retained in the model.
As indicated by the p-value, the Emax values for Abata and Adali are significantly different from the Emax for placebo. On the opposite, given the data, the Emax value for Canaki is not significantly different from placebo. Note that when analyzing the richer individual data for Canaki, the efficacy of Canaki had been found significantly better than that of placebo (Alten, R. et al. (2011). Efficacy and safety of the human anti-IL-1β monoclonal antibody canakinumab in rheumatoid arthritis: results of a 12-week, Phase II, dose-finding study. BMC Musculoskeletal Disorders, 12(1), 153.).
The T50 values are similar for all compounds except Adali. The test shows that the effect of Abata vs. Placebo and Canaki vs. Placebo is not significantly different from 0. Therefore, these drug covariate modalities can be pooled for the next model run (MBMA_r03_variability1stage_groupeddrug.mlxtran), creating a new covariate.
This new covariate will be included in the individual model as a covariate effect on T50, replacing the covariate tDRUG.
After launching the estimation tasks, the Wald test suggests that the new covariate should remain included in the individual model on T50, as the difference between Adali and Canaki, Abata, and Placebo is significant.
The between study variability of Emax is around 25% (C.V..(%) of omega_etaBSVEmax = 0.2533). The between treatment arm variability is gamma_etaBTAVEmax = 2.4361. In the model, the BTAV is divided by the square root of the number of individuals per arm. With a typical arm size of 200 individuals, the between-treatment arm variability is approximately 14% ( )
When the random effects of BSVEmax_pop and BTAVEmax_pop are set to 0, the C.V.(%) for omega_etaBSVEmax and gamma_etaBTAVEmax becomes infinite. The coefficient of variation is defined as the ratio of the standard deviation to the mean, resulting in division by 0.
Model diagnosis
The next step is to estimate the study parameters (i.e., individual parameters via the mode task) and generate the graphics. No model mis-specification is detected in the individual fits, observations versus predictions, residuals, or other graphics:
Individual fits | Observations vs predictions |
---|
Note that all graphics use the transformed observations and transformed predictions.
Alternative models
A more complex model, including BSV and BTAV on T50, could also be explored, but this investigation is left as a hands-on exercise for the reader. The results show a small difference in AIC and BIC between the two models, and some standard errors in the more complex model are unidentifiable. Therefore, the simpler model is preferred.
Covariate analysis
The between study variability originates from the different inclusion criteria from study to study. To explain part of this variability, the studied population characteristics can be tested as covariates. The available population characteristics are the mean number of swollen joints, the mean C-reactive protein level and the mean disease duration.
As these covariates differ between arms, they appear in the IOV window. To center the distribution of covariates around a typical value, the covariates are first transformed using a reference value:
The covariates can be included one by one in the model on the Emax parameter as an example of a forward covariate model development approach. Significance can be assessed using either a Wald test (with p-values calculated from the Fisher Information Matrix Task in the result file) or a likelihood ratio test. The advantage of the Wald test is that it requires estimating only one model (the one with the covariate), while the likelihood ratio test necessitates estimating both models (with and without the covariate).
In this case study, a backward approach has been employed, where all the covariates are included in the model on Emax simultaneously, followed by an evaluation to determine whether any covariates should be excluded (run04.mlxtran). The p-values from the Wald test assessing whether the beta coefficients are zero are as follows:
Disease duration: p-value=0.28 (not significant)
C-reactive protein level: p-value=0.75 (not significant)
Swollen joint count: p-value=0.14 (not significant)
The likelihood ratio test also indicates that none of these covariates is significant; therefore, they will not be included in the model.
Final model
The final model has an Emax structural model, with between study and between treatment arm variability on Emax, and no variability on T50. No covariates are included on these parameters.
The fixed effect EmaxFE incorporates the drug effect tDrug, reflecting all four drug modalities (Adali, Abata, Canaki, and Placebo), with Placebo set as the reference drug.
For the parameter T50, a drug covariate effect is also included; however, the drug covariate has been regrouped to consist of only two modalities (Adali and Abata/Canaki/Placebo).
The final run is available in the materials (see the top of the page) and is named MBMA_r03_variability1stage_groupeddrug.mlxtran.
Simulations for decision support
Efficacy comparison with existing market drugs
The developed model will be used to compare the efficacy of Canaki with that of Abata and Adali. The question arises: does Canaki have the potential to be more effective? To address this, simulations will be conducted using the simulation application Simulx.
To compare the true efficacy (over an infinitely large population) of Canaki with Abata and Adali, the Emax values could be directly compared. However, since Emax estimates carry uncertainty, this uncertainty must be taken into account. A large number of simulations will be performed for the three treatments, drawing the Emax population value from its uncertainty distribution. The BSV, BTAV, and residual error will be ignored (i.e., set to 0) to focus on the “true efficacy” within an infinitely large population.
Before running the simulations, the final project must be exported from Monolix to Simulx.
All the information contained in the Monolix project will be transferred to Simulx. This includes the model (longitudinal and statistical part), the estimated population parameter values and their uncertainty distribution and the original design of the dataset used in Monolix (number of individuals, treatments, observation times). Elements containing these information are generated automatically in the “Definition” tab of the created Simulx project. It is possible to reuse these elements to setup a simulation in the “Simulation” tab or create new elements to define a new simulation scenario.
Among the elements created automatically, there is the occasion element imported from the Monolix data set. The first step is to remove the occasion element, as it is not needed for simulations. Navigate to the tab Definition > Occasions and click the 'delete' button.
To compare the efficacy of Canaki with Abata, Adali, and placebo, ACR20 will be simulated over 52 weeks in four different scenarios (simulation groups): Canaki, Abata, Adali, and placebo.
Next, the output needs to be defined. In Monolix, the transformed prediction of ACR20 was used due to constraints on the error model. Since the error model is being neglected here, and the focus is on model predictions, the untransformed ACR20 can be directly output at each week.
The output variable used to assess the efficacy of the treatments is ACR20, measured on a regular grid from time 0 to week 52.
The applied treatments, which appear in the model as a categorical covariate, must then be defined. This can be done directly in the covariates tab. Four distinct elements need to be created, each representing one of the four treatments to be simulated.
The definition of the other treatments not shown in the screenshot follows the same approach.
Lastly, a regressor representing the number of subjects per study is defined. Although this information needs to be specified, it will not influence the simulations.
Initially, the ACR20 for each treatment is simulated using the estimated Emax and T50 values, while ignoring uncertainty. Four individuals (i.e., studies) are defined, one for each treatment.
After defining the output, regressor, and treatment variables, the simulation tab is used.
The ACR20 for each treatment is simulated, focusing on the estimated Emax and T50 values and disregarding uncertainty. One simulation (i.e., study) is defined for each treatment.
A group size of 1 is set to simulate only one study per treatment. The parameter 'mlx_typical' (available starting from version 2023) is selected, which contains the population parameters from the last Monolix run, with the random effects' standard deviations set to 0. This simulates a typical study, excluding BSV and BTAV. The defined untransformed ACR20 vector is used as the output variable. By clicking on 'New group,' the defined covariates are applied as treatments across the four simulation groups. The Reg_Narm element is set as the regressor. In theory, the number of subjects in the treatment arms determines the variability between the treatment arms. As the number of subjects approaches infinity (), the variability converges to 0, with .
Since an ideal scenario is being modeled, a very high number of subjects can be set. However, since 'mlx_typical' is selected as the simulation parameter, the random effects (etaBSV and etaBTAV) and their standard deviations are already set to 0, meaning Reg_Narm has no practical effect on the simulation.
After running the simulation, the response curves for the four treatment groups appear as subplots in the 'Output distribution' tab within the Plot section. To display all curves in a single plot, the 'merged splits' option can be activated in the 'Stratify' panel. Additionally, by enabling the legend toggle in the plot settings and optionally adjusting the treatment arm colors in the stratification tab, the following plot is generated:
The simulation gives in the generated plot already a first impression of the efficacy of Canaki (orange curve) compared to the other treatments.
By defining outcomes and endpoints, this impression can be quantified.
As the outcome, ACR20 is defined at the final time point (t = 52). The endpoint can be the arithmetic mean, geometric mean, or median within each simulation group. Since only one study is simulated per group, this choice has no impact, and the endpoint will be equal to the outcome. A direct comparison of the ACR20 values is chosen across the groups, with Canaki as the reference. Given that each treatment group has only one simulated study, no statistical comparison is possible. The relation is set as unilateral "< 0" to assess whether Canaki is more effective compared to existing treatments. If the result is negative, Canaki is considered more effective; if it equals 0, Canaki is deemed equivalent at the final time point; if positive, it is considered less effective.
In formula terms, the following relationship is expected as a successful experimental outcome:
The formula is analogous for the treatments Abata and Placebo versus Canaki.
After defining the outcomes and endpoints, save the project under a new name. Only the Outcome & Endpoint task needs to be executed; the simulations do not need to be rerun.
In the simulation tab of the results section, summary tables for each treatment arm are provided, along with a list of the individual parameters. The endpoint tab of the results shows the arithmetic mean of ACR20 at the final time point, t=52. The "Standard Deviation" column in the Endpoints[Table] or Endpoints[Summary] tab displays NaN, as the sample size n, representing the number of simulated studies, is set to 1, leading to division by zero. In the group comparison tab, a table presents the calculated mean differences between the treatment arms and the reference, Canaki. The "Success" column indicates whether the comparison is successful: a green check mark appears if the difference is < 0, while a black cross indicates an unsuccessful comparison.
The mean difference between Abata and Canaki, as well as Adali and Canaki, is positive, indicating that ACR20 for Abata and Adali was higher than for Canaki at week 52. This suggests that, in the ideal simulated scenario (infinitely large population), Canaki is less effective than these two treatments already available on the market. However, the comparison between Placebo and Canaki shows a negative difference, aligning with the effect curve plot.
The next step involves calculating a prediction interval for the true effect, considering the uncertainty of the population parameters estimated by Monolix. This can be done by saving the Simulx project under a new name and adjusting the settings as follows:
create 100 replicates of the scenario and
change the population parameter from mlx_Typical to mlx_TypicalUncertain (available from version 2023). This option includes the estimated variance-covariance matrix of the population parameters from the final Monolix run, where the standard deviations of the random effects are set to 0. This allows simulation of a typical subject (i.e., a study) with varying population parameter values across replicates, ensuring that the uncertainty in population parameters is reflected in the predictions.
The defined outcomes and endpoints remain unchanged. The goal is to assess the impact on the success criterion when the scenario is repeated 100 times using different sets of population parameters.
The output distribution plot again shows the response curve, but now with the percentiles of the simulated data for all calculated replicates. The “number of bands” in Display can be set to 2 and the subplots merged to easy the comparison.
In the “Endpoint distribution” tab of the resulting plots, the distribution of ACR20 at week 52 across all replicates for each of the four treatment groups is shown. It is clear that treatments with active compounds are more effective than the placebo group. However, after 100 replicates, the mean value of Canaki remains below that of Abata and Adali.
Note: The thickness of the bands in the “Output distribution” plot, as well as the wide box and whiskers in the “Endpoint distribution” plot for Canaki, reflect the smaller amount of data available for Canaki compared to other treatments. This results in greater uncertainty regarding Canaki's true effect.
Returning to the results section, tables similar to those from the previous simulation are found, but this time displaying results for each replicate.
Based on the summary table at week 52, considering the estimated population parameters:
In 100 repetitions, the relation was met only 9 times, indicating a 9% chance that Canaki outperforms Abata.
In 100 repetitions, the relation was met 43 times, indicating a 43% chance that Canaki outperforms Adali.
In 100 repetitions, the relation was met all 100 times, indicating a 100% certainty that Canaki outperforms the placebo.
This finding supported the decision not to continue with clinical development of canakinumab in RA, according to:
Demin, I., Hamrén, B., Luttringer, O., Pillai, G., & Jung, T. (2012). Longitudinal model-based meta-analysis in rheumatoid arthritis: an application toward model-based drug development. Clinical Pharmacology and Therapeutics, 92(3), 352–9.
Conclusion
Model-informed drug development
Model based meta-analysis supports critical decisions during the drug development process. In the present case, the model has shown that chances are low that Canaki performs better than two drugs already on the market. As a consequence the development of this drug has been stopped, thus saving the high costs of the phase III clinical trials.
MBMA with Monolix and Simulx
This case study also aims at showing how Monolix and Simulx can be used to perform longitudinal model-based meta-analysis. Contrary to the usual parameter definition in Monolix, the weighting of the between treatment arm variability imposes to split the parameters into their fixed term and their BSV and BTAV terms, and reform the parameter in the model file. Once the logic of this model formulation is understood, the model set up becomes straightforward. The efficient SAEM algorithm for parameter estimation as well as the built in graphics then allow to efficiently analyze, diagnose and improve the model.
The inter-operability between Monolix and Simulx permits to carry the Monolix project over to Simulx to easily define and run simulations.
Natural extensions
In this case study, continuous data were used. Yet, the same procedure can be applied to other types of data such as count, categorical or time-to-event data.
Guidelines for implementing MBMA models
Data set preperation
The data set should usually contain the following columns:
STUDY: index of the study/publication. It is equivalent to the ID column for individual-level data.
ARM: index of the arm (preferably continuous integers starting at 0). There can be an arbitrary number of arms (for instance several treated arms with different doses) and the order doesn’t matter. This column will be used as equivalent to the occasion OCC column.
TIME: time of the observations
Y: observations at the study level (percentage, mean response, etc)
transY: transformed observations, for which the residual error model is constant
Narm: number of individuals in the arm. Narm can vary over time
Covariates: continuous or categorical covariates that are characteristics of the study or of the arm. Covariates must be constant over arms (appear in IOV window) or over studies (appear in main window).
Common inquiries about the data set
Can dropouts (i.e. a time-varying number of individuals per arm) be taken into account? Yes. This can be accounted for without difficulty as the number of subjects per arm (Narm) is passed as a regressor to the model file, a type that allow by definition variations over time.
Can there be several treated (or placebo) arms per study? Yes. The different arms of a study are distinguished by a ARM index. For a given study, one can have observations for ARM=0, then observations for ARM=1, then observations for ARM=2. The index number has no special meaning (for instance ARM=0 can be the treated arm for study 1, and ARM=0 the placebo arm for study 2).
Why should I transform the observations? The observations must be transformed such that an error model from the list (constant, proportional, combined, exponential, etc) can be used. In the given example, two transformations were applied. First, a logit transformation was used to ensure that the prediction of the fraction of individuals achieving ACR20 remains within the 0-1 interval, even if a large residual error term is encountered. For other cases, such as when predicting a mean concentration of glucose, a log-transformation could be applied to guarantee positive values. Second, each value was multiplied by the square root of the number of individuals per arm. This ensures that the standard deviation of the error term is consistent across all studies and treatment arms, regardless of the number of individuals per arm.
Model set-up
The set up of the project can be done in four steps:
Split the parameters in several terms
Make the appropriate transformation for the parameters
Set the parameter definition in the Monolix interface
Add the transformation of the prediction
Common inquiries about the model set-up
Why should I split a parameter in several terms? Because the standard deviation of the BTAV term must usually by weighted by the square root of the number of individuals per arm, this term must be handled as a separate parameter and added to the fixed effect within the model file. Moreover, the fixed effect and its BSV random effect might be split depending on the covariate definition.
How do I know in how many terms should I split my parameter?
If all covariates of interest are constant over studies: In this case, the parameters can be split into a (fixed effect + BSV) term and a (BTAV) term. The model file would for instance be (for a logit distributed parameter):
[LONGITUDINAL]
input = {paramFE_BSV, etaBTAVparam, Narm}
Narm = {use=regressor}
EQUATION:
tparam = logit(paramFE_BSV)
tparamRE = tparam + etaBTAVparam/sqrt(Narm)
paramRE = 1 / (1+ exp(-tparamRE))
Impact on the interface: In the GUI, paramFE_BSV would have both a fixed effect and a BSV random effect. Covariates can be added on this parameter in the main window. The etaBTAVparam fixed effect should be enforced to zero, the BSV disabled and the BTAV in the IOV part enabled.
If one covariate of interest is not constant over studies: This was the case for the case study, where the DRUG covariates were constant across arms but not studies. In this scenario, the parameters must be divided into a fixed effect term, a BSV term, and a BTAV term. For example, the model file could be structured as follows (for a logit-distributed parameter):
[LONGITUDINAL]
input = {paramFE, etaBSVparam, etaBTAVparam, Narm}
Narm = {use=regressor}
EQUATION:
tparam = logit(paramFE)
tparamRE = tparam + etaBSVparam + etaBTAVparam/sqrt(Narm)
paramRE = 1 / (1+ exp(-tparamRE))
Impact on the interface: In the GUI, paramFE has only a fixed effect (BSV random effect must be disabled in the variance-covariance matrix). Covariates can be added on this parameter in the IOV window. The fixed effect value of etaBSVparam must be enforced to zero. The etaBTAVparam fixed effect should be enforced to zero too, the BSV disabled and the BTAV in the IOV part enabled.
Why shouldn’t I add covariates on the etaBTAVparam in the IOV window? Covariates can be added on etaBTAVparam in the IOV window. Yet, because the value of etaBTAVparam will be divided by the square root of the number of individuals in the model file, the covariate values in the data set must be transformed to counter-compensate. This may not always be feasible:
If the covariate is categorical (as the DRUG for example), it is not possible to compensate and thus the user should not add the covariate on the etaBTAVparam in the GUI. A splitting in 3 terms is needed.
If the covariate is continuous (weight for example), it is possible to counter-compensate. Thus, if you add the covariates on the etaBTAVparam in the GUI, the covariate column in the data set must be multiplied by beforehand. This is a drawback, however this modeling has less parameters without variability and may thus converge more easily.
How should I transform the parameter depending on its distribution? Depending on the original parameter distribution, the parameter must first be transformed as in the following cases:
For a parameter with a normal distribution, called paramN for example, we would have done the following:
; no transformation on paramN as it is normally distributed
tparamN = paramN
; adding the random effects (RE) due to between-study and between arm variability, on the transformed parameter
tparamNRE = tparamN + etaBSVparamN + etaBTAVparamN/sqrt(Narm)
; no back transformation as tparamNRE already has a normal distribution
paramNRE = tparamNRE
Note that the two equalities are not necessary but there for clarity of the code.
For a parameter with a log-normal distribution, called paramLog for example (or T50 in the test case), we would have done the following:
; transformation from paramLog (log-normally distributed) to tparamLog (normally distributed)
tparamLog = log(paramLog)
; adding the random effects (RE) due to between-study and between arm variability, on the transformed parameter
tparamLogRE = tparamLog + etaBSVparamLog + etaBTAVparamLog/sqrt(Narm)
; transformation back to have paramLogRE with a log-normal distribution
paramLogRE = exp(tparamLogRE)
For a parameter with a logit-normal distribution in ]0;1[, called paramLogit for example (or Emax in the test case), we would have done the following:
; transformation from paramLogit (logit-normally distributed) to tparamLogit (normally distributed)
tparamLogit = logit(paramLogit)
; adding the random effects (RE) due to between-study and between arm variability, on the transformed parameter
tparamLogitRE = tparamLogit + etaBSVparamLogit + etaBTAVparamLogit/sqrt(Narm)
; transformation back to have paramLogitRE with a logit-normal distribution
paramLogitRE = 1 / (1+ exp(-tparamLogitRE))
For a parameter with a generalized logit-normal distribution in ]a;b[, called paramLogit for example, we would have done the following:
; transformation from paramLogit (logit-normally distributed in ]a,b[) to tparamLogit (normally distributed)
tparamLogit = log((paramLogit-a)/(b-paramLogit))
; adding the random effects (RE) due to between-study and between arm variability, on the transformed parameter
tparamLogitRE = tparamLogit + etaBSVparamLogit + etaBTAVparamLogit/sqrt(Narm)
; transformation back to have paramLogitRE with a logit-normal distribution in ]a,b[
paramLogitRE =(b * exp(tparamLogitRE) + a)/ (1+ exp(tparamLogitRE))
How do I set up the GUI? Depending which option has been chosen, the fixed effects and/or the BSV random effects and/or the BTAV random effects of the model parameters (representing one or several terms of the full parameter) must be enforced to 0. Random effects can be disabled by clicking on the RANDOM EFFECTS column and fixed effects by choosing “fixed” after a click in the initial estimates.
Why and how do I transform the prediction? The model prediction must be transformed in the same way as the observations of the data set, using the column Narm passed as regressor.
Parameter estimation
The splitting of the parameters into several terms may introduce model parameters without variability. These parameters cannot be estimated using the main SAEM routine, but instead are optimized using alternative routines. Several of them are available in Monolix, under the settings of SAEM. If the default method leads to poor convergence, testing other methods is advisable.
Model evaluation
All graphics are displayed using the transformed observations and predictions. These are the right quantities to diagnose the model using the “Observation versus prediction” and “Scatter plot of the residuals”, “distribution of the residuals” and graphics. For the individuals fits and VPC, it may also be interesting to look at the non-transformed observations and predictions. The whole analysis conduct as well as the generation of the plots can be redone in R using the simulx function of the RsSimulx package or the lixoft connectors. The other diagnostic plots (“Individual parameters vs covariates”, “Distribution over the individual parameters”, “Distribution of the random effects”, and “Correlation of the random effects”) are not influenced by the transformation of the data.