Clinical Pharmacokinetics 2012

0 downloads 0 Views 972KB Size Report
optimal dosing strategy, dose scaling and/or incorporation of individual ...... pharmacokinetic analysis of levetiracetam in children and adolescents with epilepsy: ... filgrastim in healthy adults following intravenous and subcutaneous admin-.
Clin Pharmacokinet 2012; doi: 10.2165/11634200-000000000-00000 0312-5963/12/0000-0000/$49.95/0

REVIEW ARTICLE

Adis ª 2012 Springer International Publishing AG. All rights reserved.

Fundamentals of Population Pharmacokinetic Modelling Validation Methods Catherine M.T Sherwin,1 Tony K.L. Kiang,2 Michael G. Spigarelli1 and Mary H.H. Ensom2,3 1 Division of Clinical Pharmacology & Clinical Trials Office, Department of Pediatrics, University of Utah School of Medicine, Salt Lake City, UT, USA 2 Faculty of Pharmaceutical Sciences, The University of British Columbia, Vancouver, BC, Canada 3 Department of Pharmacy, Children’s and Women’s Health Centre of British Columbia, Vancouver, BC, Canada

Contents Abstract. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2. What is Validation?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.1 Validation versus Verification versus Evaluation versus Qualification: Does it Matter? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2 Do We Need to Validate/Evaluate Population Pharmacokinetic Models? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Validation/Evaluation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3.1 Model Diagnostics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.2 Summary of Known Validation/Evaluation Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.3 What is the Ideal Validation Method? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4. Future Approaches/Suggestions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Abstract

Population pharmacokinetic modelling is widely used within the field of clinical pharmacology as it helps to define the sources and correlates of pharmacokinetic variability in target patient populations and their impact upon drug disposition; and population pharmacokinetic modelling provides an estimation of drug pharmacokinetic parameters. This method’s defined outcome aims to understand how participants in population pharmacokinetic studies are representative of the population as opposed to the healthy volunteers or highly selected patients in traditional pharmacokinetic studies. This review focuses on the fundamentals of population pharmacokinetic modelling and how the results are evaluated and validated. This review defines the common aspects of population pharmacokinetic modelling through a discussion of the literature describing the techniques and placing them in the appropriate context. The concept of validation, as applied to population pharmacokinetic models, is explored focusing on the lack of consensus regarding both terminology and the concept of validation itself. Population pharmacokinetic modelling is a powerful approach where pharmacokinetic variability can be identified in a target patient population receiving a pharmacological agent. Given the lack of consensus on the best approaches in model building and validation, sound fundamentals are required to ensure the selected methodology is suitable for the particular data type and/or patient population. There is a need to further standardize and establish the best approaches in modelling so that any model created can be systematically evaluated and the results relied upon.

Sherwin et al.

2

1. Introduction In an attempt to provide insight into the field, this review focuses on two distinct aspects of population pharmacokinetic validation: (i) what is validation; and (ii) how results from population models are validated or evaluated. We provide an overview of commonly applied techniques that perform these functions. The question of what constitutes validation has been repeatedly posed in the literature and online discussion groups such as NMusers (Online UsersNet: http://www.cognigencorp. com/index.php/cognigen/resources_nonmem), but there is no consensus answer. There is debate as to which term should be used, with some favouring evaluation, qualification or validation. Other terms found in the literature include accreditation and credible model.[1] In general, the approach has been dependent upon the model, how results will be applied and the investigators. The second question of what does validation provide for those who use the model is equally important. This can be answered from the aspect of the model itself in which validation is an attempt to quantify how accurate and reproducible the model is and perhaps under what circumstances it is applicable. For the researcher, model validation provides insight into the modelled system and its inherent limitations that will require new approaches or additional data to overcome. For the clinician who takes the time to understand the model, it provides a sense of how comfortable they should be with the predictions made by the model, prior to applying the results to their patients. Two main guidelines that help define model validation/ evaluation for population pharmacokinetic modelling are the US FDA Guidance for Industry: Population Pharmacokinetics[2] and European Medicines Agency (EMA) Guideline on Reporting the Results of Population Pharmacokinetic Analyses.[3] The FDA guideline uses the term ‘model validation’ and describes the objective of examining if the model is a good description of the validation dataset, with regards to behaviour of the model and the application proposed.[2] Similarly, the EMA guideline uses the term ‘model evaluation’ and outlines that this should demonstrate the final model is robust and a good description of the data; therefore, the objective of the analysis can be met.[3] A guide for reporting results of population pharmacokinetic analyses by Wade et al.[4] was used as the basis for the EMA guidelines.[3] Various methods have been proposed to validate or evaluate performance of a model (table I). These methods include bootstrapping, data splitting, cross-validation or other measures of goodness-of-fit. To compound the lack of consensus Adis ª 2012 Springer International Publishing AG. All rights reserved.

regarding which types of validation methods should be used, methods are constantly changing and some very innovative approaches to validation have been suggested. The FDA objective of model validation[2] is to evaluate predictability of the model. To that end, the model and model estimates are created using a learning or index dataset and subsequently applied to a different validation dataset not used in the model building and estimation of parameters to evaluate model predictions. This approach is good for ideal studies with large datasets of numerous subjects. In special populations such as neonates and children or those with rare diseases, this approach would be difficult to take as these are generally small datasets comprising few subjects. Other groups[18] believe that model validation defines how accurately the population model describes the data or validation dataset based on empirical judgment rather than specific statistical test(s). The International Programme on Chemical Safety (IPCS) Harmonization Project[19] describes models as an interpretation of a physiological system, which includes errors varying in magnitude and importance and concepts of structural and parameter uncertainty/error. IPCS describes validation as the process by which reliability and relevance of the model is determined for a defined purpose. For physiologically based pharmacokinetic (PBPK) and simulation models in risk assessment, it is proposed that there should be purpose-specific evaluation rather than generic validation.[19,20] Model evaluation is viewed as essential to establishing confidence in the model. This is based on sound scientific principles, quality of input parameters and the model being able to reproduce independent empirical data. They suggest that all available data should be used to assess uncertainty and variability in estimated parameters and model predictions. This contrasts the typical method of arbitrarily excluding selected datasets in order to use them for validation purposes.[19,20] 2. What is Validation? 2.1 Validation versus Verification versus Evaluation versus Qualification: Does it Matter?

The terms validation, verification and qualification often have field-specific meanings. For example, in computational engineering verification is concerned with building the model correctly and is utilized to compare the conceptual model with the computer representation. Validation involves building the right model; verification determines that a model accurately represents the real system.[21] In population modelling, the term validation is used along with checking, evaluation, appropriateness Clin Pharmacokinet 2012

Used the likelihood profiling to improve the fit of the model by evaluating the models based on change in the objective function and determining the empirical 95% CI

Used to validate the final model, 200 bootstraps generated using PsN toolkit and CI was built around the median of each parameter. Estimates for each parameter from final model were compared with the bootstrap CI Cross-validation used to evaluate the accuracy of the model using a jack-knife technique, and to evaluate the validity of the PK model To check the robustness of the final model and its ability to predict the data, a cross-validation method was used. A full dataset was randomly divided into an index group and a test group The final model was evaluated using case-deletion diagnostics used to detect influential individuals and to explore robustness. Cross-validation was performed by refitting the model with patients excluded one at a time

Perez-Ruixo et al.[7]

Musuamba et al.[8] Lahu et al.[9] Figure 4 Kerbusch et al.[10] Djebli et al.[11] Table III Sam et al.[12]

Log-likelihood profiling: mapping the objective function; used as alternative method for finding a parameter CI

Bootstrapping techniques: estimate 95% CIs and standard errors of the estimate. Bootstrap generates other plausible data and assesses model structure Jack-knife techniques: estimate standard errors of the estimate Cross-validation

Model reliability: assessment of the uncertainty of parameters and random effect; plausibility of parameter estimates and their precisions Model stability: assesses how resistant the model is to change

Resampling techniques

Adis ª 2012 Springer International Publishing AG. All rights reserved.

PPC assessed whether the model adequately described the PK parameters and the covariate disposition

Lahu et al.[9] Table IV Figure 7

PPC: assess the predictive performance of the model

External validation was used to evaluate the established final model

Used NPDE to generate 1000 model-prediction concentrations for each observation in the external datasets. The observed concentrations were compared to the 1000 predicted concentrations

Krekels et al.[16] Figure 8

NPDE

Han et al.[17]

NPCs were performed to evaluate the simulation properties of the final model

Pilla Reddy et al.[15]

NPC: used to test model appropriateness

The developed model is applied to a new dataset (validation dataset) from another study

VPC were used to evaluate model performance and accuracy of the model for describing the data (trend and variability)

Lehr et al.[13] Figure 5 Nielsen et al.[14] Figure 6

VPC: plot comparing 95% prediction interval with observed data

Identify and characterize effects of influential outliers

CWRES = conditional weighted residuals; DV = dependent variable; NPC = numerical predictive checks; NPDE = normalized prediction distribution error; PK = pharmacokinetic; PPC = posterior predictive check; PRED = population predictions; VPC = visual predictive check; WRES = weighted residuals.

External validation

External model evaluation

Simulation-based diagnostics

Advanced internal methods

Case-deletion diagnostics

Internal validation

Data splitting was used to show that the model was robust

Plots including at least the following predicted data vs observed data (PRED vs DV), PRED vs WRES or CWRES, and time vs WRES or CWRES

Goodness-of-fit/diagnostic plots: assessment of goodness-of-fit statistics/plots Falck et al.[6] Table II

Description of use

Data splitting: data are randomly divided into an index population and a test population

Example of paper from Clinical Pharmacokinetics journal and figures and tables from this article The choice between alternative models was based on goodnessof-fit as observed in diagnostic plots

Summary of techniques/methods

Hennig et al.[5] Figure 2 and Figure 3

Basic internal methods

Model evaluation/validation and diagnostics

Table I. Summary methods and techniques used in model evaluation/validation and diagnostics

Validation for Population Pharmacokinetic Modelling 3

Clin Pharmacokinet 2012

4

and stability testing of the model. These terms have similar connotations and are meant to indicate that ‘good practices’ have been applied to evaluate the population model.[2,3,22] The FDA guideline[2] suggests the focus should be on modelpredictive performance and not all population pharmacokinetic models need validation. When pharmacokinetic modelling results will be incorporated into a drug label, validation is strongly encouraged. When the population pharmacokinetic model is used to explain variability and no dosage adjustments are proposed, it may be appropriate to test the model for stability only by assessing impact of other plausible or probable data on the model.[1] Evaluation of models can include verification of the model to examine if the code used produces the expected model. In Bayesian statistics, there is a relatively large volume of literature discussing model evaluation, validation, adequacy, assessment, checking, appropriateness, performance, etc.[23] Specific procedures undertaken are dependent on the goal of analysis. Generally, the goal of model evaluation is to describe the data and evaluate the influence of potential covariates. However, if the model is to be used for performing simulations, then a more rigorous set of model evaluation procedures needs to be undertaken.[4] Ette and Williams[1,24] discuss the basis for lack of term consensus – qualification, verification, accreditation and credibility – when applied to population models. A qualified model refers to compilation with application of objective standards and precedent conditions that must be met and infers that those would be the same for all models. By definition, no alternative standards could be used to qualify the model. Within the field of population modelling there are no specific precedent conditions, nor is it possible to outline a set of specific objective conditions that could apply to all models. Accreditation requires implementation of documentation that includes verification, assessment of conceptual models and verification of the computer model, which is also impractical in population modelling. There have been incidences where the term ‘model accreditation’ has been substituted for ‘appropriateness’, leading to confusion between disciplines.[1] The concepts of credible models and model appropriateness also further the confusion. Credible models are created when there is an absence of data and are based solely upon expert opinion. Relative to population models, there are few circumstances where a credible model is needed; the best example occurs before in vivo experiments are undertaken as there can be no actual data. Ette and Ludden[25] support use of the term model appropriateness to mean suitability or aptness of the model, and typically major influences or determinants on the system are described by the model. Adis ª 2012 Springer International Publishing AG. All rights reserved.

Sherwin et al.

Despite current semantic limitations, the overall intent remains unchanged. Pharmacology requires the ability to assess the appropriateness and predictive ability of population models.[19,20] Predictive ability means that input parameters used in the model can be adequately simulated to describe behaviour of the drug for which the model was developed.[19] PBPK modelling is performed to fill in unknowns and simulate or predict information for those data that are unfeasible or unethical to obtain. In some circumstances, results obtained or predicted by a model cannot be confirmed, which begs the question of how to confirm confidence in a model and its estimations. In summary, the concept of model validation or evaluation suffers from lack of agreement on terminology. The ultimate goal of model evaluation is to assess the model for consistency with the data. This is frequently called validation, although many are uncomfortable with use of the term, but not the concept, as an attempt to determine absolute truth of a model. The fear is that validation can be misinterpreted to imply that a model is appropriate for use under any condition; thus, some are more comfortable with the term evaluation rather than validation. 2.2 Do We Need to Validate/Evaluate Population Pharmacokinetic Models?

Population pharmacokinetic models are used to determine optimal dosing strategy, dose scaling and/or incorporation of individual characteristics, including renal function and/or genetic information. They can be used to determine first-time dose in humans, can include allometric scaling of pharmacokinetics from animals to humans and determine linkage between pharmacokinetics and pharmacodynamics. Determining impact and influence of covariates on pharmacokinetics of a particular drug and therefore dosing also uses a modelling approach. Therefore, the model that is developed needs to be considered not only appropriate but also believable in terms of estimates. All models are essentially ‘opinions’ of reality based upon the data collected and, as such, what defines a ‘good’ model is subjective and based upon the purpose for which the model was intended. This is summed up by the statement ‘‘Essentially, all models are wrong but some are useful’’.[26] Ultimately, it is the choice of model, interpretation of results and application of those results that define model validity. Model deficiencies need to be determined as well as their potential impact on results and decisions that follow.[2] Results from many clinical trials are reported following some model-based analyses, commonly using nonlinear mixed-effects population models. Karlsson and Clin Pharmacokinet 2012

Validation for Population Pharmacokinetic Modelling

Savic[27] suggest that published models should include appropriate diagnostics that describe performance of the final model. There is an assumption that published models are accurate and rarely are limitations or weakness described.[28] There has been discussion that part of the validation strategy is to accept that NONMEM is not to be literally validated.[29] As a development environment, it is used to optimize complex analysis and modelling for a specialized field. NONMEM is a qualified development platform and each specific analysis is validated individually. In cases where commercial and custom tools such as NMQual (ª Metrum Institute, 2008; http://code. google.com/p/nmqual) pull metadata from NONMEM, the assumption is that the environment is able to reproduce outcomes. Depending upon which computer hardware is used or the interfaced platform, there can be numeric differences in outcomes produced by models run with the same dataset. As a result of these inconsistencies, a validation strategy may be needed to confirm that the model is accurate in relation to how it is being applied. 3. Validation/Evaluation Methods Validation methods (figure 1, table I) can be described by increasing order of quality: (i) Basic internal methods a. Goodness-of-fit plots/diagnostic plots (figure 2 and figure 3) b. Uncertainty in parameter estimates c. Model sensitivity to outliers (ii) Advanced internal methods a. Data splitting (table II) b. Bootstrap (figure 4) c. Cross-validation (resampling techniques) [table III]

Search strategy for use of validation/evaluation methods

Population pharmacokinetic modelling (n ~300)

5

d. Simulations such as visual (figure 5 and figure 6) or posterior predictive checks (PPCs) [table IV, figure 7] (iii) External model evaluation (validation dataset observations compared with model predictions).[30] Internal validation uses approaches such as data splitting and resampling techniques (e.g. cross-validation including bootstrapping) in order to generate estimates of confidence limits for parameter estimates.[2] The decision to split available data into calibration and validation datasets (internal validation) is dependent on the amount of available data and whether the data can support such an approach. In general, statistical approaches such as measures of goodness-of-fit or cross-validation determine whether the model adequately describes the data, while taking into account that parameter estimates are obtained from the data. Cross-validation techniques use repeated data splitting for estimating generalization error using resampling.[1] Bootstrapping techniques are commonly (figure 4)[5,8-11,17,31-33] used to estimate 95% confidence intervals and estimate standard errors (SEs) of the estimate and provide insight into statistical parameters of a distribution (e.g. mean, SE) when the true distribution is unknown and only observations are available. This technique utilizes resampling with replacement that uses the entire dataset and it is therefore possible to have some original observations appearing more than once in the bootstrap sample.[28,34] Data splitting (table II) examples are available from recent population pharmacokinetic manuscripts.[6,35-37] Papers published in Clinical Pharmacokinetics[10,11] demonstrate cross-validation techniques (table III). Kerbusch et al.[10] performed a cross-validation using jack-knife techniques to estimate SEs of the estimate and bias of a statistic. Numerous examples of published papers demonstrate this technique.[6,10,25,38-42] It is important to note that jack-knife is not considered a model

Model evaluation (n ~300) Validation (n = 127) Internal validation (n = 127) External validation (n = 124) Model validation (n = 123) Goodness of fit (n = 76)/diagnostic plots (n = 36) Visual predictive check (n = 39)/visual predictive performance (n = 54) Log likelihood (n = 64)/log likelihood profiling (n = 6) Bootstrap (n = 37)/bootstrapping (n = 13) Cross-validation (n = 43) Predictive check (n = 19)/posterior predictive check (n = 6) Case deletion (n = 16)/case deletion diagnostics (n = 2) Jack-knife (n = 9)/jack-knife technique (n = 9) Data splitting (n = 13) Numerical (numeric) predictive check (n = 6) Normalised prediction distribution errors (n = 8)

(n = 112; 37.3%) (n = 93; 31.0%) (n = 70; 23.3%) (n = 50; 16.7%) (n = 43; 14.3%) (n = 25; 8.3%) (n = 18; 6.0%) (n = 18; 6.0%) (n = 13; 4.3%) (n = 6; 2.0%) (n = 8; 2.7%)

Fig. 1. Online search of Clinical Pharmacokinetics journal (http://adisonline.com/pharmacokinetics) using the search words outlined in the figure for validation/ evaluation of modelling software (search all available articles containing the key words up to 1 October 2011). Varying common search terms related to each type of method/technique were searched. n = number of articles returned by search. Adis ª 2012 Springer International Publishing AG. All rights reserved.

Clin Pharmacokinet 2012

Sherwin et al.

6

a 4

Hydroxy-itraconazole Itraconazole Line of identity

3

Observed plasma concentration (mg/L)

2

1

0 0

1

2

3

4

Individual predicted plasma concentration (mg/L) b 4

3

2

1

0 0

1

2

3

4

Population predicted plasma concentration (mg/L)

Fig. 2. Goodness-of-fit/diagnostic plots. Individual predicted plasma concentration (a) and population predicted plasma concentrations (b) vs observed plasma concentrations. Reproduced from Hennig et al.,[5] with permission from Springer International Publishing AG (ª Adis Data Information BV 2006. All rights reserved.).

validation technique by some, as it may only be used to correct for bias in parameter estimates and does not assess model performance. Case-deletion diagnostics are also a form of cross-validation, which assess the influence of individuals within a model. This is accomplished by sequentially removing different sets of cases, and various numbers depending upon the specific technique, from the dataset and evaluating how model parameters are affected. Examples include those described by Musuamba et al.[8] and Sam et al.[12] Collectively, these validation methods (case-deletion diagnostic method and jackknife analysis) can be used to assess model sensitivity to outliers in the data.[30] Depending on which method or technique is Adis ª 2012 Springer International Publishing AG. All rights reserved.

undertaken, different approaches are used for assessing the influence of outliers and for dealing with them. Assessing the impact of outliers can be accomplished by removing the outlier from the model and reassessing the parameters. In some cases, the identified outlier can be removed; however, if this is done, there needs to be adequate justification.[43] There are statistical methods that can be applied to assess the influence of outliers. Generally, removing outliers is considered acceptable if they can be shown to be data errors by independent means. This includes instances where a sample has been mixed up or mislabelled, the times recorded incorrectly, the assay not run correctly or the samples diluted incorrectly.[2,3,30] In general, external validation involves application of the developed model to validation of a dataset that has been obtained from a different study and internal validation uses values from the original study. External validation is considered one of most stringent approaches for model testing. An example of external validation can be found in the manuscript by Han et al.,[17] and when done provides evidence of the model’s transportability.[25] The standard validation method utilizes a separate dataset, whether it arises from a random split of the original data or from subsequently collected datasets. Other approaches to validation include predictive distributions and a newer method, evaluation of pseudo-residuals (or prediction discrepancies).[44] If there is intent to validate a model, specific criteria are required to evaluate differences between observed values from a validation dataset and model predictions. These specific criteria require a process for decision making and acceptance of a validated model. Criteria include the following: (i) does the validation show good model performance?; (ii) are there errors or missing components in the model?; (iii) do the predictions match the observed?; and (iv) does the model account for all the random variability? There are issues associated with investigating whether a given null model is compatible with data when the assumed model has unknown parameters.[44] These issues include making sure that the model can perform for the purpose for which it was developed and that the unknown parameters are appropriate to be included in the model development. Validation provides the ability to prospectively predict data. As models progress from simple to more complex, they tend to produce a better and better quality of data up to a point in which the model not only models the data but also the noise within the data. This condition is known as overfitting. 3.1 Model Diagnostics

The term model diagnostics describes a set of systematic methods to evaluation that include numerical, graphical or Clin Pharmacokinet 2012

Validation for Population Pharmacokinetic Modelling

7

other approaches. Numerical approaches can include objective function value (OFV), bootstrap method, goodness-of-fit statistics, parameter estimates and imprecision estimates. The OFV is the likelihood ratio test for nested models; it is a valuable indicator of model goodness-of-fit.[45] OFV is defined as -2 times the log of the sum of squared deviations between predicted and actual values for nonlinear mixed-effect modelling. Diagnostics can assess outliers or influential subjects, which may have influence in parameter estimates or model selection. Parameter estimates should make sense, be stable when run repeatedly and not be at the boundary of estimation predictions. Imprecision estimates assess SEs and correlations and

provide a measure of information about a parameter. Ideally, SE-based confidence intervals should always be symmetrical; sometimes they may provide impossible or inaccurate values. The most useful estimates are SE and covariance. Log likelihood profiling can be used to compare goodness-of-fit and to map the OFV to assess for model fit. It can also be used as an alternative method for determining confidence intervals for pharmacokinetic parameters.[7] This method is very time consuming when there are many parameters and is not commonly reported. Other numerical evaluators described in model diagnostics include calculation of the condition number. This is the ratio of

Weighted residuals

a

b

10

10

5

5

0

0

−5

−5

−10

−10 0

10

20

30

40

50

60

0

10

20

Individual

30

40

50

60

Time (h)

c

d

10

200

5

Count

0

100

−5

−10

0 0

1

2

3

4

Population predicted plasma concentration (mg/L)

−10

−5

0

5

10

Weighted residuals histogram

Fig. 3. Goodness-of-fit/diagnostic plots. Weighted residuals vs individuals (a), vs time after the last dose (b), vs population predicted plasma concentrations (c), and as histogram (d). Reproduced from Hennig et al.,[5] with permission from Springer International Publishing AG (ª Adis Data Information BV 2006. All rights reserved.). Adis ª 2012 Springer International Publishing AG. All rights reserved.

Clin Pharmacokinet 2012

Sherwin et al.

8

Table II. Data splitting: final pharmacokinetic parameters and validation results using data splitting. Reproduced from Falck et al.,[6] with permission from Springer International Publishing AG (ª Adis Data Information BV 2009. All rights reserved.) Parameter

Final value

Jack-knife SD

95% CI

Average parameter

SD

95% CI

OFV

9868.4

139

9796, 9874

9033

198

8910, 9156

CL/F (L/h)

26.9

0.24

26.9, 27.0

27.0

0.35

26.8, 27.2

V1/F (L)

24.4

5.5

25.2, 26.8

25.3

3.3

23.2, 27.3

Q/F (L/h)

19.6

1.7

19.6, 20.7

20.4

1.6

19.4, 21.4

V2/F (L)

1119

732

806, 1218

1140

851

613, 1668

0.544

0.10

0.514, 0.670

0.570

0.065

0.530, 0.610

-1

ka (h )

Data splitting

tlag (h)

0.460

0.026

0.453, 0.467

0.463

0.020

0.450, 0.471

Age on ka

0.00109

0.00021

0.00103, 0.00116

0.00111

0.00030

0.0010, 0.0012

Age on CL/F

0.192

0.025

0.187, 0.201

0.190

0.021

0.184, 0.196

WT on ka

0.00480

0.0016

0.00405, 0.00495

0.0043

0.0012

0.0030, 0.0048

TXT on ka

0.246

0.046

0.237, 0.263

0.278

0.032

0.269, 0.287

WT on V1/F

0.351

0.21

0.296, 0.414

0.302

0.12

0.268, 0.335

CL/F = oral clearance; ka = absorption rate constant; OFV = objective function value; Q/F = intercompartment clearance after oral administration; SD = standard deviation; tlag = absorption lag time; TXT = time after transplantation; V1/F = central volume of distribution after oral administration; V2/F = peripheral volume of distribution after oral administration; WT = body weight.

the smallest and largest Eigen values (values derived from solving a differential equation). This technique checks for model over-parameterization and involves calculating Eigen values from the covariance matrix.[46] A conditioning number >10n, where n is the number of parameters, is considered an indication of severe ill-conditioning or model over-parameterization. For example, a calculated condition number of >1000 when there are £3 parameters may indicate highly correlated parameter values with values >0.95 in the correlation matrix.[47] In addition to model diagnostics, model selection criteria are used in model selection/evaluation; for example, the Akaike information criterion compares models with different numbers of parameters.[48] The Schwarz Bayesian criterion assumes equal probability for each model and each parameter within the model to equally provide a significant effect.[49] Other goodnessof-fit statistics include measures of model-predictive performance such as bias, root mean square error and imprecision.[50] Numerical diagnostics can provide important comparisons between models. The most typically used diagnostic is OFV. These numerical diagnostic methods can provide information about model robustness and identify poor model fit (SE of parameters) but cannot be used to assess whether the model absolutely describes the observed data. Graphical diagnostics include prediction-, parameter-, simulation- and residual-based estimates such as weighted residuals (WRES) or individual weighted residuals (IWRES). Population prediction (PRED)-based diagnostics provide graAdis ª 2012 Springer International Publishing AG. All rights reserved.

phical comparison of observations, such as dependent variable versus PRED, which can assess fit of data along the line of identity and can be used to visually identify outliers. This technique follows trends of individuals or lines between points and can be used to indicate bias with use of regression line. PRED-based diagnostics may be useful for identifying variability not explained by the structural model and its covariates. Unexplainable patterns may be related to the specific model, study design or parameter values. There may be a need to use additional graphical diagnostics to begin to better understand specific causes.[4,25,27] Goodness-of-fit plots can include any or all of the following: PRED versus observations versus time, PRED versus observations or time, population residuals versus PRED, population WRES versus PRED or time, individual predictions (IPRED) versus observations or time, and IWRES versus IPRED or time (figures 2 and 3).[5] These types of plots are used to try to detect any potential bias or issues with the structural model or random effects.[30] The most common graphical diagnostics used as part of model validation/evaluation are goodness-of-fit plots. Karlsson and Savic[27] outline the use of model diagnostics. This includes a description of typical IPRED-based diagnostics, which uses a plot to show the observed versus PRED, indicating how well the observed and predicted agree and the potential variability in the data. These plots generally include a line of identity and a regression line. Despite being a fundamentally required plot, Karlsson and Savic[27] state that the diagnostic Clin Pharmacokinet 2012

Validation for Population Pharmacokinetic Modelling

9

has a flaw in that there is no expected pattern in this goodnessof-fit plot. Additional types of goodness-of-fit plots include diagnostics based on individual parameter estimates. These are similar to observed versus PRED plots; however, the predictions are based on IPRED and not the population. In these plots, unexplained parameter variability does not confound the interpretation. Alternative types of diagnostics include empirical Bayes estimates-based diagnostics, post hoc estimates (prior distribution of the parameters across a population and actual data used

to obtain the posterior probability of individual parameter estimates), individual predicted data based on individual empirical Bayes parameter estimates (IPRED) and absolute IWRES. Diagnostics based on empirical Bayes (or post hoc) estimates increase resolution by separating variability components.[27] Some diagnostics are not suitable for some problems; thus, it is important to know what a diagnostic does before using it and to fit the diagnostic to the purpose or use of the model. WRES, commonly used for evaluating model misspecification, are calculated using first-order (FO) approximation even when Final Model Boostrap distribution Bootstrap – median

a

b

c

14 25

0.6 12 0.5 10 0.4

8

Density

15

Density

Density

20

6

0.3

10 4

0.2

2

0.1

5

0

0 0.05

0.10

0.15 Lagtime

0.0 0.40

0.20

0.45

0.50

0.55

0.60

0.65

9.5 10.0 10.5 11.0 11.5 12.0 CL

ka

d

e

f

0.8

0.30 0.25

0.06 0.6

0.15

Density

Density

Density

0.20 0.4

0.04

0.10 0.02

0.2 0.05 0.00

0.00

0.0 8

10

12

14

16

18

V1

19

20

21 Q

22

190

200

210 V2

220

230

Fig. 4. Bootstrapping techniques. Bootstrap distributions for the intercepts of the parent model (n = 316): (a) lagtime; (b) absorption rate constant; (c) apparent total clearance; (d) central volume of distribution; (e) intercompartmental clearance; and (f) peripheral volume of distribution. Comparison of the posterior distributions of the intercept parameters. Normal approximation (final model) is shown in orange (with median, solid line) and bootstrap distribution in black (with median, dotted line). The solid bar at the bottom of the plots represents the bias-corrected 95% bootstrap confidence interval. Reproduced from Lahu et al.,[9] with permission from Springer International Publishing AG (ª Adis Data Information BV 2010. All rights reserved.). CL = apparent total clearance; ka = absorption rate constant; Q = intercompartmental clearance; V1 = central volume of distribution; V2 = peripheral volume of distribution. Adis ª 2012 Springer International Publishing AG. All rights reserved.

Clin Pharmacokinet 2012

Sherwin et al.

10

Table III. Cross-validation: pharmacokinetic parameter estimates in the full dataset and in the four different subsets (cross-validation). Reproduced from Djebli et al.,[11] with permission from Springer International Publishing AG (ª Adis Data Information BV 2006. All rights reserved.) Parameter

Full dataset (–2 SE)

Subset 1 A

Subset 2 B

Subset 3 C

Subset 4 D

ktr (h-1)

5.25 (4.75–5.75)

5.2

5.3

5.0

5.2

Q/F (L/h)

38.7 (27.3–50.1)

45.8

38.4

44.7

34.1

V1/F (L)

218 (187–249)

216

219

223

215

V2/F (L)

292 (233–351)

284

332

233

268

y1 (L/h)

14.1 (12.0–16.2)

14.6

14.2

14.7

12.8

y2 (L/h)

14.2 (6.1–22.3)

14.4

20.5

12.3

11.5

CL/F (= y1 + y2 · CYP3A5)

hx = typical parameter value; CL/F = apparent oral clearance; CYP = cytochrome P450; ktr = transfer rate constant (absorption parameter); Q/F = intercompartment rate constant; SE = standard error; V1/F = apparent volume of distribution of the central compartment after oral administration; V2/F = apparent volume of distribution of the peripheral compartment after oral administration.

running FO conditional estimation (FOCE). The possibility of misguided model development has been suggested if WRES is wrong. Alternatively, conditional WRES (CWRES) has been calculated based on FOCE approximation.[51] In comparing WRES versus CWRES, estimates using FOCE are considered with the true model. The difference between estimated and true parameters is small (45 years and age £65 years

The lower and upper 95% point of AUC parent is outside the range [-1, +1], whereas the 50% point is within the range. For AUC metabolite and tPDE4i, the 50% point is slightly negative

Minor

Age >65 years

The upper 95% point of the AUC parent is shifted to the right, and the 50% point is away from 0 (positive value). For the AUC metabolite and tPDE4i, only the upper 95% point is shifted to the right. The parent model does not predict the elderly population satisfactorily. However, since approximately 90% of the tPDE4i is achieved by roflumilast N-oxide, with which only minor issues were found, the overall finding was not considered severe

Moderate

High-fat meal

Small sample size (n = 12)

No issue

Black

The lower 95% point to AUC parent is outside the range [-1, +1], as 1/34 subjects had a very small AUC value

No issue

Hispanic

Small sample size (n = 15)

No issue

Body weight £60 kg

The lower and upper 95% point of AUC parent is outside the range [-1, +1]. For AUC metabolite and tPDE4i, the upper 95% point is outside the range

Moderate

AUC = area under the plasma concentration-time curve; PPC = posterior predictive check; tPDE4i = total phosphodiesterase 4 inhibitory activity. Adis ª 2012 Springer International Publishing AG. All rights reserved.

Clin Pharmacokinet 2012

Validation for Population Pharmacokinetic Modelling

13

PPC for AUC metabolite Black 5

4

4

4

3 2

0

Frequency

5

1

3 2 1

−2

0

−1 0 1 2 Projection of observed AUC to pred. 95% CI (n = 34)

3 2 1

−2

0

−1 0 1 2 Projection of observed AUC to pred. 95% CI (n = 37)

2

2

2

0

−2

1 −1 0 2 Projection of observed AUC to pred. 95% CI (n = 15)

Frequency

3

Frequency

3

1

0

−2

1 −1 0 2 Projection of observed AUC to pred. 95% CI (n = 15)

−1 0 1 2 Projection of observed AUC to pred. 95% CI (n = 34)

PPC for tPDE4i Hispanic

3

1

−2

PPC for AUC metabolite Hispanic

PPC for AUC parent Hispanic

Frequency

PPC for tPDE4i Black

5

Frequency

Frequency

PPC for AUC parent Black

1

0

−2

1 −1 0 2 Projection of observed AUC to pred. 95% CI (n = 15)

Fig. 7. Posterior predictive check. Posterior predictive check for race in healthy and patient data. Reproduced from Lahu et al.,[9] with permission from Springer International Publishing AG (ª Adis Data Information BV 2010. All rights reserved.). AUC = area under the plasma concentration-time curve; PPC = posterior predictive check; pred. = predicted; tPDE4i = total PDE4 inhibitory activity.

in nature, more assumptions are made about the relationship between population, data and predicted results. The model describes behaviour outside the range over which empirical evidence is available; thus, results from the model are applied to patients from whom data were not actually estimated. Therefore, this type of model should be validated to be consistent with its purpose and should not have deficiencies that would result in it not being applicable for its intended use.[27] Validation methods/techniques are dependent on specifics of the model and how the model was built, and results are reliant on quality of data used. There is a strong link between study design and appropriate validation/evaluation techniques. There are differing opinions about what appropriate statistical approaches should be applied to a population pharmacokinetic model; ultimately, the issue of validation or evaluation is left to the modeller. There are limitations of validating a predictive model without gathering prospective data; this includes drawing unrealistic, impractical and improbable conclusions that are of no benefit clinically. Adis ª 2012 Springer International Publishing AG. All rights reserved.

3.3 What is the Ideal Validation Method?

What the ideal validation method is remains the central unanswered question, as those who undertake population pharmacokinetic modelling cannot agree on what term to use in the literature and it is in large part dependent upon the software used by various groups (table V). There is no consensus on the ‘preferred’ method of validation. There is even less agreement with respect to validation of models for special populations such as children, pregnant women and the elderly. Model evaluation/validation typically should include some if not all of the methods and techniques described in table I and should assess model performance. Challenges in model validation arise whenever models are used to make decisions, particularly when it affects drug choice and dosing regimen. The FDA and other regulatory bodies use population pharmacokinetic analyses as a basis for regulatory decisions; results are used in labelling and to provide guidance for exposure response. Although there is a lack of consensus regarding type Clin Pharmacokinet 2012

Sherwin et al.

14

of validation appropriate for population pharmacokinetic models or if it is possible to fully validate a pharmacokinetic model, there is no argument about need for software packages used in modelling to be reliable and results to be reproducible. This is important whether validation is required by the FDA, industry or academia.[2,3,22]

4. Future Approaches/Suggestions This review of validation/evaluation methods used in population pharmacokinetics modelling has identified a number of limitations, which lead to the following suggestions for future approaches:

a 60

2

2

50

1

1

0

0

20

−1

−1

10

−2

−2

40 30

0 −3

−2

−1

0

1

2

0

3

2000

6000

10 000

1

2

3

4

Mean: 0.2215* Variance: 0.6818* b 50

NPDE

Frequency

40 30

3

3

2

2

1

1

0

0

20

−1

−1

10

−2

−2

0

−3

−3

−3

−2

−1

0

1

2

3

0

5000 10 000 15 000 20 000

2

3

4

5

Mean: 0.03461 Variance: 0.8481 c 3

3

2

2

1

1

0

0

10

−1

−1

5

−2

−2

30

20

0 −2

−1

0

1

2

3

NPDE

0

2000

6000 Time (min)

10 000

−1

0

1

2

3

Log of concentration (µg/L)

Mean: 0.5153* Variance: 1.501*

Fig. 8. Normalized prediction distribution errors. Results of external validation using normalized prediction distribution error method with external datasets of postoperative and ventilated patients. The histograms show the using normalized prediction distribution error frequency distribution in the merged external dataset for (a) morphine; (b) morphine-3-glucuronide; and (c) morphine-6-glucuronide. The histograms show the normalized prediction distribution error frequency distribution in the merged external datasets. The solid lines indicate a normal distribution. The dotted lines represent the mean – 2SD of a normal distribution. Reproduced from Krekels et al.,[16] with permission from Springer International Publishing AG (ª Adis Data Information BV 2011. All rights reserved.). NPDE = normalized prediction distribution error; * indicates a significant difference of a mean of 0 and a variance of 1 at p < 0.05 level as determined by the Wilcoxon signed rank test and the Fisher test of variance. Adis ª 2012 Springer International Publishing AG. All rights reserved.

Clin Pharmacokinet 2012

Validation for Population Pharmacokinetic Modelling

15

Table V. Variety of software tools which exist to aid in running various population pharmacokinetic modelling software packages and for undertaking model validation/evaluation and diagnostics Software program

Description

URL

PsN (Perl speaks NONMEM)

Script program; extracts parameters from NONMEM; numerical diagnostics, numerical predictive check, log-likelihood profiling, nonparametric bootstrap; statistical methods: cross-validation and Jack-knife and log-likelihood profiling Available free

http://psn.sourceforge.net/download.php

R

Statistical and graphics interface; used for exploratory analysis of the data Available free

http://www.r-project.org/

Xpose

Diagnostic software; Plugs into R statistical software, does automatic graphs of residuals, DV vs PRED etc.; numerical evaluations; visual diagnostics: wide range of diagnostic graphs, visual and posterior predictive checks; data exploration such as step-wise GAM and tree-based modelling Available free

http://xpose.sourceforge.net/

PLT Tools

Diagnostic software; requires NONMEM to be installed to run; statistical and graphics interface: CWRES, visual predictive checks, bootstrap/Jack-knife and likelihood profile; model diagnostics using Xpose A free version is available or can buy a license

http://www.pltsoft.com

Phoenix NLME

Phoenix NLME PK analysis program, with built-in graphical functions; numerical diagnostics: numerical predictive check, log-likelihood profiling and nonparametric bootstrap; visual diagnostics: wide range of diagnostic graphs, visual and posterior predictive checks Requires an annual license fee

http://www.pharsight.com/products/ prod_phoenix_nlme_home.php

MONOLIX

Similar to NONMEM, PK modelling program, with built-in graphical functions; numerical diagnostics: numerical predictive check, log-likelihood profiling, nonparametric bootstrap; visual diagnostics: wide range of diagnostic graphs, visual and posterior predictive checks A free version is available

http://software.monolix.org/sdoms/ software/

PDx-Pop

Requires NONMEM to be installed to run; NONMEM graphical user interface program (uses R or S-PLUS and Xpose); performs AIC, SBC, OFV profiling, bootstrap, leverage analysis (used to evaluate model robustness) and predictive checks Requires an annual license fee

http://www.iconplc.com/technology/ products/pdx-pop/

Census

Data handling and summary of NONMEM (interfaced with Xpose 4, R and PsN); provides graphical output Available free

http://census.sourceforge.net/

Wings for NONMEM (WFN)

A simple DOS presentation to run NONMEM; requires NONMEM to be installed to run; can be used to perform nonparametric and parametric bootstrap analyses and likelihood ratio testing Available free

http://wfn.sourceforge.net/

NMQual

An operating environment for NONMEM Available free

http://code.google.com/p/nmqual/

Pirana

Data handling and summary of NONMEM; graphical user interface for PsN, R and Xpose Available free

http://www.pirana-software.com/

SAS

Used for exploratory analysis of data Requires a license fee

http://www.sas.com/

Berkeley Madonna

Used to explore model structures A free version is available

http://www.berkeleymadonna.com/

Mango Solutions Navigator

Data analysis; outputs NONMEM in tabular or graphical formats Statistical support and analysis can be provided for a fee, hourly, daily or annually

http://www.mango-solutions.com/ navigator

AIC = Akaike information criterion; CWRES = conditional weighted residuals; DV = dependent variable; GAM = generalized additive modelling; NONMEM = nonlinear mixed-effect modelling; OFV = objective function value; PK = pharmacokinetic; PRED = population predictions; SBC = Schwarz Bayesian criterion. Adis ª 2012 Springer International Publishing AG. All rights reserved.

Clin Pharmacokinet 2012

Sherwin et al.

16

1. Similar to model building, there is also lack of consensus on appropriate terminology used to describe or define the concept of validation. This will require not only clarification but consensus when approaching conversations regarding validation and methods to bridge the gap between differing camps within the population pharmacokinetic modelling world. 2. As there is no definitive consensus on terminology, validation of consistent, robust and reliable methods has yet to be formally and independently performed to be able to understand model limits and variability assessment with current models. Systematic investigations comparing different validation methods will need to determine the best approach for any given dataset/study population. 3. There remains debate regarding the need for model validation. This debate serves only to undermine the applicability of models, particularly within the clinical realm. The field needs to move beyond the concept of modelling as an art and utilize rigorous methodology to define accuracy, consistency and reliability to deliver the promise of population pharmacokinetic modelling and allow the application of predictive, appropriate models for both the clinical and research worlds. 5. Conclusions Given the lack of consensus on the best approaches in model building and validation, sound fundamentals are required to ensure the selected methodology is suitable for the particular data type and/or patient population. There is a great need to further standardize and establish the best approaches in modelling so that any model created can be systematically evaluated and results relied upon. Defining what constitutes a validated model is critically important to developing and expanding the field of population pharmacokinetic modelling. The inability to externally and accurately describe the model greatly decreases the trust that is required to rely upon those models, hindering clinical and research progress. Acknowledgements No funding was provided to assist in the preparation of this review. The authors have no potential conflicts of interest that are directly relevant to the content of this review to declare.

References 1. Ette EI, Williams PJ. Pharmacometrics: the science of quantitative pharmacology. Hoboken (NJ): John Wiley, 2007 2. FDA. Guidance for industry: population pharmacokinetics. US Department of Health and Human Services; Food and Drug Administration; Centre for Drug Evaluation and Research & Centre for Biologics Evaluation and Research, 1999 Feb; CP 1 [online]. Available from URL: http://www.fda.gov/ Adis ª 2012 Springer International Publishing AG. All rights reserved.

downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ UCM072137.pdf?utm_campaign=Google2&utm_source=fdaSearch&utm_ medium=website&utm_term=Guidance for industry population pharmaco kinetics&utm_content=1 [Accessed 2011 Sep 1] 3. EMA. Guideline on reporting the results of population pharmacokinetic analyses. European Medicines Agency 21 June 2007 (Doc. Ref. CHMP/EWP/ 185990/06) [online]. Available from URL: http://www.ema.europa.eu/docs/ en_GB/document_library/Scientific_guideline/2009/09/WC500003067.pdf [Accessed 2011 Sep 2] 4. Wade JR, Edholm M, Salmonson T. A guide for reporting the results of population pharmacokinetic analyses: a Swedish perspective. AAPS J 2005; 7: 45 5. Hennig S, Wainwright CE, Bell SC, et al. Population pharmacokinetics of itraconazole and its active metabolite hydroxy-itraconazole in paediatric cystic fibrosis and bone marrow transplant patients. Clin Pharmacokinet 2006; 45: 1099-114 6. Falck P, Midtvedt K, Vaˆn Leˆ TT, et al. A population pharmacokinetic model of ciclosporin applicable for assisting dose management of kidney transplant recipients. Clin Pharmacokinet 2009; 48: 615-23 7. Perez-Ruixo JJ, Zannikos P, Hirankarn S, et al. Population pharmacokinetic meta-analysis of trabectedin (ET-743, Yondelis) in cancer patients. Clin Pharmacokinet 2007; 46: 867-84 8. Musuamba FT, Rousseau A, Bosmans J-L, et al. Limited sampling models and Bayesian estimation for mycophenolic acid area under the curve prediction in stable renal transplant patients co-medicated with ciclosporin or sirolimus. Clin Pharmacokinet 2009; 48: 745-58 9. Lahu G, Hu¨nnemeyer A, Diletti E, et al. Population pharmacokinetic modelling of roflumilast and roflumilast N-oxide by total phosphodiesterase-4 inhibitory activity and development of a population pharmacodynamic-adverse event model. Clin Pharmacokinet 2010; 49: 589-606 10. Kerbusch T, de Kraker J, Mathoˆt RAA, et al. Population pharmacokinetics of ifosfamide and its dechloroethylated and hydroxylated metabolites in children with malignant disease: a sparse sampling approach. Clin Pharmacokinet 2001; 40: 615-25 11. Djebli N, Rousseau A, Hoizey G, et al. Sirolimus population pharmacokinetic/ pharmacogenetic analysis and Bayesian modelling in kidney transplant recipients. Clin Pharmacokinet 2006; 45: 1135-48 12. Sam WJ, Tham LS, Holmes MJ, et al. Population pharmacokinetics of tacrolimus in whole blood and plasma in Asian liver transplant patients. Clin Pharmacokinet 2006; 45: 59-75 13. Lehr T, Staab A, Tillmann C, et al. A quantitative enterohepatic circulation model: development and evaluation with tesofensine and meloxicam. Clin Pharmacokinet 2009; 48: 529-42 14. Nielsen EI, Sandstro¨m M, Honore´ PH, et al. Developmental pharmacokinetics of gentamicin in preterm and term neonates: population modelling of a prospective study. Clin Pharmacokinet 2009; 48: 253-63 15. Pilla Reddy V, Kozielska M, Johnson M, et al. Structural models describing placebo treatment effects in schizophrenia and other neuropsychiatric disorders. Clin Pharmacokinet 2011; 50: 429-50 16. Krekels EH, DeJongh J, van Lingen RA, et al. Predictive performance of a recently developed population pharmacokinetic model for morphine and its metabolites in new datasets of (preterm) neonates, infants and children. Clin Pharmacokinet 2011; 50: 51-63 17. Han K, Bies R, Johnson H, et al. Population pharmacokinetic evaluation with external validation and bayesian estimator of voriconazole in liver transplant recipients. Clin Pharmacokinet 2011; 50: 201-14 18. Beal SL. Validation of a population model. NONMEM users group (NMusers), 1994 Feb 1 [online]. Available from URL: http://www.cognigencorp.com/ nonmem/nmo/topic006.html [Accessed 2011 Sep 2] 19. International Programme on Chemical Safety (IPCS). Characterization and application of physiologically based pharmacokinetic models in risk assessment. Clin Pharmacokinet 2012

Validation for Population Pharmacokinetic Modelling

Geneva: WHO, 2010 [online]. Available from URL: http://www.inchem. org/documents/harmproj/harmproj/harmproj9.pdf [Accessed 2012 May 28] 20. Barton HA, Chiu WA, Woodrow Setzer R, et al. Characterizing uncertainty and variability in physiologically based pharmacokinetic models: state of the science and needs for research and implementation. Toxicol Sci 2007; 99: 395-402 21. Babuska I. Verification and validation in computational engineering and science: basic concepts. Comput Methods Appl Mech Eng 2004; 193: 4057-66 22. Sun H, Fadiran EO, Jones CD, et al. Population pharmacokinetics: a regulatory perspective. Clin Pharmacokinet 1999; 37: 41-58 23. Mesnil F, Mentre F, Dubruc C, et al. Population pharmacokinetic analysis of mizolastine and validation from sparse data on patients using the nonparametric maximum likelihood method. J Pharmacokinet Biopharm 1998; 26: 133-61 24. Williams PJ, Ette EI. The role of population pharmacokinetics in drug development in light of the Food and Drug Administration’s ‘Guidance for Industry: population pharmacokinetics’. Clin Pharmacokinet 2000; 39: 385-95 25. Ette EI, Ludden TM. Population pharmacokinetic modeling: the importance of informative graphics. Pharm Res 1995; 12: 1845-55 26. Box GEP, Draper NR. Empirical model-building and response surfaces. New York: John Wiley & Sons, 1987 27. Karlsson MO, Savic RM. Diagnosing model diagnostics. Clin Pharmacol Ther 2007; 82: 17-20 28. Ette EI. Stability and performance of a population pharmacokinetic model. J Clin Pharmacol 1997; 37: 486-95 29. Vilicich M. Validation Strategy for NONMEM. NONMEM users group (NMusers) 2008 Oct 17 [online]. Available from URL: http://www.cognigencorp. com/nonmem/current/2008-October/1214.html [Accessed 2011 Sep 2] 30. Brendel K, Dartois C, Comets E, et al. Are population pharmacokinetic and/or pharmacodynamic models adequately evaluated? A survey of the literature from 2002 to 2004. Clin Pharmacokinet 2007; 46: 221-34 31. Frame B, Koup J, Miller R, et al. Population pharmacokinetics of clinafloxacin in healthy volunteers and patients with infections: experience with heterogeneous pharmacokinetic data. Clin Pharmacokinet 2001; 40: 307-15 32. Feillet F, Clarke L, Meli C, et al. Pharmacokinetics of sapropterin in patients with phenylketonuria. Clin Pharmacokinet 2008; 47: 817-25 33. Sugiyama E, Kaniwa N, Kim S-R, et al. Population pharmacokinetics of gemcitabine and its metabolite in Japanese cancer patients: impact of genetic polymorphisms. Clin Pharmacokinet 2010; 49: 549-58 34. Parke J, Holford NH, Charles BG. A procedure for generating bootstrap samples for the validation of nonlinear mixed-effects population models. Comput Methods Programs Biomed 1999; 59: 19-29 35. Benkali K, Pre´maud A, Picard N, et al. Tacrolimus population pharmacokinetic-pharmacogenetic analysis and Bayesian estimation in renal transplant recipients. Clin Pharmacokinet 2009; 48: 805-16 36. Molto´ J, Barbanoj MJ, Miranda C, et al. Simultaneous population pharmacokinetic model for lopinavir and ritonavir in HIV-infected adults. Clin Pharmacokinet 2008; 47: 681-92 37. Saint-Marcoux F, Royer B, Debord J, et al. Pharmacokinetic modelling and development of Bayesian estimators for therapeutic drug monitoring of mycophenolate mofetil in reduced-intensity haematopoietic stem cell transplantation. Clin Pharmacokinet 2009; 48: 667-75

17

42. Wilde S, Jetter A, Rietbrock S, et al. Population pharmacokinetics of the BEACOPP polychemotherapy regimen in Hodgkin’s lymphoma and its effect on myelotoxicity. Clin Pharmacokinet 2007; 46: 319-33 43. Ette EI, Williams PJ, Kim YH, et al. Model appropriateness and population pharmacokinetic modeling. J Clin Pharmacol 2003; 43: 610-23 44. Mentre F, Ebelin M. Validation of population pharmacokinetic/pharmacodynamic analyses: review of proposed approaches. In: Balant L, Aarons L, editors. The population approach: measuring and managing variability in response, concentration and dose. Brussels: Commission of the European Communities, 1997: 147-60 45. Wahlby U, Jonsson EN, Karlsson MO. Assessment of actual significance levels for covariate effects in NONMEM. J Pharmacokinet Pharmacodyn 2001; 28: 231-52 46. Montgomery D, Peck E. Introduction to linear regression analysis. New York: Wiley, 1982 47. Bachman W. Model diagnostics. Nmusers; 2003 [online]. Available from URL: http://www.cognigencorp.com/nonmem/nm/99may012003.html [Accessed 2006 Jan 12] 48. Yamaoka K, Nakagawa T, Uno T. Application of Akaike’s information criterion (AIC) in the evaluation of linear pharmacokinetic equations. J Pharmacokinet Biopharm 1978; 6: 165-75 49. Ludden TM, Beal SL, Sheiner LB. Comparison of the Akaike information criterion, the Schwarz criterion and the F test as guides to model selection. J Pharmacokinet Biopharm 1994; 22: 431-45 50. Sheiner LB, Beal SL. Some suggestions for measuring predictive performance. J Pharmacokinet Biopharm 1981; 9: 503-12 51. Hooker AC, Staatz CE, Karlsson MO. Conditional weighted residuals (CWRES): a model diagnostic for the FOCE method. Pharm Res 2007; 24: 2187-97 52. Pe´rez-Ruixo JJ, Krzyzanski W, Bouman-Thio E, et al. Pharmacokinetics and pharmacodynamics of the erythropoietin Mimetibody construct CNTO 528 in healthy subjects. Clin Pharmacokinet 2009; 48: 601-13 53. Yano Y, Beal SL, Sheiner LB. Evaluating pharmacokinetic/pharmacodynamic models using the posterior predictive check. J Pharmacokinet Pharmacodyn 2001; 28: 171-92 54. Mentre F, Escolano S. Prediction discrepancies for the evaluation of nonlinear mixed-effects models. J Pharmacokinet Pharmacodyn 2006; 33: 345-67 55. Wiczling P, Lowe P, Pigeolet E, et al. Population pharmacokinetic modelling of filgrastim in healthy adults following intravenous and subcutaneous administrations. Clin Pharmacokinet 2009; 48: 817-26 56. Mueck W, Frey R. Population pharmacokinetics and pharmacodynamics of cinaciguat, a soluble guanylate cyclase activator, in patients with acute decompensated heart failure. Clin Pharmacokinet 2010; 49: 119-29 57. Lindauer A, Siepmann T, Oertel R, et al. Pharmacokinetic/pharmacodynamic modelling of venlafaxine: pupillary light reflex as a test system for noradrenergic effects. Clin Pharmacokinet 2008; 47: 721-31 58. Retlich S, Duval V, Ring A, et al. Pharmacokinetics and pharmacodynamics of single rising intravenous doses (0.5 mg–10 mg) and determination of absolute bioavailability of the dipeptidyl peptidase-4 inhibitor linagliptin (BI 1356) in healthy male subjects. Clin Pharmacokinet 2010; 49: 829-40

38. Efron B. The jackknife, the bootstrap, and other resampling plans. Philadelphia (PA): Society for Industrial and Applied Mathematics, 1982

59. Samtani MN, Vermeulen A, Stuyckens K. Population pharmacokinetics of intramuscular paliperidone palmitate in patients with schizophrenia: a novel once-monthly, long-acting formulation of an atypical antipsychotic. Clin Pharmacokinet 2009; 48: 585-600

39. Zahr N, Amoura Z, Debord J, et al. Pharmacokinetic study of mycophenolate mofetil in patients with systemic lupus erythematosus and design of Bayesian estimator using limited sampling strategies. Clin Pharmacokinet 2008; 47: 277-84

60. Mukonzo JK, Nanzigu S, Rekic D, et al. HIV/AIDS patients display lower relative bioavailability of efavirenz than healthy subjects. Clin Pharmacokinet 2011; 50: 531-40

40. Coulter CV, Isbister GK, Duffull SB. The pharmacokinetics of methanol in the presence of ethanol: a case study. Clin Pharmacokinet 2011; 50: 245-51

61. Jauregizar N, de la Fuente L, Lucero ML, et al. Pharmacokinetic-pharmacodynamic modelling of the antihistaminic (H1) effect of bilastine. Clin Pharmacokinet 2009; 48: 543-54

41. Stockis A, Toublanc N, Sargentini-Maier ML, et al. Retrospective population pharmacokinetic analysis of levetiracetam in children and adolescents with epilepsy: dosing recommendations. Clin Pharmacokinet 2008; 47: 333-41 Adis ª 2012 Springer International Publishing AG. All rights reserved.

62. Mueck W, Lensing AWA, Agnelli G, et al. Rivaroxaban: population pharmacokinetic analyses in patients treated for acute deep-vein thrombosis and Clin Pharmacokinet 2012

Sherwin et al.

18

exposure simulations in patients with atrial fibrillation treated for stroke prevention. Clin Pharmacokinet 2011; 50: 675-86 63. Wang DD, Zhang S. Standardized visual predictive check – how and when to used it in model evaluation [abstract no. 1501]. PAGE. Abstracts of the 18th Annual Meeting of the Population Approach Group in Europe; 2009 Jun 23-26; St Petersburg [online]. Available from URL: http://www.page-meeting. org/?abstract=1501 [Accessed 2012 May 28] 64. Karlsson MO, Holford NH. A tutorial on visual predictive checks [abstract no. 1434]. PAGE. Abstracts of the 17th Annual Meeting of the Population Approach Group in Europe; 2008 Jun 18-20; Marseille [online]. Available from URL: http://www.page-meeting.org/?abstract=1434 [Accessed 2012 May 28] 65. Bergstrand M, Hooker AC, Wallin JE, et al. Prediction-corrected visual predictive checks for diagnosing nonlinear mixed-effects models. AAPS J 2011; 13: 143-51 66. Brendel K, Comets E, Laffont C, et al. Metrics for external model evaluation with an application to the population pharmacokinetics of gliclazide. Pharm Res 2006; 23: 2036-49

Adis ª 2012 Springer International Publishing AG. All rights reserved.

67. Brendel K, Comets E, Laffont C, et al. Evaluation of different tests based on observations for external model evaluation of population analyses. J Pharmacokinet Pharmacodyn 2010; 37: 49-65 68. Mandema JW, Verotta D, Sheiner LB. Building population pharmacokinetic– pharmacodynamic models: I. Models for covariate effects. J Pharmacokinet Biopharm 1992; 20: 511-28 69. Sheiner LB. Analysis of pharmacokinetic data using parametric models: III. Hypothesis tests and confidence intervals. J Pharmacokinet Biopharm 1986; 14: 539-55

Correspondence: Dr Mary H.H. Ensom, Children’s and Women’s Health Centre of British Columbia, Pharmacy Department (0B7), 4500 Oak Street, Vancouver, BC V6H 3N1, Canada. E-mail: [email protected]

Clin Pharmacokinet 2012