account for potentially missing observations in a quantal dose response study ... goals are to construct optimal designs in a dose response study that incorpo-.
Bayesian optimal designs for a quantal dose response study with potentially missing observations
InYoung Baek, Wei Zhu, Xiangfeng Wu Department of Applied Mathematics and Statistics State University of New York at Stony Brook Stony Brook, New York 11794 Weng Kee Wong Department of Biostatistics University of California, Los Angeles Los Angeles, California 90095 Abstract In a dose response study, there are frequently multiple goals and not all planned observations are realized at the end of the study. Subjects drop out and the initial design can be quite different from the final design. Consequently, the final design can be inefficient. Single and multiple-objective Bayesian optimal designs that account for potentially missing observations in quantal response models were recently proposed in Baek (2005). In this work, we investigate the efficiencies of the conventional optimal designs that do not incorporate potential missing information relative to our proposed designs. Furthermore, we examine the impact of restricted dose range on the resulting optimal designs. As an application, we used missing data information from a study by Yocum et al. (2003) to design a study for estimating dose levels of tacrolimus that will result in a certain percentage of rheumatoid arthritis patients having a ACR20 response at 6 months.
1
Key Words: Compound Bayesian c-optimal criterion, Logit model, Multipleobjective optimal design, Non-missing probability function, Restricted design interval.
1
Introduction
Optimal experimental designs in the statistical literature usually assume that all responses are observed in the study. In practice, missing observations occur when subjects drop out of the study due to adverse events, lack of efficacy or other personal or clinical reasons. Consequently, the final design can be quite different from the original design and may be much less efficient than what was expected. For example, a balanced two-arm study will become inefficient if a very large percentage of subjects drops out from one of the arms. There are statistical methods continuously developed for analyzing missing data but design strategies for addressing potential missing observations are quite lacking. Baek (2005) proposed multiple-objective optimal designs that account for potentially missing observations in a quantal dose response study using a Bayesian paradigm. Throughout we assume that we know a priori that there are going to be missing observations and they are likely to occur in a specific pattern. Our goals are to construct optimal designs in a dose response study that incorporate the anticipated pattern of missingness and evaluate the loss in efficiencies if potential missing information were ignored at the design stage. Specifically we construct Bayesian optimal designs using independent uniform priors for the logit model parameters and for illustrative purposes, assume our goal is to estimate a single or a few percentiles in the logit distribution. In the next section, we describe the statistical setup and introduce the non-missing probability function that models the probability of missingness of any observation as a function of the dose level. Section 3 presents optimal designs for estimating a percentile or a set of percentiles for a few non-missing probability functions and efficiencies of optimal designs that do not incorporate missing information at the design stage. We call the latter designs conventional or usual optimal designs. As an application, we use missing data pattern information in a rheumatoid arthritis clinical trial conducted by Yocum et al. (2003) and construct Bayesian optimal designs for estimating dose levels of tacrolimus that would result in certain percentages of rheumatoid arthritis patients having a ACR20 response at 6 months. 2
Conclusions and limitations of our approach are noted in Section 4.
2
Background
The simple logit model is a popular model in quantal dose response studies and takes on the form given by log
π(x) = β(x − α). 1 − π(x)
Here π(x) = 1/{1 + exp[−β(x − α)]} is the probability of a response at dose x ∈ χ and χ is the dose range of interest (Abdelbasit and Plackett, 1983, Carr and Portier, 1993, Zhu el at. 1998). The parameter β is the slope in the logit scale and the parameter α is the value of x at which the response probability is .5. The parameter α is the median in the logit scale and is called the “median effective dose”. More generally, let ED100π be the dose level x0 corresponding to a response probability π. It follows directly that π(x0 ) = 1/{1 + exp[−β(x0 − α)]}, and ED100π is a function of the model parameters θT = (α, β), i.e. ED100π = x0 = α +
γ , β
where γ = log[π/ (1 − π)]. We assume here and throughout that the total number of observations n for the experiment is determined in advance, usually by cost or time constraint. Any design ξ can be represented in the form (
ξ=
x1 x2 · · · xk p1 p2 · · · pk
)
,
where k is the number of dose levels and pi is the proportion of subjects assigned to the dose level xi . Each xi is selected from a pre-determined interval χ representing the dose range of interest. In practice, the design ξ assigns roughly ni = npi subjects to the dose level xi , i = 1, 2, ..., k subject to the condition that the sum of the npi ’s equal the total number of patients n in the trial. Such designs are called continuous designs and were first proposed by Kiefer in a series of papers beginning in the late 1950’s. The main 3
motivation for such designs is that for many problems, optimal designs can be found from simple algorithms and in a relatively simple setup, analytical formulae for the designs are also available. These designs can then be used as a guide to practitioners in search of an efficient design. At the minimum, practitioners can use the optimal design as a benchmark and assess how much they want their design to stray away from the optimal design without losing too much efficiencies. To incorporate the potential missing observation pattern at the design stage, we use a non-missing probability function. The value of this function at the point x reflects the probability of realizing a response at the dose level x in the study (Imhof, Song and Wong, 2002). Earlier on, Herzberg and Andrews (1976) used a somewhat similar idea to model the response at the point x and assumed ‘losses’ at different points are independent. Imhof, Song and Wong (2002) employed two types of non-missing probability functions. One of them is a ‘symmetric’ function and the other is a monotonic function. We put quotes around the word symmetric because this function is symmetric about some dose level if the dose interval is appropriately chosen. In this paper, we consider three types of non-missing probability functions that we feel are likely to be useful in practice: the logistic function, the exponential function and a class of ‘symmetric’ functions (Herzberg and Andrews, 1976; Imhof, Song and Wong, 2004). The logistic non-missing probability function assumes the following form τlogistic (xi ) = 1/{1 + exp[a(xi − b)]}, where a and b are positive constants chosen to represent the probability of observing a response across the dose interval. The exponential non-missing probability function has the form τexpo (xi ) = c − d exp(axi − b), where a, b, c and d are constants to model the probability of observing a response at x. An example of a ‘symmetric’ non-missing probability function is τsym (xi ) = a(b − xi )2 + c, where a, b and c are positive constants. The logistic and exponential non-missing probability functions are monotonic and so they can be used to model missingness in situations where increasingly 4
large dosage levels result in larger dropouts or fewer dropouts. The former case is likely to occur when larger dose levels lead to more toxic effects and patients drop out. If the drug is effective at larger doses, fewer subjects are likely to dropout at higher dosage levels. The ‘symmetric’ function takes on a minimum value at b and is symmetric about b if the dose interval is symmetrically centered at b. Otherwise, the ‘symmetric’ function can be used to model curvilinear missing pattern. In practice, this function is suitable in situations when the treatment target is well defined and its range bounded. For instance, in stabilizing blood pressure, we expect the treatment to produce a clinical reading within a certain range and any reading outside the range is taken to mean the drug is not efficacious. Following convention, we measure the worth of a design by its Fisher Information matrix. Because of the potential missing data information that we wish to account for at the onset of the study, we modify the information matrix as follows. Suppose that Ni observations are realized out of ni trials at the dose level xi , i = 1, ..., k. The set of observations at xi is (xi , Yij ), j = 1, ..., Ni , where Yij = 1 if the response is a “success” and 0 otherwise. Under the logit model, the likelihood function is L(θ|X, Y) =
Ni k Y Y
[π(xi )]Yij [1 − π(xi )]1−Yij .
i=1 j=1
If the non-missing probability function at the dose level xi is τ (xi ), or simply τi , a direct calculation shows the expected normalized Fisher information matrix is à ! β 2 t0 − βt0 (x¯0 − α) 0 , M (θ, ξ) = −βt0 (x¯0 − α) s0 + t0 (x¯0 − α)2 P P where t0 = ki=1 pi wi τi , wi = π(xi )(1 − π(xi )), x¯0 = (t0 )−1 ki=1 pi wi τi xi and P s0 = ki=1 pi wi τi (xi − x¯0 )2 . We note here that we have deliberately used the primes to denote the corresponding quantities in earlier work (Zhu and Wong, 2001, for example), where they constructed multiple objective Bayesian designs and assumed that all responses are observed. To fix ideas, we assume our interest in the study is estimate a single percentile or multiple percentiles in a quantal dose response study. These are common objectives in a quantal dose response study; for example, the 50th percentile is α and corresponds to the dose where 50% of the subjects are expected to have a response (Kalish, 1980). See also Rosenberger and Grill (1997) where they were interested to estimate the quantal dose response
5
surface in a psychophysical experiment by estimating the 25th, 50th and 75th percentiles in a logit model. The dose in that experiment was the level of visual stimulus applied to the subject and the outcome was whether the subject reacted to the stimulus or not. To capture the anticipated missing responses in the study, we modify the Bayesian optimality design criterion for estimating a single percentile in the logit model to Φ0γ (ξ) = Eθ {β −2 [(t0 )−1 + (γ − β(x¯0 − α))2 β −2 (s0 )−1 ]}. Here the percentile is expressed in terms of γ as we have discussed earlier on. This criterion as a function of ξ is convex and so convex theory can be used to generate the optimal design using a standard computer algorithm. The optimality of the design can also be easily checked using an equivalence theorem. In the optimal design terminology, the resulting optimal design is sometimes also refereed to as a Bayesian c-optimal design. Details on the different types of algorithms, convergence issues and equivalence theorems are discussed in design monographs, such as Silvey (1980), for instance. If there are several objectives that are also convex in ξ, they can be combined into a single design criterion. For instance, if we wish to estimate several percentiles ED100πi , the compound optimality criterion is Φ0 (ξ|γ) =
m X i=1
λi Φ0γi (ξ).
(1)
Here γi = log[πi /(1 − πi )], i = 1, ..., m and λT = (λ1 , ..., λm ) is a vector of nonnegative weights with each λi representing the relative interest in estimating ED100πi . Thus if estimating ED100πi is a more important objective than estimating ED100πj , λi should be larger than λj . Cook and Wong (1994) and Clyde and Chaloner (1996) showed that the choice of these weights can be meaningfully related to the efficiency requirements in a constrained optimal design problem. Once the value of λ is selected, we find the desired multiple-objective optimal design by choosing values of k, x1 , ..., xk , p1 , ..., pk that minimize the compound optimality criterion. Sometimes these designs are also called compound optimal designs. Because the compound optimality criterion is a convex combination of convex functions, it is also convex and hence standard computer algorithm for generating a single objective optimal design can be applied to generate the multiple-objective optimal design. We omit details because they are similar to those given in Zhu (1996) 6
for finding compound Bayesian optimal designs when all observations in the study are observed.
3
Bayesian optimal designs for a quantal dose response study that incorporate potential missing data information
In this section, we generate Bayesian optimal designs for the quantal dose response logit model and compare Bayesian optimal design with and without incorporating missing data information at the onset. The latter types of optimal designs are sometimes referred to as conventional or usual optimal designs. For a given design problem, suppose ξ 0 is the multiple objective optimal design that incorporates missing data information and ξ is the corresponding optimal design that does not incorporate potential missing information at the onset. The efficiency of the design ξ is defined by e0 (ξ) = Φ0 (ξ 0 )/Φ0 (ξ). If this ratio is close to unity, this suggests that the usual optimal design can be used without much loss in efficiency when there are missing observations and they occur in the expected pattern. Another useful measure to compare the merits of incorporating potential missing information at the onset versus not including is to consider the relative efficiency of the multiple-objective design for each individual objective. For instance, if the goal is to estimate several percentiles, we first consider the design efficiency for estimating ED100πi . This efficiency is given by e0i (ξ) = Φ0γi (ξ 0 )/Φ0γi (ξ), where γi = log[πi /(1 − πi )]. The design ξ is efficient if each e0i (ξ) is relatively high for i = 1, ..., m. One drawback of this method is that these efficiencies do not reflect the weights attached to each of the objectives. As an example, consider finding a Bayesian optimal design for estimating ED30 in the logit model on an unrestricted dosage interval for a variety of non-missing probability functions. We assume independent uniform priors for α and β , and explain later how these priors are actually chosen. We consider an unrestricted dose interval first because the standard algorithm used to 7
generate the optimal design does not allow constraints on the design space. Depending on the setup and the types of non-missing probability functions assumed, the optimal design generated from the standard algorithm may or may not be admissible. Nevertheless the algorithm is simple to implement and usually serves as a first resort to finding an optimal design. When the standard algorithm was run, we obtained a 2-point conventional optimal design supported at −7.80 and 0.48 with mass at x = −7.80 equal to 0.41. This means that the usual optimal design for estimating ED30 assigns about 41% of the patients to dose −7.80 and the rest to the dose x = 0.48. Clearly this design is not admissible unless we replace the negative dose level by 0. The resulting design now requires 41% of the patients be assigned to the placebo group and the rest to the dose x = 0.48. When potential missing observation information is included in the design criterion, the algorithm was rerun and we obtain Bayesian optimal designs for the different types of non-missing probability functions listed in Table 2(a). Figure 1 displays the non-missing probability functions that were used to generate the designs in the table. For example, the Bayesian optimal design for estimating ED30 is supported at 0.74 and 21.90 with mass at 0.74 equal to 0.84 for the exponential non-missing probability function. The extreme right column of the table shows the relative efficiencies of using the usual optimal design (with the negative dose level replaced by 0) for estimating ED30. The conclusion is that the resulting usual optimal designs are extremely inefficient because all the efficiencies are near 0. Interestingly, the results for estimating ED50 under a similar setup are reversed. The usual Bayesian optimal design in this case tells us to assign an equal number of subjects to the two doses 2.03 and 3.58. The Bayesian optimal designs for estimating ED50 under the various non-missing probability functions are all nearly single-dose designs, assigning practically all subjects to the dose near the average dose level of the conventional optimal design. Yocum el at. (2003) studied efficacy and safety of tacrolimus in patients with rheumatoid arthritis. A total of 464 patients were randomized in equal allocation to receive a single daily oral dose of placebo, tacrolimus 2 mg, or tacrolimus 3 mg for 6 months. The maximum dose of 3 mg was imposed because of safety concerns (Furst et al., 2002). Accordingly, we limit the dose interval between 0 and 3, i.e. χ = [0, 3]. In this clinical trial, efficacy of the treatment is assessed monthly using the American College of Rheumatology (ACR) definitions of improvement (ACR20, ACR50 and ACR70 response). These are composite binary criteria 8
that require an increasing burden of evidence to measure improvement in patients. For example, an ACR20 response requires a patient to have a 20% reduction in the number of swollen and tender joint, and a reduction of 20% in three of the following five parameters: physician global assessment of disease, patient global assessment of disease, patient assessment of pain, C-reactive protein or erythrocyte sedimentation rate, and health assessment questionnaire score. The corresponding ACR50 criteria is the same as ACR20 except that the 20% in ACR20 is now replaced by 50% improvement in ACR50, and so on. For definiteness, we use ACR20 at 6-month as our binary response in the simple logit model with x representing the dosage level. The number of patients with a ACR20 response at month 6 visit was 16, 29 and 41 for the placebo, tacrolimus 2 mg and tacrolimus 3 mg groups, respectively. Of the 157 patients who received placebo, 45 completed the study and 112 withdrew because of adverse event, lack of efficacy and administrative reasons. Of the 154 patients who received tacrolimus 2 mg, 64 completed the study and 90 withdrew and of the 153 patients who received tacrolimus 3 mg, 80 completed the study and 73 withdrew, because of the same reasons with the placebo group. Table 1 summarizes the dose response (ACR20) and the missing data information from Yocum el at. (2003). [Table 1 about here.] We can design more efficient future studies by incorporating the expected missingness pattern using information given from Table 1. It is clear that we should use a non-missing probability function that is monotonic increasing in the given dosage level. There are clearly many possibilities on the interval [0, 3]. We focus on two simple monotonic non-missing probability functions, the logistic and the exponential functions. We fit the pattern of missing data available from Yocum et al. (2003) Table 1 using SAS procedure (PROC LOGISTIC) and MATLAB function (FIT) and obtain Logistic model: τ (xi ) = 1.0/[1.0 + exp(−.33xi + .94)] Exponential model: τ (xi ) = 1 − .6151 exp(−.12468xi + .1568) In addition, we consider the following non-missing probability functions to briefly investigate sensitivity of the optimal design to misspecification in the non-missing probability function. Logistic model 1 with a slope .66: τ (xi ) = 1.0/{1.0 + exp[2(−.33xi + .94)]}. 9
Logistic model 2 with a slope .99: τ (xi ) = 1.0/{1.0 + exp[3(−.33xi + .94)]}. Symmetric model: τ (xi ) = −.006791(7 − xi )2 + .6121 The motivation for the first two is that we want to investigate the change in the design efficiency of conventional optimal design relative to our proposed design when the slope coefficient in the non-missing probability function is mis-specified. These two non-missing probability logistic functions have larger slopes of .66 and .99 than the fitted logistic model which has a slope coefficient of .33. The larger slopes imply that non-missing probabilities are smaller in the low dosage levels and they become larger as the dosage levels increase. We include the ‘symmetric’ non-missing function to compare optimal designs obtained from very different non-missing probability functions. A simple way to obtain prior distributions for the two model parameters α and β in the logit model is to use independent uniform priors centered ˆ Using the ACR20 at their maximum likelihood estimates (MLEs) α ˆ and β. response data in Table 1, we obtain α ˆ = 2.8049 and βˆ = .2148. Additionally, we felt it is reasonable to have a range of 0.2 for both the uniform prior distributions. The resulting two prior distributions are α ∼ U [2.7049, 2.9049] and β ∼ U [.1148, .3148]. [Figure 1 about here.] Using these prior distributions, we now find conventional optimal designs and the proposed optimal designs using the above non-missing probability functions for estimating ED30 or ED50. The added complexity here is that the optimal design is constrained to a user-specified dose interval of [0, 3] for safety reasons. Patients in the placebo group correspond to x = 0, and the two treatment groups correspond to x = 2 and x = 3. This realistic constraint has non-trivial technical implications and we omit details for space consideration. Details, including analytical construction of the optimal designs on a restricted interval, are discussed in Biedermann el at. (2005). All the resulting Bayesian optimal designs in the setup of Youcm el at. (2003) have two points regardless which one of the non-missing probability functions is assumed. These designs are shown in Table 2(b) in a similar format as in Table 2(a). It is interesting to note that when we estimate ED50, 10
the conventional optimal design is very efficient relative to our proposed designs regardless of the type of non-missing probability functions considered here. This implies that the conventional optimal designs can be used without much loss in efficiency when missing observations occur in the expected pattern. Results not presented here however show that when we estimate a lower percentile such as ED10, the conventional optimal designs become noticeably less efficient. [Table 2(a) about here.] [Table 2(b) about here.] We can gain some insights into the sensitivity of the conventional optimal design to misspecification of τ (x). Consider the case when the expected missing patterns follow the logistic non-missing probability function and our goal is to estimate ED50 over an unrestricted interval. Suppose the slope coefficient is 100% larger than its true value of 0.33, that is, assume the slope coefficient is .66 in the logistic non-missing probability model. Our calculation shows the efficiency of the conventional optimal design is about 93%. If we further mis-specify the slope coefficient 200% larger and assume that the slope coefficient is .99 in the logistic model, the efficiency drops to about 83%. This shows that misspecification of the non-missingness probability function can affect the efficiency of the conventional optimal design. The corresponding results on the restricted dose interval [0, 3] are reported in Table 2(b) for comparison. We observe that when we limit the dose level to between 0 and 3, the efficiency of the conventional optimal design becomes .71 for the logistic non-missing probability function with a slope coefficient equal to .99. This low efficiency is likely caused by the difference of the non-missing probabilities at the two support points in the two designs. We next consider finding Bayesian multiple-objective optimal designs for estimating several percentiles simultaneously. We assume the two percentiles of interest are ED20 and ED50. We may have different levels of interest in these two percentiles, and this is reflected in the choice the weight vector λ. For illustrative purposes, we consider λT = (.2, .8), (.5, .5) and (.8, .2). The resulting compound optimal designs are 2-point designs when we use the priors α ∼ U [2.7049, 2.9049] and β ∼ U [.1148, .3148] whether missing information is incorporated at the onset or not. From Table 3(a), we observe that the change of design points and the weights is similar to the result of the optimal design estimating a single percentile in Table 2(a). When we have 11
more interest in estimating ED20 than in estimating ED50, as in the case when λT = (.8, .2), we observe generally lower efficiencies than those when λT = (.5, .5) or (.2, .8). [Table 3(a) about here.] Table 3(b) reports optimal designs in the restricted dose interval [0, 3]. It is noticeable that 2-point compound optimal designs are at the boundary of the interval in most cases. This result confirms again that the proposed optimal designs have generally lower efficiencies when we have the logistic non-missing probability function with the slope .99. It is about 78% efficient when we estimate the percentiles ED20 and ED50 with the interest λT = (.2, .8) and about 81% efficient when we estimate ED20 and ED50 with the same interest. [Table 3(b) about here.] Table 4 compares the merits of the design in Yocum’s study relative to our proposed designs in the restricted dose interval [0, 3]. Overall, Yocum’s equal allocation design is a bit more than 50% efficient for estimating ED30 or ED50. This would mean that if estimating either of these percentiles is the study’s goal, Youcm’s design would have to require about twice as many patients to estimate these percentiles with the same precision available from our optimal designs. Of course this is not a very fair comparison because Youcm’s design was not designed to estimate the two percentiles in the first place. Nevertheless, Yocum’s design furnished us with valuable dropout rates as the dose levels vary. The upshot is we are able to use the missing data information and information on the ACR20 response rates to construct prior distributions for α and β and design an improved study for estimating percentiles. [Table 4 about here.]
4
Conclusions
We propose Bayesian c-optimal designs that incorporate the expected missing data pattern in the study. We also investigate the efficiencies of conventional optimal designs relative to the proposed optimal designs. In general, these efficiencies depend on the severity of the missingness, the type of the missing 12
pattern, the objectives of the study and the prior distributions. Because studies are increasingly expensive, it would be prudent to evaluate robustness properties of any optimal designs to various model misspecifications before we implement the design. We focus on estimating percentiles but it should be emphasized that the design techniques are flexible and are applicable to many other types of design problems. In particular, the response may be multinomial or continuous and the goals can be to estimate parameters (D-optimality), extrapolate or estimate the response surface, just to name a few. A particularly attractive feature of working with optimal designs is that they can be quickly found and in some problems, analytical formulae for the optimal designs are also available. The closed form formulae for the optimal designs would greatly facilitate the study of robustness properties of the design to model misspecification. There are some limitations to our proposed approach. On ethical grounds, many dose response designs used in practice are adaptive in the sense that previous patients’ responses are considered before experimenting with another dose. Our proposed designs are not sequential or adaptive and so are best used when the treatment involved is slow-acting and responses are observed months later. This situation is quite common in rheumatoid arthritis clinical trial, where adaptive trials will be costly in terms of time, effort and cost. Another potential drawback with continuous design is that we assume a range of dose levels is available for experimentation. In practice, pharmaceutical companies have drugs made in a limited number of dose strength and are unwilling to make drugs at user-specified strength levels. Our proposed designs should, as should all types of optimal designs, therefore be used as a a rough guide as to where the optimal doses are and how many patients to assign at each dose to ascertain the treatment effects accurately and with minimal cost. Another limitation of our approach is the assumption of a known non-missing probability function. As our results show, our proposed optimal designs can be sensitive to specification of the non-missing probability function, which is not known precisely in most situations. Previous research that assumes the response is a continuous variable provides some evidence that the optimal design is not too sensitive to the non-missing probability function specification, as long as its form is correctly specified (Imhof, Song, Wong, 2002, 2004). For example, if τ (x) = exp(θx), the cost of working with a wrong value of θ is less consequential than misspecifying the form of τ . We are currently studying the impact of misspecification of τ 13
on the proposed optimal design in a more comprehensive setup.
References [1] Abdelbasit, K.M., Plackett, R.L. (1983). Experimental design for binary data. J. Amer. Stat. Assoc. 78:90-98. [2] Baek, I. (2005). Optimal designs accounting for potential missing trials in quantal dose responses. Ph. D. Dissertation, Department of Applied mathematice and Statistics, State University of New York at Stony Brook. [3] Biedermann, S., Dette, H., and Zhu, W. (2006). Optimal designs for dose-response models with restricted design spaces. J. Amer. Stat. Assoc. In Press. [4] Carr, G.J., Portier, C.J. (1993). An evaluation of some methods for fitting dose-response models to quantal-response developmental toxicology data. Biometrics 49:779-791. [5] Clyde, M., Chaloner, K. (1996). The equivalence of constrained and weighted design in multiple objective design problems. J. Amer. Stat. Assoc. 91:1236-1244. [6] Cook, R.D., Wong, W.K. (1994). On the equivalence of constrained and compound optimal design. J. Amer. Stat. Assoc. 89:687-692. [7] Furst, D.E, Saag, K., Fleischmann, M.R., Sherrer, Y., Block, J.A., Schnitzer, T., Rutstein, J., Baldassare, A., Kaine, J. (2002). Efficacy of Tacrolimus in rheumatoid arthritis patients who have been treated unsuccessfully with methotrexate. Arthritis and Rheumatism 46, 20202028. [8] Herzberg, A.M., Andrews, D.F. (1976). Some considerations in the optimal design of experiments in non-optimal situations. J. R. Stat. Soc. B 38:284-289. [9] Imhof, L.A., Song, D., Wong, W.K. (2002). Optimal design for experiments with possibly failing trials. Stat. Sinica 12:1145-1155.
14
[10] Imhof, L.A., Song, D., Wong, W.K. (2004). Optimal design of experiments with anticipated pattern of missing observations. J. Theor. Biol. 228:251-260. [11] Kalish, L.A. (1990). Efficient design for estimation of median lethal dose and quantal dose-response curves. Biometrics 46:737-748. [12] Rosenberger, W.F., Grill, S.E. (1997). A sequential design for psychophysical experiments: an application to estimating timing of sensory events. Stat. Med. 16:2245-2260. [13] Silvey, S. D. (1980). Optimum design. Chapman and Hall, London. [14] Yocum, D.E., Furst, D.E, Kaine, J.L., Baldassare, A.R., Stevenson, J.T., Borton, M, Mengle-Gaw, L.J., Schwartz, B.D., Wisemandle, W., Mekki, Q.A. (2003). Efficacy and safty of Tacrolimus in patients with rheumatoid arthritis. Arthritis and Rheumatism 48, 3328-3337. [15] Zhu, W. (1996). On the optimal design of multiple-objective clinical trials and quantal dose-response experiments. Ph. D. Dissertation, Department of Biostatistics, School of Public Health, UCLA. [16] Zhu, W., Ahn, H., Wong, W.K. (1998). Multiple-objective optimal designs for the logit modefl. Comm. Stat. Theory Methods 27:1581-1592. [17] Zhu, W., Wong, W.K. (2001). Bayesian optimal designs for estimating a set of symmetrical quantiles. Stat. Med. 20:123-137.
15
1
Probability
0.8
0.6
0.4 Dose response probability Non−missing probability Dose response model Logistic non−missing model Logistic non−missing model 1 Logistic non−missing model 2 Exponential non−missing model Symmetric non−missing model
0.2
0 −2
0
2
4
6 8 Dose Levels
10
12
14
16
Figure 1: Dose response model and various non-missing probability functions: the logistic model of τ (xi ) = 1.0/[1.0 + exp(−.33xi + .94)], the logistic model 1 of τ (xi ) = 1.0/{1.0 + exp[2(−.33xi + .94)]}, the logistic model 2 of τ (xi ) = 1.0/{1.0 + exp[3(−.33xi + .94)]}, the exponential model of τ (xi ) = 1 − .6151 exp(−.12468xi + .1568) and the symmetric model of τ (xi ) = −.006791(7 − xi )2 + .6121.
Table 1: Dose response and missing data information of patients with rheumatoid arthritis in Yocum et al. (2003) Dose levels placebo 2 mg tacrolimus 3 mg tacrolimus
Assigned Completed patients patients 157 45 154 64 153 80
16
ACR20 success 16 29 41
Table 2(a): Bayesian c-optimal designs that incorporated various nonmissing probability functions for estimating a single percentile ED30 or ED50. The prior distributions for the model parameters are α ∼ U [2.7049, 2.9049] and β ∼ U [.1148, .3148]. Efficiencies of conventional Bayesian c-optimal designs (ξ) relative to each proposed optimal design are given in the extreme right column. Conventional optimal design ED30 ( †
ξ=
( ‡
ξ=
(
ξ=
)
−7.80 .48 .41 .59 0 .48 .41 .59 ED50
2
)
2.03 3.58 .50 .50
1
3 4 5
)
Proposed optimal design x1 x2 p 1 0.74 21.90 .84 0.29 12.70 .83 0.64 22.00 .85 2.72 19.90 .72 3.27 19.60 .68 x1 x2 p 1 2.79 13.20 .99 2.76 8.71 .99 2.80 14.00 .99 2.88 17.20 .99 3.23 17.70 .95
1
Non-missing probability τ1 τ2 .34 .95 .31 .39 .33 .99 .48 .99 .60 .99 τ1 τ2 .49 .86 .49 .59 .50 .98 .51 .99 .59 .99
Efficiency e0 (‡ ξ) .01 .01 .01 .01 .00 e0 (ξ) .99 .99 .99 .93 .83
optimal design using the exponential non-missing probability function optimal design using the symmetric non-missing probability function 3 optimal design using the logistic non-missing probability function 4 optimal design using the logistic non-missing probability function with a slope of .66 5 optimal design using the logistic non-missing probability function with a slope of .99 † conventional Bayesian c-optimal design ‡ design substituted 0 for the negative dose point in the conventional Bayesian c-optimal design 2
17
Table 2(b): Bayesian c-optimal designs that incorporated various nonmissing probability functions for estimating a single percentile ED30 or ED50 in the restricted dose interval [0, 3]. The prior distributions for the model parameters are α ∼ U [2.7049, 2.9049] and β ∼ U [.1148, .3148]. Efficiencies of conventional Bayesian c-optimal designs (ξ) relative to each proposed optimal design are given in the extreme right column. Conventional optimal design ED30 ( †
ξ=
0 3 .69 .31
)
ED50 (
ξ=
0 3 .07 .93
)
Proposed optimal design x1 x2 p 1 1 0 3 .74 2 0 3 .75 3 0 3 .75 4 0 3 .81 5 .21 3 .85 x1 x2 p 1 1.46 3 .15 1.51 3 .15 1.34 3 .14 1.97 3 .23 2.24 3 .30
1
Non-missing probability τ1 τ2 .28 .51 .28 .50 .28 .51 .13 .52 .07 .54 τ1 τ2 .40 .50 .41 .50 .38 .51 .36 .53 .35 .54
Efficiency e0 (ξ) .97 .92 .92 .89 .81 e0 (ξ) .97 .97 .97 .89 .71
optimal design using the exponential non-missing probability function optimal design using the symmetric non-missing probability function 3 optimal design using the logistic non-missing probability function 4 optimal design using the logistic non-missing probability function with a slope of .66 5 optimal design using the logistic non-missing probability function with a slope of .99 † conventional Bayesian c-optimal design 2
18
Table 3(a): Bayesian compound c-optimal designs that incorporated various non-missing probability functions for estimating ED20 and ED50 with weight λT . The prior distributions for the model parameters are α ∼ U [2.7049, 2.9049] and β ∼ U [.1148, .3148]. Efficiencies of conventional Bayesian compound c-optimal designs relative to each proposed optimal design are given in the extreme right columns. Conventional optimal design λT = (.2, .8) ( †
ξ=
0 7.68 .57 .43
)
λT = (.5, .5) (
ξ=
0 8.39 .67 .33
)
λT = (.8, .2) (
ξ=
0 7.47 .76 .24
)
Proposed optimal design x1 , x2 p1 1 1.84, 18.00 .78 2 1.08, 11.50 .74 3 2.06, 18.00 .79 4 3.15, 18.70 .74 5 3.49, 18.70 .71 x1 , x2 p1 1.27, 19.23 .75 0.57, 12.20 .73 1.40, 19.10 .75 2.94, 19.30 .68 3.38, 19.20 .64 x1 , x2 p1 .91, 20.20 .74 .33, 12.50 .73 .90, 20.00 .75 2.80, 19.70 .65 3.32, 19.50 .61
1
Non-missing probability τ1 , τ2 .43, .92 .37, .47 .44, .99 .55, .99 .65, .99 τ1 , τ2 .39, .94 .33, .43 .38, .99 .52, .99 .63, .99 τ1 , τ2 .36, .94 .31, .41 .34, .99 .49, .99 .61, .99
Efficiency e01 (ξ) .59 .82 .61 .34 .14 e01 (ξ) .54 .77 .56 .32 .14 0 e1 (ξ) .49 .71 .51 .31 .14
e02 (ξ) .89 .92 .90 .45 .19 e02 (ξ) 1.03 1.07 1.05 .50 .21 0 e2 (ξ) 1.21 1.21 1.28 .63 .27
e0 .71 .86 .73 .38 .16 e0 .61 .81 .63 .34 .15 e0 .52 .73 .54 .32 .14
optimal design using the exponential non-missing probability function optimal design using the symmetric non-missing probability function 3 optimal design using the logistic non-missing probability function 4 optimal design using the logistic non-missing probability function with a slope of .66 5 optimal design using the logistic non-missing probability function with a slope of .99 † conventional Bayesian compound c-optimal design substituted the negative dose point for 0 2
19
Table 3(b): Bayesian compound c-optimal designs that incorporated various non-missing probability functions for estimating ED20 and ED50 with weight λT in the restricted dose interval [0, 3]. The prior distributions for the model parameters are α ∼ U [2.7049, 2.9049] and β ∼ U [.1148, .3148]. Efficiencies of conventional Bayesian compound c-optimal designs (ξ) relative to each proposed optimal design are given in the extreme right columns. Conventional optimal design λT = (.2, .8) ( †
ξ=
0 3 .52 .48
)
λT = (.5, .5) (
ξ=
0 3 .58 .42
)
λT = (.8, .2) (
ξ=
0 3 .60 .40
)
Proposed Non-missing optimal design probability x1 x2 p 1 τ1 , τ2 1 0 3 .65 .28, .50 2 0 3 .65 .28, .50 3 .28, .51 0 3 .65 4 0 3 .73 .13, .53 5 .32 3 .79 .08, .54 x1 x2 p 1 τ1 , τ2 0 3 .65 .28, .50 0 3 .65 .28, .50 0 3 .65 .28, .51 0 3 .73 .13, .53 .24 3 .79 .07, .54 x1 x2 p 1 τ1 , τ2 0 3 .67 .28, .50 0 3 .67 .28, .50 0 3 .67 .28, .51 0 3 .75 .13, .53 .21 3 .80 .07, .54
1
Efficiency e01 (ξ) e02 (ξ) e0 .94 1.20 .98 .94 1.20 .98 .93 1.20 .98 .84 1.53 .90 .73 1.83 .78 0 e1 (ξ) e02 (ξ) e0 .97 1.20 .98 .97 1.20 .98 .97 1.20 .98 .89 1.54 .91 .80 1.93 .81 0 e1 (ξ) e02 (ξ) e0 .98 1.21 .98 .98 1.21 .98 .98 1.21 .98 .91 1.59 .92 .82 1.93 .82
optimal design using the exponential non-missing probability function optimal design using the symmetric non-missing probability function 3 optimal design using the logistic non-missing probability function 4 optimal design using the logistic non-missing probability function with a slope of .66 5 optimal design using the logistic non-missing probability function with a slope of .99 † conventional Bayesian compound c-optimal design 2
20
Table 4: Efficiency of Yocum’s design (ξo ) relative to the proposed optimal designs under the prior distributions for the model parameters are α ∼ U [2.7049, 2.9049] and β ∼ U [.1148, .3148] for estimating single percentiles using the Bayesian c-optimal criterion incorporated non-missing probabilities. Yocum’s design ED30 (
ξo =
0 2 3 1 3
1 3
)
1 3
ED50 (
ξo =
0 2 3 1 3
1 3
1 3
)
Proposed optimal design x1 x2 p 1 † 0 3 .69 1 0 3 .74 2 0 3 .75 3 0 3 .75 4 0 3 .81 5 .21 3 .85 x1 x2 p 1 0 3 .07 1.46 3 .15 1.51 3 .15 1.34 3 .14 1.97 3 .23 2.24 3 .30
1
Non-missing probability τ1 τ2 1 1 .28 .51 .28 .50 .28 .51 .13 .52 .07 .54 τ1 τ2 1 1 .40 .50 .41 .50 .38 .51 .36 .53 .35 .54
Efficiency e0 (ξo ) .59 .55 .52 .52 .51 .55 0 e (ξo ) .55 .54 .54 .54 .54 .54
optimal design using the exponential non-missing probability function optimal design using the symmetric non-missing probability function 3 optimal design using the logistic non-missing probability function 4 optimal design using the logistic non-missing probability function with a slope of .66 5 optimal design using the logistic non-missing probability function with a slope of .99 † conventional Bayesian c-optimal design 2
21