A Principal Components Approach to Assign Confidence Intervals in Steady-State Simulation Seyed Taghi Akhavan Niaki, Ph.D. Associate Professor of Industrial Engineering, Sharif University of Technology P.O. Box 11365-9414 Azadi Ave., Tehran Iran Phone: (+98-21) 6165740, Fax: (+98-21) 600-5116, E-mail:
[email protected]
Wafik H. Iskander, Ph.D. Senior Member of IIE Professor of Industrial & Management Systems Engineering College of Engineering and Mineral Resources West Virginia University P.O. Box 6070, Morgantown, WV 26506-6070 Phone: (304) 293-4607 ext.3710, Fax: (304) 293-4970 E-mail:
[email protected]
Abstract This paper presents a new approach for assigning a confidence interval to the mean of a stream of autocorrelated output data from a single simulation run. Based on principal components analysis method, the approach is to derive a linear transformation of the data that yields approximate independence of the transformed data. To aid convergence to normality on which the confidence interval is based and to keep the dimension of the transformation reasonable, the original output data is batched prior to estimating the transformation. The approach is fairly simple to understand and to implement, and the experimental results indicate that it performs better than the nonoverlapping batch-means method in terms of coverage with any number of batches. Key Words: Simulation Output Analysis, Confidence Intervals, Principal Components, Batch Means, Linear Transformation
1-Introduction and Statement of the Problem Analysis of the output data from a discrete-event simulation model is one of the important components of simulation studies. Managers and systems analysts are usually interested in making inferences about properties of the process being simulated. In this regard, building confidence intervals (C.I.’s) on the mean value of key output variables of interest, usually defined in terms of some system performance measure, is the most fundamental and applied issue. Examples of the key output variables are: the expected queue waiting time, the expected queue length, or the expected system throughput. In making inferences about the system’s performance, we should address two issues. First, how to obtain good estimators of the performances? Second, how to measure the variance of these estimators and provide precise estimators of the variances? The sample mean usually acts as a prototype estimator of the population mean in simulation experiments. The reliability of the sample mean as an estimator of the population mean depends on its variance. Moreover, an estimate of the variance of the sample mean is also crucial in making C.I.’s for the population mean. 2
In most cases the output data from a simulation experiment are autocorrelated and come from some systems that are not in steady state. This makes the C.I. procedure quite complicated, because it is difficult to estimate the variability of the sample mean in this case. Furthermore, direct application of classical statistical methods is precluded. Solution methods currently available for this problem are either too complicated or may require a considerable amount of data and/or assumptions. This paper develops a new method for constructing C.I.’s for the mean of output data from a single steady-state simulation run. Since in some cases collecting output data from a simulation model might be quite expensive, the method should not require a relatively large number of observations. Also, compared to other C.I. methods, it should be easy to understand and to implement, and should produce reliable C.I.’s with adequate coverage. Section 2 contains a brief review of the related literature along with a short introduction to principal components analysis. In section 3, we provide the idea underlying the new confidence interval procedure. Section 4 provides the reader with a numerical example in order to better understand the method. We introduce the variance-covariance matrix estimation method in section 5. Section 6 contains some possible pitfalls associated with the new C.I. method. In section 7 we cover implementation, testing and comparisons of the new procedure with the non-overlapping batch-means method. Finally, we conclude the paper in section 8 and provide the reader with some recommendations for further research in section 9.
2-Literature Survey Various types of estimators have been proposed to estimate the variability of the sample mean calculated from steady-state autocorrelated stochastic processes that typically arise in simulation output analysis. For example, there are estimators based on the method of independent replications (Welch 1983, Law 1977, Kelton and Law 1983, 1984, and Whitt 1991), the non-overlapping and overlapping batch-means methods (Law and Carson 1979, Schmeiser 1982, Song, 1996, Sherman, 1998, Bischak, et al., 1993, Song & Schmeiser, 1995, Chien, et al., 1997, Fishman 1978a and 3
Meketon and Schmeiser 1984), the method of spectral analysis (Welch 1983, Duket and Pritsker 1978, Law and Kelton 1984, and Heidelberger and Welch 1981), the autoregressive method (Fishman 1971, 1978b, Law and Kelton 1984, and Andrews and Schriber 1984), the regenerative method (Law 1983, and Crane and Iglehart 1974a, 1974b), and standardized time series (Bischak, et al., 1993, Schruben 1983, Goldsman and Schruben 1990, Tokol, et al., 1998, and Glynn and Iglehart 1990). Each of the above-mentioned approaches involves some basic assumptions on the process being simulated, which may not be realized in real-world systems. In addition, all of them require a relatively large number of observations. Some of these methods, such as the method of independent replications, are usually wasteful in terms of the information obtained from the data. However, there are trade-offs to be considered and under certain circumstances, these methods may not be wasteful relative to one long run (Whitt 1991). Other methods, such as spectral analysis, autoregressive representation, and standardized time series place a heavy burden on the user, since they require him to be familiar with sophisticated methods of time series analysis. The regenerative method is simple and easy to understand and implement, but its applicability to realworld systems is very limited and suffers because of its long cycle length. The remaining approach suggested in the literature, the batch-means method, is easy to understand and apply and is based on the assumption the batch means are approximately the i.i.d. normal. Nevertheless, one key element in the batch-means method is the determination of the number of observations per batch, which is highly model dependent. We will address this problem in section 2.2. Since the proposed procedure is based on principal components method and the non-overlapping batch-means (NBM) procedure, a very brief description of these approaches is given below. 2.1 Principal Components Method Principal components analysis is a statistical technique concerned with explaining the variancecovariance structure of a set of random variables through a few linear combinations of these
4
variables. Its general objectives are 1) data reduction and 2) interpretation (Johnson & Wichern, 1998). To explain principal components approach mathematically, let Σ be the variance-covariance matrix of the p-dimensional random vector X T = [ X 1 , X 2 , … , X p ] , with eigenvalue-eigenvector pairs (η1 , e1 ) , (η 2 , e 2 ), (η 3 , e 3 ), ... , (η p , e p ) where η1 , η 2 , η 3 , ... , η p ≥ 0 . Then the
i th
principal
component is defined as T
Yi = e i X = e i1 X 1 + e i 2 X 2 + ... + e i p X p ,
i = 1,2, ... , p
It can be shown that
Var(Yi ) = e i Σ e i = η i ; i = 1,2, ... , p T
(1)
and Cov(Yi , Yk ) = e i Σ e i for i ≠ k . Algebraically, principal components are particular linear T
combinations of the p random variables X1 , X 2 , …, X p , and geometrically, these linear combinations represent the selection of a new coordinate system obtained by rotating the original system with X1 , X 2 , …, X p , as the coordinate axes. The new axes represent the directions with maximum variability and provide a simpler and more parsimonious description of the covariance structure (Johnson & Wichern, 1998).
2.2 The Non-Overlapping Batch-Means Procedure In this procedure we assume a realization {X i : 1 ≤ i ≤ n} , as the steady-state output of a single run
of a simulation on the key variable of interest X from a covariance stationary process with mean
µ and lag s covariance, Q s = Cov(X i , X i +s ) . This run is divided into k "batches" of m consecutive
observations
each
(assume
k
divides
n,
so
that
n = km ).
Let
X (i, j) (i = 1,2,..., m, and j = 1,2,..., k) be the i th observation from the j th batch, and define the batch means, X j (m) , j = 1,2,..., k , to be the sample means of the m observations in the j th batch, i.e., m
X j (m) = ∑ X (i , j) / m. For sufficiently large m , Law and Carson (1979) showed that the batch i =1
5
means are approximately i.i.d. normal random variables with mean µ . Hence, using classical statistical methods, a 100(1 − α )% C.I. for µ would be X(k, m) ± t (k - 1,1 - α/2) σˆ (k, m) , where k
k
j=1
j=1
X(k, m) = ∑ X j (m) / k is the overall sample mean, and σˆ 2 (k, m) = ∑ [X j (m) - X(k, m)] 2 / k (k − 1)
are the point estimators of µ and Var[ X(k, m) ] respectively.
The method of batch means has three potential sources of error: 1) The process may not be covariance stationary. 2) If m is not large enough, the batch means may not be approximately normally distributed. 3) If m is not large enough, the batch means may be highly correlated and the estimate of the variance of the overall sample mean will be biased. On the one hand, a small batch size results in an NBM variance estimator having large bias and small variance, and on the other hand, a large batch size results in small bias and large variance. The problem is to find an appropriate batch size that provides a good tradeoff between bias and variance. In this regard, several studies have explored the effects of using batch means in C.I. construction. If the original observations are positively correlated (the case usually encountered in practice), then the batch means will also be positively correlated. Law (1977) mentioned that the principal source of error in the batch-means method is the underestimation of Var[X(k, m)] when
m is too small. Empirical results obtained by Law and Kelton (1984) indicate that at least for simple systems like an M / M / 1 queue, the correlation between the batch means is the most serious source of error, and that it is better to make a few large batches rather than many small batches. Adam and Klein (1989) concluded that the C.I. coverage for the process mean decreases significantly even at moderate levels of autocorrelation. Law (1983) and Fox et al. (1991) investigated the possibility of leaving some observations between batches, or more generally the possible assignment of different weights to the observations within a batch, in an effort to reduce
6
the correlation between batch means. Also, Sargent et al. (1992) investigated the effects of estimator bias on C.I. performance. One approach in selecting the batch size is to minimize the mean squared error (MSE) of the variance estimator, which is the sum of squared bias and variance. Note that in this case the bias and variance are performance measures for sample-mean variance estimators, rather than for the sample mean itself. Goldsman and Meketon (1986) derived some asymptotic results concerning the optimal batch size in terms of MSE for large sample size. Bischak et al. (1993) proposed a weighting scheme for the observations within batches that minimizes the variance of the weighted point estimator of the mean. The results of their experiments with M / M / 1, AR(1), AR(2) and MA(2) processes indicate that their procedure exhibits better C.I. coverage and expected halfwidth than that of the NBM method. Song and Schmeiser (1995) obtained an asymptotic approximation, which involves a process-dependent quantity, of the optimal batch size based on a minimal MSE criterion. In addition, Song (1996) proposed a procedure to obtain the optimal batch size that minimizes the mean squared error of estimators of the variance of the sample mean. Chien et al. (1997) noted that the batch-means variance estimator is asymptotically unbiased and convergent in mean square, and came up with an asymptotic expression for the variance of the batch-means variance estimator. Sherman’s (1998) method, which is process-independent (modelfree), proposes an algorithm that makes use of the previous asymptotic results giving the order of batch size as a function of the simulation run length, and gives an empirical estimate of batch size for shorter simulation run length. Based on the experimental results for autoregressive and moving average output, he claims that his method performs well for relatively large simulation length with moderate dependence, in most cases. In summary, a review of the literature shows that a practical procedure for determining an appropriate batch size has yet to be proposed.
7
3. The New Confidence Interval Procedure As mentioned earlier the most serious potential source of error in C.I. estimation is due to the
correlation of the observations. In this regard, two kinds of principal components applications could aid simulation output analysts. We could apply principal components to the batch-means vector obtained from the original output data, or apply it to the original output data itself. In this research, to aid convergence to normality on which the C.I. is based and to keep the dimension of the transformation reasonable, the original output data is batched prior to performing the transformation. Then, we derive a linear transformation of the batch means such that they become uncorrelated, approximately independent, and approximately normally distributed. Hence, the use of the classical C.I. method to the transformed batch means is justified. The approach is to estimate the variance-covariance matrix of the batch-means vector, and use this estimate to derive the linear transformation by means of principal components method. To derive the new C.I. procedure let us define X = [ X 1 (m), X 2 (m), ... , X k (m) ] T as the random vector of the batch means and define µ X and Σ X as the mean vector and variance-covariance matrix associated with the random vector X , respectively. Note that since the batch means are correlated, the off-diagonal entries of Σ X are nonzero. The main idea is to transform the correlated batch-means vector to an uncorrelated random vector Y such that the variance-covariance matrix of Y becomes diagonal. Since the variance-covariance matrix Σ X is real and symmetric, it has k mutually orthogonal unit (orthonormal) eigenvectors, which are also linearly independent (Stark and Woods 1986). Let
λ1 , λ 2 ,..., λ k ≥ 0 be the eigenvalues of Σ X . Define E to be the matrix whose columns are the k linearly independent orthonormal eigenvectors of Σ X . It follows that E −1 Σ X E is a diagonal matrix Λ with the eigenvalues of Σ X along its main diagonal (Strang 1980). Also, since E is an
8
orthogonal matrix (square matrix with orthonormal columns), then E T = E −1 and we have ET Σ X E = Λ .
For a linear transformation on the correlated batch-means vector of the form Y = W X such that the transformed vector becomes uncorrelated, the variance-covariance matrix associated with the random vector Y must be diagonal, i.e., Σ Y = Λ , and W is the weight matrix. The mean vector ( µ Y ) and the variance-covariance matrix ( Σ Y ) associated with the random vector Y can be derived as follows: µ Y = E(Y) = E( W X) = W E( X) = Wµ X and Σ Y = E[(Y − µ Y )(Y − µ Y ) T ] = E[[ W( X − µ X )][W( X − µ X )]T = W Σ X W T .
The transformed vector Y to be uncorrelated requires that WΣ X W T = E T Σ X E = Λ . Hence W = E T , i.e., Y = E T X . If we assume that the X i (m) ’s (i = 1,2,...,k) are normally distributed but correlated random variables (an assumption which, according to the central limit theorem, is not far from reality when batches contain a large number of observations), then Y will also be a k-dimensional normal random vector (Strang 1980). Thus, the components of Y will be uncorrelated and normally distributed, and therefore independent, but they are not i.i.d. normally distributed random variables because their variances may differ. Let us define the weights, Wi , (i = 1,2,..., k ), as the k
k
i =1
i =1
sum of the i th column of the W matrix, and B = ∑ Yi = ∑ Wi X i (m) . Assuming that the weights are constants, we see that B is also a normally distributed random variable with a mean of k
k
i =1
i =1
E[B] = ∑ Wi E[X i (m)] = µ (∑ Wi )
(2)
and a variance of (using equation (1)):
9
k
k
i =1
i =1
Var (B) = ∑ Var (Yi ) = ∑ λi
(3)
Note that this assumption renders the procedure a bit heuristic. Assuming the sum of the weights to be positive, the new C.I. can now be defined using
k ∑ Wi X i (m) i =1 P k Wi ∑ i =1
k
k
∑ λi Z α i =1
2
≤µ ≤
k
∑ Wi X i (m) +
∑λ
i =1
i =1
k
∑W i =1
i
i
Zα 2 = 1−α
(4)
α where Z α denotes the 100 percentile of the standard normal distribution. However, when the 2 2 sum of the weights is negative, we simply flip the inequalities in (4). As we mentioned previously, since the weights are random variables, the new confidence interval method is a bit heuristic. Note that since the eigenvalues are all non-negative, their sum will also be non-negative, and there will not be any difficulty computing the square root in the above formulae. However, the term in the denominators of equation (4) may be equal or very close to zero. We could not obtain a mathematical proof to show that the sum of the weights cannot be zero. Nevertheless, as you will see in section 7, we applied the new procedure numerically on almost 1,000,000 matrices with data obtained from different processes, and with number of batches ranging between 2 and 64, without producing a single case in which the sum of weights equaled zero. In the special case where the batch means are i.i.d. random variables, the variance-covariance matrix is simply the identity matrix multiplied by the common variance and any unit length vector can be the unit eigenvector. In this situation no specific weights can be defined; however, when the batch means are already i.i.d. there is no need to find weights or apply the new procedure, and so the NBM approach can be safely applied. It is also worth noting that in the nearly 1,000,000 problems investigated, we did not encounter a single case which indicated that the batch means were i.i.d.
10
4. Numerical Example Let X 1 , X 2 ,..., X 1000 be the sample observations on the system performance (response variable) of a
single simulation run in steady state, and suppose we desire point and interval estimates of the average response
(µ )
based on the sample observations. Assume that the sample mean
X(1000) = 4 is an unbiased point estimator of µ . Suppose we divide the sample data into k = 2 batches of size m = 500 each, and assume that X 1 (500) = 5 and X 2 (500) = 3 are the two batch means, and that the variance-covariance matrix associated with the batch-means vector is Σ X =
2 0.5 0.5
1
, then the non-negative eigenvalues λ1 and λ 2 and
2
∑λ i =1
i
associated with Σ X can
be obtained as follows: det ( Σ X − λI) = 0 implies that
2
∑λ i =1
i
= 3.
The unit eigenvectors E 1 and E 2 can be obtained as follows: 1 2 −1 , (Σ X − λ 1 I)E 1 = 0 ⇒ E 1 = 4 − 2 2 4 − 2 2
T
1 −1− 2 (Σ X − λ 2 I)E 2 = 0 ⇒ E 2 = , 4 + 2 2 4 + 2 2
T
1
1
4−2 2
4+2 2
2 −1
−1− 2
Then the E matrix whose columns are the unit eigenvectors is E = 4−2 2 4+2 2 Now the weight matrix W is simply the transpose of the E matrix, and the weights are: 1 1 W1 = + ≈ 1.307 4−2 2 4+2 2
W2 =
2 −1 4−2 2
+
−1− 2 4+2 2
2
≈ −0.541
Also we have ∑ Wi ≈ 0.766 and i =1
2
∑ Wi X i (500) ≈ 4.912 . Since the sum of the weights is positive i =1
and knowing that Z α = 1.96 for α = 0.05 , using equation (4), we find that the 95% C.I. for µ is: 2
11
µ ∈ [1.981,10.844] We can easily show that the variance-covariance matrix associated with the transformed random vector Y = W X is Σ Y = W Σ X W T =
2.207
0
0 0.793
. In other words, the random vector X with
correlated components was converted through the above linear transformation to a random vector
Y with uncorrelated, but not necessarily identically distributed, components. It should be mentioned that any scalar multiple of an eigenvector is also an eigenvector. However, since we need to have unit eigenvectors ( E −1 = E T when the columns of E are orthonormal), this excludes the possibility of any scalar multiplication except by (-1) which only changes the direction of the unit eigenvectors. Indeed, in the last example, there are four possible combinations of weights for the batch means, and all of them have the same properties. Nonetheless, two of them are simply the negative of the other two, which would yield the same results but with opposite signs. Changing the sign of weights does not affect the final C.I. since the changes show in both the numerator and denominator of the derived C.I. Hence, depending on the direction taken for the eigenvectors, only two sets of weights will be different. In essence, both sets of weights will transform the X' s variables into independent random variables, but may yield different C.I.’s. In general, for a k × k variance-covariance matrix of the batch-means vector, it can be shown that there are up to 2 k −1 different weight matrices that can transform the batch means into uncorrelated random variables (Strang 1980). In other words we can have up to 2 k −1 different estimations of the C.I. on µ . Therefore, the new C.I. method can be regarded as one yielding a new class of C.I.’s.
5. Variance-Covariance Matrix Estimation A problem with the new C.I. method is that the variance-covariance matrix of the vector batch
means is not known a priori. If we have several independent replications on the random vector of batch means, then, using classical statistics, we can obtain an unbiased and consistent estimator of the variance-covariance matrix. However, since we are dealing with output from a single steady12
state simulation run, we need to have a variance-covariance estimator based on a single realization of the stochastic process.
Assuming that covariance
{X (m) ; i = 1,2,..., k} i
between
the
{C (k, m) = Cov[X (m) , X i
j
i+ j
i th
and
the
(i + j) th
batch
means
of
size
m
each,
}
(m)] , j = 0, 1, ... , k - 1 and i = 1, 2, ... , k - j , may be estimated using
k− j
∑ [X (m) − X(k, m)][X i
ˆ ( k , m) = C j
form a covariance-stationary process, then the lag-j
i+ j
(m) − X(k, m)]
i =1
, where X(k, m) is the grand average of the k
k−j
batch means [Law and Kelton 2000]. Other estimates of C j (k, m) are also used in literature. For example, one could replace the k − j in the denominator by k (see Fishman 1978a). Some problems associated with the above estimate are that it is biased, it has a large variance unless k is very large, and the covariance estimates are correlated with each other; that is for j ≠ l,
{
}
ˆ (k, m) will be a poor estimate of C (k , m) since ˆ (k, m), C ˆ (k, m) ≠ 0 . In particular, C Cov C k −1 j l k −1 it is based on the single product [X 1 (m) − X(k, m)][X k (m) − X(k , m)] . Thus, in general, accurate estimates of C j (k, m) ’s will be difficult to obtain unless n is very large and j is small relative to n . To overcome this problem Fishman (1973) suggests the use of spectral procedures, which implicitly estimate the C j (k, m) ’s, to improve the estimate of the variance of the sample mean. However, since our objective is to develop a procedure that is simple to use, we did not consider Fishman's methodology. Another estimator for the variance-covariance matrix is the jackknife estimator, which in general is less biased (Miller 1974), and will be used as the estimator for the variance-covariance matrix of the batch-means vector. ˆ (1) ( k , m) and C ˆ ( 2 ) ( k , m) denote the usual lag-j In the jackknife estimation method, we let C i i 2 2 estimators based on the first k/2 and the last k/2 batches respectively (j = 0,1,...,k/2 - 1). Then the
13
jackknife estimator of the lag-j covariance between the i th and the (i + j) th batch means of size m each is given by: ˆ (1) ( k , m) + C ˆ ( 2 ) ( k , m) C j j 2 2 ˆ ( m) = 2 C ˆ (k , m) − C (5) j j 2 ˆ (m) , can be used for up to a lag of (k/2)-1 (assuming k to be Note that the jackknife estimator, C j ˆ (k, m) will have to be used. even). For lags greater than (k/2)-1 the regular estimator C j
When we replace the variance-covariance matrix with its estimate, we substitute the variance of the sum of the independent, transformed random variables (Yi's) with the sum of the eigenvalues of the estimated variance-covariance matrix associated with the batch-means vector. In this case, we replace the standard normal quantile, Z α , in the C.I. equations with that of the Student-t 2
distribution with (k-1) degrees of freedom.
6. Possible Pitfalls When applying the new method, one has to watch for the following pitfalls:
a) Make sure that the initial transient effects have been removed before collecting data. Also, in order to estimate the variance-covariance matrix, we may have to assume that the output is covariance stationary. This assumption is usually not far from reality after reaching steady-state conditions. b) The batch size (m) should be large enough; otherwise, the sum of the weighted batch means may not be approximately normally distributed. c) As mentioned earlier, after replacing the variance-covariance matrix with its estimate, we exchanged the standard normal distribution with a Student t-distribution with k-1 degrees of freedom. In fact, when using the estimated variance-covariance matrix, the transformed variables may not be independent.
14
d) The variance-covariance estimator of the batch-means vector may not be an unbiased and consistent estimator. e) When the true variance-covariance matrix of the batch-means vector is not known a priori, and is estimated based on the sample observations, the summation of the weights is not a constant but a random variable. Hence, the distribution of each of the random variables in equation (4) by this sum is not known. More research needs to be done on this subject.
7. Implementation In this section, we give a brief discussion on the results obtained when implementing the new C.I.
method on random outputs obtained from three different stochastic processes: (1) an M / M / 1 queuing systems with different arrival and service rates, (2) an M / M / 2 queuing system, and (3) an autoregressive process of order one [AR(1)]. Then we compare the results with those obtained from the NBM approach. The first M / M / 1 queuing example has an arrival rate of λ = 2.56 and service rate of µ = 3.2 (i.e., traffic intensity of ρ =
λ = 0.8 ). The second and the third M / M / 1 systems have the same µ
arrival rate of λ = 1.00 , but different service rates of µ = 1.25 , and 1.111 (i.e., traffic intensities of ρ = 0.8 and 0.9 ). In this example the expected waiting time in the queue, Wq , is known to be equal to Wq = ρ /[ µ (1 − ρ )] . The M / M / 2 queuing example has an arrival rate of λ = 10 and service rate of µ = 5.38 (i.e., traffic intensities of ρ = 0.8). The expected waiting time in the M / M / 2 queuing system is known to be (see for example Hillier & Lieberman 1990): Wq = 1
ρn
n =0
n!
P0 = [ ∑
+
ρ2 ] −1 . 2!(1 − ρ / 2)
15
ρ3 λ P0 , where ρ = , and 2 λ (2 − ρ ) 2µ 1
The autoregressive model has an autocorrelation parameter φ = 0.7 , mean µ = 0 , and standard normal random terms ε i ' s , X i = 0.7 X i −1 + ε i , with the initial value of X , X 0 , equal to zero. To evaluate the performance of the new method, and to compare it with the NBM approach, we report on C.I. coverage and expected half-width.
7.1 Empirical Results with M / M / 1 Queuing Systems In our first application of the M / M / 1 queue delay process, we studied small-sample properties of
the new C.I. procedure. The method was tested on waiting time output from an M / M / 1 queuing system with µ = 3.2 and ρ = 0.8 . Table I gives results on the values of the point estimator for the k
process mean, B n =
∑ W X (m) i
i =1
i
k
∑W
, the achieved coverage of the nominal 95% C.I.’s and their
i
i =1
estimated expected half-width using 10000 independent replications for N = 100 and N = 200 and k = 5, 10, and 20, using appropriate Student-t values for the C.I. quantiles. N
k
Bn
Coverage
Half-Width
100 100 100 200 200 200
5 10 20 5 10 20
1.2135 1.1815 1.1487 1.2833 1.3167 1.1746
0.7147 0.8109 0.7999 0.7842 0.8684 0.8721
1.7372 3.0287 2.9959 1.6972 4.0192 3.7808
Table I: The point estimator of the process mean, the estimated coverage, and the estimated expected half-width in 10000 replications of the M/M/1 queuing system with µ = 3.2 and ρ = 0.8
Table I shows that the new method performs well in terms of the point estimator and the estimated C.I. coverage and half-width. In order to give some support for the new procedure, consider that the half-width of the C.I. in equation (4) is a multiple of the z-value divided by the sum of the weights. Let us define the “pivot 16
k ∑ Wi X i (m) i =1 ratio” as − µ k Wi ∑ i =1
λi ∑ i =1 k W ∑ i i =1 k
−1
. If the confidence statement in equations (4) is
approximately true, then the pivot ratio should be approximately a normal random variable. Using the data from the above example, we obtained a frequency histogram of the pivot ratio, which indeed appeared normally distributed. In the second empirical study, the new C.I. method as well as the NBM method were implemented on the M / M / 1 queuing system with an arrival rate of λ = 1 and service rate of 1.25 ( ρ = 0.8 ). In order to check the sensitivities of the new approach and the NBM method to the batch size and total number of observations in the output sequence, a set of 10000 different output sequences was generated for this process, each output containing 2560 observations. Then, both methods were applied with the total number of observations N = 160, 320, 640, 1280, and 2560 and the following numbers of batches for each case: k = 2, 4, 5, 8, 10, 16, 20, 32, 40, and 64 (with the exception that the k = 64 case was not used for N = 160). The results obtained for the two extreme values of N (160 and 2560) are presented in Tables II and III. These results show the 95% confidence limits on the coverage (upper part of each entry), and halfwidth (lower part) of the 95% C.I. calculated for the mean waiting time. In each table, the abbreviation NEW refers to the results obtained via the new method. From tables II and III we can see that NBM is very sensitive to batch size and to the total number of observations in the output process. The new method gives significantly better coverage than NBM. NBM does not take the correlation between batch means into account, and therefore underestimates the variance of the sample mean, producing misleading, narrower C.I.’s. Furthermore, for a fixed total number of observations, as the batch size decreases, the coverage of NBM deteriorates, but it does not fluctuate very much in the new method. For example, for a total number of observations of 160, as the number of batches increases from 2 to 40, the coverage in NBM decreases dramatically from 17
Number of Batches (k)
NEW NBM 0.8943±0.0060 0.8938±0.0060 2 13.3308±0.2863 13.3308±0.2863 0.8339±0.0073 0.7434±0.0086 4 3.1829±0.0485 5.2065±0.0814 0.7629±0.0083 0.7075±0.0089 5 2.6293±0.0373 4.3318±0.1158 0.8584±0.0068 0.6258±0.0042 8 1.9404±0.0243 9.6261±3.1466 0.8542±0.0069 0.5824±0.0097 10 1.7176±0.0204 7.7765±0.9281 0.8502±0.0070 0.4948±0.0098 16 1.3483±0.0147 10.8232±3.8475 0.8585±0.0068 0.4526±0.0098 20 1.2064±0.0128 9.0706±1.6604 0.8663±0.0067 0.3728±0.0095 32 0.9577±0.0096 10.8364±1.1418 0.8730±0.0065 0.3365±0.0093 40 17.7769±5.3903 0.8585±0.0085 Table II: Estimated coverage and expected half-width of 95% C.I’s for the mean waiting time of
M / M / 1 ( ρ = 0.8 ), N= 60
Number of Batches (k)
NEW NBM 0.9409±0.0046 0.9410±0.0046 2 6.3363±0.1136 6.3363±0.1136 0.9523±0.0042 0.9051±0.0057 4 1.7498±0.0229 2.8663±0.0404 0.9171±0.0054 0.8978±0.0059 5 1.5351±0.0190 4.9292±3.0864 0.9661±0.0035 0.8750±0.0065 8 1.2916±0.0145 5.7992±0.6727 0.9627±0.0037 0.8690±0.0066 10 1.2235±0.0131 6.5416±0.6739 0.9581±0.0039 0.8535±0.0069 16 1.1021±0.0105 10.9908±2.9870 0.9586±0.0039 0.8433±0.0071 20 1.0502±0.0095 13.4089±3.5990 0.9586±0.0039 0.8130±0.0076 32 0.9356±0.0074 14.7351±2.2839 0.9522±0.0042 0.7906±0.0080 40 0.8771±0.0065 15.8794±2.1657 0.9493±0.0043 0.7354±0.0086 16.0580±4.2793 0.7548±0.0050 64 Table III: Estimated coverage and expected half-width of 95% C.I.’s for the mean waiting time of M / M /1 ( ρ = 0.8 ), N = 2560
0.8938 to 0.3365, while these figures are 0.8943 and 0.8730 for the new method, respectively. When the total of N = 2560 observations are available, these numbers become (0.9410, 0.7354)
18
and (0.9409, 0.9493) for NBM and the new method respectively. Note that with the number of batches fixed, an increase in the total number of observations makes the batch sizes bigger, and hence the batch means becomes less correlated. Accordingly, the NBM method will have better coverage, but still the coverage of the new method is better in all cases. In addition, the estimated half-width of the C.I.’s obtained with NBM becomes smaller as the batch size decreases. This is expected, because as the number of observations per batch decreases, higher correlations exist between batch means, and the NBM method, which ignores these correlations, under-estimates the variance of the batch means and produces smaller intervals than it should. Also, for a large number of batches, the t-distribution quantile is relatively small compared to the small degrees-of-freedom case. In all examples we studied, the new method proved to be robust with respect to the total number of observations, and consistently produced better C.I. coverage than the NBM approach. However, this excellent coverage comes at the cost of relatively wide C.I.’s. This is not necessarily bad, since the C.I. width must adjust to achieve the proper coverage. The advantage of the new method seems to become more significant with short simulations in which the collection of observations is expensive. Accordingly, in the remaining discussion only a total of 160 observations will be considered. In the third case of the M / M / 1 queuing system, we increased the traffic intensity of the system to ρ = 0.9 with λ = 1 and µ = 1.111. Table IV shows the results of the estimated coverage and the estimated expected half-width of 95% C.I.’s for the mean waiting time of this M / M / 1 queuing system. 7.2 Empirical Results with M / M / 2 Queuing Systems In the second empirical study, we implemented the new C.I. method on M / M / 2 queuing systems
with an arrival rate of λ = 10 and service rates of µ = 5.55 . The 95% confidence limits for Wq in these queuing systems are given in Table V. We obtained the results from 10000 independent 19
replications, each containing 160 observations on the waiting time in the queue. From the table we can reach the same conclusions as in the previous study.
Number of Batches (k)
NEW NBM 0.9427±0.0045 0.9434±0.0045 2 73.0356±1.6630 73.0356±1.6630 0.9602±0.0038 0.9167±0.0054 4 17.3798±0.2902 28.4294±0.4881 0.8926±0.0061 0.8932±0.0060 5 14.3959±0.2239 24.3840±1.2418 0.9565±0.0040 0.8046±0.0078 8 10.7167±0.1471 42.1117±5.0857 0.9450±0.0045 0.7476±0.0085 10 9.5123±0.1246 42.4380±5.7366 0.9331±0.0049 0.6249±0.0095 16 7.5077±0.0907 43.8064±5.2441 0.9308±0.0050 0.56820±0.0097 20 6.7304±0.0789 54.0509±14.5516 0.9304±0.0050 0.4620±0.0098 32 5.3575±0.0597 69.6584±13.0872 0.9303±0.0050 0.4138±0.0096 105.4600±36.5264 4.8073±0.0525 40 Table IV: Estimated coverage and expected half-width of 95% C.I’s for µ of M / M / 1 ( ρ = 0.9 ), N = 160
Number of Batches (k)
NEW NBM 0.8086±0.0077 0.8095±0.0077 2 145.8370±3.0767 145.8370±3.0767 0.6700±0.0092 0.5372±0.0098 4 32.4629±0.4957 52.9746±0.8265 0.5914±0.0096 0.4828±0.0098 5 26.4111±0.3771 44.4744±2.6368 0.6841±0.0091 0.3808±0.0095 8 18.8601±0.2455 73.6348±6.6778 0.6685±0.0092 0.3405±0.0093 10 16.4835±0.2071 68.0030±9.1599 0.6481±0.0094 0.2629±0.0086 16 12.6768±0.1508 76.8544±32.4433 0.6471±0.0094 0.2381±0.0083 20 11.2613±0.1315 114.3020±70.5349 0.6793±0.0091 0.1895±0.0077 32 8.8307±0.0999 136.1660±80.9520 0.6929±0.0090 0.1700±0.0074 40 7.8815±0.0883 147.4520±55.9382 Table V: Estimated coverage and expected half-width of 95% C.I.’s for the mean waiting time of
M / M / 2 ( ρ = 0.8 ), N=160 20
7.3 Empirical Results with Autoregressive Processes The third empirical study involved the autoregressive process, presented earlier. This process has a
weaker autocorrelation structure than the M / M / 1’s. Table VI shows the estimated coverage and expected half-width of the 95% C.I.’s for µ = 0 calculated from the process from 10000 independent replications, each containing 160 observations.
Number of Batches (k)
NEW NBM 0.9494±0.0043 0.9496±0.0043 2 2.6197±0.0384 2.6197±0.0384 0.9771±0.0029 0.9501±0.0043 4 0.7436±0.0061 1.2154±0.0110 0.9440±0.0045 0.9469±0.0044 5 0.6566±0.0046 1.1900±0.0716 0.9819±0.0026 0.9371±0.0048 8 0.5525±0.0029 6.5063±6.7135 0.9784±0.0029 0.9318±0.0049 10 0.5224±0.0025 3.6259±1.0500 0.9712±0.0033 0.9143±0.0055 16 0.4677±0.0017 4.5848±1.1295 0.9674±0.0035 0.9012±0.0058 20 0.4424±0.0015 6.1747±1.9356 0.9625±0.0037 0.8583±0.0068 32 0.3875±0.0011 9.9976±6.1378 0.9608±0.0038 0.8333±0.0073 7.7416±2.1382 0.3610±0.0009 40 Table VI: Estimated coverage & expected half-width of 95% C.I. for µ of AR(1), N = 160
The results of Table VI show that for larger batch sizes, there is no significant difference between the actual and the nominal coverage obtained by the two methods, but NBM yields better C.I.’s because of its narrower half-widths. However, for smaller batch sizes, while the coverage in the new method remains almost constant, the coverage in NBM decreases. Once more, these results show that for small batch sizes the new C.I. method provides more reliable C.I.’s.
8. Conclusions This paper develops a new procedure for determining C.I.’s for the mean response in a single
steady-state simulation system. Using principal components approach, this procedure transforms the correlated batch-means vector into an independent random vector so that classical statistical
21
methods can be applied. Thus, the greatest potential source of error involved in the batch-means approach is eliminated and the actual coverage yielded by the new method is better than that of the NBM procedure. In general, the off-diagonal entries of the variance-covariance matrix of the correlated batch-means vector are non-zero. Nonetheless, if we pre-multiply the batch-means vector by the transpose of the matrix containing the eigenvectors of the estimated variance-covariance matrix, the resulting components of the random vector will be approximately uncorrelated, and in the case of the normal random vector, it will also be approximately independent. Based on this transformation and the assumption that the output data are obtained from the steady-state simulation, we obtained a valid C.I. on the mean. The new C.I. method was implemented on output sequences from M / M / 1 queuing systems with traffic intensities of 0.8 and 0.9, M / M / 2 queuing systems with traffic intensity of 0.8, and from an autoregressive model of order one [AR(1)]. The results obtained were compared with those of the NBM approach, for different combinations of run lengths and batch sizes. The new method consistently produced better coverage than the NBM approach, especially with small batch sizes. On the whole, the new method is not particularly sensitive to the batch size and performs well even when the total number of observations is not large. Note that due to the unavoidable fact mentioned previously, the excellent coverage of the new method is purchased at the cost of larger (but justifiable) half-width. Although the new method is not as simple as the NBM approach, it is still easy to understand and to apply.
9. Recommendations for Future Research In order to overcome some of the pitfalls associated with the new C.I. procedure and to better
evaluate the performance of the new methodology and to compare it with other relevant methods, some additional research needs to be done.
22
1) The new procedure must be thoroughly analyzed and compared to existing methods. (a) An asymptotic (large-sample) analysis of the procedure that establishes its validity and efficiency. Such an analysis provides a guarantee that the procedure is valid, at least as the sample size gets large, and allows its performance to be compared to the large sample properties of other procedures. (b) A small-sample analysis for the other interesting systems such as ARMA processes. In this type of analysis the true coverage probability, expected half-width, and variance of the half-width derived or estimated. (c) A thorough empirical study that is controlled for a wider range of output processes, which have characteristics that might affect the performance of the procedure. Estimators other than the NBM method ought to be included in this study. 2) Assuming normality of the weighted batch means, the interval endpoints in the confidence statement given in equation (4) are divided by the sum of the weights. The correctness of this statement is questionable because in practice the weights will be random variables, not a constant. More research needs to be done to study the distribution of the random variable defined as this ratio. 3) The weights and the batch means are both estimated from the same data set and may be correlated. In fact, if they are correlated then the E[B] in equation (2) would be difficult to derive. 4) As mentioned in the paper, the new C.I. method can be regarded as one yielding a new class of C.I.’s. More research is needed to evaluate the performance of the C.I.’s in this class. 5) The reliability of the estimators for the elements of the variance-covariance matrix is under question. More research can be done to illustrate the extent of and solutions to this problem. Acknowledgments The authors would like to thank the referees for their valuable comments and for their numerous
suggestions that improved the readability of this paper. In addition, we thank Mr. Payam
23
Poursaeid, an Industrial Engineering Student at Sharif University of Technology, for his endless efforts in the implementation of the methods.
24
References • Adam, N.R. and Klein, S.W. (1989) On Coverage Probability, Independence, and Normality in
Batch-Means Algorithms. Eur. J. Opnl. Res., 43, 206-215.
• Andrews, R.W. and Schriber, T.J. (1984) ARMA Based C.I.’s for Simulation Output Analysis. Am. J. Math. and Mgmt. Sci., 4, 345-374.
• Bischak, D.P., Kelton, W.D., & Pollock, S.M. (1993) Weighted Batch Means for Confidence Intervals in Steady-State Simulations. Management Science, 39, 1002-1019.
• Chien, C., Goldsman, D., & Melamed, B. (1997) Large-Sample Results for Bach Means. Management Science, 43, 1288-1295.
• Crane, M.A. and Iglehart, D.L. (1974a) Simulating Stable Stochastic System; I. General Multiserver Queues. J. Assoc. Comput. Mach., 21, 103-113.
• Crane, M.A. and Iglehart, D.L. (1974b) Simulating Stable Stochastic System; II. Markov Chains. J. Assoc. Comput. Mach., 21, 114-123.
• Duket, S.D. and Pritsker, A.A.B. (1978) Examination of Simulation Output Using Spectral Methods. Trans. Int. Assoc. Math. and Comput. in Simulation, 20, 53-60.
• Fishman, G.S. (1971) Estimating Sample Size in Computer Simulation Experiments. Mgmt. Sci., 18, 21-38.
• Fishman, G.S. (1973) Concepts and Methods in Discrete Event Digital Simulation. John Wiley, New York
• Fishman, G.S. (1978a) Grouping Observations in Digital Simulation. Mgmt. Sci., 24, 510-521. • Fishman, G.S. (1978b) Principles of Discrete Event Simulation. John Wiley, New York. • Fox, B.L., Goldsman, D. and Swain, J.J. (1991) Spaced Batch Means. Operations Research Letters 10, 255-263.
• Glynn, P.W. and Iglehart, D.L. (1990) Simulation Output Analysis Using Standardized Time Series. Math. Opns. Res., 15, 1-16.
25
• Goldsman, D. and Meketon, M.S. (1986) A Comparison of Several Variance Estimators. Technical Report J-85-12, School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA.
• Goldsman, D. and Schruben, L. (1990) New C.I. Estimators Using Standardized Time Series. Mgmt. Sci., 36, 393-397.
• Heidelberger, P. and Welch, P.D. (1981) A Spectral Method for C.I. Generation and Run Length Control in Simulations. Commun. ACM, 24, 233-245.
• Hillier, F.S. and Lieberman, G.J. (1990) Introduction to Operations Research, 5th ed., McGrawHill.
• Johnson, R.A. and Wichern, D.W. (1998) Applied Multivariate Statistical Analysis. 4th ed., Prentice Hall.
• Kelton, W.D. and Law, A.M. (1983) “A New Approach for Dealing with the Startup Problem in Discrete Event Simulation. Naval Res. Logist. Quart., 30, 641-658.
• Kelton, W.D. and Law, A.M. (1984) An Analytical Evaluation of Alternative Strategies in Steady-State Simulation. Opns. Res., 32, 169-184.
• Law, A.M. (1977) C.I.’s in Discrete Event Simulation: A Comparison of Replication and Batch Means. Naval Res. Logist. Quart., 24, 667-678.
• Law, A.M. (1983) Statistical Analysis of Simulation Output Data. Opns. Res., 31, 983-1029. • Law, A.M., and Carson, J.S. (1979) A Sequential Procedure for Determining the Length of a Steady-State Simulation. Opns. Res., 27, 1011-1025.
• Law, A. M. and Kelton, W.D. (2000) Simulation Modeling and Analysis, 3rd ed., McGraw-Hill, New York.
• Law A.M. and Kelton, W.D. (1984) C.I.’s for Steady-State Simulations: I. A Survey of Fixed Sample Size Procedures. Opns. Res., 32, 1221-1239.
26
• Meketon, M.S. and Schmeiser, B (1984) Overlapping Batch Means: Something for Nothing? Proceedings of the Winter Simulation Conference, S. Sheppard, U. Pooch, and D. Pegden (Eds.), IEEE, Piscataway, NJ..
• Miller, R.G. “The Jackknife, A Review,” Biometrica. 61 (1974) 1-15. • Sargent, R.G., Kang, K. and Goldsman, D. (1992), An Investigation of Finite-Sample Behavior of Confidence Interval Estimators, Opns. Res., 40, 898-913
• Sherman, M. (1998) Data-Based Choice of Batch Size for Simulation Output Analysis. Simulation, 71, 38-47.
• Schmeiser, B.W. (1982) Batch Size Effects in the Analysis of Simulation Output. Opns. Res., 30, 556-567.
• Schruben, L. (1983) C.I. Estimation Using Standardized Time Series. Opns. Res., 31, 10901107.
• Song, W.T. (1996) On the Estimation of Optimal Batch Size in the Analysis of Simulation Output. European Journal of Operational Research, 88, 304-319.
• Song, W.T. and Schmeiser, B.W. (1995) Optimal Mean Squared Error Batch Size. Management Science, 41, 110-123.
• Stark, H. and Woods, J.W. (1986) Probability, Random Processes, and Estimation Theory for Engineers. Prentice-Hall, Englewood Cliffs, N. J.
• Strang, G. (1980) Linear Algebra and Its Applications. Harcourt Brace Jovanovich Publishers, San Diego.
• Tokol, G., Goldsman, D., Ockerman, D.H., & Swain, J.J. (1998) Standardized Time Series LPNorm Variance Estimators for Simulations. Management Science, 44, 234-245.
• Welch, P.D. (1983) The Statistical Analysis of Simulation Results. In Computer Performance Modeling Handbook, S. S. Lavenberg (Ed.). Academic Press.
• Whitt, W. (1991) The Efficiency of One Long Run Versus Independent Replications in SteadyState Simulation. Mgmt. Sci., 37, 645-666. 27