A Methodology for Initialisation Bias Reduction in Computer ...

22 downloads 0 Views 190KB Size Report
is the determination of a suitable truncation point for reduction of initialisation bias in computer simulation output. An implementation of this methodology is.
A Methodology for Initialisation Bias Reduction in Computer Simulation Output Paul T. Jackway

y

Basil M. deSilva

z

Abstract We present a new methodology for detecting when the steady state behaviour of a discrete-time stochastic process has been approached after starting from a non-typical initial state. The main application and motivation for this method is the determination of a suitable truncation point for reduction of initialisation bias in computer simulation output. An implementation of this methodology is given which is most powerful for an exponential initial transient.

Keywords:

 This

computer simulation, initialization bias

paper is based on the rst author's Masters thesis at the Royal Melbourne

Institute of

Technology, Melbourne, Australia. y Centre for Signal ProcessingResearch, QueenslandUniversity of Technology, Brisbane, Australia. Email: [email protected] z Royal

Melbourne Institute of Technology, Melbourne, Australia.

1

1 Introduction In computer simulation studies an estimate is often required for the steady-state mean of some quantity of interest. For each computer run the model must be initialised to some known state. In queueing systems, for example, the \empty-and-idle" state is commonly used because of its convenience. If this initial state is not chosen at random from the steady-state distribution of the system under simulation (which in general is unknown) then the inclusion of early observations in an estimator can lead to a systematic bias in that estimator known as initialisation bias. One common approach is to wait to allow the simulation to \warm up", for the transient due to initialisation to decay to insigni cance before retaining data for analysis. This procedure is known as output truncation and various methods for the determination of a suitable warm-up period or truncation point have been proposed. Schruben, Singh, and Tierney[1] have presented an optimal test for detecting initialisation bias in a batch of data. This test is asymptotically optimal in a classical statistical (likelihood ratio) sense for a pre-speci ed form of transient. The optimal test is based on earlier work by Schruben on the asymptotic convergence of partial sums of deviations to a limiting stochastic process called the \Brownian bridge" which gives a more general and less powerful bias test[2]. Roth and Rutan[3] in a non data-dependent approach provide a heuristic for truncation point based on an approximation to the \relaxation time" of the initial transient. For a M/M/1 queueing system they show that the initial transient can be closely approximated by an exponential decay. This particular method is only valid for a M/M/1 queue starting from rest, however earlier work [4] has shown that more 2

general single-server queueing systems also possess an approximately exponential transient. Wilson and Pritsker[5] give a survey of some early proposals. Due to the poor performance of the early methods there is a trend towards more complicated and sophisticated methods. Kelton and Law[6] have published a comprehensive methodology which attempts to select both truncation point and run length by tting regression lines to sections of the data and testing for zero slope within an iterative procedure. Heidelberger and Welch[7] have embedded the Schruben Brownian Bridge test[1] in an iterative procedure to control simulation run length. This procedure gives perhaps the best results to date although it did not satisfy the authors' goal for a fully automated procedure suitable for inclusion in computer packages for novice users. Heidelberger and Welch[7] found that their method had diculty detecting a transient when the checkpoint was \in the transient" (a fact already noted by Schruben[2, p.588]) and had less power against weak transients than had been anticipated. Many more references are given in the recent review by Pawlikowski[8] who also gives an adaptation of the Heidelberger-Welch method. The method to be presented here has overall similarities to the Heidelberger-Welch approach but is based on a new concept we call \time scale invariance" which is introduced in the following section. The idea is to allow the method to adjust to the \time scale" of the transient without the need for the user to determine parameters. A formal methodology is discussed in section 3 with the corresponding algorithm in the appendix. Section 4 presents a modi ed bias test which is optimum for an exponential transient. The results of a limited evaluation with other current methods are given in section 5. 3

2 Time Scale Invariance The \time scale invariance" methodology, as outlined more formally in the next section, does not itself depend on a monotonic transient. However its implementation requires a bias test and this test may itself impose some additional constraints on the form of the transient. For the purposes of speci c explanation, and for the implementation presented in this paper we will optimise the method for an exponential transient. Oscillatory transients, see for example [2, g. 1-b], require a more general (and less powerful) type of bias test such as that presented in Schruben [2]. The restriction to exponential-like transients is still quite general. Work on the transient behaviour of queueing systems has led to the observation, \for many queueing systems, the rate at which a queue converges to its steady-state characteristics, independently of the system's initial state, eventually becomes (for large values of time t) dominated by an exponential term of the form exp(?t=r ) where r is a characteristic of the queueing system."[4] The review by Pawlikowski[8] points out the considerable theoretical work done on the relaxation times r of various queueing systems. In a wider context, whenever in nature we have a system x starting from state

x0 such that its rate of approach to steady-state  is proportional to its present distance from that steady-state, that is, a system described by the rst-order differential equation

dx dt

= k( ? x), we nd the ubiquitous exponential convergence

x =  ? ( ? x0 ) exp(?t=r ). The characteristic time constant r is called the \relaxation time". An engineering \rule-of-thumb" is that by time 4r the system is close (within 2%) to its steady-state value. This somewhat arbitrary 4r value is also 4

used by Roth and Rutan[3], and cited by Pawlikowski[8]. We consider that if a bias detection test must be a priori \optimised" to a given form of transient then the exponential convergence is probably the most useful single choice. Consider a simulation output series fXi g of steady state mean , containing an exponential initial transient of relaxation time r , and initialised to X0 = 0. By scaling in time by r we can consider this data as being a realisation of a process with a standardised exponentially decaying bias function, that is

E (Xi ) =  ? B (i=r ) =  ? B (ti );

(1)

with ti = i=r and the bias function,

B (t) =  exp(?t)

(2)

The quantity r is now seen as a \(time) scale parameter" which relates the sampling rate to the relaxation time. Given sucient data and knowing r we could partition the data into batches of length, say, b = 4r . By this procedure we have e ectively scaled the initial transient to t inside the rst batch fXi : i = 0; 1; 2; : : :; b ? 1g | irrespective of the scale of the transient itself. To see this, consider the form of the bias in batch j ,

Bj (ti ) = e?4j  exp(?ti ); = e?4j B0 (ti );

i = 0; 1; 2; : ::; b ? 1 j = 0; 1; 2; : :: i = 0; 1; 2; : ::; b ? 1 j = 0; 1; 2; : : :

(3)

Thus, the bias in the second batch is approximately 2 percent of that in the rst batch and in the following batches smaller still. A data series batched in this way has the useful property that the form and relaxation time of the bias function in the rst 5

batch are known. This enables a bias test of optimum power to be used on this batch with the resultant conditions that bias will be detected the rst batch, but not in the second or succeeding batches. The procedure can now be inverted: without explicitly trying to nd r the batch size can be increased from a low value to the point where the above conditions are satis ed. The end of the rst batch becomes in e ect the \point past which bias cannot be detected" and thus a suitable truncation point. We call a truncation point determination method employing this technique a scale invariant truncation-point (SIT) method. The idea of time scaling can be seen as an extension of the \standardised time series" approach of Schruben[9], who suggested time scaling the data to the length of the available dataset or batch which is arbitrary in relation to the initialisation bias. Within the bias testing framework described here it makes more sense to attempt to scale to the characteristic relaxation time of the bias. Thus e ectively standardising that bias and facilitating its detection. This method is more formally speci ed in the methodology and algorithm which follow.

3 Methodology Consider, X1 ; X2 ; : : :; Xn , a set of simulation output data. This data is suspected of containing an initialisation transient. Divide the data up into m (m > 2) equal batches each of b points. Let M (i); i = 1; 2; : : :; m be some non-negative measure of the true bias in batch i, and  a negligibly small number. Since we are dealing with

6

an \initialisation transient" we can consider the bias to be non-increasing, that is:

M (j )  M (i);

j>i

(4)

We shall also require that the data set is of sucient length so that there is negligible bias in the latter half of the data.

M (i) < ;

i > m=2

(5)

We explicitly make this assumption, which we don't consider restrictive, where some other authors leave an equivalent requirement implicit. If we have absolutely no knowledge of the time-scale of the transient then no possible bias detection method could work since the total data at hand may conceivably be only an in nitesimal section of the transient and therefore give no information on a suitable truncation point. To illustrate the principle we assume the availability of an \ideal bias detector"

DfM (i)g with sensitivity  >  and with the characteristic that 

true; DfM (i)g = false;

M (i) >  ; M (i)  :

(6)

Then, either there exists at least one b0 < n=2 such that the terminating condition, 

true; DfM (i)g = false;

i = 1; i = 2; 3; : : :; m:

(7)

is met, or we can infer that no detectable bias exists in the series. That is, if detectable bias exists then a batch size b0, can be found such that the initial batch contains all the detectable bias and subsequent batches are free from detectable bias. In the practical implementation of this methodology the \ideal" bias detector of equation 6 is replaced by some practical method of batch bias detection. Therefore the 7

\ideal" decision test M (i) >  is replaced by a practical alternative such as T (i) >  where T (i) is some statistic stochastically dependent upon the true bias. We have used the \Schruben Test" given in [1] which we have adjusted to be optimal for an exponentially decaying bias function as shown in the next section. The remaining task is to describe a method for searching for b0. We have chosen a simple technique of starting with a small b = binit and increasing b by a multiplicative factor b until a test for bias shows ftrue, false, falseg when applied to the rst three batches. The algorithm which implements this method is given in the appendix. More complicated and intelligent search strategies may give improved results.

4 Modi cations to the Schruben Test In 1983, Schruben, Singh and Tierney presented a family of optimal tests for detecting initialisation bias in a batch of simulation output data[1]. Their implementation was optimised for the somewhat arbitrary choice of a quadratic transient which is easy to calculate and somewhat representative of a monotonic transient. We decided to optimise instead for an exponential transient because of its greater connection with the relaxation time concept and its general theoretical applicability to queueing systems. Consider the exponential transient of relaxation time b=4, starting from zero, and approaching a steady state mean of ,

E (Xi ) =  (1 ? exp(?4i=b)) ; i = 1; 2; : : :; b

(8)

where b is the number of data points in the batch. Following the method in Schruben, Singh and Tierney[1] the required weights are,

ck = exp(?4k=b) (1 ? exp(?1=b)) 8

(9)

The test statistic is then,

T=

b X k=1

ck kSk 

b X k=1

kSk exp( ?b4k )

(10)

where Sk is the cumulative sum process[1], Sk = X n ? X k with, X k = k1

Pk

i=1

Xi .

Under the null hypothesis of no bias in the mean, the test statistic is normally distributed with zero mean and a variance given by[1] Var(T) = 2



 b X b X

b min( bi ; bj ) ? bi : bj ci cj i=1 j=1

(11)

P Where 2 is a constant related to the variance of the time series, 2 = x2 +2 1 i=1 x (i),

with, x2 = Var(X ), and, x (i) = Cov(Xb?i ; Xb ). When the data is serially correlated,

2 6= x2 in general. Approximating, using the change of variables, i = bs and j = bt giving didj = b2 dsdt, we have, Var(T)  b  (1 ? exp(?1=b)) 3 2

2

Z 1Z 1 0

Z 1

0

(min(s; t) ? st) e?4s e?4t dsdt

(1 ? t)e?4t

Z t

= b  (1 ? exp(?1=b)) 2 se?4s dsdt 0 0 1 ?1 + 2e?4 ? 3e?8  (1 ? exp(?1=b))2 = b32 256 32  b247 3 2

2

(12)

So, under the null hypothesis, T 0 = 15:7T (b32 )?1=2 has a standard normal distribution. Now if we have an independent estimate of 2 , with associated degrees of freedom  , we can form the test statistic T^0 = 15:7T (b3^2 )?1=2 which has a students t distribution with  degrees of freedom. The values of  for di erent estimators are given in table 32 of Fishman [10]1. The hypothesis of no initialisation bias is rejected 1

NB: The last line of this table is slightly incorrect, the correct quantity is given in [11, expression

(5.73)].

9

if T^0 > t(; ). Here t(; ) is the upper 100 -quantile of the t distribution with  degrees of freedom. The estimate of the white noise variance of the process ^ 2 and the degrees of freedom  were obtained by a Fishman estimation procedure[11, section 5.10] from the latter part of the data. This procedure ts an autoregressive (AR) model to the data and estimates ^ 2 from the tted model. The tting of an AR model (of nite order) to simulation data may not always be appropriate, however it seems to work reasonably well in the situations tested. We used the Fishman procedure because it was used by Schruben, Singh and Tierney[1] and the FORTRAN code was readily available in the appendix to Fishman's book[11]. With a similar test Heidelberger and Welch[7] use a \spectral estimator" of 2 . In this approach 2 is seen as the spectral density of the process at zero frequency, this estimator is discussed at length in [12]. Estimating models and parameters from time series data is a specialist subject area in itself, see for example Brockwell and Davis[13], and research is ongoing. We are indebted to one of the referees for making the suggestion that it may be preferable to use the standardised time series method of Schruben[9] to e ectively estimate 2 . Schruben has given two statistics G and H based on the maximum and sum of the standardised time series respectively. Both these statistics are asymptotically distributed as 2 times a chi-square random variable with the degrees of freedom given in [9]. The idea is to compute either G or H from the latter (stationary) part

p

p

of the data then form the ratio T 0= G or T 0= H which has a students t distribution with the corresponding degrees of freedom. This procedure has not been tried here although it will have computational advantages, may work better, and does seem to 10

be more in keeping with the \style" of the bias test itself. After testing it was found that the power of the above test was poor when the batch size was signi cantly smaller than the optimum of 4r , that is, the batch ended inside the transient. A remark to this e ect had been made by Schruben[2, p.588]), and this e ect was noted by Heidelberger and Welch[7]. The solution adopted was to augment the Schruben test with a simple di erence between means t-test comparing the batch mean with an estimate for the steady-state mean obtained from the end of the data along with the Fishman estimate of the scale parameter. The augmented test consists of applying the di erence in means test then if that test fails to detect bias applying the Schruben test. The augmented test is more robust with respect to batch length, and has proved more suitable for the SIT procedure. It may also be desirable to supplement the test with the more general Schruben test[2] as Schruben, Singh and Tierney recommend that both tests be conducted serially[1].

5 Results It is possible to evaluate truncation point methods in at least two ways di ering in emphasis. Some authors concentrate on the quality of the nal estimator after the truncation point method has been applied, such schemes often also include an attempt at automatic run-length control as well as truncation (for example see [6],kn:hw83). Our purpose here is to examine the behaviour of the truncation point method per se; this approach is relevant where the simulation run length is not arti cially constrained (by processing time, for example) and adequate run lengths following the truncation point are available for evaluation of the estimator. It is then desirable, in principle, to 11

eliminate the most biased \bad data" at the beginning of each run as such truncation does not increase the variance of the estimator by reducing sample size. We present in tabular form some summary statistics to indicate the behaviour of the SIT method on data. The SIT method has been evaluated on ve data-sets obtained from simulation models. The models were written in the simulation language SIMAN[14]. More complete details of the models, their implementation, and the test results are available in [15]. A brief description of each model follows together with a reference to its description in the open literature. Model 1

Waiting time in a M=M=1 queueing system with a trac intensity of 0:9

and a service rate of 1=9s?1. 100 replications of 5000 points. Averaged across 10 replications to give 10 data sets of 5000 points. The steady-state mean is 81 seconds. [16] Model 2

The number of customers in a network of three capacitated M/M/s queues

with feedback. 100 replications of 2000 customers. Averaged across 10 replications to give 10 data sets of 2000 points. The steady-state mean for this system is unknown to the authors. [1] Model 3

The response time of a time-share computer system. Parameters used

were N = 35; 1 = 1=25; 2 = 5=4; q = 1;  = 0:015, with these values the steady-state mean response time is 8.246 seconds. 10 replications of 2000 task completions.[17] Model 4

The response time of a central server computer system. Parameters used

were M = 3; N = 8; 1 = 1; 2 = 0:5; p1 = 0; p2 = 0:5; p3 = 0:5, with these 12

values the steady-state mean response time is 10 seconds. 10 replications of 2000 task completions.[17] Model 5

An AR(2) time series with arti cial \bias." An AR(2) model Xi = 0:75Xi?1?

0:5Xi?2 + "i + bi with bias bi = (i ? 30)=30; i  30, and f"i g is a sequence of uncorrelated standard normal random variables. 10 replications of 2000 points. The steady-state mean is zero.[1] Judging by its widespread use, the in nite capacity M/M/1 queue (Model 1) has become the de facto standard model for evaluating initialisation bias detection schemes. With high loading this system causes trouble for all detection methods. With the parameters shown the Roth and Rutan relaxation time[3] is r = 2441 seconds giving a truncation point of 4r  91 = 1085 service completions. Taking this as a reference value we can compare the SIT method with the published values for several other current methods taken from [16]2. The results of this comparison are shown in Table 1. All the methods on average fell short of the reference value of 1085 [3] which is given for an expected value of the data within 2 percent of steady-state. The average truncation point for the SIT method obtained after 10 runs is 686 and the other methods less again. A truncation point of 686 corresponds to a bias value below about 5 percent of the steady-state mean for this model. We can conclude on this limited data that the SIT method appears to be more sensitive at detecting bias than these other recent methods. This comparison is somewhat unfair to some of the other methods which have been developed and tuned to minimise some other quantity such 2

The truncation point for the Roth & Rutan method is shown incorrectly in [15] and [16].

13

as the mean-square-error of an estimator. However as far as sensitivity to bias is concerned such a comparison gives an indication as to the performance of the SIT method. It is important that a truncation point method returns \sensible" values almost all the time on all kinds of data. Because of statistical variation some simulation series show no sign of initialisation bias whereas others of the same model exhibit abnormally large \apparent" initialisation bias. Within the philosophy of evaluation expressed at the beginning of this section it is important for a method to select a suitable truncation point based on the data at hand, so a large variation in truncation points is to be expected for some models. The nal results for the SIT method are presented in table 2, where we present a summary of the truncation point statistics, mean, maximum, minimum, standard deviation, and the number of runs where bias was detected, over 10 runs for each model. The minimum batch size in the algorithm was set to 25 points for statistical reasons, this caused the method to fail to detect any bias in some instances where the transient was very short. A good indicator of the performance of the SIT method are the results for model 5. Model 5 is an AR(2) series with \arti cial" linear bias. The correct truncation point for this data is 30. Bias was detected in all 10 runs, the truncation point was set at 25 six times, 31 once, 58 twice, and 112 once for an average of 41.

14

Table 1: Average truncation point over 10 replications on M/M/1 queueing system ( = 0:9). Method

Average Truncation point

1085z

Roth SIT

686

Kelton*

395

Schruben*

263

Kimbler* 166 data from Kimbler & Knight[16].

ztheoretical value.

6 Summary and Conclusion The SIT method is based on standardising the time series based on the relaxation time of the initialisation transient. This leads to certain properties on the detectability of bias in data batched in multiples of this relaxation time. This procedure in inverted to obtain an estimate for the relaxation time and thus a suitable truncation point. For unknown transient behaviour a general test such as Schruben 1982 [2] should be used. For monotonic transients a test [1] optimised for an exponential transient is suggested. These tests may need to be augmented by a simple di erence-betweenmeans test to ensure power for long transients. The bias tests require an estimate of the process noise variance 2 . This estimate can be obtained from the stationary far-end of the data by several alternative meth15

Table 2: Summary statistics for SIT method over 10 runs. Model

Truncation point mean

sd min

detectmax

ion % 90

1

686 543

96- 1600+

2

294 205

93

660+

100

3

36

19

25-

87

40

4

32

14

25-

72

80

5

41

27

25

112

100

A (-) sux indicates that bias was not detected: a default minimum value is used as the truncation point. A (+) sux indicates a premature exit in the algorithm (data set too short): the current maximum value is used as the truncation point.

ods. An AR model tting procedure was used here but this is not necessarily being promoted. In conclusion we remark that the SIT method as a whole o ers an improvement in sensitivity over other contemporary methods. The results of limited testing indicate that the truncation points obtained on various models are reasonable. Very large truncation points seem rare, and bias is almost always detected where it is obviously present (models 1 & 5). The performance depends on the power and robustness of the batch test contained within the method. The SIT framework may be suitable when newer more powerful and more robust batch bias tests are developed. Suggestions for improvements are given in notes in the appendix.

16

References [1] Lee Schruben, H. Singh, and L. Tierney. Optimal tests for initialization bias in simulation output. Operations Research, 31(6):1167{1178, 1983. [2] Lee Schruben. Detecting initialization bias in simulation output. Operations Research, 30(3):569{590, 1982.

[3] Emily Roth and Alan H. Rutan. A relaxation time approach for reducing initialization bias in simulation. In The 18th Annual Simulation Symposium, pages 189{ 203, 1985. [4] Amedeo R. Odoni and Emily Roth. An empirical investigation of the transient behaviour of stationary queueing systems. Operations Research, 31:432{455, 1983. [5] James R. Wilson and A. Alan B. Pritsker. A survey of research on the simulation startup problem. Simulation, 31(2):55{58, 1978. [6] W. David Kelton and Averill M. Law. A new approach for dealing with the startup problem in discrete event simulation. Naval Research Logistics Quarterly, 30:641{658, 1983. [7] Philip Heidelberger and Peter D. Welch. Simulation run length control in the presence of an initial transient. Operations Research, 31:1109{1145, 1983. [8] Krzysztof Pawlikowski. Steady-state simulation of queueing processes: a survey of problems and solutions. ACM Computing Surveys, 22(2):123{170, 1990. [9] Lee Schruben. Con dence interval estimation using standardized time series. Operations Research, 31(6):1090{1108, 1983.

17

[10] George S. Fishman. Concepts and Methods in Discrete Event Simulation. John Wiley & Sons, New York, 1973. [11] George S. Fishman. Principles of Discrete Event Simulation. Wiley Series on Systems Engineering and Analysis, John Wiley & Sons, New York, 1978.

[12] Philip Heidelberger and Peter D. Welch. A spectral method for con dence interval generation and run length control in simulations. Communications of the ACM, 24(4):233{245, 1981.

[13] Peter J. Brockwell and Richard A. Davis. Time Series: Theory and Methods. Springer Series in Statistics, Springer-Verlag, New York, 1987.

[14] C. Dennis Pegden. Introduction to SIMAN. Systems Modeling Corp., Pennsylvania, 1986. [15] Paul Thomas Jackway. The Determination of Data Truncation Point for Initialization Bias Reduction in Discrete-Event Computer Simulation. Master's thesis,

Royal Melbourne Institute of Technology, Melbourne, 1990. [16] Delbert L. Kimbler and Barry D. Knight. A survey of current methods for the elimination of initialization bias in digital simulation. In The 20th Annual Simulation Symposium, pages 133{152, 1987.

[17] Averill M. Law and J.S. Carson. A sequential procedure for determining the length of a steady-state simulation. Operations Research, 27:1011{1025, 1979. [18] A.V. Gafarian, C.J. Ancker, Jr., and T. Morisaku. Evaluation of commonly used rules for detecting \steady state" in computer simulation. Naval Research Logistics Quarterly, 25:511{529, 1978.

18

7 Appendix: The SIT Algorithm The SIT algorithm is given using the following notation:

n total data points. b batch size. b multiplicative batch size increment.

binit initial batch size. tp truncation point ec error code indicating the type of terminating condition: Null : normal completion, LeftLimit : initial batch tested good, RightLimit : n too small. Step 1

Initialise the batch size. b

binit

Step 2

Obtain data for batches, 1, 2, and 3.

Step 3

Test for bias in batches 1, 2, and 3.

Step 4

Make a decision based on the test results.

 If the test results are ftrue, false, falseg then tp

b, ec

Null, EXIT

b, ec

LeftLimit,

(this is the normal exit point)

 If the test results are ffalse, false, falseg then tp

EXIT (this is an error exit indicating that no bias was found in the initial batch) 19

Step 5

Increase the batch size. b

Step 6

If b > n=3 THEN tp

b  b.

b, ec

RightLimit, EXIT (this is an error exit

indicating the series is too short for the algorithm to reach a normal termination) ELSE GOTO Step 2. The algorithm is further expanded in the following notes:

 binit can be obtained by a simple method known to underestimate the truncation point. The method we used is an adaptation of the rst-crossing-of-the-mean method[18]. The mean of the entire data set is calculated when the data is rst read into the program. Then the data is scanned forward from the beginning to nd the rst data-point which crosses this mean | this becomes binit. binit is constrained to exceed 25 points to ensure an adequate batch size for the Schruben test.

 The algorithm as presented is guaranteed to terminate since the batch size increases each iteration and must eventually the maximum of n=3 points. This limit has been set to ensure at least three batches to test. At step 6 instead of exiting with an error it may be possible to increase n, for example by generating more data.

 A more intelligent decision algorithm could no doubt look at more batches and take into account results from previous iterations at step 5 although this has not been attempted here.

 The Schruben test requires an unbiased estimate of the variance of the data. We obtained this estimate from a Fishman auto-regressive model tting procedure[11] 20

applied to the second half of the data. This is the reason for assuming that there is negligible bias in the latter part of the data.

 Computational eciency on a nite machine dictates that a whole batch of data be held in an ARRAY with a nite size. However, the spirit of the scale invariant procedure suggests that we should not arti cially limit the maximum size of a batch in the algorithm. The solution is found in a dynamic re-batching technique[12]. The idea is that instead of working with the data directly, we use psuedo-points where a pseudo-point after the kth re-batching is de ned as the average of 2k consecutive data points. The number of pseudo-points can always be kept within the limits of the nite array. After re-batching the Fishman procedure is rerun as it is also constrained by an array size limitation.

21

Suggest Documents