5th Annual IEEE Conference on Automation Science and Engineering Bangalore, India, August 22-25, 2009
Regular Flow Line Models for Semiconductor Cluster Tools: A Case of Lot Dependent Process Times James R. Morrison, Member, IEEE
Abstract—We develop a reduced complexity recursion for the wafer delay in each server in flow lines with wafer dependent deterministic or regular process times and demonstrate how it can serve to model lot production in semiconductor cluster tools with setups. Under certain assumptions on the process times, it is shown that the system behavior shares some similarities with the case of wafer independent process times, thereby enabling our results. Such models can be used to substantially increase the fidelity of existing fabricator simulation models, without the computational complexity of a complete step-by-step wafer, module and robot simulation. The models have been tested using data from a clustered photolithography tool in production and exhibited throughput and tool sojourn time values within 1% and 4% of the actual values, respectively.
I. INTRODUCTION
S
models of semiconductor wafer fabrication facilities (fabs) invariably employ tool models that allow for the specification of a time for the first wafer to exit the tool once it enters production, the so called first wafer delay, and a time between subsequent wafer exits from the tool. Additional features such as batch sizes, a first batch delay and setup times are also common. Whereas these models have served for many years, recent questions have been raised about their applicability to key toolsets [1]. Such concerns have arisen due to the impending proliferation of small lot sizes brought about by the anticipated 450 mm wafer era and the increasing use of cluster tools. One key tool for which traditional simulation models may prove inadequate for future simulation needs is the clustered photolithography tool. These tools typically cost on the order of US$30 million and are thus intended to serve as the fab bottleneck. The term clustered refers to the fact that the photolithography scanner is generally physically attached to the pre-scan process modules, termed the coat track, and to the post-scan process modules, termed the develop track. Sizable internal buffers are employed at the interface of the pre-scan track and the scanner itself (and sometimes between the scanner output and the post-scan track) and setups may be required at the pre-scan track, the scanner and the post-scan track when changing from wafers of one type to another. Due to the setups and the internal buffer, the first wafer delay depends upon the past production history of the tool. Such dependence grows in import as lot sizes reduce and wafer IMULATION
Manuscript received March 3, 2009. Revision received June 21, 2009. J. R. Morrison is with the Industrial and Systems Engineering Department and the KAIST Institute for the Design of Complex Systems, KAIST, 373-1 Guseong-dong, Yuseong-gu, Daejeon 305-701, Republic of Korea (e-mail:
[email protected]; homepage: http://xS3D.kaist.edu).
978-1-4244-4579-0/09/$25.00 ©2009 IEEE
diversity increases. This dependence on the past has implications for modeling both throughput and cycle time. To address the insufficiency of traditional models of cluster tools, one can employ a full scale model of all the internal action within the cluster. That is, one can model the movement of each wafer from process module to process module within the tool and include the robot wafer transport decisions and time. However, this approach is highly inappropriate for a fab level simulation due to the dramatic increase in computational complexity associated with a detailed simulation of wafer progress within the tool. To address the increase in computational complexity arising from the consideration of internal tool behavior, a first modeling simplification is to ignore the robot. This approach can work well when the tool is operating outside of the robot limited regime (as one certainly hopes is the case for the fabricator bottleneck) and, for non-reentrant process flows inside the tool, yields a flow line model. However, even flow line models can require too much computation to serve. Building on the classic work of [2] and [3] in an effort to gain general insight into the wafer evolution in flow lines and reduce the computational complexity, it was demonstrated in [4, 5] that a regular asynchronous flow line may be exactly decomposed into subsets. Such decomposition allows for a nearly two orders of magnitude reduction in computation for many quantities of interest. There, the flow line model was restricted to deterministic (regular) process times that are independent of the wafer. Here we study flow line models for clustered photolithography tools that allow wafer dependent process times in each process module. As a consequence, our models are much more amenable to application. One consequence of allowing wafer specific process times is that the models are not as efficient as those of [4, 5]. However, they still require about one order of magnitude less computation than a direct flow line model. Further, the models have been tested on data from a clustered photolithography tool in production serving diverse classes of wafers and provided throughput and cycle time values to within 1% and 4% of the actual values, respectively. The models are thus good candidates for full fabricator simulations that require improved model fidelity to capture the effect of reduced lot size and wafer diversity. The rest of the paper is organized as follows. In Section II, we review flow line models and introduce our assumptions on the process times. The decomposition of such a flow line into channels and development of the basic recursions is discussed in Section III. In Section IV, we extend the results to wafer lots and setups. Application issues such as computational
561
complexity and an example are provided in Section V. Section VI presents concluding remarks. Throughout, no proofs are given due to space limitations. Note however that the (max,+)-algebra algebra (c.f. [[6]) and induction are useful tools for developing the proofs. II. SYSTEM DESCRIPTION A flow line consists of M process modules, m1, m2, …, mM, from which customers receive service in sequential order. Hereafter, we refer to customers as wafers,, since this is the unit of work in semiconductor wafer manufacturing manufacturing. Thus a wafer, after receiving service from module mi, next requires service from module mi+1. After receiving service from module mM, the wafer exits the system. There is a waiting room of infinite capacity at the head of the system and w wafers may be considered to enter service (via module m1) in a first-come first-served order.. The arrival time of wafer w is denoted as aw. For wafers that arrive in batches ((hereafter termed lots, as in our application area), all wafers have the identical arrival times. A wafer that has received service from module mi, proceeds to module mi+1 immediately upon the vacancy of that subsequent module, unless i= i=M, in which case the wafer simply exits the system. This his asynchronous advancement protocol is termed manufacturing blocking in [7]. To be precise, we require that all processes are right continuous with left limits.
Fig. 1. A flow line with M = 8 modules and an infinite hholding area for queued wafers. Module m4 is labeled mB as it is the bottleneck.
Such a system is depicted in Figure 1. There There, the squares represent the modules and an infinite capacity waiting room is available prior to module m1. No buffers are depicted, nor is it necessary to distinguish between buffer modules and process modules. This is because, as shown in [1], buffers may simply be modeled as a module with zero service time. When the service times are deterministic and wafer independent, denoted as τj for module mj, the system described is an asynchronous flow line with regular service times and arbitrary input. This was the model studied in [[2, 3, 4, 5]. It was shown in [5] that many of the results of [2, 3, 4] apply in the case when the bottleneck module odule service time is allowed to depend on the wafer but always remains strictly greater than the service time of all preceding modules. Throughout the sequel, we assume that there here are K classes of wafers, c1, …, cK, and we use c(w) to denot denote the class of wafer w. Without loss of generality, we may assume that all wafers in a lot are of the same class. Within class ck, all wafers require the same deterministic service time from module mj; these service times are denoted as τjk. The service times are termed regular or deterministic and wafer dependent dependent; we call
the system a multiclass flow line. We may also say the service times are lot dependent or class dependent. To obtain our results, we restrict attention to service times with a certain structure as follows. Assumption A1: Service times between wafer classes. The service times are ordered such that τjk+1 = ηk τjk, k=1, ..., K-1, j = 1, ..., M, where 0 < ηk < 1.Let Let η be the product of the ηk’s, that is η = η1·…·ηK-1. Definition 1:: Successive bottlenecks. For a flow line with deterministic service times, a server is termed a successive bottleneck for wafers of class c if τjc > τic, ∀i < j. That is, we distinguish servers that have service times strictly greater than all preceding ding servers within each class. class Under Assumption A1, each class shares the same successive bottlenecks. Let σ denote their number and β(α) denote the server index for successive bottlenecks α = 1, …, σ. That is, if module m7 is the second successive bottleneck, b then β(2) = 7; also, β(σ) = B (the module index of the bottleneck for each class). Regular flow lines with service times satisfying Assumption A1 were considered in [8]. While their results address a more general class of flow lines than we will consider, they do not have the character aracter of the results of [2, 4, 5]. Employing the next assumption, which restricts the service times within each class, we can obtain results akin to those of [2, 4, 5] for a multiclass flow line. line Assumption tion A2: Service times within a class. For class c1 wafers, τj1 = ηj-β(α) τβ(α)1, for β(α) < j < β(α+1), α = 1, …, σ−1. For j > β(σ) = B, τj1 = ηj-Β τΒ1. That is, between adjacent successive bottlenecks and after the class bottleneck, bottleneck the service times decay geometrically at rate η. Example 1: Multiclass service times. t Consider a multiclass flow line with four customer classes and M=B=10. Let τB1 = 90, τB2 = 81, τB3 = 60.75 and τB4 = 48.6, so that η1 = 81/90 = 0.9, η2 = 60.75/81 = 0.75 and η3 = 48.6/60.75 = 0.8.The product η = 0.54. Class c1 (and each class) has three t successive bottlenecks (including mB); they are m1, m5 and m10 with service times τ11 = 40, τ51 = 60 and τ101 = 90. The service times of the intervening modules are τ21 = η τ11 = 21.6, τ31 = η2 τ11 = 11.664,…, τ61 = η τ51 = 32.4, τ71 = η2 τ51 = 17.496. Thus the service times of class c1 obey Assumption A2. Other service times for classes c2, c3 and c4 are obtained via application of Assumption A1 using η1, η2 and η3. □ Definition 2: Channels. The modules between and including uding two adjacent successive bottlenecks, that is modules mβ(α), …, mβ(α+1), are termed a channel. Let xj(w) denote the time that wafer w enters service with module mj. Let γj(w) := xj(w) + τjc(w) denote the time that
562
wafer w completes its service with module j (not including any time spent waiting for the subsequent module). To account for the possibility that there are delays or setups at the bottleneck module, we write τBc(w)(w) = τBc(w) + sB(w), where sB(w) > 0. We are now prepared to introduce our results. III. RECURSIONS FOR DELAY As is demonstrated in [3], there is a succinct recursion for the departure times from a regular flow line with arbitrary arrival process. Extending this concise recursion to the case of wafer dependent service times, even under Assumption A1, is not possible. However, by further restricting our attention to Assumption A2, we can ensure that here is no contention after the bottleneck module.
possible in-line delay wafer w could face, respectively. Here, we allow τBc(w)(w) = τBc(w) + sB(w), sB(w) > 0, to be a function of the wafer. Theorem 1: Delay evolution in a multiclass channel. Consider an initially empty multiclass flow line with M=B modules consisting of a single channel. That is, τBc(w) > τ1c(w) > τjc(w), j=2, …, B-1. The process times τBc(w)(w) = τBc(w) + sB(w), sB(w) > 0 are wafer dependent. Under Assumptions A1 and A2, Yw min *S w, Yw 1 τ
max ,τ
.
Lemma 1: No contention after the bottleneck. Under Assumptions A1 and A2, for M > B and initially empty, the start and completion times of wafers obey X w X w τ
X w X w 1 τ
w,
j B, , M 1,
w 1 τcnw τcnw1 ,
j B, , M. where τ w τ , j=B+1,…,M.
Yw: d w, and
S w % &τ
cw τj
/0 ,
x w max,a , x w 1 τc1 w1 d1 w 1-
d w min *τ
cw1 τj
, a x w 1
where Y(0) = 0, a0 = -∞, d0(0) = 0, d1(0) = 0. Further, the delays in each module are given as
In many cluster tool models (such as clustered photolithography and generic multicluster tools [9, 10]), a setup cannot begin until all wafers have vacated the relevant section of the tool. In some clustered photolithography tools, for example, the initial pre-scan track may require that all modules be vacant of wafers prior to initiating a setup. A similar condition may exist for multicluster tools and we call them state dependent setups. To assess the internal state of the flow line, as is necessary to model state dependent setups, one must account for the advancement of wafers within the system. The decomposition of the flow line into channels (as is done in [4, 5]) allows for the characterization of wafer evolution within the flow line. Under assumptions A1 and A2, the delay faced by a wafer in each channel is well structured and a recursion exists to determine its magnitude. If a wafer experiences contention for a server (and hence the wafer is delayed), then there is a first module in which that delay occurs and, in all subsequent modules in that channel until the last, the wafer faces the maximum possible delay. We require a few definitions to state the next result. Let dj(w) := xj+1(w) – γj(w) = xj+1(w) – xj(w) – τjc(w), j < B, denote the delay wafer w experiences in module mj. There is no delay in the bottleneck nor in any subsequent module. Let
w 1
w B j τ , Yw
w &τ
B n
τ '0
,
for j=1, …, B-1 (there is no delay in the bottleneck, though the process time may change). The delay to enter service is d0(w) := x1(w) - aw. Here {.}+ := max{0,.}. Theorem 1 gives us an alternative to calculating the advancement of each wafer through every module. The delay that each wafer faces within the flow line may be deduced by concatenating its constituent channels in the natural manner. We thereby obtain the following result, stated after a few definitions. Let Y
S 1 w %
1 w:
21
21
21
21
d w, and
&τ21
21
d21 w βα 1 j τ '
denote the experienced and maximum possible delay wafer w faces in channel α (not including delay in the last module of a channel). Theorem 2: Delay evolution in a multiclass flow line. For each successive bottleneck α, 1< α < σ,
w B j τ ',
denote the total in-line delay wafer w faces and the maximum
563
Y 1 w min *S1 w, Y 1 w 1 τβα 1 dβα 1 w 1
computational benefits accrue by the consideration of wafer lots and restricting attention to a single channel.
max ,τβα
βα 11
βα
. τj
, a1 x21 w 1-
cw1
cw
τj
The start times at channel-α are given as
A. State Dependent Setups In many clustered photolithography and multicluster tools, a change in wafer class requires a setup prior to the start of new wafers. The commencement of such a setup may be deferred until all wafers of the previous class have vacated an initial portion of the tool. This behavior may be modeled as in [5], even for the case of a multiclass flow line.
/0 .
x w max,a , x w 1 τc1 w1 d1 w 1-, 21
x21 w a d0 w τ
cw
1
Y 1 w, 6
for 2 < α < σ – 1. Further, the delays in each module are given as d w min *τβα 1
βα 1
w βα 1 j τ , Y 1 w
βα 1
βα 1
&τβα 1
dβα 1 w βα 1 n
τ '0
,
for j=β(α), …, β(α+1)−1. Also, d0(w) := x1(w) - aw , and a a , a1 % x21 w.
Here {.}+ := max{0,.}. The initial conditions are Yα(0) = 0, a0 = -∞, d0(0) = 0, d1(0) = 0 and x1(0) = -∞. Also, set τjc(w)= τjc(1), for all j and w < 0. In brief, the theorem states that each channel in a multiclass flow line (under A1 and A2) behaves in a manner similar to the channel in isolation (with care taken to account for behavior at the intersection of the channels). Computationally, the approach of Theorem 2 is roughly equivalent to that of simulating the advancement of each wafer and module in the system, as we shall see later. The following simplification is immediate. Lemma 2: Delay in successive bottlenecks. For α = 1,…, σ, .89,:2α2α 1/
d21 w 7τβα 1
8
τβα9,: Y ; ω=,
B. Wafer Lots in a Single Channel System We now turn our attention to batch arrivals, that is, wafer lots. Due to the wafer dependent process times, we cannot garner as much reduction in computational complexity as is possible for wafer independent process times (see, [4], [5]). However, some simplifications are possible. We also restrict attention to a single channel flow line. While this restriction does limit the system structure, implementation studies have shown that single channel models developed for more complicated practical systems can perform quite well. Use g to denote the index of wafer lots and denote by ωg,w the wafer index of the wth wafer of lot g. That is, for wafer lot 3, if lots 1 and 2 each consisted of 25 wafers, ω3,7 = 57 (the 57th wafer to arrive to the system). Also, let W(g) denote the number of wafers in wafer lot g. Abusing notation slightly we also let c(g) denote the class of the wafers in lot g. Thus, c(ωg,w) and c(g) are identical; we allow c(.) to take both a wafer index and a lot index as arguments (it will be clear from context whether we are to interpret the argument as a wafer index or a lot index). The next corollary immediately follows since the arrival times of all wafers in a lot are identical. Corollary 1: Delay for wafer lots in a single channel system. Consider a single channel multiclass flow line under A1 and A2 that is initially empty. Suppose that all wafers in a lot are of the same class, except possibly the first, which we allowed to have a greater service time at the bottleneck. That is, τjc(ωg,w) = τjc(g), j ≠ B. At the bottleneck, we have τBc(ωg,w) = τBc(g), for w ≠ 1, and τBc(ωg,1) = τBc(g)+sB(ωg,1), for sB(ωg,1) > 0. For wafers w = 2, …, W(g)-1 of lot g, the aggregate delay in the final channel obeys .89,: βσ1βσ/
Y ; ω=, min 7S; ω=, τβσ 8
S; ω=, > .
89,:
τβσ9,: , Y ; ω=, τB
For brevity, we refer the reader to [4, 5] for examples that demonstrate the idea of Theorems 1 and 2 (though, here we have multiclass terms, the essence is similar).
9,: > τβσ1 ,
8
where all wafers before the first lot are considered to be the same class as lot 1. The recursion holds for w = 1 by using τBc(g)+sB(ωg,1) instead of τBc(g). The following simplification is immediate.
IV. EXTENSIONS FOR SETUPS AND LOTS One value of Theorem 2, in addition to clarifying the manner in which wafers advance internal to the flow line, is to allow us to model state dependent setups. Also,
Lemma 3: Entry delay. For wafers 2,…, W(g) in wafer lot g,
564
d@ w ,τ1
cw1
d@ w 1 d w 1-.
Employing the simplifications for wafer lots in a single channel system, we can improve the complexity of the calculations. In so doing, the approach of this section yields recursions that are computationally more tractable than a complete module by module wafer evolution. V. PRACTICAL CONCERNS It remains to answer whether the theory suggested above leads to expressive models that both provide meaningful estimates of practical system performance and reduce complexity (beyond that of the basic flow line computations). First note that the theory proposed here has been employed to model clustered photolithography tools in production. The tools were running a variety of classes of wafer lot (different process times dependent upon the wafer lot). In addition, there were setups conducted on the lots (as well as production disturbances). The average time between lot exits from the tool (this number may be used to calculate throughput) and lot cycle time (total time in the tool) values as predicted by a single channel model were within 1% and 4% of the actual values, respectively. The single channel flow line model thus gave very high quality results and is a promising candidate for use in simulation studies of such tools. We next discuss the computational complexity of the proposed approach. Let FS be shorthand for “full simulation”, that is, the computation required to conduct a full module-by-module simulation for each wafer (as would be required without the discussion presented here). Let TH2 denote the computation required to conduct the operations of Theorem 2. Finally, let C1 denote the computation required to conduct the operations of Theorem 1 incorporating the simplifications of Corollary 1 for wafer lots. Below, we consider additions and subtractions to be Add operations and maximum and minimum to be Max operations. Theorem 4: Computational Complexity. The computations required to conduct the simulation of G lots each consisting of W wafers is depicted in Tables 1 and 2. The initialization computations and recursion computations are shown in Tables 1 and 2, respectively. Recall that there are K classes of wafers.
Method FS TH2 C1
# of Add 0
# of Mult 0
K2(2B-σ-1)+K(σ-B-1) +2(B+σ)-4 K(K-1)(B-1)
1 0
Table 1. Computations for initialization.
Method FS TH2 C1
# of Add GWB-1 (15σ-12)(WG-1) 27G+9G(W-2)
# of Max GWB-B (5σ-2)(WG-1) 8G+2G(W-2)
Table 2. Computations for recursions.
We conclude this section with an example. Example 2: Computational complexity. Consider a flow line with K = 20 classes of lots, W = 12 wafers per lot (for a small lot size, high mix scenario), B = 40 (including numerous buffer modules before the bottleneck), σ = 4 successive bottlenecks and β(σ-1) = 12 (since there are a large number of modules with 0 process times – the buffer modules). We are interested in simulating the system for G lots. The computations required for each of the approaches are given in Tables 3 and 4. As can be seen from the table, at the expense of some initial calculation (mostly due to the number of classes), the approach of C1 requires on the order of 4 times fewer additions and 17 fewer maximizations than the FS approach – about 1 order of magnitude improvement. This is a significant reduction, even with multiple classes of lots (though it is not as great as is possible with a single class of lots). □ Method # of Add # of Mult FS 0 0 TH2 29344 1 C1 10898 0 Table 3. Initialization computations for Example 2.
Method # of Add # of Max FS 480G-1 480G-40 TH2 576G-48 216G-18 C1 117G 28G Table 4. Recursion computations for Example 2.
VI. CONCLUDING REMARKS With the intent of applying our results to the simulation of cluster tools in semiconductor wafer fabricators, we studied a class of flow lines with deterministic, yet wafer dependent, service times. Wafers arrive to the flow line as an arbitrary process and we demonstrated that, under certain assumptions on the service times within and between classes, recursions exist for the delay faced by wafers in the system. The recursions employed an exact decomposition of the flow line into channels. Within each channel the wafer delays could be expressed by a single aggregate delay. By concatenating the channels in a natural way and accounting for their interactions, recursions for the whole flow line were obtained. The results extend those of previous work on flow lines to the case of wafer dependent service times and provide much better models for use in practice. In particular, by restricting attention to a single channel system it was demonstrated that substantially less computation is required than in a module-by-module wafer advancement simulation. Further, the models have been tested on a clustered photolithography tool in production and provided mean time between lot exits from the tool (this is the time used for throughput calculations) and mean cycle time (lot time in the tool) values within 1% and 4% of the actual measured values,
565
respectively. The models are thus good candidates for use in the modeling of cluster tools and, since they have reasonable computational requirements, may serve in simulations of entire factories. Though the models have demonstrated high fidelity in practical tests, a detailed study of the implications of using a single channel model under Assumptions A1 and A2 is warranted. Similarly, in what parameter regimes is it possible to ignore the wafer transport robots as is done here? How can one incorporate the possibility that the setup at the bottleneck can start before the next lot arrives? REFERENCES [1]
D. Pillai, “The future of semiconductor manufacturing: Factory integration breakthrough opportunities,” IEEE Robotics & Automation Magazine, Vol. 13, No. 4, pp. 16-24, Dec. 2006. [2] B. Avi-Itzhak, “A sequence of service stations with arbitrary input and regular service times,” Management Science, Vol. 11, No. 5, pp. 565-571, 1965. [3] H. D. Friedman, “Reduction methods for tandem queuing systems,” Operations Research, Vol. 13, No. 1, pp. 121-131, 1965. [4] J. R. Morrison, “Flow lines with regular service times: Evolution of delay, state dependent failures and semiconductor wafer fabrication,” Proceedings of the 4th IEEE Conference on Automation Science and Engineering, pp. 247-252, Aug. 2008. [5] J. R. Morrison, “Deterministic flow lines with applications,” IEEE Transactions on Automation Science and Engineering, submitted for publication, Dec. 2008. [6] F. Baccelli, G. Cohen, G. J. Olsder and J. P. Quadrat, Synchronization and Linearity, New York, NY: Wiley, 1992. [7] Y. Dallery and S. B. Gershwin, “Manufacturing flow line systems: A review of models and analytical results,” Queueing Systems, Vol. 12, pp. 3-94, 1992. [8] B. Avi-Itzhak and H. Levy, “Buffer requirements and server ordering in a tandem queue with correlated service times,” Mathematics of Operations Research, Vol. 26, No. 2, pp. 358-374, May 2001. [9] S. Ding, J. Yi and M. T. Zhang, “Multicluster tools scheduling: An integrated event graph and network model approach,” IEEE Transactions on Semiconductor Manufacturing, Vol. 19, No. 3, pp. 339-351, 2006. [10] W. K. Chan, J. Yi, S. Ding and D. Song, “Optimal scheduling of k-unit production of cluster tools with single-blade robots,” Proceedings of the 4th IEEE Conference on Automation Science and Engineering, pp. 335-340, Aug., 2008.
566