discrete-event simulation for semiconductor wafer ...

1 downloads 0 Views 998KB Size Report
International Journal of Industrial Engineering, 22(5), 661-682, 2015 ... discrete-event simulation in semiconductor manufacturing and how they can be avoided ...
International Journal of Industrial Engineering, 22(5), 661-682, 2015

DISCRETE-EVENT SIMULATION FOR SEMICONDUCTOR WAFER FABRICATION FACILITIES: A TUTORIAL John W. Fowler1, Lars Mönch2,*, Thomas Ponsignon3 1

Department of Supply Chain Management Arizona State University Tempe, AZ 85287-4706, USA

2 Chair of Enterprise-wide Software Systems Department of Mathematics and Computer Science University of Hagen 58097 Hagen, Germany * Corresponding author’s e-mail: [email protected]

Infineon Technologies AG 85579 Neubiberg, Germany 3

Discrete-event simulation is a well-established and rather successful method in some semiconductor companies, while other companies do not use simulation at all. Simulation is used for performance assessment and decision-making. This paper focuses on the methodological and practical issues that have to be addressed to build, use, and maintain simulation models for a semiconductor wafer fabrication facility (wafer fab). We describe and discuss the main steps of a simulation study in this domain. We seek to highlight the main issues and present alternative ways to address them. Common pitfalls in using discrete-event simulation in semiconductor manufacturing and how they can be avoided are also discussed. Keywords: discrete-event simulation; semiconductor manufacturing; modeling issues; tutorial (Received on May 28, 2015; Accepted on October 16, 2015) 1. INTRODUCTION Semiconductor manufacturing deals with producing integrated circuits on silicon wafers. Over the last 55 years it has processed from scientific research to a mature industry. Today, 200.000 and 250.000 people are employed in the semiconductor industry of Europe and the U.S, respectively. This industry supports more than one million additional European as well as American indirect jobs. European and U.S. semiconductor companies generated $33 and $146 billion in sales in 2012, respectively. Semiconductors make the global trillion dollar electronics industry possible (cf. European Semiconductor Industry Association 2015 and Semiconductor Industry Association 2015). According to the Semiconductor Industry Association, $34 billion were invested in research and development in 2013 by the U.S. semiconductor industry. Recent wafer fabs belong to the most complex manufacturing systems that exist today (cf. Mönch et al. 2013). The size and complexity of the related supply chains suggest that simple, intuitive, manual techniques are unlikely to perform well. While there is a considerable analytical culture in the semiconductor industry due to its science and technology roots, in the beginning the industry was mainly driven by device design considerations and yield management. Manufacturing and supply chain management was not viewed as a source of competitive advantage in the beginning (cf. Chien et al. 2011). Because of the fierce competition, model-based decision making has become more and more important. Among the different methods from Industrial Engineering, Computer Science, and Operations Research simulation is notably successful. The first scientific paper in this field we are aware of, namely (Dayhoff and Atherton 1987), deals with a simulation model of a wafer fab. Using simulation in semiconductor manufacturing was the object of intensive scientific discussions (cf. Fowler et al. 1998). At the same time, using discrete-event simulation in wafer fabs is not as straightforward as one might think at a first glance (cf. Fowler and Rose 2004 and Fischbein and Yellig 2011 for discussion of difficulties in simulation modeling for manufacturing companies). Simulation as a technology has a number of inherent difficulties and limitations that have to be taken into account when considering its use. Therefore, the main goal of the present paper is to present some of our knowledge and experience in designing, developing, and deploying simulation models in a tutorial-type manner. This includes a discussion of lessons learned during the application of discrete-event simulation in wafer fabs. ISSN 1943-670X

INTERNATIONAL JOURNAL OF INDUSTRIAL ENGINEERING

Fowler et al.

Discrete-Event Simulation for Wafer Fabs: A Tutorial

The rest of the paper is organized as follows. The semiconductor manufacturing process is briefly described in Section 2. For the sake of completeness, the basic principles of discrete-event simulation are summarized in Section 3. The application of simulation on different levels in semiconductor manufacturing is also described in this section. Building blocks of simulation models for wafer fabs are discussed in Section 4. The life cycle of a simulation model for a wafer fab is presented in Section 5. Pitfalls related to using discrete-event simulation are discussed in Section 6. Finally, we provide some concluding remarks in Section 7. 2. PROCESS DESCRIPTION OF SEMICONDUCTOR MANUFACTURING Following the description in (Mönch et al. 2013), we provide an overview of the semiconductor manufacturing process. Semiconductor manufacturing deals with producing semiconductor chips. A chip is a highly miniaturized, integrated circuit (IC) that consists of thousands of components. A semiconductor manufacturing process starts from raw wafers, thin discs made of silicon or gallium arsenide. Typical diameters of wafers are 200 or 300 millimeters. Up to several thousand identical chips can be produced on a single wafer. The ICs are built up layer-by-layer in a wafer fab. Depending on the complexity of the device being produced, up to 40 layers are possible for advanced technologies. Next, the completed wafers are transferred to a sort or probe area where electrical tests identify the individual ICs that are not likely to be of appropriate quality when packaged. After the sort stage, the probed wafers are sent to an assembly facility. Here, the wafers are diced into pieces, where a single piece contains a copy of the IC. Each of these pieces is called a die. In this facility, dies are put into a package. The packaged dies are finally tested in a test facility to make sure that only high-quality products will be sent to customers. A semiconductor enterprise consists of several wafer fabs and backend factories. A single wafer fab is composed of several work areas. These areas are used for wafer processing and sort in a clean room environment. A single work area typically contains dozens of different work centers. The work centers of a work area are closely related logically or due to their location. A single work center is a set of machines that provide similar processing capabilities. Work centers are also called machine groups or tool groups. Machines are called tools in semiconductor manufacturing. A single machine can have a buffer to store lots. These buffers typically have a finite capacity in 300-mm wafer fabs. Lots are the moving entities in wafer fabs. Each lot contains a fixed number of wafers. Lots are obtained from customer orders. On the one hand, very often several lots will be created to fulfill a customer order. On the other hand, because of the increased line width and more area per wafer in 300-mm wafer fabs, often fewer wafers are needed to fulfill a single customer order. It is not desirable to launch a large number of lots with a small number of wafers because then the material handling system will become overloaded. As a result, there is a need and incentive in 300-mm wafer fabs to group small-size orders from different customers into one lot. Some tools are able to process only a single wafer at a time, while other types of tools process lots in an overlapping manner. Besides single wafer tools and tools with overlapping processing, batch processing tools are typical in semiconductor manufacturing. A batch is defined as a set of lots that are processed at the same time on a single tool (cf. Mönch et al. 2011a). An interesting type of machines in wafer fabs is known as a cluster tool. Wafers with different types of process steps can circulate in a single cluster tool simultaneously, i.e., a cluster tool can be seen as a fully automated machine environment (cf. Lee 2008). A cluster tool is a single piece of equipment that gathers several process steps as well as transportation and metrology. The underlying reasons for this integration are:    

reduced floor space required less human interventions shorter processing times increased yields.

Next we discuss the most important wafer fab operations. Note that we will not describe sort and backend operations in detail due to space limitations (cf. Mönch et al. 2013 for more details and related literature). The following process steps are performed in wafer fabs: 1.

Oxidation/Diffusion: A layer of material is grown or deposited on the surface of a cleaned wafer. The goal of an oxidation step is to grow a dioxide layer on a single wafer, while diffusion aims at dispersing material on the wafer surface. Diffusion furnaces and rapid thermal equipment are used in the oxidation/diffusion work area. The furnaces are batch processing tools.

2.

Film deposition: Films are deposited onto wafers when the deposition step is carried out. Dielectric or metal layers are deposited. A dozen or even more such deposition layers can be found in advanced integrated circuits. Deposition can be executed by processes such as physical vapor deposition (PVD) or chemical vapor deposition (CVD). 662

Fowler et al.

Discrete-Event Simulation for Wafer Fabs: A Tutorial

3.

Photolithography: The main steps of the photolithography process are coating, exposure, developing, and process control. A wafer is coated with a thin film of a photosensitive polymer. The IC pattern is transferred via a reticle onto the photosensitive polymer, i.e. a photoresist strip. Three-dimensional patterns are the result on the surface of the wafer. Steppers are used to expose the wafers. This means that the pattern is transferred onto the wafer by projecting ultraviolet light through a reticle. The exposed wafer is then developed by removing polymerized sections of photoresist from the wafer surface.

4.

Etch: A wafer is partially covered by photoresist strip after photolithography. Material is removed from the areas of the wafer surface that are not covered by the etching step.

5.

Ion implantation: Doping material is deposited where parts of the wafer have been etched.

6.

Planarization: The wafer surface is cleaned and leveled by the planarization step.

7.

Cleaning/Inspection/Measurement: A cleaning step is performed before the wafers enter the oxidation/deposition/diffusion work area. Inspection and measurement steps, so called metrology steps, are necessary to control the processes within and between work areas. Special inspection tools can be found in all work areas.

The main steps of the wafer fabrication process are summarized in Figure 1.

Figure 1. Wafer Fab Operations (adopted from Mönch et al. 2013) Lots, wafers, or dies might be processed in such a way that they become damaged. Rework is sometimes a possible option to repair a damaged item. If rework is not possible, we obtain scrapped material. Yield is known as the percentage of dies that meet their electrical specification. Wafer fabs are different from conventional job shops with respect to certain facets. We start by discussing the main unusual features. In many wafer fabs, there are dozens of different process flows. Products that follow the same process flow are said to belong to the same technology. We differentiate between low- and high-mix wafer fabs. A single process flow contains typically between 300-800 process steps that are carried out on more than hundred different tool types. A single tool might be very expensive. Therefore, the same tool is shared by several lots that ask for a certain processing capability. Note that the lots might be at different stages of the production cycle. This behavior leads to reentrant process flows, i.e., wafers at the different stages in their production cycle have to compete with each other for the same scarce resources. The typical 663

Fowler et al.

Discrete-Event Simulation for Wafer Fabs: A Tutorial

reentrant flow is shown in a simplified form in Figure 1. The processing times of the individual process steps are rather different. Some steps require only a few minutes, while the duration of other steps is half a day. The longer processing times often belong to processes on batch tools. Often one-third of all operations in a wafer fab are related to batch processes. This leads to a nonlinear flow of products in a wafer fab. Because of the complicated tools in wafer fabs, long machine failures occur in a probabilistic manner. Preventive maintenance is commonly used to reduce the number and duration of machine failures. There is also a need to start prototype/engineering lots since the technological processes are difficult (cf. Crist and Uzsoy 2011). Very often, specific lots are more important than others. The important lots, also called hot lots, will be expedited to meet their due dates. Significant sequence-dependent setup times occur in some work areas. These setup times are caused for instance by changing temperature, gas pressure, and metal composition. Some operations require secondary resources, for instance, reticles, for their processing. Therefore, the tool as primary resource and the secondary resource have to be available at the same time. Operators are necessary to run wafer fabs. Even in highly automated wafer fabs, operators may load and unload wafers to tools, run the tools, and perform certain control and inspection steps. Furthermore, engineering staff may intervene to perform tool maintenance and requalification activities in order to prepare the equipment for production. Time constraints between consecutive process steps are another important restriction that can be found in many wafer fabs. The process engineering department installs time windows to prevent native oxidation and contamination effects on the wafer surface by respecting the time constraints. Note that the time constraints are often nested (cf. Klemmt and Mönch 2012). Cluster tools cause specific restrictions. They can process two or three lots in parallel because a cluster tool generally has multiple load ports. The lots processed in parallel might have to undergo different sequences of process steps, i.e., their recipe is different. Resource conflicts are the result of such parallel processing since the wafers that are processed in parallel share the handling robots, the load locks, and the process chambers of the cluster tool. These conflicts lead to an increased processing time compared to a stand-alone processing time. While in most 200-mm wafer fabs material handing is generally carried out manually by operators that use carts or personal guided vehicles, wafers and reticles are transported in a fully automated manner in modern 300-mm wafer fabs using an Automated Material Handling System (AMHS). Front opening unified pods (FOUPS) are used as carriers. Automated material handling is always a critical operation in wafer fabs (cf. Agrawal and Heragu 2006). However, due to space limitations, we will exclude a detailed discussion of AMHS operations in this introductory tutorial and will refer to related literature (cf., for instance, Foster and Pillai 2007). 3. DISCRETE-EVENT SIMULATION In this section, we start by briefly discussing principles of discrete-event simulation in Subsection 3.1. We then survey applications of simulation on different levels of the production planning and control hierarchy in semiconductor manufacturing in Subsection 3.2. 3.1 Basic Principles of Discrete-event Simulation In this subsection, we introduce the basic notions and mechanisms of discrete-event simulation following mainly Banks (1998), Law (2015), Banks et al. (2010), and Schriber et al. (2014). Simulation is a technique that is used to mimic the behavior of a real-world system. We start by introducing the notion of systems. A system is given by a set of components. These components and their associations provide the structure of the system, while the interactions of the components determine the behavior of the system. Only aspects of a certain system are of interest in many situations. Therefore, we work with models, appropriate representations of the original system. Creating a model is driven by goals that have to be achieved. The level of modeling detail is influenced by these goals. Dynamic simulation is characterized by the fact that the model of the system of interest evolves in a time-dependent manner, in contrast to static simulation (also called Monte Carlo simulation) that considers a particular point of time. The system is represented by a set of entities. We differentiate between dynamic entities which move through the system, and static entities or resources, which provide services to other entities. Resources are usually limited in capacity so that entities must compete to be served (e.g., a static entity machine can process one dynamic entity lot at a time). When processing is granted to a dynamic entity, it captures the resource and releases it upon completion. If service is denied to the requesting entity, it joins a queue and waits for its turn, or takes some other action such as being diverted to another resource. The entities are characterized by local data values called attributes (e.g., the entity machine has an attribute machine status with the two possible values busy and idle). The state of the system is given by a collection of state variables whose values are derived from the entities’ attributes (e.g., the state variable capacity utilization is computed from the attribute machine status). The determination of the state variables depends on the purpose of the investigation. The way in which the attributes and state variables are updated distinguishes continuous from discrete simulation. Continuous simulation is the numerical treatment of 664

Fowler et al.

Discrete-Event Simulation for Wafer Fabs: A Tutorial

differential or difference equations where these equations describe the system changes over infinitesimally small or finite time steps, respectively. In a discrete-event simulation model, the variables remain constant over intervals of time, and they instantaneously change value at given discrete points. These separate points in time are the ones at which events occur. It is referred to as the next-event time-advance approach. Events happen at the beginning and ending of activities and delays of the entities. An activity is a duration of time whose length is known when it begins, while a delay starts with an unspecified length and ends when a given condition is fulfilled (e.g., the entity machine remains in the state idle until a lot comes in to be processed). After having introduced the main ingredients of discrete-event simulation, we now describe the principle of model execution. A variable named simulation clock gives the current value of simulated time. The clock is advanced to the next occurrence after all possible actions at the current time have been taken. It is noticeable that the successive jumps of the simulation clock are generally unequal in size. At the core of the time-advance mechanism lies a series of lists that keep track of the events to occur. Even though the terminology and the implementation may differ among different simulation software packages, the following lists are common to all:   

the future event list (FEL) records the occurrence times in chronological order of all events that have been scheduled to happen at a later time the current event list (CEL) contains the events to occur at the current point in time the delay list (DL) refers to entities waiting for their respective condition to be fulfilled so that they can be moved to the CEL for further processing. In case the resolution of a delay cannot be related to a single event, a so-called pooled waiting is used where a routine regularly checks whether a combination of several conditions is met.

We now outline the main steps to carry out a run of a given simulation model. First, the clock is set to zero, the FEL, CEL and the DL are initialized, and the state variables take their default values. The event i at the top of the FEL is the most imminent one. The simulation clock is advanced to the time ti at which event i shall occur. The event i is removed from the FEL, it is added to the CEL, and it is executed. If more than one event is to be moved to the CEL (i.e., events with an identical occurrence time), a tie resolution strategy is applied such as the First-In-First-Out (FIFO) rule. The execution of an event – here, event i – includes the following steps: After all events to occur at ti have been processed, a check is performed whether the simulation should now terminate. Typical stopping criteria are:   

the FEL is empty a predefined point in time is reached any model-specific condition is met.

If the simulation ends, usually a report generator is invoked to compute the performance of the system based on the state snapshots. If it is not time for termination, the clock is advanced to the time tj of the new imminent event j with ti