semiconductor manufacturing

18 downloads 0 Views 838KB Size Report
Nov 4, 1992 - plication of time-series filtering and multivariate statistical process control. ... Historically, SPC has been used with process measure- ments in order to ..... fully applied on a Lam Research Rainbow 4400 plasma etcher, and on ...
308

IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING. VOL. 5 . NO. 4. NOVEMBER 1992

Real-Time Statistical Process Control Using Tool Data Costas J. Spanos, Member, IEEE, Hai-Fang Guo, Alan Miller, and Joanne Levine-Parrill Abstract-During the last five years we have witnessed the widesprelld application of statistical process control in semiconductor manufacturing. As the requirements for process control grow, however, traditional statistical process control applicationsfall short of their goal. This happens because modern processes are more complex than they used to he. Further, because of the expanding use of the so called “cluster” tools, modern technologies are also less observable than before. Because of these difficulties, we can no longer afford to wait until a malfunction can be detected on a traditional control chert. Fortunately, modern semiconductor manufacturing tools can communicate to the outside world a number of their internal parameters, such as throttle valve positions, chamber pressures, temperatures, etc. It is intuitively obvious that equipment malfunctions will manifest themselves first in the values of these internal parameters and much later on the wafer properties. In this paper we describe a process monitoring scheme that takes advantage of such real-time information in order to generate malfunction alarms. This is accomplishedwith the application of time-series filtering and multivariate statistical process control. This scheme is capable of generating alarms on true real-time basis, while the wafer is still in the processing chamber. Several examples are presented with tool data collected from the SECSII port of single-wafer plasma etchers.

I. INTRODUCTION

A

S INTEGRATED CIRCUITS (ICs) become more complex, the semiconductor manufacturing community is focusing its resources on achieving tight process control over the critical process steps. Many tools and techniques are being used toward this end. Statistical Process Control (SPC) is prominent among them, as it can help in the timely detection of costly process shifts. Historically, SPC has been used with process measurements in order to uncover equipment and process problems. Such problems are manifested by significant degradation in equipment operation and product quality. To discover this degradation, critical process parameters are monitored using various types of control charts. The measurements consist mainly of in-line readings collected from wafers after the completion of the process step in question. Although this method is helpful in detecting process Manuscript received January 21, 1992; revised March 26, 1992. C . Spanos is with the Department of EECS, University of California at Berkeley, Berkeley, CA 94720. H:F. Gun was with the Department of EECS. University of California. She is presently with IBM Corporation, San Jose, CA A . Miller IS with Lam Research, Fremont, CA. J . Levine-Parrill is with IBM Corporation. East Fishkill, NY. IEEE Log Number 9202883.

drifts, there is significant delay between the occurrence of a drift and the resulting control chart violation. As production volume increases, faster response to process drifts becomes necessary in order to assure high product quality and low cost. In addition, the proliferation of multi-chamber (cluster) tools, makes it even more difficult to collect the necessary in-line measurements. Under these circumstances we must use other types of information for quality control purposes. Modem semiconductor manufacturing equipment can communicate internal sensor readings oyer standard RS232 ports using the SECSII protocol. This capability has been recognized as crucial for the diagnosis of equipment failures, and for the improvement of the overall product quality [I]. Unfortunately, in a high volume production facility the monitoring of multiple sensors results in an overload of information. Further, most of the popular SPC strategies cannot be applied to real-time readings, since these readings usually show non-stationary , auto-correlated and cross-correlated variation. A special type of SPC procedure is therefore needed to automate the processing of tool data. This paper describes the development and the application of a novel SPC method that uses the-series filters [2] and multivariate statistics [3] to analyze internal machine parameters. These parameters are sampled several times per second, and the readings are filtered using a time-series model. The filtered readings are then combined into a single variable with well defined statistical properties [4].This single statistical variable is calculated every few seconds, and is plotted against formally defined control limits. Real-time misproceshg alarms generated in this manner allow a controller to interrupt faulty runs and prevent any adverse effects on the equipment or the product. These alarms can be used for scheduling preventive maintenance. In the future, these alarms might also be used in conjunction with automated diagnosis routines

t51. This method has been applied on a Lam Research Rainbow single wafer plasma etcher, and on an Applied Materials Precision 5000 cluster tool. The results show that the filtered statistical parameter has successfully responded to several types of process faults, which were introduced in a controlled fashion. The faults included mismatched RF components, different loading factors, gas leaks, and miscalibrated equipment controls. It is noteworthy that none of these faults could have been easily detected by traditional wafer measurements.

0894-6507/92$03,00

8 1992 IEEE

309

SPANOS el 01.: REAL-TIME STATISTICAL PROCESS CONTROL USING TOOL DATA

The rest of this paper is structured as follows: Section

I1 presents a brief overview of traditional statistical process control. Section I11 describes the real-time, multivariate SPC approach, which includes the time seriesmodel and the calculation of Hotelling’s T 2 statistic. Experimental results are presented in Section IV along with a brief description of the equipment and the data acquisition tools. Finally, Section V contains a summary and some suggestions for future extensions of this work. 11. TRADITIONAL STATISTICAL PROCESSCONTROL The concept of statistical control of a production sequence was introduced in 1924 by Walter A. Shewhart of the Bell Telephone Laboratories [ 6 ] . Today, SPC is understood as a collection of methods whose objective is to improve the quality of a process by reducing the variability of its critical parameters. A process is said to be in statistical control when,

“through the use of past experience, we cun predict, at least within limits, how the process may be expected to vary in the future” [7]. When a process is in statistical control, there is only natural variation or “background noise” because of mechanisms known as chance causes. Sometimes, however, a process can change due to assignable causes, such as significant environmental changes, miscalibrations, variability of raw material, or human error. Assignable causes make a process unpredictable and cause it to lose thg state of control as defined above. The main purpose of SPC is to detect the presence of an assignable cause so that it can be corrected. From a statistical point of view, SPC casts the decisionmaking process as a formal hypothesis test. In this context, the null hypothesis (H,) states that the process under consideration is under statistical control, while the alternative hypothesis (Ha) states that the process is out of statistical contral. To test these hypotheses, a random sample x is selected from the population of interest, and the suitable test statistic is calculated. Typically, we calculate the average of several readings of x , and the resulting statistical score is tested against the limits listed in (1). The range of v?!pes that leads to the rejection of a hypothesis is called the cr+al region or the rejection region. For the Shewhart X chart, the upper and lower (UCL and LCL) limits used to validate Ha are given next: UCL = I”

+ Z&Ui

LCL = p

- Z&UF

(1)

where x is distributed according to the N ( p , U’) normal distribution, X is the arithmetic average calculated from n samples of x , and U? = U / & . Also, Z U l 2is the standard normal score which excludes the a / 2 portion off the high tail of the standard normal distribution. According to this equation, the probability of rejecting Ha by mistake, an occurrence known as a type I error, is equal to a. Alternatively, accepting Ho by mistake is known as a type I1 error. Thq distribution that illustrates the nature of the chart is shoivn in Fig. 1.

x

Fig. I . An

x control chart and its hypothesis-testing nature

A popular set of rules developed by Western Electric in the 1950s and known as the Western Electric Rules, provides additional ways to generate alarms [8]. At this point, it is important to emphasize that the operation of the X chart is based on the model described by (1). This equation implies that all the “good” data must come from the same population, which must follow a normal distribution around a fixed value. In other words, the data must be Identically, Independently and Normally Distributed. This is known as the IIND assumption and is summarized below:

xl=p+a, a,

r=l,2;-.

- N(0,u2).

(2)

The IIND assumption is essential for the simple control chart. Without it, the chart and its limits would not truly reflect the process. Unfortunately, real time data often violate the IIND assumption. In the next chapter we focus on the statistical nature of such data. 111. REAL-TIME STATISTICAL PROCESSCONTROL As the volume of production increases, instantaneous detection of process drifts becomes necessary. Most modem equipment have some automated data acquisition capabilities. Unfortunately, traditional statistical process control methods cannot be applied directly on tool data, because most tool-generated data violate the IlND assumption. Indeed, in most cases, real-time data are nonstationary, and in addition they are auto-correlated and cross-correlated, even when they originate from a process that is under control. To accommodate this situation, a novel SPC scheme is developed and applied to several test processes. This scheme employs time-series [2] multivariate statistics [3]. First, time-series models are needed to transform real-time sensor data into IIND signals; and a particular multivariate technique, known as the Hotelling’s T 2 statistic, is used to combine the IIND signals into a single, well behaved statistical variable. This scheme is capable of generating alarms on true real-time basis, while the wafer is still in the processing chamber. In this way, we are able to detect misprocessing before it impacts the product. In this section we describe in some detail this real-time SPC scheme.

-

‘The expression a, N ( 0 , a 2 ) means that the random variable a, is distributed accordingly to a normal distribution with zero mean and a variance U2.

310

IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, V O L . 5 , NO. 4. NOVEMBER 1992

A . Time-Series Modeling Readings collected sequentially are rarely independent. It is this lack of independence, for example, that allows the forecasting of daily temperature lows and highs from recent readings and from historical records. Often, the statistical behavior of a time-varying parameter can be described by time-series model. The purpose of a time-series model is to capture the dependencies among sequential readings of a variable. Time-series models are often used to forecast the value of a future reading from the values of several past observations [9], [ l l ] . The statistical behavior of data collected from most modem semiconductor manufacturing equipment can be modelled with the help of a time-series model. The fact that process readings are statistically related to past values can be intuitively understood: Consider, for example, that modem equipment use feedback control on critical parameters, such as temperature or pressure. The sensors in the control loop record the deviation of the parameter from its target value, and, in the next instant, the controller tends to compensate the observed deviation. Thus, a reading higher than a target value is very likely to be followed by a low value and vice versa, leading to an apparent negative autocorrelation between consecutive readings. Conversely, at high sampling rates the monitored parameters are subject to “inertia,” leading to an apparent positive autocorrelation between consecutive readings. In general, dependencies among readings collected over time can be described by the following equation:

where x is the signal and a is the IZND prediction error of the time series model. In this work, the main goal is to find suitable time-series models to filter real-time data used for statistical process control. The methods used to obtain the models are discussed next. Later we will see how the model can be applied within a practical real-time SPC technique. Next we give a very brief overview of time-series modeling. For an in-depth coverage, the reader should consult the extensive literature on the subject [2], [41, [91, [101, 1131, ~ 4 1 .

B. Univariate Box-Jenkins Analysis In this application we use the univariate Box-Jenkins time-series analysis [2]. The assumption behind the univariate analysis is that the time-series behavior of one parameter can be fully explained by using past observations of this parameter. A Box-Jenkins model is also called an ARIMA(p, d , q) model, and it consists of three linear components (or filters) as illustrated in Fig. 2. These components are the auto-regressive part of order p , the integration part of order d , and the moving-average part of order q [2].

Fig. 2. The three components of the ARIMA model

The general form of the ARIMA(p, d , q ) model is given below:

d(B)w,

=

O(B)a,,

. . . - dpBP - . . . e,Bq

d(B) = 1 - d l B - d2B2 O(B) = 1 - e , B w, = Vdz,

-

where d

z0

(4)

Difference Operator:

vz, = z, - z,- ,

v2z, = V(Vz,) . . *

Backward Shift Operator:

Bz, = z , - ~ B2z, = z1-2

...

where z, is the original reading collected at time t, w, is the respective differentiated signal, and a, is the IZND residual. Below we explain the function of each of the three components of the ARIMA model. The first part of the ARIMA model is the integration component. This part is necessary because a condition for fitting the autoregressive and moving-average parts of the model, is that the signal must be stationary. This means that the mean, variance and autocorrelation functions of the time-series must be time invariant. The integration component of the ARIMA model is used to convert a nonstationary signal to a stationary one. Simple or higherorder differentiation can be used to achieve a time-invariant mean.’ The second part of the ARIMA model is the autoregressive (AR) part, which is needed in order to describe the dependency of the current observation on previous observations. This is done through the autoregressive coefficients d i . The third part of the ARIMA model is the moving-average (MA) part, which describes the dependency of the current observation on previous forecasting errors (also known as random shocks), by means of the moving-average coefficients O1. Occasionally, the original data show seasonal periodic patterns. These patterns can be modeled by creating ARIMA models for the seasonal variation as well as for the individual samples. The composite model is known as a Seasonal ARIMA model or SARIMA(p, d , q) x (P, D, Q)f, where p is the number of significant autocorrelations, d is the number of differentiations, q is the number of significant moving average terms within each season, 2Taking the log or the square root of the data might be necessary in order to produce a constant variance.

31 I

SPANOS el a l . . REAL-TIME STATISTICAL PROCESS CONTROL USING TOOL D A T A

and P, D , Q are the autocorrelations, differentiations and moving average terms, taken across seasons of duration s [14]. The complete SARIMA(p, d , q ) X ( P , D , Q), model is expressed by ( 5 ) : I#J(E)@(B~)W, = O(B)O(BS)al

Dinerentiate and use acland cfto seled candidate AKMA model Estimate the parameters 01 the model seleded at step 1

Check the adequacy of the model

w1 =

VP(VdZ,)

A model can be obtained from the collected data when the process is under control; in this way the model describes the “good” process. Once a model has been developed, it can be used to forecast (or predict) each new value. The difference between the forecast value and the actual value is the forecasting error, or residual. The residual is by definition, an IIND ~ a r i a b l e : ~ a, = Z, -

2,

- N(0,

(r2)

(6)

C. Creating Box-Jenkins Models To obtain a useful ARIMA model, Box and Jenkins proposed a three-step procedure [9]. This procedure is illustrated in Fig. 3. Two devices are used to select the ARIMA models: These are the discrete autocorrelation function (acf ) and the discrete partial autocorrelation function ( p a c f ) . The acfand the pacfare calculated from the properly differentiated signal and are compared with the theoretical acf and pacf patterns from known model structures. To further explain the acf we need to talk about the autocorrelation coefficient. The autocorrelation coefficient describes the statistical dependence between two readings collected at different times. The auto-correlation coefficient takes values in the range from - 1 to + 1 . A zero value will be obtained when the observation of interest is independent from other observations, while a value of 1 indicates complete synergistic dependence. The value of - 1 indicates complete antisynergistic dependence. The following equation defines the auto-correlation coefficient between all pairs of n readings that have been collected k observation time intervals apart from each other. The autocorrelation coefficient is calculated from n consecutive observations, by using the ( n - k ) pairs of observations separated by k observation intervals. Expressed as a function of the integer k , the estimated acf is given by (7):

Fig. 3. The 3-step procedure for ARIMA modeling

complished by fitting the following regression equation: zt+k

= 6klZrCk-l

+

+

6k3Zrik-3

6kZZr+k-2 f



.’

+ 6 k k Z r + Ur+I

(8)

where this equation is fitted on the signal multiple times, with increasing value of k starting from k = 1. The pacf is the series 6,I , 622r. . . , qhkk which is usually displayed as a discrete function of k . Both the acf and the pacf are needed in order to infer the structure of the best fitting ARIMA model. The inference of the best model structure is usually done by trial and error, using the acf and pacf of the original signal and its residuals for guidance [9]. After the structure of the model is inferred and its coefficients extracted, the acf and the pacf of the residuals are used to check the adequacy of the selected model. The process terminates when a satisfactory model is obtained. This interactive sequence is illustrated in Fig. 3. Attempts to automate this procedure have also been reported in the literature [IO].

D. Hotelling’s T 2 Statistic A piece of equipment will, in general, be monitored through a number of sensor signals. Using the appropriate time-series model, each signal is filtered down to its IIND residual. Assuming that the time-series models have been properly built and that the machine is under control, each of these residuals will be an IIND random number. This means that one could use a simple Shewhart control chart to monitor each residual. However, since the signals are originating from the same physical process, their residuals will probably be statistically correlated and using them in separate control charts can be misleading. In fact, it can be shown that as the number of correlated n-k variables increases, the probability of generating false ( z i - 7) ( z r + k - 7) r=1 k = 1 , 2 , . . . (7) alarms from a control procedure that uses a large number rk = n of separate charts grows significantly [6]. This is because (z, - z)2 treating correlated signals separately leads to the underI= I estimation of the probability of generating false alarms The partial autocorrelation function ( p a c f ) also gives and the probability of not detecting a malfunction. Fura measure of dependence across pairs of readings, only ther, the information content of multiple, concurrent realnow this dependence is given after the dependence of the time control charts will undoubtedly overwhelm the huintervening readings has been accounted for. This is ac- man operator. The function of Hotelling’s T 2 statistic is to combine 3The hat ( .) signifies a value predicted by the model several cross-correlated variables into a single statistical

+

c

IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL 5 , NO 4. NOVEMBER 1992

312

score. This number is simply the square of the maximum possible univariate student-t score computed from any linear combination of the various outcome measures [3]. This score is calculated from the p correlated residuals as foli o ~ ~ : ~

T 2 = n(H - O)TS-l(ii- 0) where group mean H T = [a,

. . . ii,]

nominal value of residuals OT = [0

. . . 01

is related to the cumulative F distribution at level a:

which, assuming that the number of measurements is high, can be approximated by a simple chi-square distribution with p degrees of freedom:

Of course, the way the T 2 score has been defined here makes it the optimum statistic for controlling "unstructured" mean shifts, i.e., shifts that might happen in any direction within the p-dimensional space. This property is variance-covariance matrix S = . .* . . . . .. very useful in the context of our application, since shifts can indeed happen in any direction. When, however, parL ticular, known directions are more susceptible to a shift, (9) better statistics (such as the principal components [7] or where, in order to further ascertain that the entries in this the Z-scores [SI) might be utilized. In addition, although formula are normally distributed, it is customary to use this property is not being investigated in this paper, the averages calculated over small, consecutive groups of size T 2 statistic can be extended to guard against a shift in the n for each residual. Some discussion is necessary con- variance of the monitored parameter [7]. Another potential problem might arise from the fact that cerning the estimation of the variance-covariance matrix S. First, the diagonal elements in S are calculated as the the T Zstatistic is not geared towards identifying a shift in the variance-covariance matrix and, in fact, will confound average s value for each of the m groups of size n: such a shift with a shift in the mean vector. Because of this the S matrix has to be re-calculated every time a new time-series model is calculated. Other multivariate control methods are, of course, (10) available. Most, however, suffer from the significant disThe off-diagonal terms are estimators of the covari- advantage of requiring the monitoring of multiple control charts. Such methods might prove advantageous for anaances and are calculated as follows: lyzing an alarm for diagnostic purposes and will most probably be the subject of future work by the authors. For routine monitoring applications, however, the simplicity of having to maintain a single control chart makes the T 2 k = 1,2, . . . , m statistic a very attractive proposition. j=l,2;..,p j # h . E. Implementation of the Real-Time SPC Scheme h = 1,2, * * * , p (11) In summary, the real-time SPC scheme takes multiple Finally, the actual elements of the variance-covariance sensor data that are auto-correlated and cross-correlated, matrix S are calculated by averaging over the m groups and then feeds them into individual time-series filters that the values found in (IO) and (11): produce multiple, cross-correlated IIND residuals. Hotelling's T 2 equations combine the cross-correlated resid. m uals into a single real-time alarm signal. This sequence is illustrated in Fig. 4. This alarm signal can be used either as a passive SPC alarm, or it can initiate a diagnostic procedure [ 5 ] . A software package has been developed to implement this realtime SPC scheme. It includes four modules: data manipThe T 2 score is sensitive to a shift in the mean value of ulation, ARIMA filtering, Hotelling's T 2 calculation, and one or more of the variables. This score can be used in a alarm generation. These operations were initially impleone-sided control chart, whose limit is set according to mented in the commercial statistical packages SAS" [ 161 the number of variables, the sample size and the acceptand RS/l" [17]. Recently, we have completed indepenable false alarm rate. The control limit of the T 2 statistic dent implementations for Unix and DOS environments. Coupled with a SECS11 server, either of these implemen'In this paper we employ bold-faced symbols to represent non-scalar tations is capable of actual real-time operation. The most quantities such as arrays and matrices. All arrays are columns, unless used with the superscript which symbolizes transposition. recent implementation imports ARIMA models that are

r

I

1

I

SPANOS er a /



REAL-TIME STATISTICAL PROCESS CONTROL USING TOOL DATA

313

TABLE I REAL-TIME SIGNALS COLLECTED FROM THE LAM RAINBOW 4400 Name

1 2

Position of the RF tune vane Position of the RF load coil Amount of RF phase error Plasma Impedance between electrodes Peak-to-peak voltage across the electrodes

3 4 5

Cross-correlated Fig. 4. Summary of the real-time SPC scheme

..........................................................

Number

Set-up Procedure

timate the means and the variance-covariance matrix of the residuals for the T 2 calculation. .... Although 25 parameters could be monitored from the LAM Rainbow etcher, using all of them into the scheme Plasma SARIMA Etcher models proved to be unnecessary, since only a few carried useful U I r---...---.--..---.......-....-.-.-.-...k .............................................. information. The criterion for selecting the relevant sensor readings was that the parameter must have some physical significance, in addition to being suitable for timeseries modeling. This meant that after applying a reason:............................................................................................ < Produdion Run ably simple SARIMA model to the parameter, the resulting residuals should be IIND. The five parameters that Fig. 5 . The implementation of the Real-Time SPC System. were finally selected are shown in Table I. The statistical behavior of these readings conveys a generated interactively using the SAS /ETS’” (Economet- comprehensive picture of the etching conditions. Other ric Time Series) module. The monitoring program then readings of apparent significance, such as the RF power, performs the real-time alarm generation function auto- the chamber pressure, or the gas flows, were not used. matically. This sequence is illustrated in Fig. 5 . Since these parameters were actively controlled by the machine according to externally set targets, their readings IV. APPLICATION EXAMPLES were insensitive to internal machine changes. The real-time data from the baseline wafers are plotted To date, the real-time SPC scheme has been successfully applied on a Lam Research Rainbow 4400 plasma in Fig. 6. These plots show only the first 60 readings coletcher, and on Applied Materials Precision 5000 cluster lected during the first minute of the RF cycle for each of the four baseline wafers. The frequency of data acquisitool. These applications are described next. tion is about 1 Hz. It is obvious from these plots that the readings are not stationary and that they have strong “seaA . The Lam Research Experiments sonal” patterns, where each new wafer constitutes a In these experiments, a number of 6” patterned poly‘‘season.” silicon wafers were etched using a C12-based polysilicon The SARIMA(0, 1, 1) x ( 1 , 1, 0)60model, listed in etch recipe [18] on a Lam Research Rainbow 4400 single- (15), was selected and fitted for all parameted with the wafer etcher. Through the SECS11 protocol link a remote help of the SAS/ETS’” statistical package: host communicated directly to the Rainbow in order to acquire real-time analog data, using the Lam Station a, = (zr - z t - 1 ) - (zr-m - z r - 6 1 ) - 41 package provided by Brookside Software [ 191. Using this x [(Zi-fXl - z - 6 1 ) - (Zr-120 - Z r - 1 2 1 ) l + e l a r - l package, up to 32 separate parameters can be sampled simultaneously with rates of up to 3 Hz. For this experiment, we monitored signals from the RF network because This model was applied to all five monitored paramewe found that they are very responsive to small process ters, with different values fitted for the coefficients 41and changes [20]. Two experiments are described next. el for each parameter. The ZIND residuals for the five observed parameters are plotted in Fig. 7 . Note that ( 1 1 ) B. The First LAM Rainbow Experiment cannot be used to generate residuals for readings 1 to 121, The initial objective was to select the proper parameters since the data from the first two baseline wafers, as well from all of the available sensor readings, and also to find as the first reading of the third wafer, are lost due to difthe proper time-series models for these parameters. To ferencing. The plots in Fig. 7 show the residuals for the this end the machine was calibrated to a stable operating baseline data points 122 to 240. point by processing over 100 wafers. Four polysilicon wafers that were processed afterwards gave us the base‘Naturally, the fact that the same SARIMA structure was applicable to line data-set. The baseline data-set was used to select and all five signals in this example is just a coincidence. In general, different characterize the appropriate time-series models, and to es- signals require different SARIMA structures. user sets SARIMA

i

1

models interactively

IEEE TRANSACTIONS O N SEMICONDUCTOR M A N U F A C T U R I N G , VOL. 5. N O . 4, NOVEMBER 1992

314

Orieinal RF Tune Vane

Onginal RF Load Coil

-I

I

I

1

I

0

Original RF Plasma Impedance I I, 1

1 II

1111

I I

50 IO0 150 200 Fig. 6. The onginal real-time data for the four baseline wafers.

Following the four polysilicon baseline wafers, a photoresist-covered wafer was introduced into the chamber, in order to simulate an inadvertent change in the loading factor. The T Zstatistic was obtained by taking the average of every ten consecutive readings. Since there are 60 consecutive readings per wafer collected at 1 Hz, we issue one T2 value every ten seconds, for a total of 6 control points during the etch cycle of one wafer. The signals for

I

I

250

wafers 1 and 2 are lost due to the seasonal differencing inherent in the SARIMA(0, 1, 1) x (1, 1, O)60 model. The T2 chart is plotted in Fig. 8. Since the fifth wafer had poorly developed photoresist on top of the polysilicon, the loading factor is different from that of the baseline wafers; thus the plasma parameters are different. As expected, the T 2 scores violates its control limit when wafer #5 starts processing.

SPANOS

e1

315

al.: REAL-TIME STATISTICAL PROCESS CONTROL USING TOOL DATA Filtered R F Tune Vane

-. ---

e.

I

.A.-

II

I .--

I

U

1

1 ‘

.-.-I

I

I .--

-I

I

I

-I

I

I I

Filtered R F Load Coil

Filtered RF Phase Error

Filtered Peak-lo-Peak Vollnge

I

I

120

140

1

I

I

I

I

I

160

180

200

220

240

-

Fig. 7. The IIND Residuals for the four baseline wafers.

C. The Second LAM Rainbow Experiment

During the second LAM experiment, the objective was to test the method by introducing, one by one, a number of “faults” into the process. After carefully calibrating the machine, eight baseline wafers were etched before introducin the changes that are summarized in Table 11. The T control chart from this experiment is shown in Fig. 9. The faults associated with the alarms are marked

4

on the top of the plot. (Since all the points associated with a known fault are also marked on the chart, we can be reasonably sure that the indicated alarms are real.) This scheme is able to pick up most of the process faults that we have introduced. The significance of this scheme is that it can detect very slight process changes which might affect the performance of the machine, but cannot be seen on the traditional etch rate chart (Fig. 10.)

IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING. VOL 5 , NO. 4 , NOVEMBER 1992

316

KL = 9

I 10

Fig. 8. 7‘* Control Chart for the first LAM Rainbow Experiment. TABLE I1 FAULTS A N D ALARMS DURING THE SECOND LAM EXPERIMENT ~~

Alarms Recorded

Wafer No

Process Condition

1 to 8

Reference Baseline New Baseline Adjustment Replaced Minimatch Component New Machine Calibration Replaced DIP Card New Machine Calibration Replaced Minimatch Component for the 2nd time Change in Electrode Gap (1 %) Pressure Miscalibration (5%) Mass Flow Controller Miscalibration (5%) Replaced Relay Component Replaced Relay Component for the 2nd time Introduced Contamination by skipping cleaning step Replaced Cable Set

9,10,11 12, 13 14, 15 16, 17 18, 19 20.21 22, 23 24, 25 26, 27 28, 29 30, 31 32, 33 34, 35

none 3

2 2 2 2 2 2 2 2

I

Number

Name

1

DC Bias across the electrodes Position of the RF Tune Blade Position of the pressure-control Throttle

2 3

TABLE IV FAULTS A N D ALARMS DURING THE AME 5000 EXPERIMENT

none Wafer No 1 to

8

9 10, I I

12 13, 14 15

16, 17 18 19,20 21 22, 23 24 25, 26

150

B

$

E-

IW

50

UCL = I7

. . . . .

n 1 3

5

7 9 11 13 I5 17 19 21 23 25 27 Wafer Number (2 sunpleslwafer)

29

(a = 0.0s)

31 33 35

Fig. 9. T 2 Control Chart for the second LAM Rainbow Experiment.

D. The AME Precision 5000 Experiment Ip this experiment we etched Si02 grown on 8 ” wafers. The recipe used in the AME Precision 5000 Cluster Tool called for 150 mTorr of pressure, 350 W of RF power, 25 Gauss of magnetic field, 72 sccm of CHF3 flow, 5 sccm of O2 flow and-‘l2 sccm of Ar flow. The following realtime signals were used in Table 111 for this analysis.

9 I 1 13 I5 17 19 21 23 25 27 29 31 33 35 Wafer Number

TABLE Ill REAL-TIME SIGNALSCOLLECTED FROM THE AME PRECISION 5000

2

2W

5 7

Fig. 10. Etch Rate Control Chan for the second LAM Rainbow Experiment.

2 2

250

3

Process Condition Reference Baseline 20 sccm Ar (up from I2 sccm) Reference Baseline 0 Gauss (down from 25 Gauss) Reference Baseline 375 Watts (up from 350 Watts) Reference Baseline 140 mTorr (down from 150 mTorr) Reference Baseline 58 sccm CHF3 (down from 72 sccm) Reference Baseline Bare Si wafer with normal Si02 etch recipe Reference Baseline

Alarms Recorded none missed none 4

none 3 none 1

none missed none

2 none

After carefully calibrating the machine, we introduced the changes summarized in Table IV. Most of these faults produced very visible real-time alarms as seen in Fig. 11. Interestingly, except for a general downward trend, no alarms were seen by the traditional etch rate chart shown in Fig. 12. Some discussion is in order concerning the missed alarms during this experiment. This can be attributed to the fact that the only machine parameters monitored during this experiment were the chamber throttle position, the blade position and the dc bias. Arguably, these parameters alone cannot give us enough information about the changing composition of the plasma, thus the missed alarms. The missing information could have been recovered by adding signals such as endpoint emission, etc.

7

-

SPANOS et al

operator. Adjustments, preventive maintenance, and further investigation might be planned by the operator (or by a computer-based expert system) based on this information. An important element in this application, however, is the discovery of the appropriate time-series filter. In the examples that we have shown here these filters were created by the interactive examination of the baseline data. In the future, these filters must be generated automatically. We are currently investigating several techniques for the automatic synthesis of the time-series filters and this work will be the suuject of a future publication.

ee able)

I

I \

I1

I

UCL= I5 (a= 0.05) 3

1

317

REAL-TIME STATISTICAL PROCESS CONTROL USING TOOL DATA

7

5

I I 13 15 17 19 21 Wafer Number (3 ssmpldwafer) 9

23

25

27

Fig. 11. T 2 Control Chart for the AME Precision 5000 Experiment.

ACKNOWLEDGMENT The authors are grateful to IBM, Lam Research, National Semiconductor, Texas Instruments, the Semiconductor Research Corporation, the California MICRO program, and the National Science Foundation for supporting the work presented in this paper. The DOS and Unix software used during the real-time signal analysis was written by Eddie Wen. The SECS11 server for the Lam Rainbow was donated by Brookside Software. The authors are also grateful to two anonymous reviewers, whose constructive criticism was instrumental in improving our paper. REFERENCES

1

3

5

7

9

11 13 I 5 17 Waf= Number

19

21 23 25 27

Fig. 12. Etch Rate Control Chart for the AME Precision 5000 Experiment.

V. CONCLUSIONS AND FUTUREPLANS We have presented a novel application of real-time statistical process control for monitoring state-of-the-art semiconductor manufacturing equipment. Through actual monitoring examples, we have shown that this technique can successfully flag internal machine variations long before their effects can be seen on the wafers. Since this method takes advantage of tool data in order to discover process drifts, it offers several advantages over traditional in-line monitoring schemes. Among these advantages it has demonstrated high sensitivity to process drifts, and very high immunity to the routine variation of the tool. Further, since the alarms are generated automatically and they are available immediately after the wafer starts processing, faulty runs can be stopped before they impact the product. In addition, because of the ability of this technique to observe changes within the internal operation of the tool, we also expect it to be useful in driving an automated diagnostic package which, in turn, will be able to supply immediate diagnostic information to the

[ l ] W . E. Barkman, In-Process Quality Control for Manufacruring. New York and Basel: Marcel Dekker, 1989. 121 G . E. P. Box and G. M. Jenkins, Time Series Analysis: Forecasting and Control, 2nd ed., San Francisco: Holden-Day, 1976. [3] R. Harris, A Primer of Mulriuoriare Srarisrics. New York: Academic Press, 1975. [4] D. C . Montgomery and D. J. Friedman, “Statistical process c o n k 1 in a computer-integrated manufacturing environment,” in Statistical Process Conirgl in Automated Manufacruring, I. Keats, Ed. New York and Basel: Marcel Dekker, 1989, pp. 67-87. [SI G. S. May and C. J . Spanos, “Automated malfunction diagnosis of a plasma etcher,” in Proc. 1991 Int. Semiconductor Manufacturing Science Symp., May 1991. [6] D. C . Montgomery, Introducrion to Statistical Qualiry Conrrol, 2nd ed. New York: Wiley, 1990. [7] I. E. Jackson, “Multivariate quality control,” Communications in Starisrics: TheoryandMethuds, vol. 14, no. 11, pp. 2657-2688, 1985. [SI D. M. Hawkins, “Multivariate quality confrol based on regressionadjusted variables,” Technometrics, vol. 33. no. 1, pp. 61-76, Feb. 1991. [9] W. A. Shewhart, Economic Conrrol of Quality of Manufacrured Producr. New York: Van Nostrand. (Republished in 1981, with a dedication by W. Edwards Deming, by the American Society for Quality Control, Milwaukee, WI.) [IO] Western Electric Co., Statistical Quality Control Handbook, 1956. [ 1 I] A. Pankratz, Forecasting With Uniuariate Box-Jenkins Models-Conceprs and Cases. New York: Wiley, 1983. [12] S. M. Kay and S. L. Marple Jr., “Spectrum analysis-A modern perspective,” Proc. / € € E , vol. 69, no. 11, Nov. 1981. [I31 C. I. Spanos, “On-line statistical process control of VLSI manufacturing equipment,” Research proposal to IBM, Nov. 1989. [14] B. Sindahl, “Estimated time senes and CIM,” private communication, Apr. 1990. [I51 W. Vandaele, Applied Time Series and Box-Jenkins d o d e l s . New York: Academic Press, 1983. [16] C. Chatfield, The Analysis of Time Series-An Introduction, 4th Ed., New York: Chapman and Hall, 1989. 1171 H. F. Guo, C. J. Spanos and A. J . Miller, “Real-time statistical process control for plasma etching,” Proc. 1991. Inr. Semiconductor Manufacturing Science Symp., May 1991. [18] SASIETS User’s Guide, Version 5 Ed., SAS Institute Inc.

IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING. VOL 5 , NO 4 , NOVEMBER 1992

318

1191 R S / I QCA User’s Guide: RS Series Quality Control Analysis, BBN

Software Products Corporation. (20) Lam Research, Rainbow: Excellence in Etch, 1991.

[21] P. Byrne and K. Heiman, “Equipment analysis with the SECS protocol and personal computers,” in ISMSS 1989. [22] A. J . Miller, LamStarion Performance and Modeling Evaluation. Lam Research, 1990. Costas J. Spanos (S’77-M’85) was born in 1957 in Piraeus, Greece. He received the Electrical Engineering Diploma with honors from the National Technical University of Athens, Greece in 1980 and the M.S. and Ph.D. degrees in electrical and computer engineering from Carnegie Mellon University in 1981 and 1985, respectively. From June 1985 to July 1988 he was with the advanced CAD development group of Digital Equipment Corporation in Hudson, MA, where he worked on the statistical characterization. simulation and diagnosis of VLSI processes. In 1988 he joined the faculty at the Electrical Engineering and Computer Sciences department of the Uni versity of California at Berkeley, where he is now an Associate Professor. His research interests include the application of computer-aided manufacturing techniques in the production of integrated circuits. Dr. Spanos has served on the technical committees of the IEEE Symposium on VLSI Technology, the International Semiconductor Manufacturing Sciences Symposium, and the Advanced Semiconductor Manufacturing Symposium. He is the editor of the IEEE TRANSACTIONS ON SEMICONDUC TOR MANUFACTURING.

Hai-Fang Guo received the B.S. degree in electrical engineering and computer science from the University of California at Berkeley in 1989. and ihe M.S. degree from the same department in 1991. She is currently working on development of advanced control unit for Direct Access Storage Devices at IBM San Jose Site. Ms. Guo is a member of Eta Kappa Nu and SWE (Society of Women Engineers.)

Alan J. Miller received his Bachelor’s Degree in Mechanical Engineering from the University of California, Berkeley, in May 1984. He has worked exclusively in the Semiconductor Equipment Industry since 1985. Mr. Miller worked as a Thin Films Process Engineer for E.T. Electrotech in Santa Clara, CA, specializing in sputtering and PECVD applications. He joined Lam Research in 1987 as a Plasma Etch Process Engineer. He is presently a Process Engineering Subervisor, involved in machine qualifications, process diagnostics, and improvement projects.

Joanne R. Levine-Parrill received the B.S. degree in materials science from Cornell University in 1984 and the Ph.D., also in materials science, from Northwestern University in 1990. The subject of her doctoral dissertation was the development of grazing incidence small angle x-ray scattering and the application of this technique to the kinetics of thin film island growth and migration. She was the recipient of the 1. 0. Jeffrey Prize for Excellence in Materials Design in 1984 and of the American Vacuum Societv Russell and Sieurd Varian Fellowship in 1987. Dr. Levine-Pam11 is currently employed at IBM, East Fishkill working in a range of silicon processing development areas which include reactive ion etching, manufacturing process integration, advanced lithography, process diagnostics, and statistical process control.