Low-Open Area Endpoint Detection using a PCA ... - Semantic Scholar

6 downloads 0 Views 322KB Size Report
and Tim Dalton2,*. 1Microsytems ...... [4] D. Beale, A. Wendt, and L. Mahoney, “Spatially resolved optical emission for characterization of a planar radio ...
Low-Open Area Endpoint Detection using a PCA based T2 Statistic and Q Statistic on Optical Emission Spectroscopy Measurements David White1, Brian Goodlin1, Aaron Gower1, Duane Boning1, Han Chen1, Herb Sawin1, and Tim Dalton2,* 1

Microsytems Technology Laboratories Massachusetts Institute of Technology Cambridge, MA 02139 2

IBM Microelectronics 1580 Route 52 Hopewell Junction, NY 12533, [email protected] Abstract This paper will examine an approach for automatically identifying endpoint (the completion in etch of a thin film) during plasma etching of low open area wafers. Since many endpointing techniques use a few manually selected wavelengths or simply time the etch, the resulting endpoint detection determination may only be valid for a very short number of runs before process drift and noise render them ineffective. Only recently have researchers begun to examine methods to automatically select and weight spectral channels for estimation and diagnosis of process behavior. This paper will explore the use of principal component analysis (PCA) based T2 formulation to filter out noisy spectral channels and characterize spectral variation of optical emission spectroscopy (OES) correlated with endpoint.

This approach is applied and

demonstrated for patterned contact and via etching using Digital Semiconductor’s CMOS6 (0.35µm) production process.

I.

Introduction The ability to detect etch endpoint in low open area etches presents a challenge to even the most skilled

process engineers. Most etches are timed based upon past experience with a particular reactor type and chemistry, and also based upon process setup verification with costly cross-sectional SEM analysis. As integrated circuit (IC) design moves toward higher density and smaller critical dimensions, the need to know when a particular layer has been removed without etching into the subsequent layer or laterally into features is crucial. Research into plasma etch endpoint detection and uniformity using optical emission spectroscopy (OES) [29,31,5,10,11], interferometry/ellipsometry [24,8,35,19], in-situ measurement of process conditions [2,23] and spatially resolved optical emission spectroscopy [21,4,8,6,7,32,7] has been very active over the last ten years. Advances in solid-state spectrometer technology now enable low-cost OES instruments that provide hundreds of spectral channels over a wide range of wavelengths.

To

2

accommodate the abundance of OES information, Le [17] has proposed the T formulation for determining endpoint. This paper examines the extension of Le’s approach to using principal component analysis (PCA) methods for selecting and weighting the appropriate spectral channels that are then used to compute the T2 statistic and subsequent endpoint during low open area patterned contact etch.

1

As described in Section II, endpoint detection involves determining, from multivariable spectral measurements consisting of hundreds of channels of data, the variance due to clearing from nominal etch behavior. The challenge as described in Section II is to determine what spectral channels are to be selected and how they should be weighted. Statistically, a method is required that measures the variance of a particular measurement within some data population. Section III describes the mathematical relationship between the use of PCA based decomposition to determine the principal spectral channels associated with etching and the use of T2 statistics which measure the generalized distance of a measurement from a nominal data population. From a linear systems point of view, we perform a congruence transformation on the spectral data to modal coordinates and retain only the most significant modes (eigenvector and eigenvalue pairs) in calculating T2. Since plasma etch represents a nonstationary process, any endpointing algorithm requires some method for calibrating the validity of the estimator with regard to drift in process mean and covariance.

Section III also describes use of the Q-statistic to calibrate the PCA models used

during endpoint detection.

The application of the PCA based T2 statistic for plasma etch endpoint

monitoring is described in Section IV. This section describes the software development and real-time implementation requirements for in-situ endpointing detection. Section IV concludes with analysis regarding the motivation for using this approach for endpoint detection. The endpoint approach described in this paper was applied to an etch tool used for contact and via etch applications at Digital Semiconductor. Section V discusses the experimental set-up and equipment used for this application. The results from several experiments are presented in Section VI, including a comparison of endpoint detection using a 1% contact etch, 10% contact etch and a blanket polysilicon etch. Also presented are results from application of the Q-statistic to determine validity of the PCA model, in-situ results of the algorithm’s robustness after several weeks of processing, and scanning electron micrographs for various positions on the wafers.

Section VII summarizes the main points resulting from this work and describes future

directions for this research.

In the interest of avoiding confusion regarding terminology, care must be taken in the interpretation of the term "endpoint." Due to nonuniform film thicknesses (arising from deposition or CMP processing), to spatial variation in the etch rate across the wafer, or to etch nonuniformity with different patterned feature sizes, there is a variation in the time at which a film begins to clear or completes clearing in different locations across a wafer. Thus, there is no single unambiguous etch endpoint time, but rather a region or interval in time where some areas on the wafer are being over-etched while other areas are completing etch. In this paper, we will refer to endpoint as the time (or time interval typically spanning a few seconds) in which the smallest critical dimensions are completely etched across the wafer.

II. Motivation for Applying Spectral Characterization to Plasma Processes There exists a broad range of publications [13,14,15,34,27] regarding the use of chemometrics based data manipulation for spectral characterization.

Although there are numerous advantages of using

2

chemometrics methods such as principal component analysis (PCA) and partial least squares (PLS), most applications of these methods primarily take advantage of their ability to extract a subset of useful relationships from tremendous amounts of data. For that reason, chemometric methods are commonly used in spectroscopy to characterize these relationships where a thousand variable measurement vector is commonplace. Some of the work presented here builds upon the application of PCA methods to in-situ spectral methods for plasma etch diagnostics and uniformity measurements [3,32]. The following figures outline the magnitude of this problem with regard to endpoint detection.

Fig. 1 plots a spectral sample of 1100 spectral channels (across wavelength) from 350 to 880 nm, as measured in real-time during etch. The dominant spectral channels associated with this particular gas chemistry are labeled as well. Fig. 2 (a, top) plots the time-series behavior of four of these 1100 channels across etch and endpoint, and illustrates that the spectral signature before endpoint (at 76 seconds) and after endpoint (at 100 seconds) changes appreciably, as indicated in Fig. 2 (b, bottom). From these figures, one can envision an 1100 variable measurement vector acquired every 0.3 seconds. For the 10% contact etch discussed in this section, that encompasses over 141,000 measurements per wafer. Obviously not every spectral channel is necessary, but a key difficulty is to decide which to keep and which to remove. Furthermore, a second challenge is to determine with what importance each channel/variable should be weighted -- a method is needed to determine this optimal weighting or relationship between these channels. Section III describes a commonly used chemometric approach to solving this problem.

2500

C2 SiF 516

2000

440

intensity (a.u.) (AU)

C2

1500 471

1000

CO 483

C2 401

C2,CO 553

F 601 656 685 740

O 777

500

0 300

400

500

600 700 wavelength (nm)

800

900

Fig. 1. Representative Optical Emission Spectra during main etch of oxide etching experiments. Dominant spectral lines are labeled with their corresponding species.

3

(a) 2.5

2

SiF, 440 nm C2, 516 nm

1.5

1 ∆

O, 777 nm

CO, 483 nm Before Endpoint

0.5

0

0

20

40

60

After Endpoint

80 100 time (s)

120

140

160

180

(b) 120 Change in Intensity (after endpoint – before) endpoint)

C2

100 80 60 40 20 0

-20 CO

-40

O

-60 -80 300

SiF 400

500

600 wavelength (nm)

700

800

900

Fig. 2. The normalized intensity of four spectral channels during 10% open oxide etch are shown in panel (a); the bottom panel (b) displays the difference in spectral intensities during the time ∆ marked on the left plot. A characteristic change in emission intensity during endpoint is observed for the dominant spectral lines corresponding to the species SiF, CO, C2 and O. The line at 516 nm corresponding to C2 increases at endpoint since it is a reactant, whereas the lines corresponding to the product species CO, SiF, and O all decrease at endpoint. III. Mathematical Description of the PCA-based T2 and Q Statistics Principal component analysis (PCA) and related extensions have been widely presented [13,14,15,34,27]. Here we present only a summary description of our approach and the context under

4

which a computationally efficient approximation of the T2 statistic is derived. This section focuses first on a brief review of Hotelling’s original formulation, and describes the transformation to a PCA based T2 value. The Q-statistic is then presented as a means of understanding the validity of a reduced-order PCA model. 3.1 Principal Component Analysis and T2 The Hotelling T2 formulation provides a weighted generalized distance measurement of a sample vector x within a data population (or event space) which uses estimates of the data mean vector

x and

covariance S. n

Τˆ 2 = ( x − x ) S −1 ( x − x )T where s jk =

1 n −1

∑ (x i =1

ij

− x j )( xik − xk )

(1)

Note that we have changed the covariance variable to S to denote the sample covariance and are using

Τˆ 2

to denote our estimate of the T2 statistic and T the matrix of scores. The elements of matrix S are expressed as

s jk . The columns represent variables and the rows represent experiments.

We can rearrange this equation into matrix form as follows:

Τˆ 2 = [mi1

where

 s112  min ]  sn21  

s12n    2  snn  



−1

[mi1

min ] . T

(2)

mi1 = ( xi1 − x1 ) and min = ( xin − x n ) are the mean-centered samples and x n is the mean for

channel n. The pseudo-inverse of the covariance matrix S can be expressed as:

λ1 −1 S = V   0

−1

0  VT .  λn 











(3)

By substituting the pseudo-inverse into the expression above we acquire a form very similar to the SVD of the covariance matrix C in PCA,

C = UΛV −1 = VΛV T ,

5

(4)

and the expression:

var( t k ) = t k

T

tk = vk 

T

M

T

M vk = vk 



T

C vk = λk , 

(5)

where the variance of each score tk from PCA can be expressed as the eigenvalue λk of the eigenvector vk associated with that score.

This is equivalent to performing a similarity transformation on the measured spectral data M during PCA and the following form is acquired:

λ1 2 Τˆ = MV   0

−1



where columns,

0 λ1  VTMT =T    0 λn  







−1







0  TT .  λn 



(6)

t j = Mv j , and v j v j = 1 . Equations (5) and (6) are equivalent because similarity T

transformations involve the multiplication of the unitary eigenvector matrices, V, whose inner product is the identity matrix. This form can be further simplified to:

λ1−1  Τˆ 2 = T T   0 



0   T −1  λn 



(7)

using the fact that the inverse of a diagonal matrix is simply the inverse of the elements along the diagonal.

Equation (7) provides a much more powerful form with regard to in-line use. The PCA transformation can be used to select only a small number of scores tk to reduce the dimension of the input space to a weighted combination of only the most important spectral channels. The resulting orthogonal basis can thus be used to quickly compute the T2 statistic in real-time during etch. If or when the covariance matrix needs to be updated, equation (7) allows for faster computation by calculating the inverse of each of the diagonal matrix elements rather than computing a full matrix inversion .

6

As with most open-loop estimation methods, some method of calibrating the PCA model is required to determine the validity of the estimates and whether a shift in spectral mean or covariance has occurred. The Q-statistic provides such a method.

3.2 Q-Statistic for Calibrating Endpoint Estimation As a general diagnostic method, the Q-statistic is another way to examine the variation of incoming samples or measurements with regard to the PCA model:

Q = mT m − t T t

(8)

In a statistical sense, the T2 statistic may be used to describe the distance of new measurements within the PCA model (eigenspace), and the Q-statistic may be used to indicate measurement variation outside the PCA model [12]. A change in the covariance of the spectra, evident as a rotational variation in the eigenvectors of the measurement data toward eigenvectors excluded from the PCA model, will be observed whereas during a shift in spectral mean, a translational variance along eigenvectors will be observed. A deviation in measurement covariance or mean could result from a change in process operating conditions, incoming wafer characteristics or window effects on the emitted light measured by the spectrometer.

7

No. of Measured Variables (N), N=3

V3

PCA provides three natural coordinates or eigenvectors V1,V2, & V3 If only V1-direction component retained no projection onto V2 axis

Retained principal component along V1-direction Projection onto V1-direction component

V2

Projections of actual measurement onto the V2 and V2 components.

V1

Actual Measured Variation in N-Space If only V1-direction component retained no projection onto V3 axis

Fig. 3. Visualization of the Q-Statistic. Given three measured variables, PCA provides three principal components shown here as V1,V2, and V3. If only one component along the V1-direction is retained and components V2 and V3 are removed, then information associated with those directions are filtered out. A visual example of this is shown in Fig. 3 where the PCA model consists of the V1-directional component only and the V2 and V3 directional components are excluded. Any measured vectors that reside in the V2 and V3 plane will have a null projection onto the V1-directional. If the variance in these directions is commonly associated with noise, the removal of vectors V2 and V3 increases the signal-to-noise ratio of the original measurement. However, if an event occurs which is not represented in the original data set that has significant variance along V2 and V3 axes then that event will not be observed in the resulting scores or T2 calculation. If, over time, the process covariance drifts in such a way that the new principal direction vector (currently V1) has rotated toward V2 or V3, then continued projection along the original V1 vector will contain significant errors that may not be detectable. The Q-statistic, computed using equation (18), however, would increase sharply indicating the rotation away from V1 and the need to recompute the PCA coordinates. IV. Application of PCA and T2 to Plasma Etch Endpoint As critical device dimensions decrease and the manufacturing costs increase, greater importance is placed upon higher yield and reduced variation during manufacture. To adequately address these concerns, run-by-run control, multivariate statistical process control methods and innovative sensor technologies have been introduced.

Plasma processes in particular present a difficult problem in that any sensor must

8

passively monitor, in real-time, the etch process from outside the chamber, such as through optical emission spectroscopy or electrical/power related parameters.

Multivariate methods are necessary to

address the large number of variables measured at the rate one to a hundred times a second (1-100 Hz) during processing, such as the hundreds of spectral channels over a wide range of wavelengths commonly provided by optical emission spectrometers.

Principal component analysis is a powerful multivariate method to extract important correlations between these variables while reducing the dimensionality of the measurement vector. Also, by exploiting correlations between variables, the effective signal-to-noise ratio is often significantly increased (as evident in the described application here to low open area etch). Multiple components produced by PCA may represent multiple reactions, for example the consumption of reactants and the creation of products, which together may combine to describe the onset of etch, clearing and overetch. To more accurately track the variation of a process in response to these process conditions within the PCA model, the PCA based T2 statistic may be employed.

4.1 Methodology for endpoint detection implementation in real-time Real-time endpoint detection (Fig. 4) for an industrial plasma etching system requires: 1.

prior development of a PCA model using experimental training data,

2.

a real-time data acquisition system for the collection of OES data,

3.

software to calculate T2 in real-time, and

4.

a threshold detection level to automatically stop the process or method to display it on a monitor so an operator may stop the process.

350-880 nm Spectrometer

Applied Materials 5300 HPD Oxide Etcher

photon energy converted to electrical signals

STOP

PC (Windows/Visual C)

Data Acquisition and A-D Conversion

• PCA - T2 Computation • GUI Endpoint Detection

Fig. 4. Block diagram of experimental set-up for real-time endpoint detection is illustrated.

Fig. 5 provides a block diagram of the endpoint detection scheme for development (Fig. 5b) and in-situ endpoint detection (Fig. 5a). Once a target reactor/process for endpoint detection is chosen, a series of

9

experiments are initiated, etching wafers at nominal operating conditions, over several runs to ensure that significant process drift is captured. The raw OES data taken in sequential time steps throughout an etch is formatted as an r by n matrix, where r is the total number of samples acquired within a run and n is the number of spectral measurements (or channels) collected on each run. The DEC patterned etch data used in this report has 1000 samples taken at a sampling period of 500 ms over 1100 spectral channels. Thus, in the resulting raw data matrix each row corresponds to one time sample of one particular experimental

run/wafer, and each column represents the intensity at one of the sampled wavelengths over the total measured spectral range. The raw data matrix is then mean-centered with respect to each wavelength or

channel to produce the data matrix that is decomposed using PCA.

New Measurement

Database of Spectral Data Acquired and Put in Matrix Form

Data

Data is Mean Centered

Mean Center Alternative Calculation or Redundancy Check Projected on Loads

SVD of Covariance Matrix

Eigenvalues

Q Stat. Computation

T Scores

Choose the Number of Principal Components (Loads) from Eigenvalue Analysis

T2 Computation T2 values

Loads

Endpoint Occurred? NO

Covariance of Data Matrix Calculated

% Variance Captured by Each Component Calculated

Eigenvalues

Signal Endpoint YES

V



Fig. 5.a. (left) shows the sequence of in-situ steps that were implemented in real time to monitor clearing as well as the Q-statistic based calibration check. The real-time data acquisition system consists of an Ocean Optics S2000 Spectrometer that is connected through a data acquisition board to a PC running a Windows C++ application. Fig. 5.b. (right) shows the sequence of off-line steps involved in determining an adequate model for PCA based decomposition of experimental data. This approach can also be extended to use multiple wafers, where the r by n data matrix becomes (bxr) by n and the number of wafers expressed as b. If each run or wafer is treated as a “sample” in a run-by-run type approach, the inclusion of additional wafers is one way of incorporating etch variation with respect to time. The additional runs or examples also serve to increase the signal-to-noise of each channel. There are several more advanced approaches, often called multi-way PCA [33], to incorporate temporal information over multiple runs and these approaches are to be addressed in future work.

Once the mean-centered data matrix is acquired, PCA is applied decomposing the data into a matrix of eigenvectors (V) and a diagonal matrix (Λ) composed of eigenvalues (λ). Since a smaller subset of components is often selected, the dimensionality of the eigenvector and eigenvalue matrices is reduced. The reduced matrices are then used during real-time operation to compute and detect a shift in the T2 value

10

using equation (7).

As this equation indicates, the computation is fast requiring only matrix

multiplications.

During in-situ endpoint detection, the statistical mean of each spectral channel is determined in realtime using a recursive estimator (e.g. EWMA or exponentially weighted moving average [9]) and current measurements are mean-centered. subtract the mean ( x j ( k estimated mean

Keeping the same notation as the PCA derivation, we recursively

− 1)) for channel j from the data for channel j prior to T2 calculation. This

x j (k ) is calculated using the prior measurement at time sample k-1 and the current

spectral measurement mean for channel j,

x j (k ) acquired at k and multiplying by a gain factor α (less than one), the current

x j (k − 1) , is updated at time k as: x j (k ) = x j (k − 1) + α ( x j (k − 1) − x j (k )) .

(9)

Another method that provides similar performance is the mean-centering of real-time spectral measurements from the current wafer using the spectral mean calculated from the prior wafer. The scores are then weighted and summed to determine the T2 values that are monitored by a shift detection scheme signifying clearing. This scheme could be in the form of a probabilistic estimation [25], threshold trigger or through visual inspection by the operator. There are obvious disadvantages to this approach in the case where significant machine downtime occurs between wafers.

As described in Section 3.2, there is a need to continually monitor changes in variables that are not represented in the PCA model and thus require a new decomposition. As shown in Fig. 5a, the Q-statistic may be calculated in parallel to the T2 computation to indicate when a process has drifted away from the PCA model and determine when a transition between layers has occurred. During this transitional period, spectral intensities that have been previously filtered out in the PCA transformation (reduction to truncated model t) will show up in m resulting in an inequality in (6). Another application of these two statistical measurements is endpoint determination for etching through multiple layers. In this application, the T2 statistic would be used to determine endpoint based on PCA decomposition of data gathered during a particular phase of the etch (e.g. with a specified chemistry and set of process conditions). So for a multilayer etch, PCA models for each etch chemistry are used to compute a T2 statistic for endpoint determination and the Q-statistics are used to monitor transitions between layers.

4.2 Motivation for using Principal Component Analysis to Calculate a Modified T2 Statistic Using principal component analysis (PCA) to calculate the T2 statistic has several benefits over the traditional approach to calculating T2. Before describing the advantages, however, it is important to note that the formulation of the T2 statistic, which was shown to be mathematically obtainable through a

11

congruence transformation, gives a fundamentally different result when the number of components are truncated than if all components are retained. In the original formulation of T2, deviations from a main etch population are determined, from which endpoint or any other significant deviation will result in an increasing T2 value. This conventional formulation is reproducible if PCA is applied only to the main etch data (not including data from the endpoint interval) and all of the scores are kept. Fig. 6 depicts an example of this where the same result is obtainable whether using the original T2 formulation or the PCA formulation where all components are retained in the model.

The one drawback to this approach is that

spectral lines that are evident only during endpoint may be weighted so small in the etch-only PCA model that they are never observed during endpoint. In Fig. 7, in-situ etch and endpoint data from a 10% contact etch are used to create two separate data sets. The differences in both spectral lines and weightings due to the truncation of PCA components can be observed (a 10% contact etch case is chosen because the effects of drift are negligible and the truncation effect is more easily discerned).

2

P lo t o f T ve rsus tim e fo r 1 0 % o p e n a re a o xid e e tch 2000 1800 1600 1400

T

2

1200 1000 800 600 400 200 0

0

50

100

150

tim e (s)

Fig. 6. This plot of T2 versus time for 10% open area oxide etch was identically created using the original T2 formulation and the PCA calculation of T2, where the PCA was applied only to the main etch data, prior to endpoint.

12

principal comp. weighting principal comp. weighting

0 .6 0 .4 0 .2 0 -0 .2

0

200

400 600 800 sp e ctra l c ha nne l, e tch-o nly

1000

1200

0

200

400 600 800 s p e c tra l cha nne l, e nd p o int-o nly

1000

1200

0 .3 0 .2 0 .1 0 -0 .1

Fig. 7. PCA was performed on a separate range of data from etch and then endpoint for a 10% contact etch. The top plot is of the first principal component from the etch-only data and the second is the first component from the endpoint data. The 10% case is used so that the effects of drift are negligible and differences between the dominant spectral lines and weightings can be observed. Our initial approach has been to include endpoint in our calculation of a “modified T2 value”; other approaches will be discussed in future papers. Since endpoint data is now included, the changes in spectral intensities due to endpoint will be included in one of the first few principal components enabling detection of large changes along these components. An adequate endpoint detection scheme, however, could also be developed from the application of PCA to etch-only data. As outlined in Section IV, the only drawback is that endpoint may have significant variation in directions which are removed from the PCA model. Regardless of whether or not endpoint data is used to create the PCA model, one advantage of using PCA based T2 over the conventional method is that PCA can be considered a superset of the simple T2 approach: if PCA is applied correctly and all components retained, the results are the same as a T2 based computation using all available data. So by using PCA, we have the opportunity to flexibly select components for inclusion or removal with subsequent endpoint detection improvements.

13

P rinciple com p onent #1 co ntribution to T2 calculatio n 50 45

Contribution to T2 calculation

40 35 30 25 20 15 10 5 0

0

20

40

60

80 100 tim e(s)

120

Principle components #2,3,4 contribution to T2 calculation

1 80

10 PC2 PC3 PC4

45

9

40

8 Contribution to T2 calculation

Contribution to T2 calculation by principle component

1 60

Principle components #5 contribution to T2 calculation

50

35 30 25 20 15

7 6 5 4 3

10

2

5

1

0

1 40

0

20

40

60

80 100 time(s)

120

140

160

0

180

0

20

40

60

80 100 time(s)

120

140

160

180

2

Fig. 8. The first principal component based T value (top) clearly shows that the event corresponding to endpoint has occurred between 90 and 100 seconds in the etch. The contribution to T2 for the 2nd , 3rd and 4th principal components (bottom left) still adds positively to the identification of endpoint, but by the 5th principal component (bottom right) the information is largely lost in the noise. A second benefit which takes advantage of the flexibility of PCA and T2 integration is that the signal-

to-noise ratio can be improved by removing unimportant principal components in the calculation of T2. As shown in Fig. 8, only the first three principal components appear to detect any shift due to endpoint, and in fact, only one principal component provides a reasonable indicator. One tool that is often used in PCA is to calculate the percent variance captured as a function of principal component and for the case shown, the first two principal components capture over 99% of the variance in the principal component model. If only one component is retained, the resulting T2 plot would simply be the top graph in Fig. 8. By comparison with the T2 plot in Fig. 6, the plot obtained improves the signal-to-noise by removing unimportant principal components in the calculation of T2. As shown in Fig. 8, only the first three principal components detect any shift, and in fact retention of only one principal component can still do a reasonably good job.

A third advantage of using PCA is that the principal component mapping of the original data that is obtained results in a set of eigenvectors and corresponding eigenvalues in order of decreasing variance. By observing the PC scores, or alternatively the contributions of each of the principal components to the T2 calculation, a shift may be more easily detected (see Fig. 8) and identified with a particular eigenvector

14

direction. This characteristic is particularly useful for diagnostics where each eigenvector direction may be correlated with a certain process variable and a shift in mean or covariance from nominal operating conditions can be isolated to that variable. Along the first through third principal component directions, a shift is observed corresponding to endpoint. However, by the 4th and 5th principal components much of the information has become lost in the noise. Thus, a primary benefit of using principal component analysis in the PCA based calculation of T2 is that one may gain knowledge about the relative direction that a shift is occurring. The one exception is shown in Fig. 3 where a shift occurs in an eigenvector direction that has been removed during PCA. A fourth benefit of using PCA for the calculation of T2 is the computational efficiency of using T2 during the production runs. In the calculation of T2 there are two major calculations corresponding to 2n2 + 3n floating point operations each time data is collected, where n is the number of spectral channels collected. When the PCA based technique is employed there are three primary calculations totalling n+2nm+3m floating point operations. For a typical case we are interested in where n=1100 spectral

channels and m=3 principal components, the PCA based technique has 1/314 as many floating point operations each time data is collected, enabling rapid sampling and real-time calculation of the T2 statistic during a run. V. Description of Experimental Set-up Oxide etch data were collected at Digital Semiconductor using an Applied Materials HDP Oxide etch system to etch contacts and vias in 200-mm wafers. The primary etchant used in the main etch step of the process was C2F6, with O2 added in the post-etch treatment (an in situ polymer and photoresist strip process). Optical emission measurements were made using an Ocean Optics S2000 spectrometer with a single fiber optic cable looking in at the side of the reactor. These measurements consisted of 1100 spectral channels over a wavelength range from 350 to 880 nm. The first set of experiments were conducted in early 1997 and included a 1% contact, 10% contact, thick oxide and a blanket resist etch (to check for false detection). The second set of experiments was conducted in January of 1998 to acquire OES data over the entire etch and slightly past the point believed to be the end of clearing. The third set of experiments used the endpointing system developed from the second experimental set (acquired one month earlier) to determine endpoint or clearing. During etch, the T2 values were calculated and the process terminated once the T2 value shifted significantly, indicating the completion of clearing.

The resulting wafers were

analyzed with cross-sectional SEM (scanning electron microscope) measurements.

In the polysilicon etching process, a Rainbow LAM TCP 4500 etcher at MIT was used to etch blanket 150-mm polysilicon wafers (100% open area) with 5000A of polysilicon over oxide.

The etching

chemistry was a 5:1 ratio of HBr:Cl2 with a flow rate of 180 sccm. The chamber pressure was 20 mTorr.

15

Top power was applied at 300W with a bottom bias power of 25 W. As with the oxide etch experiments, separate training and test wafers were used to develop and test the endpoint detection system. VI. Results and Analysis

Our experiments were designed to determine whether PCA based T2 estimation from OES spectra would provide superior results to current methods where one or two dominant spectral lines, corresponding to a etchant or reactant, are monitored. Low open area etch endpointing (e.g. 1% oxide / 99% photoresist etch of contact or via) of 200-mm wafers using one or two spectral lines is extremely difficult and represents a challenging test for any endpointing algorithm. As such, these experimental results are focused upon a 1% contact etch but will also examine a 10% contact etch and 100% open-blanket polysilicon etch for comparison. Results of direct comparisons between dominant spectral lines versus the PCA based T2 estimates are presented in Section 6.1.

Any real-time algorithm must also be able to generalize to accommodate process shifts and changes which occur from run to run; this issue is discussed in Section 6.2. To determine the robustness of the PCA based T2 approach, the endpoint estimation algorithm was developed using a batch of 1% contact etch wafers and then tested in real-time on a separate group of wafers processed three weeks later. To verify these results, scanning electron microscope (SEM) analysis was conducted and the results are presented in Section 6.4.

6.1 Comparison of Endpoint Detection for Blanket, 10% Open Area, and 1% Open Area Etch Fig. 9 plots the dominant spectral line for each of three etches, with 100%, 10%, and 1% open area, respectively. As mentioned in the experimental set-up the etch consisted of a number of steps before and after the etch of the thin film of concern; to compare the etch results for each the time axis is set to zero at the beginning of etch (relative time).

For the 100% open-blanket polysilicon etch, endpoint (or the

completion of clearing) can easily be observed around 100 seconds. For the 10% contact etch, endpoint is more difficult to determine; however after examining several wafers the characteristic increase around 90 seconds confirms near completion of clearing. For the 1% contact etch, the dominant drift component of the signal obscures the endpoint. This is likely due to many reasons, one of which is the low area of oxide being etched in comparison to wall and temperature effects in the oxide reactor. Additionally, oxideendpoint is more difficult in this etch system due to the presence of Si and SiO2 parts inside the etch chamber; these process kit parts also etch, contributing to the observed optical emission background. Since these effects show up across many of the same lines, the selection and weighting of individual lines for observing endpoint but ignoring the drift are very difficult.

16

intensity(AU)

1 100% open blanket etch 0.5

0

0

20

40

60 time(s)

intensity(AU)

1

100

10% contact etch 0.5

0

40

50

60

70 time(s)

1

intensity(AU)

80

80

90

100

110

1% contact etch 0.5

0

40

60

80

100

120

140

time(s) Fig. 9. Plot of dominant spectral channel for three etches is shown.

PCA was performed on the OES spectral measurements for selected wafers from each set of experiments; 100% polysilicon, 10% open-area oxide and 1% open-area oxide etch. The three data sets consist of 1100 spectral channels and roughly 200 spectral time samples from etch to clearing. All the results presented here represent tests performed on production wafers that have been separated from those used during development of the estimators (i.e. training data). For the 100% poly and 10% oxide cases, one principal component was able to capture 99% of the variation; whereas for the 1% open-area etch two components were needed to capture 99% of the variation. The eigenvectors from each wafer type were incorporated into the T2 formulation presented in Section II to produce the three separate estimators, which are compared in Fig. 10. The resulting estimator receives the 1100 channel spectral measurement vector in real-time and produces the resulting T2 value for that particular measurement.

17

By examining the hundred-second samples from the onset of etch to endpoint (of which only one channel was plotted in Fig. 9), Figs. 9 and 10 can be directly compared.

For the 10% contact etch,

endpoint or the near completion of etch is more easily observed and the process and measurement noise observed in the single channel plot, Fig. 9, is significantly reduced. For the 1% contact etch, clearing was observed to begin with the abrupt change in slope around 112 to 125 seconds, but the effects of drift are visible as a bowl-shaped increase in the T2 values. While deriving an algorithm to detect this change in slope would be difficult given the influence of process drift in the signal, after some practice we were able to determine this change visually in real-time. Further examination, in the next section, of the two principal components for the 1% open area case offers some hope for removing the drift component from the signal and developing algorithms for automating endpoint.

100% open blanket etch 0.5

2

T values

1

0

0

20

40

60 time(s)

100

10% contact etch 0.5

2

T values

1

80

0

40

50

60

70 80 time(s)

100

110

1% contact etch 0.5

2

T values

1

90

0

40

60

80

100

120

time(s) Fig. 10. Plot of T2 values for the same three etches is shown.

18

140

6.2 Effects of Principal Component Selection on Detection and Drift In Fig. 11, the PCA derived T2 values using the first and second eigenvectors (components) alone and then the combined vectors are plotted for the 1% contact etch. The first component seems to align primarily with drift and the second component with the change in slope around 118 to 120 seconds that we believe is endpoint. The lighter dotted line in the second subplot is an averaging filter to smooth the noisy raw signal.

2

T , eigvec. 1

80 60 40 20 0 40

60

80

100 time(s)

120

140

40

60

80

100 time(s)

120

140

40

60

80

100 time(s)

120

140

2 1 0

100

50

2

T , eigvecs. 1 & 2

2

T , eigvec. 2

3

0

Fig. 11. Plot of 1% etch T2 values for the 1st, 2nd, and combined eigenvectors is shown.

19

The effects of mean-centering when a strong drift component is present must be dealt with carefully. For these results, a recursive mean update algorithm was used to center each incoming spectral measurement at the on-set of etch.

However the drift component causes variance between each

measurement and the spectral mean during the run. This linear increase appears as a quadratic effect when passed through the PCA based T2 algorithm and is observed in the first component. Since the EWMA weighting coefficient has to be small enough not to filter out significant variance due to endpoint, the primary hope is that drift and endpoint are correlated with one eigenvector or the other and the effects of drift can be removed.

To further illustrate this fact, mean centering was performed near endpoint and the results plotted in Fig. 12. Notice that the diminishing T2 value indicates the spectral measurement approaching the location where mean centering was performed and then increasing again as clearing starts. However the more interesting characteristic is that the T2 value using the second component does not seem to be affected by mean centering. The variation in the signal seems correlated to endpoint and this result is consistent across all wafers from the Digital Semiconductor experiments. Although further study is needed, the ability of the second component to isolate endpoint variation from drift would improve the robustness of this algorithm significantly. The primary disadvantage of using only the second component is that the signal-to-noise ratio is nearly an order of magnitude lower than keeping the first component in and trying to filter out the effects of drift.

20

2

T , eigvec. 1

30 20 10

*

0 40

60

80

100 time(s)

120

140

40

60

80

100 time(s)

120

140

40

60

80

100 time(s)

120

140

2 1 0

60 40 20

2

T , eigvecs. 1 & 2

2

T , eigvec. 2

3

0

Fig. 12. The effect of mean centering on drift is indicated in the first component which does not seem to affect the relationship of the second component with regard to endpoint.

6.3 Q-Statistic and T2 Results using PCA Models Developed on Etch-only Data Fig. 13 shows the Q-statistic for the 10% contact etch case where the PCA model has been developed from etch-only data (no clearing data included); in this application both endpoint detection and calibration are inseparable. Note the values decreasing toward the onset of etch (as the process approaches etch) and deviating away again at endpoint (the conclusion of etch). This figure can be directly compared to the T2 endpoint computations in Fig. 8. The advantage of using the Q-statistic for endpoint detection is the high signal-to-noise ratio noted by the units on the y-axis. The disadvantage, as pointed out in Section IV, is that any deviation from the PCA model will resemble endpoint, including drift.

21

10

x 10

5

9 8 7 Q vals

6 5 4 3 2 1 40

50

60

70

80 tim e (s )

90

100

110

Fig. 13. The Q-statistic values for the 10% contact etch where the PCA model has been developed using only etch data (no endpoint).

30 25

2

T vals

20 15 10 5 0 40

60

80

100 ti m e (s )

120

140

160

Fig. 14. T2 values are shown for the 1% contact etch using etch-only data.

A PCA model is developed using data taken from approximately 70 to 90 seconds during 1% contact etch for a training wafer. This model is used within the endpoint detection software applied to four test wafers, and the T2 results are shown in Fig. 14. Although observable, the effects of mean and covariance based drift from samples 80 to 120 do not prevent fairly easy endpoint detection. One reason for this is that the update scheme using a EWMA-based recursive mean update has been tailored to update the mean to accommodate drift but not so fast as to filter out the effects of endpoint. This simple mean update scheme

22

was applied over a range of weighting coefficients to determine the optimum weighting for this trade-off. When the update coefficient is only slightly higher, the mean update severely reduces the endpoint signature thus reducing the signal-to-noise ratio. When the mean update coefficient is slightly lower, the natural drift in the process dominates the T2 values and endpoint is difficult to see. As mentioned before, the use of etch-only data with T2 detection (as in Fig. 14) provides a similar result as the Q-statistic in Fig. 13 and has a serious disadvantage that any deviation from nominal etch may not be distinguishable from endpoint. Although the performance of this estimation approach appears to be quite good, the Q-statistic cannot be used simultaneously as a calibration metric for the PCA based endpoint detection scheme and issues regarding robustness are of concern.

6.4 Real-time Demonstration of Robustness of PCA Based Endpoint Approach To test the robustness of the PCA and T2 based endpoint algorithm, data was taken in January of 1998 to update the PCA models developed from earlier data sets. The resulting algorithm was demonstrated and tested in real-time approximately one month later.

During the approximately three weeks between

experiments, a few hundred wafers were processed in the etch chamber.

Additionally, significant

maintenance activity occurred, including the replacement of an RF matching network. These and other variables may cause significant drift in the system. When a sudden shift occurred in the T2 values (believed to correspond with the end of clearing), the process was stopped and the wafers examined by SEM analysis. The PCA based T2 values plotted in real-time from the beginning to the conclusion of etch are shown in Fig. 15. To obtain a clear understanding of the correlation between the T2 estimates and status of the wafer, the etch process was halted at different times associated with the beginning, middle and end of the endpoint estimate. Wafer 203 was stopped at the beginning of the rise believed to be the onset of endpoint. Wafer 204 represents the original timed etch used by Digital Semiconductor for the 1% contact etch (believed to be past the optimal endpoint). Processing of wafer 205 was stopped at the top of the rise believed to be associated with the near completion of clearing.

The scanning electron micrographs for wafers 203, 204 and 205 are displayed in Fig. 16. SEM photos of minimum-sized vias in an SRAM structure are shown at the wafer center and edge.

Micrographs for

wafers 205 and 206 are very similar and as such, only wafer 205 is shown. (Note the magnification is 50K for all micrographs, with the exception of the top right photo that is 45K.) Wafer 203 shows that the smallest features on this pattern are not yet clear; there is approximately 51nm of oxide remaining at the center and 25 nm remaining at the edge (out of a total of 842 nm). SEMs of larger features show that they have cleared by this point, as expected due to RIE lag [1,20]. For wafer 203, the measured via diameter is 0.472 microns at the center and 0.468 microns at the edge positions.

The measured diameters are

approximately 0.498 microns for the via at the edge of wafer 204 and wafer 205.

The via depth for wafer

203 at the center is 0.790 microns (out of a total film stack of 0.842 microns). The via depth for wafers 204 and 205 at the center are .855 and .881 microns, which represent the total oxide thickness.

23

The fastest etch rate was observed at the wafer center for the largest vias of approximately 0.525 microns. The via diameters varied from 0.57 (for wafer 203) to 0.62 microns (for wafer 204), indicating slight overetch for the larger vias. The primary difference was in the contact etch depth. For the larger vias, wafer 203 shows an incomplete etch depth of 0.84 microns compared to wafers 204 and 205 at 0.89-0.88 microns respectively.

3.5

3

wafer 204

2.5

2 wafer 203 1.5

1

wafer 205

0.5

0

60

80

100

120

140

160

Fig. 15. The T2 values for four of the wafers from the real-time endpoint experiments are plotted. Wafer 203 was stopped short of the predicted endpoint. Wafer 204 uses the current timed etch recipe for Alpha wafers. Wafer 205 was stopped at the top of the rise estimated to be near the end of clearing.

24

Center

Edge

Wafer 203

Wafer 204

Wafer 205

Fig. 16. Scanning electron micrographs for wafers 203, 204 and 205 are displayed for the center and edge positions of the smallest critical design units of an SRAM structure.

In general, the SEMs indicate that from the onset of estimated endpoint (wafer 203) to the slight overetch of wafer 204, there is an optimal stopping region after the sudden rise where trade-offs between overetching the larger dimensions and underetching the smaller dimensions exist. It is observed that across the entire set of 36 micrographs taken for vias at minimum size, 1.25X minimum, and 1.5X minimum at the center, middle and edge of the wafer, wafers 205 and 206 offer a potential improvement in throughput by minimizing the required overetch time. It would be very difficult to determine ‘the’ optimal endpoint and a better metric for successful etch completion is likely to be within a five to ten second region shortly after the sudden rise. These results indicate that a real-time implementation of this algorithm developed from process data one month earlier could be robust enough to provide reliable endpoint determination in a production process. It should also be noted that without an effective estimator to update the mean in the presence of drift these results are not possible. The use of PCA enhances the signal-to-noise ratio where low open area etch can be observed; however it is the use of simple recursive estimators that provide the asimportant robustness. In the next section, future plans to address several obstacles to increasing the robustness and eventual automation are discussed.

25

VII. Conclusions and Future Work Effective endpointing of low open area etch is extremely difficult using a timed etch or by monitoring a few hand-selected wavelengths. Alternative methods using PCA in conjunction with T2 detection and recursive mean updates have been presented for estimating endpoint and calibrating this estimate over long term use. These methods have been applied to production wafers at Digital Semiconductor and the results confirmed through SEM analysis. The key issues for success using this approach involve careful analysis of which principal components are aligned with drift and removal of those components if possible. Issues regarding mean centering to address run-to-run shifts in spectral mean are also addressed. The current and future focus of this work is to address the critical obstacles in developing a fully automated, robust system for commercial use.

To commercially implement this approach and address multi-layer etches where multiple PCA models are required, there are three sources of variation that need to be further addressed. The first is to determine when etch begins so that any transient signals between layers do not require data processing or trigger any alarms. The second is to identify and adapt/filter any drift or shifts in the spectral signature and the third, which is primarily focused on in this paper, is the characterization of endpoint in the presence of noise and process related disturbances.

There are several approaches that may overcome any or all of these sources of variation. In the first two cases, the algorithm must adjust to the run-to-run variance due to process drift and shifts and in the third case, identify the variance due to endpoint. For all three steps, a priori information is used to determine when etch begins (first source of variation), to recalculate the new baseline spectral mean (second source of variation), and to determine when clearing is complete or endpoint (third source of variation).

The characteristics of the shift signifying the beginning of etch usually follow a step-like pattern. Gradient methods do not work well for these tasks because there are often noise-like fluctuations in the signal or transients before etch begins. To identify the beginning of etch, a step fitting function can be applied to clusters of incoming data to identify the characteristic step due to reactants being consumed and products formed and from this identification, the baseline is adjusted to the new mean [25]. One constraint is that the data window be sized appropriately with the bandwidth of the estimation loop. By splitting the data set into two different data populations and iteratively computing the mean for each population, a step is identified and fit. The split always begins at the most recent or current sample and works backward using prior calculations. The sum squared error for each fit over a group is compared and the process is repeated until the beginning of etch is determined by a large shift in the T2 calculation. Once determined, the resulting mean after the shift is used to mean center the incoming OES measurements. When coded efficiently this algorithm can be used with the PCA based T2 method in real-time.

26

A recursive least squares estimator or exponentially weighted moving average (EWMA) estimator [9] can also be used to update the spectral mean, the second form of variation. A drawback to these methods is selection of the proper window size or filtering coefficients. Selecting a large window size creates a delay in baselining the spectral measurements once etch begins. Selecting a small window size results in the mean being altered by noise or outliers.

Each of these methods has been applied to the Digital

Semiconductor data set with comparable results.

The most challenging problem is distinguishing the third source of variation, endpoint, from noise and drift.

The PCA based T2 algorithm presented in this paper has demonstrated the ability to estimate

endpoint. Similar results have been examined that use multiple PCA models, one for etch and one for endpoint, where the T2 value deviates away from the etch model and toward the endpoint model during clearing. A distance metric or explicit intersection using these two values signifies when clearing is near complete. Other work has examined the use of evolving factor analysis [16,18,26,28] as well. Preliminary analysis indicates that similar problems regarding signal-to-noise for low open area etch data need to be addressed; however this approach indicates much promise as well. Analysis and comparison of the PCA based T2 approach presented in this paper with multi-way PCA methods is an appropriate area for future research.

Acknowledgements

This work has been sponsored in part by the MIT Leaders for Manufacturing program and DARPA Computation Prototyping Program via a subcontract with Stanford University.

References

[1] A. Baily, M. VandeSanden, J. Gregus and R. Gottscho, “Scaling of Si and GaAs Trench Etch Rates with Aspect Ratio, Feature Width and Substrate Temperature,” Journal of Vacuum Science & Technology B, vol. 13, no. 1, pp. 92-104, 1995.

[2] M. Baker, C. Himmel, and G. May, “In-situ prediction of reactive ion etch endpoint using neural networks,” Components, Packaging, and Manufacturing Technology, Part A, IEEE Transactions on, Vol. 18, 3, Pages: 478 – 483, Sept. 1995.

[3] G. Barna, D. Buck, S. Butler and D. A. White, “Sensor Information from a Lam 9600 Metal Etch Process,” Proc. of Process Control, Diagnostics, and Modeling in Semiconductor Manufacturing, 187th Electrochemical Society Meeting, 1995.

27

[4] D. Beale, A. Wendt, and L. Mahoney, “Spatially resolved optical emission for characterization of a planar radio frequency inductively coupled discharge,” Journal of Vacuum Science & Technology A: Vacuum, Surfaces, and Films, vol. 12, issue 5, pp. 2775-2779, September, 1994 .

[5] P. Biolsi, D. Morvay, L. Drachnik, and S. Ellinger, “An Advanced Endpoint Detection Solution for < 1% Open Areas,” Solid State Technology, vol. 39, no. 12, December 1996.

[6] M. Buie, J. Pender, H. Spindler, J. Soniker, M. Brake, and M. Elta, “Correlating Plasma Emissivity And Etch Depth,” IEEE International Conference on Plasma Science, Conference Record, 1994.

[7] M. Buie, J. Pender, J. Holloway, and M. Brake, “In situ sensor of spatially resolved optical emission,” IEEE Transactions on Plasma Science, vol. 24, (1), pp. 111-112, Feb. 1996.

[8] D. Boning, J. Claman, K. Wong, T. Dalton, and H. Sawin, “Plasma Etch Endpoint Via Interferometric Imaging,” American Control Conference, Pages: 897 – 901, June 29-July 1, 1994.

[9] G. Box and G. Jenkins, Time Series Analysis; Forecasting and Control, Holden-Day, San Francisco, 1970.

[10] R. Chen, S. Rangan, and C. Spanos, “Spatially resolved endpoint detector for plasma etcher,” IEEE International Symposium on Semiconductor Manufacturing, Conference Proceedings, pp. B45 - B48, Oct. 1997.

[11] S. Dolins, A. Srivastava, and B. Flinchbaugh, “Monitoring and diagnosis of plasma etch processes,” IEEE Transactions on Semiconductor Manufacturing, Vol. 1, 1, pp. 23-27, Feb. 1988.

[12] N. Gallagher, B. Wise, S. Butler, D. White and G. Barna, “Development and Benchmarking of Multivariate Statistical Process Control Tools for a Semiconductor Etch Process: Improving Robustness Through Model Updating,” IFAC ADCHEM'97, pp. 78-83, June 1997.

[13] J. Jackson, "Principal Components and Factor Analysis: I. Principal Components," J. Qual. Tech, vol. 12, p. 201, 1981.

[14] J. Jackson, A User’s Guide to Principal Components, Wiley-Interscience, John Wiley & Sons, Inc., New York, 1991. [15] I. Jolliffe, Principal Component Analysis, Springer Series in Statistics, Springer-Verlag, New York, 1986.

28

[16] H. Keller and D. Massart, “Evolving Factor Analysis,” Chemom. Intell. Lab. Syst., vol. 12, pp. 209224, 1992.

[17] M. Le, “Variation Reduction in Plasma Etching via Run-to-Run Process Control and Endpoint Detection,” Master’s Thesis, MIT, 1997.

[18] E. Malinowski, “Automatic Window Factor Analysis – A More Efficient Method for Determining Concentration Profiles from Evolutionary Spectra,” Journal of Chemometrics, vol. 10, no. 4, pp. 273-279, July-August 1996.

[19] H. Maynard, Layadi, N. and J. Lee, “Plasma Etching of Submicron Devices: In-situ Monitoring and Control by Multi-wavelength Ellipsometry,” Thin Solid Films, vol. 313, pp. 398-405, February 1998.

[20] S. Naqvi, R. Krukar, J. McNeil, J. Franke, T. Niemczyk, D. Haaland, R. Gottscho and A. Kornblit, “Etch Depth Estimation of Large-Period Silicon Gratings with Multivariate Calibration of Rigorously Simulated Diffraction Profiles,” Journal of the Optical Society of America A - Optics Image Science and Vision, vol. 11, no. 9, pp. 2485-2493, September 1994.

[21] J. Pender, M. Buie, T. Vincent, J. Holloway, M. Elta, and M. Brake, “Radial optical emission profiles of radio frequency glow discharges,” Journal of Applied Physics, vol. 74, no. 5, pp. 3590-3592, September 1993.

[22] S. Qin and H. Yue, “Fault Detection and Classification for Plasma Etchers via Optical Spectroscopy Analysis,” AIChE Annual Meeting, Paper 202g, November 16-21, 1997.

[23] E. Rietman, J. Lee and N. Layadi, “Dynamic Images of Plasma Processes: Use of Fourier Blobs for Endpoint Detection During Plasma Etching of Patterned Wafers,” Journal of Vacuum Science & Technology, vol. 16, no. 3, pp. 1449-1453, Part 2, May – June 1998.

[24] J. Roland, P. Marcoux, G. Ray, and G. Rankin, “Endpoint detection in plasma etching”, Journal of Vacuum Science & Technology A: Vacuum, Surfaces, and Films, Volume 3, Issue 3, pp. 631-636, May 1985.

[25] E. Sachs, A. Hu and A. Ingolfsson, "Run by Run Process Control: Combining SPC and Feedback Control," IEEE Transactions on Semiconductor Manufacturing, vol. 8, no. 1, pp. 26-43, Feb 1995.

29

[26] F. Sanchez, B. Vandeginste, T. Hancewicz and D. Massart, “Resolution of Complex Liquid Chromatography Fourier Transform Infrared Spectroscopy Data,” Analytical Chemistry, vol. 69, no. 8, pp. 1477-1484, April 1997.

[27] M. Sharaf, D. Illman, and B. Kowalski, Chemometrics, Wiley, 1986. [28] R. Tauler, A. Smilde, J. Henshaw, L. Burgess and B. Kowalski, “Multicomponent Determination of Chlorinated Hydrocarbons Using a Reaction-Based Chemical Sensor,” Analytical Chemistry, vol. 66, no. 20, pp. 3337-3344, October 1994.

[29] E. Thomas and D. Haaland, “Comparison of Multivariate Calibration Methods for Quantitative Spectral Analysis,” Analytical Chemistry, vol. 62, no. 10, pp. 1091-1099, May 1990.

[30] S. Thomas, H. Chen, C. Hanish, J. Grizzle, and S. Pang, “Minimized response time of optical emission and mass spectrometric signals for optimized endpoint detection,” Journal of Vacuum Science & Technology B: Microelectronics and Nanometer Structures, Vol. 14, 4, pp. 2531-2536, July 1996.

[31] P. Ward, “Plasma process control with optical emission spectroscopy,” Electronics Manufacturing Technology Symposium, Seventeenth IEEE/CPMT International, pp. 166-169, Oct. 1995.

[32] D. A. White, D. Boning, S. Butler and G. Barna, “Spatial Characterization of Wafer State using Principal Component Analysis of Optical Emission Spectra,” IEEE Trans. on Semiconductor Processing, February, 1997. [33] S. Wold, K. Esbensen, P. Geladi, J. Ohman, “Multi-Way Principal Components and PLS Analysis,” Journal of Chemometrics, vol. 1, p. 41-56, 1987.

[34] B. Wise, N. Ricker, D. Veltkamp, and B. Kowalski, “A Theoretical Basis for the Use of Principal Component Models for Monitoring Multivariate Processes,” Process Control and Quality, vol. 1, pp. 4151, 1990.

[35] K. Wong, D. Boning, H. Sawin, S. Butler, and E. Sachs, “Endpoint prediction for polysilicon plasma etch via optical emission interferometry,” Journal of Vacuum Science & Technology A: Vacuum, Surfaces, and Films, Volume 15, Issue 3, pp. 1403-1408, May 1997.

30

Suggest Documents