Model-Based Dummy Feature Placement for Oxide ... - CiteSeerX

1 downloads 0 Views 649KB Size Report
(ruiqi, wong)@cs.utexas.edu, robert boone@motorola.com. ABSTRACT ... confirmed that post-CMP ILD thickness is highly correlated to pat- tern density ...
Model-Based Dummy Feature Placement for Oxide Chemical-Mechanical Polishing Manufacturability Ruiqi Tian1 2 , D. F. Wong1 , Robert Boone2 ;

1 Department of Computer Sciences, 2 Motorola Inc., 3501 Ed

University of Texas at Austin, Austin, TX 78712 Bluestein Blvd., Austin, TX, 78721

Permission to make digital/hardcopy of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication and its date appear, and notice is given that copying is by permission of ACM, Inc. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. DAC 2000, Los Angeles, California (c) 2000 ACM 1 -58113-188-7/00/0006..$5.00

1 0 1 0 1 0

0 1 1 0 0 1

0 1 1 0 0 1

0 1 1 0 0 1

1 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 1 0 0 1

1 0 0 1 0 1 0 1

1 0 1 0 1 0

0 1 1 0 0 1

0 1 1 0 0 1

0 1 1 0 0 1

1 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 1 0 0 1

1 0 1 0 1 0

0 1 0 1 0 1 0 1 1 0 0 1 0 1 0 1 0 1 0 1

Figure 1: (a) without tiling (b) tiled

11111 00000 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111

00 11 11 00 00 11 00 11 00 11 00 11

00000 11111 11111 00000 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111

Methods for tiling can be classified into two categories: rule-based and model-based. Rule-based tiling are usually done with boolean operations to find open space on a layer and fill it with tiles of a single prescribed density. Model-based methods, based on analytical expressions of the relation between local density and ILD thickness, allow both local tile density and insertion location to vary. Figure 2 illustrates the difference between rule-based and modelbased tiling. Obvious to the observer, model-based methods provide more accuracy and efficiency.

(b)

(c)

00000 11111 11111 00000 00000 11111 00000 11111 00000 11111 00000 11111

11 00 00 11 00 11 00 11 00 11 11 00

00 11 11 00 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 11 00

11111 00000 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111

11111 00000 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111

(a) 00 11 11 00 00 11 00 11 00 11 00 11

INTRODUCTION

Continued aggressive scaling down of VLSI feature size has constrained much of the manufacturing process window so that CMP for inter-level dielectric (ILD) planarization has become increasingly important for manufacturability [1]. Experimental data has confirmed that post-CMP ILD thickness is highly correlated to pattern density distribution of features on a layer. Hence, dummy features—features that are electrically inactive and are not for the purpose of optical assistance—are inserted into layout to change pattern density distribution to ensure CMP manufacturability. This procedure is sometimes called “tiling”, because dummy features inserted typically are small polygons of similar shape to minimize coupling. These small polygons are thus called “tiles”, and they are usually rectangles for simplicity. In the rest of this paper, the term “tiling” (“tile”) and dummy feature placement (dummy feature) are used interchangeably. Figure 1 illustrates an exaggerated wafer cross section of two layers, in which the ILD thickness varies and tiling helps.

(b)

11 00 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 11 00

1.

(a)

00000 11111 11111 00000 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111

Chemical-mechanical polishing (CMP) is an enabling technique used in deep-submicron VLSI manufacturing to achieve uniformity in long range oxide planarization [1]. Post-CMP oxide topography is highly related to local spatial pattern density in layout. To change local pattern density, and thus ensure post-CMP planarization, dummy features are placed in layout. Based on models that accurately describe the relation between local pattern density and post-CMP planarization [7; 5; 9], a two-step procedure of global density assignment followed by local insertion is proposed to solve the dummy feature placement problem in the fixeddissection regime with both single-layer and multiple-layer considerations. Two experiments, conducted with real design data, gave excellent results by reducing post-CMP topography variation from ˚ to 152A ˚ in the single-layer formulation and by avoiding cu767A mulative effect in the multiple-layer formulation. The result from single-layer formulation compares very favorably both to the rulebased approach widely used in industry and to the algorithm in [3]. The multiple-layer formulation has no previously published work.

11 00 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 00 11 11 00

ABSTRACT

0 1 0 1 0 1 0 1 1 0 01 1 0 0 1 0 1 0 1

(ruiqi, wong)@cs.utexas.edu, robert [email protected]

(d)

Figure 2: Rule-based and model-based tiling. (a) is a layout example with features lightly shaded and exclusion zones in dashed lines; (b) shows a 25% dense tiling template in rule-based approach; (c) is the result of the rule-based tiling after boolean operations; and (d) is a possible model-based tiling result for the same layout. In general, given a model for a CMP process, the tiling problem is to determine the amount and location of dummy features to be placed into the layout, so that certain constraints such as electrical and physical design rules are observed, and certain objectives such as minimum or ranged variation are satisfied by post-CMP topography. Based on recent models [7; 5; 9] that enable faster and more accurate prediction of ILD thickness by computing an effective initial local feature density, this paper proposes a two-step solution to the dummy feature placement problem, giving considerations to both single-layer and multiple-layer layout in the fixed-dissection regime—one that divides a layout into a grid of small equal rectan-

gles. The first step uses linear programming to compute the amount of dummy feature required in each small rectangle. The second step then places the calculated amount into each rectangle while optionally optimize certain local properties.

2.

MODELS FOR OXIDE CMP

Several models were proposed for oxide CMP [4]. In contrast, the model by B. Stine et al. is not computationally expensive nor difficult to calibrate [7]. In that model, ILD thickness z at location (x; y) is solved to be 

z=

[Ki t =ρ0 (x; y)] z1 Ki t + ρ0 (x; y)z1

z0 z0

t < (ρ0 z1 =Ki ) t > (ρ0 z1 =Ki )

;

(1)

where Ki is the blanket polishing rate, z0 the height of oxide deposition, z1 the height of existing feature, t the polish time, and ρ0 (x; y) the initial pattern density. Figure 3 shows a schematic for some of the variables. t normally is larger than (ρ0 z1 =Ki ) so that final oxide

3. DUMMY FEATURE PLACEMENT 3.1 Assumption and notation Equations (1) to (3) describe post-CMP topography as a function of process condition ( f , t, Ki , z0 , z1 , B) and spatial pattern density (d). Process condition is assumed to be fixed for a particular CMP process. Final topographies are hence differentiated only by initial density distributions. Also, all polygons representing oxide pattern in the layout are assumed to be after bias B (enlargement in all directions, see Fig. 3) has been applied. In the fixed-dissection regime, suppose each layer of K layers of layout area is divided into (M  N ) rectangles of equal dimensions and of area A0 , each being called a cell, and the ordered tuple (i; j ; k) with i 2 [1; M ], j 2 [1; N ], and k 2 [1; K ] being the coordinate of a cell. For the cell at (i; j; k), let P0 (i; j; k) denote all polygons inside the cell that are originally in the layout, and P(i; j; k) denote all dummy features (polygons) inserted within. In addition, let the function A( p) return the area of an arbitrary polygon p. Hence,

B Oxide

1 0 0 1 0 1 0 1

1 0 0 1 0 1

z =0

xi jk = 1 0 0 1 0 1

z0

z1 Pattern

Figure 3: Some variables in CMP (from [5]) thickness, by the second case of Eq. (1), is between 0 and (z0 z1 ). In addition, all Ki , z0 , z1 , and t are constants for a specific CMP process. As a result, the final topography is determined only by the initial pattern density ρ0 (x; y). The simplest model uses the local spatial pattern density for ρ0 (x; y). An algorithm by Kahng et al. solves the tiling problem based on this model [3]. However, more accurate modeling by D. Ouma et al. considers the deformation of polishing pad during polish [5]. The effective local density ρ0 (x; y) is no longer directly proportional to local spatial pattern density, but calculated as the summation of weighted spatial pattern density within a weighing region. The weighing function f (x; y), derived from elastic material under a uniform load normal to the material surface, is an elliptical function. Further study by E. Travis [8] assumed an approximation to the elliptical f to be f (x; y)  c0 exp[c1 (x2 + y2 )c2 ], where constants c0 , c1 , and c2 are calibrated for each specific process. In the fixed-dissection regime, in which the layout area is divided into a grid of small rectangles, the discretized effective local pattern density ρ0 (i; j) is then ρ0 (i; j) = IFFT[FFT[d (i; j)]  FFT[ f (i; j)]];

(2)

where d (i; j) is the spatial pattern density for oxide at location (i; j), and f (i; j) is the weighing function discretized accordingly with respect to the grid. Therefore, the CMP process is modeled by Eq. (2) as a low-pass filter through which the local pattern density d not only contributes to immediate but, where f is not too small, also to short range ILD thickness. Furthermore, variation in ILD thickness is cumulative from layer to layer, as it is shown in Fig. 1, so each layer cannot assume a perfectly flat starting surface. The cumulative effect is modeled by T. Yu et al. [9] as topography variation of lower layer attenuating through subsequent CMP processes– each a low-pass filter based on Eq. (1) and (2). Mathematically expressed,

[

ρ0 (k) =

(

\

z [dbk + ( kz 1 )ρ0 (k 1) ] k

db1

 fˆ

 fˆ

if k > 1 if k = 1

(3)

where “ b ” is the FFT operator, ρ0 (k) the effective local density, zk the step height, dk the local density, all for layer k, and f the weighing function.



0 [A( p)=A ] p2P(i; j;k)

is density for the amount of dummy feature inserted into the cell at (i; j; k). Similarly, the density of features originally in the layout (x0i jk ) is x0i jk = ∑ p2P0 (i; j;k) [A( p)=A0 ]. Clearly, d (i; j; k) = xi jk + x0i jk in Eq. (2) for a given k. To formulate constraints from electrical and physical design rules, cost zones (polygons q) are used to define regions of different costs (C(q) 2 [0; ∞)), and Q(i; j; k) denotes all cost zones in the cell at (i; j ; k). Cost is uniform within a cost zone, and a cost zone is called an exclusion zone if it has infinite cost. Exclusion zones are sufficient to implement physical design rules such as non-overlapping rule and minimum spacing (buffer) rule. Consequently, the total available density in each cell for inserting dummy feature is xaijk  1



q2Q(i; j;k); C (q)=∞

0 [A(q)=A ]:

Furthermore, the average cost of inserting dummy feature into the cell at (i; j; k) is σi jk =





0 [C (q) [A(q)=A ]]:

q2Q(i; j;k); C (q)6=∞

Finally, if K = 1 then all subscript and variable k are dropped in the notation to reflect a 2-dimensional layout.

3.2 Global density assignment 3.2.1 Single-layer consideration From digital signal processing theory, Equation (2) is the circular convolution of d and f in physical domain provided that d is periodically repeated, and f is a linear shift-invariant filter, i.e., contributions from different locations are linearly additive [2]. For the CMP process, the area defined by the layout, called the reticle field, is repeated in both x and y directions on wafer. Meanwhile, the CMP process does not have a preferred location nor a preferred direction, so filter f satisfies the linear shift-invariant criterion. Assuming long range contribution outside the weighing region [( L; L)  ( L; L)] is negligible, Equation (2) can be rewritten as a circular convolution: ρ0 (i; j) =

i+L

∑ 0

j +L



i =i L j 0 = j L

0 [(xi0 j0 + xi0 j0 )

 f (i0

i; j 0

j)]:

(4)

Because the multiple convolution written as a series of summations in Eq. (9) is linear in term of x, the Min-Variation formulation for single layer is easily extended for multiple layers as Multiple-MinVariation:

2L (i, j )

y 2L

Minimize: O

x

Subject to: Figure 4: Convolution implementation of filter f . The reticle field is the center region with 54 cells, and is repeated as dashed rectangles. ρ0 (i; j) of the shaded cell sums over the square region marked with 2L  2L by revolving indices of cells in the center region. An implementation of Eq. (4) is shown in Fig. 4. By subjecting ρ0 (i; j) to linear constraints, Equation (4) becomes a system of linear equations with (M  N ) variables xi j . The tiling problem at the global stage, where all cells are considered, becomes a linear programming (LP) problem, as xi j in the LP solution is the amount of dummy feature needed for cell (i; j). The LP formulation for minimum topography variation (Min-Variation) is Minimize: Subject to:

H L 0  L  ρ0 (i; j)  H  1 0  xi j  xaij ;

Σi; j [σi j  xi j ] 0  L  ρ0 (i; j)  H  1 H Lε 0  xi j  xaij :

(6) (7)

In this formulation, constraints (7) ensure manufacturability, and the cost zones for objective (6) can be used to model different considerations. For example, if all non-exclusion zones have constant cost, i.e., σi j is a constant, then the total amount of tile is minimized by (6) to maximize polish rate and thus throughput.

3.2.2 Multiple-layer consideration By mathematical induction on layer number k and linearity of Fourier transforms, Equation (3) can be written as

[

ρ0 (k) = Σl =1 [(zl =zk ) fb(k k

l +1)

 dbl ]

(8)

:

For the effective density at a location (i; j) on layer k, each term in the summation of Eq. (8) results in a multiple circular convolution in the physical domain: (α)

[IFFT( fˆ = =

z

[( f

 dˆl )](i α

}|

;

j)

{

f  f ) dl ](i

Σi1 Σ j1 [ f (i1 Σiα Σ jα [ f (iα

i; j1 iα i

0 (xiα jα l + xi j l )]) α α

:

;

j)

j) 

1 ; jα



1)

Lk ]

0  Lk  ρ0 (i; j; k)  Hk  Σl =1 [zl =zk ] k

0  xi jk  xaijk ;

(10)

where ∑kl=1 [zl =zk ] is the upper bound on the cumulative density for layer k. Similarly, the Ranged-Variation formulation translates to Multiple-Ranged-Variation: Minimize: Subject to:

Σi j k [σi jk  xi jk ] ; ;

0  Lk  ρ0 (i; j; k)  Hk  Σl =1 [zl =zk ] Hk Lk  εk 0  xi jk  xaijk ; k

where εk and σi jk can differ among layers to achieve a solution that emphasize particular layers. Also, if process uniformity from layer to layer is desired to be within a positive amount β, circular constraints j∆k ∆k 1 j  β for k 2 [2; K ] and j∆1 ∆K j  β, where ∆k = Hk Lk , can be added.

3.3 Local dummy feature insertion (5)

where L and H are auxiliary variables, and constraint (5) ensures that exclusion zones are observed as well as that no existing feature is deleted. The Min-Variation formulation usually serves as a feasibility test for manufacturability. A more useful formulation is for a ranged variation, in which a process budget ε bounds the final topography, and a linear function utilizing the cost zones or even the null function is used for the objective function— Minimize: Subject to:

Σk [Hk

 (9)

After xi jk is determined, the total area of dummy feature needed for a single cell is Ai jk = xi jk  A0 , and Ai jk is feasible by LP constraints in (10). Then at the local level in layout, tiles are distributed within available areas of a cell (C(q) 6= ∞) such that total area inserted is Ai jk . Available areas, constrained by electrical and physical design rules, may be highly irregular in shape. To avoid design rule violation during insertion, available area xaijk should be determined before LP by boolean operations. Moreover, actual location of tiles inside a cell does not change the original LP computation if the cell is small and insertion is random, because CMP interaction length (millimeters) is of orders larger than anything on the local level. A simple method for local insertion is to grid the cell and place same tiles at grid intersections within the available area. Tiles are then selected at random until Ai jk is reached. Although local insertion has little global impact, more useful approaches consider its local impact of increased interconnect capacitance due to coupling. Assuming tiles are small and tile-to-tile coupling negligible, the coupling in tiling is inversely proportional to the (buffer) distance between tiles and original feature [6]. Many algorithms and heuristics, like greedy, iterative improvement, or simulated annealing, can be employed to minimize coupling by maximizing the local buffer distance under the constraint of available area.

4. EXPERIMENTAL RESULTS 4.1 Single layer The mask in the first experiment, which includes six same dice in the reticle field with approximately 2 million polygons in each die, is for shallow trench isolation of a digital signal processor (DSP) from Motorola. The layout is tiled with both rule-based and modelbased tiling algorithms for comparison. The two-step process in model-based tiling was performed with Ranged-Variation LP formulation using commercial design rule checker (DRC) tools for spatial density acquisition, commercial LP package for LP computation, and internally developed geometry engine and software for polygon manipulation in local insertion. Figure 5 shows the

pattern density maps for model-based tiling as well as the original untiled layout as measured from final masks. Figure 6 shows 1 60

0.9

1 60

0.9

0.8 50 0.7

0.6

40

30

40

50

60

70

0.4

0.3

0.3

0.2

0.2

0.1

0

80

0.5

0.4 0.2 10

20

0.6

0.5 0.3

20

0.1

10

0.7

0.6 0.4

0.2 10

0.8

0.7 0.5 30

0.3

0.8 0.6

0.4

20

1

0.9

0.7

0.5 30

1

0.9

0.8 50

40

the three layers are tiled using the Multi-Ranged-Variation formulation from section 3 with target maximum variations for each layer ˚ Figure 7 shows results of the experiment. Clearly, the being 300A.

10

20

30

40

50

(a)

60

70

0

80

0.1

(b)

0

0.1

0

5

10

15

20

0

0

(a) Figure 5: Density d of final masks for a DSP chip from Motorola. The original layout is in (a) and with model-based tiling in (b). In (a), the dark area represents the memory, gray for random logic, and white for open spaces. In (b), the model-based method puts tiles at different locations with very different densities. The units for x and y are 0.25mm. distributions of ρ0 for the original input, output from rule-based tiling, output from model-based tiling, and one that is done with the Min-Variation formulation in [3]. The value for var(z) = z1  [max(ρ0 ) min(ρ0 )] is calculated for each with step height z1 = ˚ Obviously the rule-based method is not adequate. Also, in 7kA.

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0 15

15 20

10

20

10

15 10

5

15 10

5

5 0

5 0

0

(a) 1

1

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

0 15 15 10

5

20

10

0

(c)

Figure 7: Experiment for tiling multiple layers. (a) is cumulative topography ρ0 (k) projected onto the x-z plane for considering each layer individually, and (b) is the result from Multi-RangedVariation projected onto x-z plane. The unit for x is 1mm. All ρ0 (k) is normalized with formula ρ0 (k) = ρ0 (k) min(ρ0 (k) ) + 0:3(k 1) + 0:2. cumulative effect due to multiple layers degrades the results of considering each layer individually. At the top layer in the experiment, ˚ almost doubling the target the cumulative variation is about 540A, intended. In contrast, tiling result from considering the cumulative ˚ effect has every layer within the target bound of 300A.

[1] I. Ali, S. Roy, and G. Shinn. Chemical-Mechanical Polishing of Interlayer Dielectric: A Review. Solid State Technology, 37(10):63-70, October 1994.

[3] A. Kahng, G. Robins, A. Singh, and A. Zelikovsky. Filling Algorithms and Analyses for Layout Density Control. IEEE Trans. CAD, 18(4):445-462, April 1999.

15 10

5

5 0

20

[2] C. S. Burrus and T. W. Parks. DFT/FFT and Convolution Algorithms: Theory and Implementation. John Wiley and Sons, New York, 1985.

15 20

10

15

(b)

0

(b)

0.8

10

5. REFERENCES

0.2

0

5

5 0

0

(d)

Figure 6: Post-CMP topographies represented by ρ0 (in z direction). The units for x and y are both 1mm. (a) shows the original ˚ (b) rule-based approach with var(z) = input with var(z) = 767A; ˚ (c) previous work with var(z) = 358A; ˚ and (d) current work 702A; ˚ with var(z) = 152A. comparison to the approach presented in this paper, the LP formulation from [3] is inserting more dummy feature, thus having higher overall ρ0 values, and yet the final variation in topography var(z) is still very large, as it is shown by the numbers in the caption of Fig. 6.

4.2 Multiple layers The experiment for tiling multiple layers used 3 layers of metals of another real design from Motorola with six dice in the reticle field and various process control structures between the dice. The density maps for the layers are omitted here due to space. In the first part of the experiment, each of the three layers are tiled individually using the Ranged-Variation formulation from section 3 ˚ and then the cumulawith a target maximum variation of ε = 300A, tive effect of the resulting layers are calculated. In the second part,

[4] G. Nanz and L. Camilletti. Modeling of ChemicalMechanical Polishing: A Review. IEEE Trans. Semicon. Manufacturing, 8(4):382-389, November 1995. [5] D. Ouma, D. Boning, J. Chung, G. Shinn, L. Olsen, and J. Clark. An Integrated Characterization and Modeling Methodology for CMP Dielectric Planarization. In Proc. 1998 IEEE-IITC, pages 67-69, Feburary 1998. [6] B. Stine, et al. The Physical and Electrical Effects of MetalFill Patterning Practices for Oxide Chemical-Mechanical Polishing Processes. IEEE Trans. Electron Devices, 45(3):665679, March 1998. [7] B. Stine, et al. A Closed-Form Analytical Model for ILD Thickness Variation in CMP Processes. 1997 CMP-MIC Conference, Santa Clara, CA, February 1997. [8] E. Travis. private communication. [9] T. Yu, S. Chheda, J. Ko, M. Roberton, A. Dengi, and E. Travis. A Two-Dimensional Low Pass Filter Model for Die-Level Topography Variation Resulting from Chemical Mechanical Polishing of ILD Films. 1999 IEDM, Washington, D. C., December 1999.