technology[1], layout compliance of double patterning is not trivial [2,3] and blind ... cells and how by combining these with suitable design rules governing the.
Large-scale double-patterning compliant layouts for DP engine and design rule development. Christopher Cork1, Kevin Lucas2, John Hapli3, Herve Raffard1, Levi Barnes4 1
Synopsys SARL 12, Rue Lavoisier, 38330 Montbonnot, France Synopsys Inc 1301, South Mopac Expressway, Austin, TX 78746 USA 3 Synopsys Canada, One Antares Drive, Suite 300, Nepean, Ontario Canada K2E 8C4 4 Synopsys Technology Park, 2025 NW Cornelius Pass Rd., Hillsboro, OR 97124 2
ABSTRACT Double Patterning is seen as the prime technology to keep Moore’s law on path while EUV technology is still maturing into production worthiness. As previously seen for alternating-Phase Shift Mask technology[1], layout compliance of double patterning is not trivial [2,3] and blind shrinks of anything but the most simplistic existing layouts, will not be directly suitable for double patterning. Evaluating a production worthy double patterning engine with highly non-compliant layouts would put unrealistic expectations on that engine and provide metrics with poor applicability for eventual large designs. The true production use-case would be for designs that have at least some significant double patterning compliance already enforced at the design stage. With this in mind a set of ASIC design blocks of different sizes and complexities were created that were double patterning compliant. To achieve this, a set of standard cells were generated, which individually and in isolation were double patterning compliant, for multiple layers simultaneously. This was done using the automated Standard Cell creation tool Cadabra™ [4]. To create a full ASIC, however, additional constraints were added to make sure compliance would not be broken across the boundaries between standard cells when placed next to each other [5]. These standard cells were then used to create a variety of double patterning compliant ASICs using iCCompiler™ to place the cells correctly. Now with a compliant layout, checks were made to see if the constraints made at the micro level really do ensure a fully compliant layout on the whole chip and if the coloring engine could cope with such large datasets. A production worthy double patterning engine is ideally distributable over multiple processors [6,7] so that fast turn-around time can be achievable on even the largest designs. We demonstrate the degree of linearity of scaling achievable with our double patterning engine. These results can be understood together with metrics such as the distribution of the sizes of networks requiring coloring resulting from these designs.
INTRODUCTION Double Patterning adds new challenges to design compliance beyond those seen for alt-PSM. Blind shrinks of existing single patterning layouts which are essentially arbitrary 2D layouts will invariably require intervention to be double patterning compliant. A correct by construction methodology is essential to enable fast compliant designs. In this demonstration we show how the decomposition tool is used directly to create suitable standard cells and how by combining these with suitable design rules governing the boundaries of these cells, large double patterning compliant ASIC have been generated. This methodology of creating cells that are compliant and constraints by which their interactions with any neighboring cells created using the same methodology and laid out on a prescribed tiled array, will ensure the whole design block is also compliant. The lack of existing layouts that are double patterning compliant complicates the testing of double patterning decomposition tools and run-sets. As poor scalability, coloring errors or excessive time spent in one of the stages of decomposition may be symptomatic of the inherent massive non-compliance seen in the input layout. The ability to create multiple, non-trivially repetitive double patterning compliant design blocks of any size desired, therefore becomes an ideal vehicle with which to test and drive quality improvements of our double patterning software infrastructure.
Design for Manufacturability through Design-Process Integration III, edited by Vivek K. Singh, Michael L. Rieger, Proc. of SPIE Vol. 7275, 72751K · © 2009 SPIE · CCC code: 0277-786X/09/$18 · doi: 10.1117/12.815213
Proc. of SPIE Vol. 7275 72751K-1
CREATING G DPT COM MPLAINT ST TANDARD CELLS C To create c a doub ble patterningg compliant layoutt we took an ex xisting 45nm standard s cell set andd did a blind shrink s down too 32nm. Our standaard cell creation n tool, Cadabrra™, creates standaard cells by y iteratively generating suitablle layouts co onstrained by predefined rule seets such as DR RC rules, as well w as ones createdd on the fly such as lithoggraphic hotspots stress etc. wh hile simultaneoously using compaaction techniqu ues to minimizze cell area see Fig. F 1. Adding g DPT comppliance was straighhtforward with h this framewoork. Just as Fig.1 DPT enngine integrated with w Standard Cell generation lithogrraphy simulation identifiees hot spots,, integraating the doub ble patterning decomposition d n tool into the flow identifiees coloring com mpliance issuees. After decomposing d and a coloring eaach layout, nonn compliant connfigurations were w marked wiith a DRC checck polygoons. These check polygons were then reaad by the stanndard cell tooll, and used too create spacinng constrraint between that t pairs of eddges, for the next n iteration of o cell generattion. This loopp was continueed until either e a complliant cell was created or thee loop timed out. An exam mple of this forr one particulaar standaard cell is show wn in Fig. 2, which w required two such loopps to generate a double patteerning compliannt cell.
I
POLY Y & METAL1 DPT Analysis A Results
Fig.22 Standard cell sh howing two iteraations were needded to make celll DPT compliantt.
Certaain cells, howev ver, never convverged with this looop. An exam mple, shown in Fig. 3, where the metal 1 connecting c actiive areas is straighht, while one connecting pooly landing pads is highly loo oped. The loooped metal creates an odd cy ycle as indicaated. Using Custom m Designer™ ™, more radiical design changes could be made m manuallyy to ensure complliance. And th hus in this way w with a couplee of days work k a fully complliant double patternning compliantt set of standarrd cells was createdd.
Proc. of SPIE Vol. 7275 72751K-2
Metal 1 connecting to active generally has less complexx shapes: • Fewerr DPT conflicts • Easierr to redesign • More space for contactss to move Metal 1 connecting to poly hass more complex shapes such s as looping: • More DPT D conflicts • Harderr to redesign • Less sand pacetuning to move Fig. 3 Ha for DPT com mpliance
This exercise, validated the standard cell generation methodology, but only considered standard cells in isolation; the next stage was to ensure compliance when placed in an ASIC.
INTER CELL INTERACTIONS FOR DOUBLE PATERNING To make standard cells suitable for placement in an ASIC array, one need to make sure that boundary conditions cannot create coloring violations. Figs 4 show that within an ASIC, standard cell are placed along rows, with wide Power/Ground rails separating rows vertically. The close proximity of a standard cell’s polygons to the cell boundary at the left and right, risk the formation of non-compliant odd-cycles when any pair of cells is butted up against each other horizontally. Additional constraints to the standard cell generation loop are required to prevent the creation of errors horizontally between cell, or the propagation of networks across the Power/Ground rail vertically so that large scale compliant design blocks may be generated. The Vertical constraint is shown in Fig. 5. Here the Power/ground rails has minimum design Fig. 4 Placement-aware constraints at Metal 1 for standard cells to create DPT compliant space gaps above and ASICs below to polygons that have been assigned to different decomposed masks. This means that the opposing color is defined on both side of the Power/Ground rail and these two masks must overlap in the middle. The overlay and process variations inherent in the lithography steps will place a lower limit on the acceptable overlap between these two masks such that they will not pull-apart and that the features defined in each mask is no smaller than the minimum definable width compatible for the imaging technology. Clearly in this case See Fig, 5 there is plenty of overlap. For horizontal overlap the horizontal constraint is defined such that the polygons at the far left and right edges of the cell, must not depend on the polygons of the neighboring cell having more than one color. The cell we fed into the standard cell generation loop therefore had two minimum width and space dummy lines added at a distance ½ minimum design space from the edge of the cell boundary, see Fig. 6. This line pair prevents either line being split and therefore ensures the horizontal, placement-aware coloring constraint.
Ik1kliIium
Mi sifl
in
.spa
Pwcr/Ground Rail ;I LF-
Fig.5 Vertical Constraint – crossing Power/Ground Rail
Fig.6 Horizontal constraint – ensuring single color at cell edges
Proc. of SPIE Vol. 7275 72751K-3
CREATIO ON OF LARG GE DPT CO OMPLIANT ASICS A The standard s cell set s was recreaated using the Cadabra stanndard cell generaation loop. On nce the manuaal modificationns were compllete, they were loaded l into Sy ynopsys’ placee and route toool iCCompiler™ ™. Using actual design databases, these cellss were placed within w predefinned layout areas, 50μm square,, 100μm squarre, 200μm squuare up to 1mm m square. The 100μm square design d block contained c 128 rows and rougghly 4000 placem ments of stand dard cell and filler f cells. Thee 100μm squaare design block contained 1280 rows and rouughly 400,000 placements off standard cell annd filler cells. All A layouts weere successfullyy decomposed (See Fig. 7), annd subsequent DRC checks on revealed no coloring violations v therebby validating th he cell creationn methodology.. Fig. 8 shows a zoom m in on seven rows r of cells foor Metal 1 befoore and F Fig.7. 100x100um decomposed ASIC C after decomposition d identifying celll boundaries. Fig. F 9 shows laayouts for Meetal1, Poly and d contact with links l identifyinng inter-polygoon spacing beloow the minimuum single patternning spacing co onstraints, to thhereby indicatee the extent thaat networks cann extend to. Naamely that Contacct networks aree highly localizzed and althouugh they can exxtend in x, tendd to be somewhhat limited in size. Metal M networkss are similar buut are linked to each other thrrough a weak overlap o link in Y, Y and poly networks are linked in x and y, andd can potentiallly cover the whhole chip. Furthhermore, the liine-ends need speciaal cell placemen nt awareness (nnot implemented for this worrk) to prevent odd o cycles form ming.
jj j
UoU
IIH
ft
9
101
I D[J
Li
-I
Oi i i
a
I E[
I-I
Uo
iTfli r
o
ii
Figs.8 DPT compliant ASIC at metal 1 created with plaacement-aware Double-Patterning D g compliant stand dard cell set.
I
I
Metal 1 Contact Pooly Fig. 9 Decomposed layouts l for Meetal 1, Poly andd Contact show wing inter-polyggon links and hence h networks.
Proc. of SPIE Vol. 7275 72751K-4
SCALABIL LITY OF DO OUBLE PAT TTERNING CODE Just as a OPC has beenefitted from the ability to distribute d proccesses across huge h numbers of o CPUs so thhat large designs d may bee processed in an acceptable time, t double patterning decom mposition which also involvees geomeetrical as well as network colloring should also a benefit froom this approach if the netwoork coloring caan be suiitably distributted. Now that a set of scalaable fully doubble patterning compliant layyouts have beeen createdd, this can be investigated i wiith layouts reseembling those of o potential futture products. Doublle patterning deecomposition consists c of threee main stages (see Fig. 9): 1)) Analysis an nd reduction off geometric spaaces to an abstrract network, 2)) The colorin ng of that netwoork 3)) Applying th hat coloring to the original deesign so that tw wo distinct massks are defined The first f and final steps are essentially polyggon manipulatiion functions and a can easilyy be distributeed amonggst multiple prrocessors. In thhis way faster turnaround t tim mes may be achhieved by simpply adding extrra processsors, as is co ommonly usedd for OPC todday. The colorring operationn is essentiallyy an unboundeed probleem, as in princiipal the colorinng network maay extend over the entire chipp. If the netwoork was really so s large, this would lim mit its ability to t be distributted, however typically t multiiple independeent networks arre found and the colorin ng of each of thhese can be assigned to a diff fferent processoor. The development of the Synoppsys’ Proteus Pipelined Technology™ [88] enables thhe flexibility of o combiining geometric template bassed distributablle processing with w whole chiip operations such as coloring, as shoown in Fig. 9. As A networks tyypically cross geometrical g tem mplates, the CPU’s C must waait until all thosse templaates, which incclude nodes froom that network, have been completed beffore moving onn to do networrk colorinng. Likewise all a networks wiithin a given geometrical g tem mplate must bee resolved befoore colors can be b assignned and cleanu up operations can be appliedd. While the scalability s of the geometricaal processing is similaar to that for OPC, the scaalability of neetwork colorinng is normallyy limited by the number of o indepeendent network ks, as the colorring of each network is definned as a separate thread runnning on a singgle CPU. The scalability y of double paatterning decoomposition wouuld therefore be b limited by synchronizatioon between geometrical and coloringg stages, and on o the number of networks presented p at thhe coloring stepp. Fortunnately, method ds exist to oveercome the sccaling limit off the number of o independennt networks annd increaase coloring scalability. The ability to thenn go straight innto RET and OPC, O verificatiion and fracturre operattions on a per template basiss, in addition gives significaant time savinggs over runninng multiple jobbs with innput and outpu ut GDS files created for each stage. Geicmetrical Pr cessing Network Processing Li nited by No. of networks Ma ssively Para I/el A
Geometry & Model P rocessing Massive!y Parallel
LI
1 N
V
ide ntification ol
F ull
spacilngand p0tential split Ic)cations.
as purely abstract data set.
mu nimum
chip Cost Based Solve 0f coloring neltwork. Treate d
Geome4ric impleme ntation of solver results & cle an-up. Then A F, OPC and \(erification
Fig. 9 Double Patteerning Decom mposition usinng Proteus Pippeline Technology
Proc. of SPIE Vol. 7275 72751K-5
E
Amdahl’s law describes the degree to which a single computation can be efficiently divided over multiple processors where s is the proportion that must run serially and N is the number of CPUs available and T the run time for a single CPU.
RunTime = s.T +
(1 − s ).T N
Speedup
Speedup vs No. of Processors by Design Block Area: 10μm templates 13 12 11 10 9 8 7 6 5 4 3 2 1
Device Size 50um x 50um 100um x 100um 200um x 200um
50x50um - 12.7% 100x100um - 4.9% 200x200um - 1.9%
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
% Serial Overhead 12.7% 4.9% 1.9%
Fig. 10 – Comparison of speed-up curves for different design block sizes, showing the effect of serial overhead due to stalling between geometrical and coloring stages.
Number of Processors
Fig. 10 shows scalability results running a 50um, 100um and 200um square design block on the contact mask for between 1 and 16 CPUs. Fitting Amdahl’s law to these results shown a rapid decrease of serial overhead from 12.7% down to 1.9% for the 200um square block. This suggests that over 100 CPUs can be gainfully applied for decomposing large designs. The small design blocks are not representative of any final chip that may be decomposed, but demonstrate the impact of the starting and stopping CPUs between the geometrical and coloring stages. The geometrical template size is 10um, so for a 50um square block 25 templates are created. If these are spread between 16 CPUs, then some CPUs will process only one template and others need to process two templates. This means several CPUs may need to wait up to ½ the total processing time for the next piece of data to compute. For large designs each CPU will be computing typically 100s of template across 100s of CPUs. The massive scalability of geometrical templates indicates that more sophisticated within-template operations such as model based determination of cuts, should not impact the scalability of the whole process.
CONCLUSION A methodology has been described which allows the creation of large double patterning compliant ASIC blocks, by considering both inter and intra-cell double patterning requirements. With this methodology, a 32nm standard cell set was created which was double-patterning compliant at the metal 1 layer, by shrinking an existing 45nm set and adding cell specific double patterning constraints within a standard cell generation tool. A few cells which were not able to converge to a complaint solution needed manual editing. With this standard cell set large ASIC design blocks were created which were shown to be double patterning compliant i.e. DRC clean after decomposition. The scalability of double patterning decomposition using Proteus Pipelined technology is dependent on the layer type and design style, but was shown to demonstrate excellent linearity up to the limits of our testing and indicated one should be able to use 100s of CPUs effectively for production runs. Furthermore the ASIC blocks thus developed provide an excellent vehicle for further engine and recipe development.
Proc. of SPIE Vol. 7275 72751K-6
REFERENCES [1] Liebmann, L. “DFM lessons learnt from alt-PSM design.” Proc SPIE Vol. 6925, (2008) [2] Drapeau, M., et al. “Patterning Design Split Implementation and Validation for the 32nm Node.” Proc. SPIE Vol. 6521 (2007) [3] Cork, C., “Checking design conformance and optimizing manufacturability using automated Double Patterning decomposition.” Proc. SPIE Vol. 6925 (2008) [4] Lucas, K., Double patterning interactions with wafer processing, OPC and physical design flows. Proc. SPIE Vol. 6924 (2008) [5] Lucas, K., “Physical design and mask synthesis considerations for DPT”, Immersion Symposium 2008, The Hague. [6] Ballhorn, G., “Slashing Turn around time by introducing distributed processing” Proc. SPIE Vol. 4562, p. 183-193, 21st Annual BACUS Symposium on Photomask Technology. 2001 [7] Lugg, R., "An Effective Distributed Architecture for OPC & RET Applications", Proc SPIE, p. 903-908 Vol. 4889. 2002 [8] Boman, M., “Efficient hardware usage in the mask tapeout flow” Proc. SPIE Vol 7274-100 (2009)
Proc. of SPIE Vol. 7275 72751K-7