A New SPICE Reliability Simulation Method for ... - Semantic Scholar

IEEE TRANSACTIONS ON DEVICE AND MATERIALS RELIABILITY, VOL. 6, NO. 2, JUNE 2006

247

A New SPICE Reliability Simulation Method for Deep Submicrometer CMOS VLSI Circuits Xiaojun Li, Jin Qin, Bing Huang, Xiaohu Zhang, and Joseph B. Bernstein, Senior Member, IEEE

Abstract—CMOS very large scale integration (VLSI) circuit reliability modeling and simulation have attracted an intense research interest in the last two decades, and as a result, almost all IC reliability simulation tools now try to incrementally characterize the wearout mechanisms of aged devices in iterative ways. These tools are able to accurately simulate the device’s wearout process and predict its impact on the circuit performance. Nevertheless, an excessive simulation time, a tedious device testing work, and a complex parameter extraction process often limit the popularity of these tools in the product design and fabrication stages. In this paper, a new simulation program with integrated circuits emphasis (SPICE) reliability simulation method is developed, which shifts the focus of the reliability analysis from the device wearout to the circuit functionality. A set of accelerated lifetime models and failure equivalent circuit models have been proposed for the most common silicon intrinsic wearout mechanisms, including hot-carrier injection, time-dependent dielectric breakdown, and negative bias temperature instability. The accelerated lifetime models help to identify the most degraded transistors in a circuit in terms of the device’s terminal voltage and current stress profiles. Then, the corresponding failure equivalent circuit models are incorporated into the circuit to substitute these identified transistors. Finally, the SPICE simulation is performed again to check the circuit functionality and analyze the impact of the device wearout on the circuit operation. Device individual wearout effect is lumped into a very limited number of SPICE circuit elements within each failure equivalent circuit model, and the circuit performance degradation and functionality are determined by the magnitude of these additional circuit elements. In this new method, it is unnecessary to perform a large number of small-step iterative SPICE simulation process as other tools required to obtain the accuracy. Therefore, the simulation time is obviously shortened. In addition, a reduced set of failure equivalent circuit model parameters, rather than a large number of device SPICE parameters, need to be accurately characterized at each interim wearout process. Thus, the device testing and parameter extraction work are also significantly simplified. These advantages will allow the circuit designers to perform a quick and efficient circuit reliability analysis and to develop practical guidelines for reliable electronic designs. Index Terms—Accelerated lifetime models, circuit reliability simulation, CMOS, failure equivalent circuit models, hot-carrier injection (HCI), negative bias temperature instability (NBTI), simulation program with integrated circuits emphasis (SPICE), time-dependent dielectric breakdown.

I. I NTRODUCTION

T

HE SCALING of the CMOS technology into deep submicrometer regimes has brought about new reliability

Manuscript received August 9, 2005; revised January 12, 2006. The authors are with the Microelectronics Reliability Engineering, Center for Reliability Engineering, University of Maryland, College Park, MD 20740 USA (e-mail: [email protected]; [email protected]). Digital Object Identifier 10.1109/TDMR.2006.876572

challenges, which are forcing the dramatic changes in the approaches for assuring the IC reliability. Product cost and performance requirements will be substantially affected, or even superseded by the reliability constraints [1]. The traditional reliability-assurance methods, which relied on a failure detection and analysis at the end of a lengthy product-development process, are rapidly losing the efficiency due to the reliability trends predicted by the International Technology Roadmap for Semiconductor (ITRS 2003) [2]. For most applications, the current overall chip reliability levels need to be maintained over the next fifteen years, despite the possible risks induced by multiple major technology breakthroughs. This constraint requires a continuous improvement in the reliability per transistor, and per unit length of interconnect due to the continuous shrinkage of the device dimensions. Scaling pushes the device performance to the limits of technologies and eats away the circuit-reliability margins. Therefore, the accurate tradeoffs between the performance and reliability must be addressed before committing design to production. The projected failure-in-time (FIT) for technology nodes from 90 to 65 nm in the ITRS 2003 is on the order of 10 to 100. However, in experimentally determining the FIT values, this low by traditional methods requires a huge number of device hours of testing. Approximately 9 × 107 device hours of testing are required to prove a failure rate of 10 FITs at 60% confidence level if no failures occur during the test [3]. The increased cost and excessive time consumed by testing work demand that the accurate lifetime models and efficient reliability tools must be available in the product design stages. The validity of the voltage and temperature acceleration methods, which has been utilized in the reliability screening and qualification techniques (such as burn in and accelerated life test), becomes questionable due to the diminished margin for proper acceleration of these stress factors. The traditional FIT and acceleration factor determination methods that rely on the multiplication of the temperature and voltage acceleration factors need to be revisited, and the correlation of these factors must be explored and modeled for the purpose of an accurate failure rate prediction. Finally, as circuits become increasingly complex, two irreversible trends can be noted. First, a given device within a chip is stressed for a decreasing fraction of the reliability testing time. Second, a longer delay is required to correct the reliability problem by a process or design iterations [4]. All of the above trends demand that a device lifetime and circuit reliability be characterized and predicted accurately during the product design process. This can only be fulfilled by the effective IC reliability simulation tools.

1530-4388/$20.00 © 2006 IEEE

248


The CMOS circuit reliability simulation has attracted an intense research interest in the last two decades. Significant progress in the modeling device wearout mechanisms has led to the emergence of quite a few successful reliability simulation tools [5]–[9]. The common feature of these tools is that they physically characterize the device wearout process under a real circuit stress environment and incrementally simulate the circuit performance degradation in iterative ways. This physicsbased iterative simulation algorithm often produces the accurate simulation results with the disadvantage of excessive computational and experimental work. Some attempts have been made to improve the simulation efficiency by employing the fast-timing simulation method [10], [11] or by performing a gate-level circuit simulation [12]. However, the devicewearout-based simulation and testing philosophy are preserved. As a result, even though the reliability simulation is generally regarded as an essential step in deep submicrometer CMOS circuit designs and fabrication, the tedious device testing work for the degraded parameter extraction often discourages the chip designers from exercising the IC reliability simulation in their everyday work. In review of the reliability-simulation practice in industrial and academic communities, it is obvious that some fundamental concepts and techniques have been universally adopted, that not only form the common foundation of the legacy reliability simulation tools but also nurture new ideas in some previously unresearched areas. These new ideas will give rise to the developments and breakthroughs of the new IC reliability simulation methods, which are both efficient and effective. In light of these ideas, a new Maryland Circuit-ReliabilityOriented (MaCRO) simulation program with integrated circuits emphasis (SPICE) simulation method is developed that is based on the rate-of-failure concept and failure equivalent circuit modeling techniques. The MaCRO consists of a series of accelerated lifetime models and failure equivalent circuit models for the common silicon intrinsic wearout mechanisms, including a hot-carrier injection (HCI), time-dependent dielectric breakdown (TDDB), and negative bias temperature instability (NBTI). The MaCRO simulation is a first-order approach that does not fully characterize the microcosmic interactions among these wearout mechanisms. This assumption simplifies the device-wearout modeling process and makes the MaCRO compatible with the standard simulation tools. The MaCRO has promised a way for system designers to better prepare for the reliability challenges that will be present in the future generation technologies. The overall simulation flow of the MaCRO is straightforward, and the SPICE routine is only called for very limited times to simulate the impact of the device wearout on circuit functionality. The MaCRO models and simulation algorithm are presented in a series of three papers: This paper is a summary of the models and simulation algorithm; the detailed accelerated lifetime models and failure equivalent circuit models for the HCI, TDDB, and NBTI are presented in a separate paper [13]; the third paper demonstrates the simulation process of a simplified static random-access-memory (SRAM) circuit as an example to illustrate how to apply the MaCRO method to the circuit reliability analysis and design-for-reliability practice [14].

This paper is organized as follows. Two commercial state-ofthe-art reliability simulation tools are reviewed in Section II. A discussion of their limitations and possible improvements is presented in Section III. A summary of accelerated lifetime models and failure equivalent circuit models for each wearout mechanism is given in Section IV. The overall MaCRO simulation algorithm is illustrated in Section V. Finally, this paper is concluded with Section VI. II. R EVIEW OF R ELIABILITY S IMULATION T OOLS Hot-carrier-induced MOS device wearout is one of the most critical reliability issues for deep submicrometer CMOS integrated circuits. Hot-carrier reliability models and simulation methods have been implemented and widely used in the semiconductor industry for many years. To some extent, the accuracy of the hot-carrier reliability simulation represents the robustness and efficiency of an entire reliability simulator. Therefore, for simplicity, HCI simulation is employed in this section as the vehicle to deliver the basic concepts and flows realized in some commercial state-of-the-art reliability simulation tools. A. Hot-Carrier Reliability Simulation in Virtuoso UltraSim Virtuoso UltraSim is the Cadence FastSPICE circuit simulator capable of predicting and validating the timing, power, and reliability of a mixed signal, complex digital, and System-onChip (SoC) designs in an advanced technology of 0.13 µm and below. It has a set of specialized reliability models (AgeMos) for the HCI and NBTI simulation [15]. In the simulation, an Age parameter is calculated for each nMOSFET with the following formula: t=τ

Age(τ ) = t=0

Isub Ids

m

Ids dt W ·H

(1)

where W refers to the channel width of the transistor, m and H are technology-dependent parameters and determined from experiments, Isub is the substrate current, Ids is the drain current and, τ is the stress time. For pMOSFETs, the gate current Igate instead of Isub is used to determine the Age parameter. The degree of the device wearout has been experimentally found to be a function of this Age parameter for wide ranges of channel lengths and stress conditions, and the relationship has a plausible theoretical basis [16]. The simulation starts with the device parameter extraction and modeling. From the SPICE model parameters of fresh devices, some other device parameters are added to accurately model Isub . Saturation current Idsat , threshold voltage Vth , or the maximum transconductance gm can be used as a degradation monitoring parameter. Idsat is a good degradation monitor for digital circuits, whereas Vth is suitable for analog applications. Normally, the stress time resulting in 10% decrease of one of these degradation monitoring parameters is arbitrarily set to the device lifetime. The final step is the AgeMos extraction. Based on the Age parameter calculated

LI et al.: NEW SPICE RELIABILITY SIMULATION METHOD FOR DEEP SUBMICROMETER CMOS VLSI CIRCUITS

249

Fig. 1. Hot-carrier reliability-simulation flowchart in Virtuoso UltraSim. Device-wearout modeling is the focus of the reliability analysis [17].

from the fresh simulation, the AgeMos applies the degradation models, which can be fed to most SPICE-like simulators, to the aged circuit simulation. The reliability simulation with the Virtuoso UltraSim is an iterative process, in which a large number of iterations are often needed in order to obtain the accurate modeling results. The simulator can calculate and output the degradation results to predict the lifetime of each MOSFET within a circuit [17]. The overall simulation flow is illustrated in Fig. 1. The fundamental models and algorithms of the reliability simulation realized in the Virtuoso UltraSim found their origins in the BERT (Berkeley Reliability Tools), which gave rise to many other reliability simulation tools. Most of these tools are based on the Age-parameter modeling concept. The main advantages of these BERT-like tools are the accuracy and SPICE compatibility. However, they put a burden on the product designers to correctly extract the device’s degraded parameters, and an inaccurate extrapolation may lead to nonphysical trends, which limits their popularity in the reliability design process.

B. Hot-Carrier Reliability Simulation in Eldo Eldo is a circuit simulator developed by Mentor Graphics, which delivers all the capability and accuracy of the SPICElevel simulation for complex analog circuits and SoC designs. Hot-carrier reliability simulation in Eldo is based on a compact ∆Id model, which directly models the difference of the drain currents between the fresh and aged devices. There exist two competing mechanisms, which lead to the obvious hot-carrier-induced drain-current variations between the fresh and degraded devices: the deviation of Id from its linear dependence of Vds due to the velocity saturation effects, and the decrease of ∆Id /Id due to the reduction of the charged interface states [18]. In Eldo, the ∆Id is modeled with (2)–(5), which unify the subthreshold, linear, and saturation regions with a simple relation for both forward and reverse operation modes [19] −B1 Vgs

∆Id ) + B2 Nit Lit B6 (1 − e × = Id 1 + B5 (Vgs − B3 Vth ) Leﬀ

Fig. 2. HCI reliability simulation in Eldo [19]. Iterative simulation scheme is the main feature.

×

1 1 + α(Vds − Vlow ) + βVds

Vlow = A3 Vdsat

(2) (3)

A1 α= 1 + A4 (Vgs − Vth )A2

(4)

β = A5 Vgs + A6

(5)

where Nit is the interface-trap density, Lit is the extension of the damage region in the channel, Leﬀ is the effective channel length, Vgs is the gate-to-source voltage, Vth is the threshold voltage, Vds is the drain-to-source voltage, Vdsat is the drain saturation voltage and, A1 to A6 and B1 to B6 are model-fitting parameters. The same Age parameter defined by (1) is also incorporated to model the “age” of each transistor. The HCI aging process is simulated in an iterative way as depicted in Fig. 2. The period Tage , at which the circuit performance is to be tested, is divided into smaller time intervals T1 . The Age table is calculated at the end of each time interval, and a new simulation with Eldo is carried forward. This process is repeated until Tage is reached. This iterative scheme can account for the gradual change of bias conditions as a result of the device wearout. The ∆Id model approach provides the possibility to have a relatively simpler parameter-extraction process. It is suitable to model bidirectional stresses and asymmetrical drain-current behaviors. However, because this approach also adopts the same Age parameter, and a similar small-step iterative process in the degradation simulation, it inherits the same disadvantages of the BERT-like tools as discussed before. III. L IMITATIONS AND I MPROVEMENTS Although the brief review in Section II reveals both the advantages and limitations of the contemporary reliability

250


simulation tools, a further discussion is necessary for the sake of identifying the fundamental reasons for these limitations and understanding how the MaCRO models and algorithms overcome some of these limitations. In a reliability qualification practice, the device lifetime or a failure definition due to the wearout mechanisms is quite arbitrary. A predefined shift in a certain device parameter is often selected as the criterion for failure. Some examples are a 10% reduction in Ids , 10% decrease in gm , or 50-mV shift in Vth . While these parameters’ drifts generally reflect the devicewearout degrees, in real circuit applications, this treatment of the device failures may not necessarily result in the circuit failures. In order to establish a more realistic failure definition, Li and Hajj at the University of Illinois, Urbana-Champaign (UIUC) [20] proposed a new criterion, which includes the estimation of both the device local damage and circuit global degradation. Jiang et al. at Massachusetts Institute of Technology (MIT) [21] further used a 3% reduction in the critical path delay as the circuit-level failure criterion in the rippercarry adder case studies. Although a significant improvement has been made in the device-failure modeling, no universally accepted method yet exists to properly define the device lifetime and assess impact of the device failures on circuit-level reliability. If the device lifetime is defined as a percentage or absolute drift in the device parameters, then an accurate calibration of the difference between the fresh and degraded device parameters is indispensable for the accurate circuit reliability simulation. However, the parameter extraction for modeling the individual device wearout to a satisfied accuracy is extremely tedious and difficult. In the MaCRO, the focus of the reliability analysis is the circuit functionality rather than the device wearout process. Therefore, an accurate characterization of each deviceparameter degradation is not necessary. A set of accelerated lifetime models for various wearout mechanisms are developed to identify those most degraded transistors in a circuit based on their terminal voltage and current waveforms. In this approach, the normalized device lifetime values instead of the absolute ones need to be predicted. As a result, the device testing and parameter-extraction work are significantly alleviated. Device-wearout-focused reliability simulation tools treat the various device-wearout mechanisms with divide-and-conquer algorithm, even though some of them like the BERT have the capability to deal with an electromigration (EM), HCI, and TDDB in the same environment [8], [16], each of these mechanisms is handled by a dedicated module with an assumption that every mechanism is independent from each other. In reality, transistors in a circuit expose to all kinds of stresses simultaneously and suffer from the various wearout mechanisms, which may interact with each other. As a result, the net effect of these combined mechanisms often leads to a precipitous degradation process. Another problem is, some wearout processes are the synergic effect of two or more wearout mechanisms, which have to be decoupled from each other in order to accurately characterize them individually. For example, both a channel hot-carrier (CHC) injection and biased temperature-instability (BTI) mechanisms will contribute to the interface-trap generation, which is the main reliability monitor in the wearout

process. Recently, some work has been done to uncover this interrelationship of the different wearout mechanisms. La Rosa et al. at IBM [22] investigated the impact of both the NBTI and CHC contributions to the device damage and proposed a methodology to decouple their effect. Yu et al. at University of Central Florida (UCF) [23] experimentally examined the interaction of the NBTI with the TDDB and HCI, and developed a transistor model to evaluate their combined effect on RF circuit-performance degradation. Even with this progress, generally speaking, the device-wearout-focused reliability simulation methods cannot effectively deal with the combined effects of the various mechanisms. In the MaCRO, a set of failure equivalent circuit models are developed to characterize the circuit failures due to the multiple wearout mechanisms. This failure equivalent circuit modeling concept is different from the former concepts and models proposed in literature (e.g., [23]). A thorough review of these former concepts and models for the HCI, TDDB, and NBTI is presented in the following papers. In some sense, these models are kind of rudimentary, but they laid the foundation for further development of any advanced models. In the MaCRO, the improved failure equivalent circuit models will be imported into the SPICE netlists to substitute the most degraded transistors in the circuit. The SPICE simulation with these failure equivalent circuit models will reveal whether the circuit can survive from the device wearout at any specific time. Device-wearout-focused reliability simulation tools only treat transistors suffering wearout mechanisms one by one in a circuit. This is not accurate, because the neighboring devices also degrade at the same time and, therefore, influence terminal waveforms of the transistor under consideration. The effects of the HCI on the operation of the neighboring devices and circuits have been explored in [24]. For an nMOS transistor in a circuit, its threshold voltage will decrease, and its subthreshold current will increase due to the excess substrate currents flowing in the neighboring MOS transistors as a result of the HCI and impact ionization effects. Some researchers have realized the problem of neglecting the neighboring effects, but they turned to the other extreme case by taking into account all the transistors’ wearout effects at the same time. Obviously, these two cases are either inaccurate or inefficient. In a real circuit, different transistors operate at different biased points and, therefore, experience different stresses. Device lifetime is roughly exponentially dependent on these stress factors, which may lead to a significant difference (sometimes several orders of magnitude in difference, refer to Fig. 1 in [25]) in the device lifetime values. In the MaCRO, by sorting the normalized device lifetime values and only considering those transistors whose lifetimes are significantly smaller than others, we may obtain both modeling accuracy and computational efficiency in addressing the neighboring effects. It is proved from the IC reliability analysis that the device DC lifetime is not sufficient to characterize the circuit-performance degradation. Therefore, much work has been done to model the device AC lifetime in circuits from static stress tests. Even though a significant progress has been achieved in this field, due to the extreme complexity of the device terminal waveforms in real circuits, there is still no convincing model available, which


is able to quantitatively predict the device lifetime to a satisfied accuracy. Accurate and absolute value of the device lifetime is theoretically important in a reliability qualification, however, in an engineering practice, because of the statistical characteristics of the device failure, an order of the magnitude variation in the predicted lifetime values is frequent and often tolerable. Compared with the device-wearout life, a device or a circuit service life is extremely short, which makes the commonly adopted end-of-life characterization method rather ineffective in the reliability analysis. End-of-life methods try to model the rising tail of a bathtub curve, but more important and useful information is the level of the failure rate in the middle part. An identifiable trend in a reliability community is that a hockey stick curve is gradually preempting the bathtub curve in the reliability analysis. With fast developments in deep submicrometer technologies, the integrated circuits become increasingly complex, and both the physical dimensions and processing techniques of each onchip component are explored to the limits. Every component is prone to fail in a shorter time, and if it does fail, the whole circuit will be greatly impaired or even fail at the same time. We can therefore approximate a complex integrated circuit with a competing failure system, i.e., a series failure system. For a complex circuit with multiple failure mechanisms, the large number of different trends and distributions of these mechanisms will be averaged out to a constant level of the failure rate. With a circuit complexity ever-increasing, and no wearout mechanism dominant in the device, the circuitfailure distribution becomes more and more randomized. In this situation, the distributions of the individual failure mechanisms are not indispensable for predicting the overall circuit failure rate, and we can treat the circuit as a constant failure-rate system. The failure-rate (λ) parameter solely characterizes the overall rate-of-failure process and reflects the level of a system reliability. This rate-of-failure concept is adopted in the MaCRO to help develop the accelerated lifetime models and predict the circuit-reliability factors [26]. In developing the accelerated lifetime models and determining the add-on model parameters for failure equivalent circuit models, we made a quasi-static operation assumption, which trades the accuracy for simulation speed. This assumption conforms to the primary purpose of the MaCRO: providing a simple tool for designers to make a quick circuit performance and reliability evaluation. In the literature, some advanced algorithms have been developed to address the AC lifetime problem [27]–[29], which will be incorporated in the MaCRO in future work. In summary, the value of the IC reliability simulation is not to determinate the device and circuit absolute lifetime values, it should be able to provide the chip designers the simple guidelines to perform the quick circuit performance and reliability evaluation, make appropriate tradeoffs between the performance and reliability, and reduce the product-development cost and time. Reliability is unanimously regarded as a vital factor for any successful product. However, reliability simulation has not been actively practiced in the industry due to the reasons discussed above. Most of the aforementioned limitations have been addressed in the MaCRO, which treats the circuit reliability from a different perspective by elevating

251

Fig. 3. HCI failure equivalent circuit model in the MaCRO. In the model: Vgdx = Vgs − Vth − Vds and VRd = Ids ∆Rd . Vth is the threshold voltage and Ids is the current from node D to S.

the reliability analysis from the device wearout to the circuit functionality. This circuit-functionality-centered method integrates the rate-of-failure concept, accelerated lifetime models, and failure equivalent circuit modeling techniques into a unified framework, and provides the designers an alternative in the efficient circuit reliability analysis and a tool for the systematic design-for-reliability practice. IV. A CCELERATED L IFETIME M ODELS AND F AILURE E QUIVALENT C IRCUIT M ODELS The accelerated lifetime models and failure equivalent circuit models for each wearout mechanism are summarized in this section, detailed processes of developing these models are given elsewhere. A. HCI The HCI accelerated lifetime-model equation for nMOSFET is given by the following equation: tf = AHCI

Isub W

−n

exp

EaHCI κT

(6)

where Isub is the substrate leakage current, EaHCI is the activation energy, W is the channel width, κ is Boltzmann’s constant, T is the temperature, n is a process-related constant, and AHCI is the model prefactor. For pMOSFET HCI lifetime model, the gate leakage current Igate replaces Isub in (6). The HCI failure equivalent circuit model for nMOSFET is illustrated in Fig. 3, which is based on the ∆Rd model [30] with some improvements. The inclusion of ∆Rd emulates the degradation in drain-to-source current Ids . Both the interfacetrap generation and oxide charge trapping contribute to the increase in ∆Rd value. The contribution of oxide charge trapping to the device wearout is neglected in the original ∆Rd model [30], but a recent experimental work and the SRAM reliability

252


ity. A simple gate-to-source or gate-to-drain parasitic resistance is often used for modeling the gate-to-source or gate-to-drain breakdown effect. Statistically, the probability of the gate-todiffusion breakdowns will become larger in shorter channel devices due to the nonscalable overlap of the gate-to-diffusion regions. In this situation, when to use which TDDB model in the reliability simulation becomes a question. Designers may have to simulate with both gate-to-channel and gate-to-diffusion models and evaluate the circuit function with the worse one. In this paper, we only focus on the gate-to-channel breakdown events, because they are more frequent in the technology we selected to implement the illustrative SRAM design and reliability simulation. C. NBTI The NBTI accelerated lifetime-model equation for pMOSFET is based on the physics and statistics model proposed by Zafar and co-workers at IBM [35], [36] and shown as (8): Fig. 4. TDDB equivalent circuit model in the MaCRO. IOX = IS − ID is a voltage-dependent current source representing a breakdown path current injection effect. RD and RS characterize the resistance in the source and the drain extensions, respectively. L1 represents breakdown location by using the source edge as the reference.

simulation done for this paper prove that the oxide trapped charge is also a major contributor to the device wearout. B. TDDB The TDDB accelerated lifetime-model equation for the nMOSFET is based on the work by Wu and co-workers at IBM [31]–[34] and given by (7) tf = ATDDB

β1 1 c 1 d + 2 F β Vgsa+bT exp A T T

(7)

where A = W × L is the device gate-oxide area, β is Weibull slope parameter, F is the cumulative failure percentage at use condition, Vgs is the gate-to-source voltage, T is the temperature, and a, b, c, and d are the model-fitting parameters determined from the experimental work, ATDDB is the model prefactor. Note that a + bT is always negative. Equation (7) is the result of the various TDDB experimental observations including a power-law voltage acceleration, non-Arrhenius temperature acceleration, and weakest-link area scaling law. The TDDB failure equivalent circuit model for the nMOSFET is illustrated in Fig. 4, in which two split transistors imitate the channel separation by an oxide breakdown path, and the voltage-dependent current source Iox physically represents the conduction mechanism of a hard breakdown path across the oxide. Fig. 4 is the model for gate-to-channel breakdown scenario, which is a much more frequent statistical event than gate-to-diffusion breakdowns. In contrast, most work in the literature focuses on the gate-to-diffusion breakdown events, which have more severe effects on the circuit functional-

−1 tf = ANBTI Vgs β

1 1 1 + −E2 1 + 2 exp −E 1 + 2 exp κT κT

− β1 (8)

where β is the model-fitting parameter, E1 is a materialrelated constant, E2 is a material- and oxide-field-dependent parameter, Vgs is the gate-to-source voltage, and ANBTI is the model prefactor. This NBTI lifetime equation was derived from a fractional power-law model, which characterized the relation between the threshold-voltage shift and NBTI stress factors. The fractional power law can capture the threshold-voltage saturation behavior. Another important NBTI phenomenon is the dynamicrecovery behavior. In (8), E2 is a gate voltage dependent parameter. Therefore, the model can also capture the NBTI dynamic-recovery effects. The NBTI failure equivalent circuit model for pMOSFET is illustrated in Fig. 5. NBTI-induced pMOSFET thresholdvoltage increase is modeled as the absolute gate-to-source voltage decrease. A gate tunneling current flowing through the gate resistance RG leads to the increase of the voltage at point G . This corresponds to the decrease of the pMOSFET absolute gate-to-source voltage, and therefore, mimics the thresholdvoltage degradation effect. Gate tunneling current is modeled with two voltage controlled current sources, which follow the form of a power-law formula as: I = KV P . V. C IRCUIT -L IFETIME P REDICTION AND R ELIABILITY -S IMULATION A LGORITHMS The MaCRO-accelerated lifetime and failure equivalent circuit models can be tailored for different purposes of the reliability analysis. If the circuit lifetime is of primary interest, we can manipulate the accelerated lifetime models to accurately predict the device and circuit lifetimes by properly extracting all model parameters; if the circuit functionality is of primary interest, we can quickly identify weakest devices with normalized lifetime


Fig. 5. NBTI failure equivalent circuit model in the MaCRO. The inclusion of IGD and IGS inherently accounts for oxide breakdown effects and also supplies leakage currents for RG whose voltage drop is equivalent to pMOSFET threshold-voltage degradation.

calculation and utilize the failure equivalent circuit models to characterize the circuit behaviors in terms of the different wearout mechanisms. The flowchart and simulation algorithm for these two distinct purposes are presented in this section. A. Circuit-Lifetime and Failure-Rate Prediction The lifetime of each transistor in a circuit with respect to each wearout mechanism has been given by (6), (7), or (8). To obtain the lifetime of the entire circuit, we need to combine the effects of the different failure mechanisms across the different structures. This requires information of the time-dependent lifetime distribution for each mechanism. In engineering applications, the FIT value is normally used to quantify a product reliability. FIT represents the number of failures per 109 device hours of accelerated stress test. Most FIT calculation methods only apply to systems with the constant failure rate for each failure mechanism, and a special treatment is required for other systems with time-variant failure behaviors [37]. State-of-the-art very large scale integration (VLSI) devices are complex systems with millions of individual transistors. Each transistor has, at least, a dozen of failure mechanisms associated with it. Simulation shows that as the number of failure mechanisms in a VLSI circuit increases to five or more, the Weibull shape parameter will shift toward the unity, unless all the failure mechanisms have the same shape parameter and similar characteristic life. This simple observation implies that the failure rate of a VLSI device approaches a constant level as it becomes increasingly complex, and it is difficult to distinguish any specific failure mechanism from the others. A good example of how the increasing complexity results in a constant failure rate is the trend of the decrease in Weibull slope as the number of possible EM failure links in a device increases. EM is one of the most important wearout failure mechanisms in microelectronic circuits. Each of those EM failure links has a strength associated with it, which will vary with some distribution based on the variables from the design and process. The stress for each link is also a random variable. This

253

series of random strengths, stresses, and the possibility of some lower strength links lead to a large spread of the probability distribution of the weakest link. With enough links, the overall probability distribution function becomes an exponential one. All these pieces of evidence prompted us to make the constant failure-rate assumption. Constant failure-rate-based reliability method for electronic circuits allows the VLSI manufacturers test parts under accelerated conditions assuming all failure mechanisms can be accelerated in approximately the same proportion. The resulting failure rate could then be extrapolated to operating conditions considering the temperature, frequencies, and voltages. In order to derive a simple model for lifetime prediction, we made another practical assumption, in which we approximate each failure mechanism with an exponential distribution. In this way, the failure rate of each failure mechanism also becomes a constant. With the above two assumptions, we apply the standard sumof-failure-rates (SOFR) model to system’s failure-rate prediction from its individual failure mechanisms [38]. From the SOFR model, the mean-time-to-failure (MTTF) of a circuit composed of n units can be related to the lifetime of each unit (MTTFij ) due to each of the m individual failure mechanisms 1 MTTF = m n i=1

1 j=1 MTTFij

.

(9)

The FIT is interchangeable with the MTTF according to its definition for the constant failure-rate system FIT =

109 . MTTF

(10)

If all the parameters of the accelerated lifetime models presented in Section IV have been extrapolated from the device testing work, from (6) to (10), we can calculate the MTTF and FIT values of the circuit under consideration. The flowchart of the circuit lifetime and failure-rateprediction process is depicted in Fig. 6. The device/circuit lifetime and failure-rate-prediction method shown in Fig. 6 can be further used in the reliability projections for future technologies. With the availability of the latest device SPICE models and technology-dependent modelfitting parameters, we can predict the reliability trends of 90 nm process and beyond in light of the wearout mechanisms being discussed. The last point deserving special attention in the lifetime prediction is the accuracy problem bounded by the quasi-static assumption, which neglects the HCI/TDDB AC acceleration effects. In estimating device terminal voltage and current stress profiles with the SPICE, even though the device operations are dynamic, for simplicity, we only calculate the time average values of these terminal waveforms. If the terminal waveforms are clean and regular, a duty cycle instead of the time average method can be applied to improve accuracy. The waveformaveraging method based on the duty cycles is used in the SRAM reliability-simulation work. For such a circuit with regular operation patterns, we can divide the period of one

254


Fig. 6. Flowchart of the device and circuit lifetime and failure-rate-prediction process with the MaCRO-accelerated lifetime models. The SPICE simulation predicts the device terminal voltage and current stress profiles, model-fitting parameters are determined from the device testing work.

operation sequence into small steps according to the waveform patterns. During each small step, we treated stress profile as quasi-static, and integrated the stress contribution from each step to the whole period. Finally, we extended one period to the full simulation time. This quasi-static method becomes time consuming for the complex circuits operating in irregular patterns. Some dynamic integration algorithms may be required to improve both the accuracy and speed for more complex circuits. In general, there is no accurate model for dynamicstress analysis, which, in addition to the complexity in extracting all model parameters, limits the applicability of the MaCRO lifetime prediction. In order to overcome this limitation, the MaCRO shifts focus of the reliability analysis from the absolute lifetime prediction and device wearout to the normalized lifetime calculation and circuit functionality. B. Circuit Reliability-Simulation Algorithms MaCRO circuit-reliability-simulation algorithm to investigate the circuit-reliability behavior with failure equivalent circuit models is illustrated in Fig. 7, which is fundamentally a two-step SPICE simulation process. First, the SPICE simulation is performed without considering any failure mechanisms. From the first simulation run, the terminal voltage and current stress profiles for each transistor can be obtained. With the average terminal voltage and/or current information, the accelerated lifetime models for the HCI, TDDB, and NBTI can be called to compute the normalized lifetime of each mechanism for each transistor, and a table of sorted device lists, for HCI, TDDB, and NBTI, respectively, ranked by the normalized lifetime values, will be generated for designers to identify the most degraded transistors. After identifying the most degraded transistors, the MaCRO calls the SPICE engine again. The second round SPICE simulation is performed by substituting those identified transistors with the corresponding failure equivalent circuit models individually or jointly depending on whether a specific transistor experiences a single or multiple wearout mechanisms.

The model parameters for each failure equivalent circuit are calculated with a dedicate Matlab routine, which contains both predefined device/process parameters and user-input parameters. From the second SPICE simulation run, the circuit performance and functionality are expected to change due to the incorporation of the failure equivalent circuit models, which may have changed the circuit internal connections, biasing networks, and local topology. The circuit functionality may or may not be preserved depending on the magnitude of these additional circuit elements. With very limited times of the SPICE simulation, a circuit-functional lifetime and failure behaviors can be easily predicted. With this information, the circuit designers can quickly perform design iterations to improve the circuit reliability if the circuit-functional lifetime falls short of the specifications. They can also work on specific devices in a circuit, sweep their failure equivalent model parameters, and find the critical values corresponding to the specific devicewearout level, at which circuit function fails. From this kind of analysis, designers can explore the circuit-reliability margins and make appropriate performance and reliability tradeoffs. MaCRO offers the circuit designers a possible way to investigate the interactions between different wearout mechanisms. In the SRAM reliability-simulation work [14], we combined the two closely related mechanisms TDDB and NBTI together with a single SPICE circuit model. We also investigated the different effects on SRAM voltage transfer curves (VTC) when combining other different mechanisms. The results indicate that HCI and TDDB have reverse effects on VTC drift, while NBTI has no observable effects. Their interactive effects on SRAM static noise margin (SNM) have also been investigated. VI. C ONCLUSION In this paper, a new SPICE reliability simulation method is developed, which shifts the focus of the reliability analysis from the device wearout to the circuit functionality. A set of accelerated lifetime models and failure equivalent circuit


Fig. 7.

255

MaCRO circuit reliability analysis and simulation algorithm.

models are proposed for common silicon intrinsic wearout mechanisms, including the HCI, TDDB, and NBTI. In this new method, the accelerated lifetime models help to identify the most degraded transistors in a circuit based on their time-variant terminal voltage and current waveforms. Then, the failure equivalent circuit models are used to substitute those identified transistors in the SPICE simulation to investigate the impact of the device wearout on the circuit functionality. Device-wearout effects are lumped into a limited number of failure equivalent circuit model parameters, and the circuit functionality and performance degradation are determined by the magnitude of these model parameters. In this new method, it is unnecessary to perform a large number of small-step iterative SPICE

simulations, therefore, the simulation time is greatly reduced. In addition, it is not imperative to accurately characterize all device and model parameters at each interim wearout process. Therefore, the testing and parameter-extraction work are also significantly alleviated. These advantages will allow the circuit designers to perform the efficient circuit reliability analysis and to develop practical design-for-reliability guidelines. ACKNOWLEDGMENT The authors would like to thank the reviewers and the editors of this paper for their insightful comments and practical suggestions on this work.

256


R EFERENCES [1] R. Blish, T. Dellin, S. Huber et al., Critical Reliability Challenges for the International Technology Roadmap for Semiconductor (ITRS), 2003, Austin, TX: Int. SEMATECH, Technology Transfer # 03024377A-TR. [2] International Technology Roadmap for Semiconductors, Process Integration, Devices, and Structure. ITRS 2003 ed, pp. 32–36. [Online]. Available: http://www.itrs.net/Common/2003ITRS/PIDS2003.pdf [3] Calculating MTTF When You Have Zero Failures, Staffordshire, U.K.: Technical Brief from Relex Software Corporation, pp. 1–4. [Online]. Available: http://www.relex.com/resources/art/calculating_MTTF_with_ zero_failures.pdf [4] C. Hu, “Future CMOS scaling and reliability,” Proc. IEEE, vol. 81, no. 5, pp. 682–689, May 1993. [5] S. Aur, D. E. Hocevar, and P. Yang, “HOTRON—A circuit hot electron effect simulator,” in Proc. IEEE Int. Conf. Comput.-Aided Des. Tech. Dig., Nov. 1987, pp. 256–259. [6] B. J. Sheu, W. J. Hsu, and B. W. Lee, “An integrated-circuit reliability simulator-RELY,” IEEE J. Solid State Circuits, vol. 24, no. 2, pp. 473– 477, Apr. 1989. [7] A. T. Yang and S. M. Kang, “iSMILE: a novel circuit simulation program with emphasis on new device model development,” in Proc. 26th ACM/IEEE DAC, Jun. 1989, pp. 630–633. [8] R. H. Tu, E. Rosenbaum, W. Y. Chan, C. C. Li, E. Minami, K. Quader, P. K. Ko, and C. Hu, “Berkeley reliability tools-BERT,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 12, no. 10, pp. 1524–1534, Oct. 1993. [9] X. D. Xuan, A. Chatterjee, A. D. Singh, N. P. Kim, and M. T. Chisa, “IC reliability simulator ARET and its application in design-for-reliability,” in Proc. 12th Asian Test Symp., Nov. 2003, pp. 18–21. [10] Y. H. Shih, Y. Leblebici, and S. M. Kang, “ILLIADS: A fast timing and reliability simulator for digital MOS circuits,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 12, no. 9, pp. 1387–1402, Sep. 1993. [11] S. S. Chung, T. S. Chang, and P. C. Hsu, “A high level simulator feasible for reliability analysis of VLSI circuits,” in Proc. 4th Annu. IEEE Int. ASIC Conf. and Exhibit, Sep. 1991, p. P15-2/1-4. [12] L. F. Wu, J. Fang, H. Yonezawa, and Y. Kawakami, “GLACIER: A hot carrier gate level circuit characterization and simulation system for VLSI design,” in Proc. IEEE 1st ISQED, Mar. 2000, pp. 73–79. [13] X. Li and J. Bernstein, “Advanced semiconductor wearout mechanisms lifetime and SPICE equivalent circuit modeling,” IEEE Trans. Device Mater. Rel., to be published. [14] ——, “SRAM circuit failure modeling and reliability simulation with SPICE,” IEEE Trans. Device Mater. Rel., to be published. [15] Cadence Virtuoso UltraSim Full-Chip Simulator Datasheet, pp. 1–4, Cadence design Systems, Inc. [Online]. Available: http://www.cadence. com/datasheets/4908_VirtuosoUS_DSfnl.pdf [16] C. Hu, , “IC reliability simulation,” IEEE J. Solid-State Circuits, vol. 27, no. 3, pp. 241–246, Mar. 1992. [17] Reliability Simulation in Integrated Circuit Design, White paper, pp. 1– 11, Cadence design Systems, Inc. [Online]. Available: http://www. cadence.com/whitepapers/5082_ReliabilitySim_FNL_WP.pdf [18] K. N. Quader, C. C. Li, R. Tu et al., “A bidirectional NMOSFET current reduction model for simulation of hot-carrier-induced circuit degradation,” IEEE Trans. Electron Devices, vol. 40, no. 12, pp. 2245–2254, Dec. 1993. [19] M. Karam, W. Fikry, H. Haddara, and H. Ragai, “Implementation of hot-carrier reliability simulation in Eldo,” the 2001 IEEE International Symposium on Circuits and Systems (ISCAS) 2001, vol. 5, pp. 515–518. [20] P. C. Li and I. N. Hajj, “Computer-aided redesign of VLSI circuits for hot-carrier reliability,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 15, no. 5, pp. 453–464, May 1996. [21] W. Jiang, H. Le, J. Chung, and T. Kopley, “Assessing circuit-level hotcarrier reliability,” in Proc. IEEE 36th IRPS, Mar. 1998, pp. 173–179. [22] G. La Rosa, F. Guarin, S. Rauch, A. Acovic, J. Lukaitis, and E. Crabbe, “NBTI-channel hot carrier effects in PMOSFETs in advanced CMOS technologies,” in Proc. IEEE 35th IRPS, Apr. 1997, pp. 282–286. [23] C. Z. Yu, Y. Liu, A. Sadat, and J. S. Yuan, “Impact of temperatureaccelerated voltage stress on PMOS RF performance,” IEEE Trans. Device Mater. Rel., vol. 4, no. 4, pp. 664–669, Dec. 2004. [24] K. Sakui, S. S. Wong, and B. A. Wooley, “The effects of hot carriers generation on the operation of neighboring devices and circuits,” in VLSI Symp. Tech. Dig., May 1993, pp. 11–12. [25] T. Horiuchi, “In-circuit hot-carrier model and its application to inverter chain optimization,” IEEE Trans. Electron Devices, vol. 43, no. 9, pp. 1428–1432, Sep. 1996.

[26] X. Li, B. Huang, J. Qin, X. Zhang, M. Talmor, Z. Gur, and J. B. Bernstein, “Deep submicron CMOS integrated circuit reliability simulation with SPICE,” in Proc. IEEE 6th ISQED, Mar. 2005, pp. 382–389. [27] M. M. Kuo, K. Seki, P. M. Lee, J. Y. Choi, P. K. Ko, and C. Hu, “Simulation of MOSFET lifetime under ac hot-electron stress,” IEEE Trans. Electron Devices, vol. 35, no. 7, pp. 1004–1011, Jul. 1988. [28] W. J. Hsu, B. J. Sheu, S. M. Gowda, and C. G. Hwang, “Advanced integrated-circuit reliability simulation including dynamic stress effects,” IEEE J. Solid-State Circuits, vol. 27, no. 3, pp. 247–257, Mar. 1992. [29] Y. Leblebici and S. M. Kang, “Simulation of hot-carrier induced MOS circuit degradation for VLSI reliability analysis,” IEEE Trans. Rel., vol. 43, no. 2, pp. 197–206, Jun. 1994. [30] N. Hwang and L. Forbes, “Hot-carrier induced series resistance enhancement model (HISREM) of nMOSTFET’s for circuit simulation and reliability projections,” Microelectron. Reliab., vol. 35, no. 2, pp. 225–239, Feb. 1995. [31] E. Y. Wu, E. J. Nowak, A. Vayshenker, W. L. Lai, and D. L. Harmon, “CMOS scaling beyond the 100-nm node with silicon-dioxide-based gate dielectrics,” IBM J. Res. Develop., vol. 46, no. 2/3, pp. 287–298, Mar./May 2002. [32] E. Wu, J. Sune, W. Lai et al., “Interplay of voltage and temperature acceleration of oxide breakdown for ultra-thin oxides,” Microelectron. Eng., vol. 59, no. 1–4, pp. 25–31, Nov. 2001. [33] E. Wu, J. Sune, W. Lai, E. Nowark, L. McKenna, A. Vayshenker, and D. Harmon, “Interplay of voltage and temperature acceleration of oxide breakdown for ultra-thin gate oxides,” Solid State Electron., vol. 46, no. 11, pp. 1787–1798, Nov. 2002. [34] E. Wu and J. Sune, “Power-law voltage acceleration: A key element for ultra-thin gate oxide reliability—invited paper in special issue of microelectronics reliability,” Microelectron. Reliab., vol. 45, no. 12, pp. 1809–1834, Dec. 2005. [35] S. Zafar, B. Lee, J. Stathis, A. Callegari, and T. Ning, “A model for nagative bias temperature instability (NBTI) in oxide and high κ pFETs,” in VLSI Symp. Tech. Dig., 2004, pp. 208–209. [36] S. Zafar, “Statistical mechanics based model for negative bias temperature instability induced degradation,” J. Appl. Phys., vol. 97, no. 1, pp. 1–9, Jan. 2005. [37] “Methods for calculating failure rates in units of FITs,” JEDEC Publication, Jul. 2001, Arlington, VA: JEDEC Solid State Technology Association. [38] J. Srinivasan, S. V. Adve, P. Bose, J. A. Rivers, and C.-K. Hu, “RAMP: A model for reliability aware microprocessor design,” IBM Research Division, Yorktown Heights, NY, IBM Research Rep. RC23048, Dec. 2003.

Xiaojun Li received the B.S. degree in physics and the M.S. degree in semiconductor device and physics from Wuhan University, Wuhan, China, in 1995 and 1998, respectively, and the M.S. and Ph.D. degrees in microelectronics reliability engineering from the University of Maryland, College Park, in 2004 and 2005, respectively. His research work focused on deep submicrometer CMOS VLSI circuit reliability modeling, simulation, and design. Since 2005, He has been a Quality Reliability Engineer with the Flash Memory Group, Intel Corporation, CA. His work includes CMOS circuit failure modeling and Flash reliability simulation and prediction tools development.

Jin Qin received the M.S. degree in reliability engineering from the University of Maryland, College Park, in 2004. He is currently working toward the Ph.D. degree in reliability engineering at the same university. His research interests include reliability testing, reliability data analysis, and microelectronic system reliability estimation.


Bing Huang received the B.S. degree in mining engineering from the University of Science and Technology of Beijing, Beijing, China, in 1977, and the M.S. degree in nuclear engineering from Tsinghua University, Beijing, in 2000. He is currently working toward the Ph.D. degree in reliability engineering with the University of Maryland, College Park. He joined the Center for Reliability Engineering, University of Maryland as a Research Assistant responsible for SRAM accelerated testing, in 2001. Since 2004, he worked for a NASA project to study the impact of microprocessor hardware faults on software reliability. His research interests include microelectronic device reliability modeling and testing, and microprocessor fault modeling and simulation.

Xiaohu Zhang received the M.S. and B.S. degrees in mechanical engineering from Beijing Institute of Technology, Beijing, China, in 1995 and 1998, respectively. He is currently working toward the Ph.D. degree in microelectronic reliability program with the University of Maryland, College Park. His research interests include power devices, microelectronic device modeling, and reliability analysis.

257

Joseph B. Bernstein (S’81–M’89–SM’01) received the Ph.D. degree in electrical engineering from Massachusettes Institute of Technology (MIT), Cambridge, in 1990. He is an Associate Professor of reliability engineering at the Department of Mechanical Engineering, University of Maryland, College Park, with appointments in electrical engineering and the Institute for Research in Electronics and Applied Physics. He is actively involved in several areas of microelectronics reliability and physics of failure research including power device reliability, gate-oxide integrity, radiation effects, MEMS, and laser programmable metal interconnect. He supervises the laboratory for laser processing of microelectronic devices and is the Head of the microelectronics device reliability program. His research interests include thermal, mechanical, and electrical interactions of failure mechanisms of ULSI devices. He also works extensively with the semiconductor industry on projects relating to laser processing for defect avoidance, programmable interconnect, and repair in microelectronic circuits and packaging. He was selected as a Fulbright Senior Researcher/Lecturer and set up a joint center for reliable electronics with Tel Aviv University, Tel Aviv, Israel.