Cross-Layer Modeling and Simulation of Circuit ... - Semantic Scholar

7 downloads 39626 Views 1MB Size Report
Since 2006, he has been a member of the Analog Group at Atheros Communications. (Qualcomm-Atheros), Santa Clara, CA, USA, engaged in mixed signal and.
8

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 33, NO. 1, JANUARY 2014

Cross-Layer Modeling and Simulation of Circuit Reliability Yu Cao, Senior Member, IEEE, Jyothi Velamala, Member, IEEE, Ketul Sutaria, Student Member, IEEE, Mike Shuo-Wei Chen, Member, IEEE, Jonathan Ahlbin, Member, IEEE, Ivan Sanchez Esqueda, Member, IEEE, Michael Bajura, Member, IEEE, and Michael Fritze, Member, IEEE

Abstract—Integrated circuit design in the late CMOS era is challenged by the ever-increasing variability and reliability issues. The situation is further compounded by real-time uncertainties in workload and ambient conditions, which dynamically influence the degradation rate. To improve design predictability and guarantee system lifetime, accurate modeling, and simulation tools for reliability are essential to both digital and analog circuits. This paper presents cross-layer solutions for emerging reliability threats, including: 1) device-level modeling of reliability mechanisms, such as transistor aging and its statistical behavior; 2) circuit-level long-term aging models that capture unique operation patterns in digital and analog design, and directly predict the degradation; and 3) simulation methods for verylarge-scale designs. Built on the long-term model, the new methods significantly enhance the accuracy and efficiency of reliability analysis. As validated by silicon data, these solutions close the gap between the underlying reliability physics and circuit/system design for resilience. Index Terms—Bias temperature instability, circuit simulation, integrated circuit reliability, reliability modeling.

I. Introduction

T

HE challenges of designing and manufacturing robust integrated systems are enormous, especially in the presence of multiple variability and reliability issues, such as bias temperature instability (BTI), channel hot carrier (CHC), time dependent dielectric breakdown (TDDB), electro-migration (EM), and their interaction with variations [1]–[6]. While some of them can be handled by the manufacturing process, comprehensive design and test solutions are necessary to

assess and manage the impact of unreliability. This in turn requires accurate and efficient reliability models. The situation is further compounded by real-time uncertainties in workload and ambient conditions, which dynamically influence the degradation rate [5], [7]. To improve design predictability and guarantee system lifetime, modeling, and simulation tools for reliability are essential to both digital and analog circuits [6], [8]–[10]. A. Definition of Device and Circuit Reliability Historically, reliability effects, such as TDDB and EM, are evaluated by an empirical threshold of performance shift, as shown in Fig. 1. Since these effects usually induce sudden and permanent damages, the exact value of reliability threshold only has a marginal impact on the lifetime [Fig. 1(a)] [11]. With the aggressive scaling of CMOS technology, new reliability mechanisms, especially NBTI and CHC, have become more pronounced and dominate the lifetime of devices and circuits. In contrast to TDDB, their effect is more gradual, or even recoverable [12]; a small difference in reliability threshold may result in a dramatic shift in determining the lifetime [Fig. 1(b)]. In this case, a better approach is to provide parametric prediction of the degradation, such that designers are able to examine the tradeoffs among speed, power, cost, and reliability [8]. This new trend requires the development of aging models and simulation tools that correctly capture the physics and efficiently support aging diagnosis during the design stage [6]. B. Reliability Simulation Tools

Manuscript received June 9, 2013; revised September 17, 2013; accepted October 14, 2013. Date of current version December 16, 2013. This work was supported in part by the Semiconductor Research Corporation and in part by the DARPA – Integrity and Reliability of Integrated Circuits Program under Grant HR0011-11-C-0067. This paper was recommended by Associate Editor V. K. Narayanan. Y. Cao, J. Velamala, and K. Sutaria are with the School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ 85287 USA (e-mail: [email protected]; [email protected]; [email protected]). M. S.-W. Chen is with the Department of Electrical Engineering, University of Southern California, Los Angeles, CA 90089 USA (e-mail: [email protected]). J. Ahlbin, I. S. Esqueda, M. Bajura, and M. Fritze are with the Information Sciences Institute, University of Southern California, Arlington, VA 22203 USA (e-mail: [email protected]; [email protected]; [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCAD.2013.2289874

To date, research work on aging mechanisms is active mainly within the communities of device and reliability physics. The relative lack of design knowledge and CAD tools further creates a barrier for managing the impact of device degradation on circuit performance [9]. Leading industrial companies develop their own reliability models and tools. These tools, however, are usually proprietary and customized to a specific technology, not available for general usage [6]. Commercially available reliability tools suffer from issues of inaccuracy in longterm prediction mainly due to their extrapolation method, as well as the simulation expense [13]–[17]. One example is the flow based on the Berkeley reliability simulation framework for CHC and BTI (Fig. 2) [18], [19].

c 2014 IEEE 0278-0070 

CAO et al.: CROSS-LAYER MODELING AND SIMULATION OF CIRCUIT RELIABILITY

9

Fig. 2. Typical simulation flow used by conventional reliability tools. The extrapolation method is employed for longterm lifetime prediction.

this goal, from device-level modeling of reliability physics, to circuit-level long-term aging models that are customized for digital and AMS design, and to large-scale reliability simulation methods. The results are demonstrated with design examples that experience severe reliability challenges. Fig. 1. Traditional definition of reliability is only appropriate for sudden failures. (a) For TDDB, the exact threshold value has little impact on the prediction of lifetime. But such a definition is not applicable to gradual shift. (b) Prediction of lifetime is highly sensitive to the threshold value, as observed in negative bias temperature instability (NBTI).

1) First, several device-level reliability parameters are collected from silicon data at an accelerated stress condition. 2) Second, the simulator samples the netlist files and their input stimulus in the SPICE environment, to calibrate the workload and dynamic stress conditions. 3) Finally, based on short-term SPICE simulations, these tools extrapolate the aging rate and the lifetime. In this approach, any small amount of modeling or calibration errors would be amplified during the extrapolation, causing a dramatic error in the longterm prediction. In addition, the tracking of aged parameters through SPICE simulations makes these tools expensive in computation and memory usage [19], [20]. While they are able to calculate circuit degradation in small designs, performance evaluation of large-scale designs with millions of gates is impractical. To overcome these problems, a generic simulation tool that efficiently predicts longterm degradation will be extremely useful. For large-scale VLSI designs, it should be sensitive to both process and digital operation conditions, such as supply voltage (Vdd ), temperature (T), and input signal duty cycle (α) [6], [12]. These parameters are not spatially or temporally uniform, but vary significantly from gate to gate and from time to time, due to the uncertainty in circuit topologies and operations. For analog and mixed-signal (AMS) designs, reliability prediction is even more challenging [21]: although aging induced shift in transistor threshold voltage (Vth ) does not affect the stress voltage in a digital logic gate, Vth shift does have a significant impact on key AMS parameters, such as the bias condition, offset, and gain. In the AMS case, the extrapolation method based on short-term sampling does not account for the changing nature of the operation condition and thus, may lead to incorrect long-term prediction. In summary, it is necessary to develop new models and simulation methodologies to improve circuit reliability prediction in both digital and AMS design under dynamic operations. This paper presents a cross-layer approach toward

II. Device-Level Modeling of Reliability Physics In scaled CMOS technology, NBTI and CHC are the dominant mechanisms of longterm degradation. Their primary impact at the device level is the gradual increase in the magnitude of transistor Vth [12]. Since V th directly affects the delay of digital gates, the speed of logic paths decrease temporally under these aging effects. Similarly, in AMS circuits, increased Vth degrades gain and other performance metrics. To estimate circuit degradation rates, the fundamental step is to model Vth shift (Vth ) under various conditions. A. Compact Modeling of NBTI NBTI mostly affects the pMOS transistor at the standby mode, where positive BTI (PBTI) becomes more pronounced in high-k technologies [1]. There are two phases of NBTI, depending on the bias condition. When a negative gate voltage is applied to a pMOS transistor (e.g., Vgs = 0), the degradation happens and thus, this phase is referred as stress; when the gate voltage is 0 or positive (e.g., Vgs = Vdd ), NBTI induced degradation can be partially recovered. The second phase is referred as recovery and has a significant impact on aging analysis during dynamic circuit operation. Reaction-Diffusion (RD): The classical description of NBTI is the reaction-diffusion theory, where the two mechanisms contributing to NBTI are the breaking of Si-H bonds at the silicon substrate/oxide interface (i.e., reaction), followed by diffusion of hydrogen species from the interface into gate oxide and the gate (i.e., diffusion), allowing the buildup of interface traps [22], [23]. Given the initial concentration of the Si-H bonds (N 0 ) and the concentration of inversion carriers (P), the generation rate of interface traps in the stress phase, N IT , is given by dNIT = kF (N0 − NIT ) P − kR NH NIT dt

(1)

where k F and k R are the reaction rates of forward and reverse reactions. Akin to other reactions, the generation rate is an exponential function of electric field and temperature; it is also proportional to the density of reaction species [24]. Meanwhile, hydrogen species generated from the reaction diffuse

10

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 33, NO. 1, JANUARY 2014

away toward the gate, driven by the gradient of hydrogen density. This process is governed by dNH d 2 NH . (2) = DH dt dx2 The solution of (1) and (2) exhibits a power-law dependence on the stress time [12], [24], [25]  qNIT Vth = = k Vgs − Vth · t n . (3) Cox The exact value of the power-law index indicates the type of diffusion species. For NBTI, n is about 0.16. k has an exponential dependence on voltage and temperature. In the recovery phase, hydrogen species diffuse back to the interface and partially anneal the broken Si-H bonds. This phase is dominated by the diffusion process. The RD theory partially explains the degradation behavior during the stress phase of NBTI. However, it predicts that the recovery phase is independent of the electric field and the temperature, which contradicts some measurement data [26], [27]. Furthermore, the RD model is not accurate enough to fully explain the statistics of NBTI, as well as the fast recovery [28], [29]. These questions lead to an alternative theory, the trapping/detrapping mechanism [27], [30]. Trapping/Detrapping (TD): Charge trapping and detrapping at localized states (charge traps) resolve the limitations of the RD theory. Several works have presented the evidence, such as the discrete Vth shifts observed with the fast measurement techniques [31]. Fig. 3 illustrates the physical picture of TD; the trap may capture a charge carrier or emit a trapped charge, thus modulating Vth [30]. In the stress phase, it has a higher probability of capturing charges, while the probability of emission is higher in the recovery phase. The occupation probability increases with voltage and temperature. TD events are stochastic in nature and hence, the TD model is based on the statistics of trap properties. The basic modeling assumptions in the TD theory are the same as the ones used in low-frequency noise, since the charge trapping dynamics in NBTI is similar to that causing lowfrequency noise [30], [32]. The three main assumptions of the trap properties are: 1) the number of traps follows a Poisson distribution, which is common for a discrete process; 2) capture and emission time constants are uniformly distributed on the logarithmic scale. This microscopic assumption is critical to derive the logarithmic time evolution at the macroscale; 3) the distribution of trap energy follows a U-shape, which is verified by silicon measurement and key to the voltage and temperature dependence on aging. Based on these assumptions, the average number of occupied traps under constant stress is given by n (t) = N · p01 (t, τc , τe )

(4)

where N is the Poisson parameter for trap distribution and p01 is the trap occupation probability     1 − exp −t τeq · τe p01 (t) = (5) τe + τ c

Fig. 3. Statistical trapping/detrapping events lead to the generation of interface charges during the aging process of NBTI.

Fig. 4. Under constant stress, compact model based on the TD theory matches the logarithmic time dependence.

which is a function of stress time t, capture and emission time constants (τc , τe ), and 1/τeq = 1/τc + 1/τe . Using (4) and (5) and substituting the distributions for time constants −p min t 10

n (t) = φ · 10−p max t

1 − e−u du u

(6)

where φ is proportional to the number of available traps per transistor; pmax and pmin correspond to fastest and slowest traps, respectively [28], [30]. By integrating the number of occupied traps over the stress time, the average shift in Vth is obtained as   Vth (t) = φ · A + B log (1 + Ct) . (7) Distinct from the power law of the RD model, the threshold voltage shift follows a logarithmic dependence on the stress time. The TD model also predicts the exponential dependence on voltage and temperature. Fig. 4 shows the model prediction matches with 65 nm silicon data under constant stress condition (i.e., static NBTI). Such a time evolution has a farreaching impact on the aging behavior. Furthermore, Fig. 5 presents the impact of interface charges on carrier mobility, where the effective field Eeff ∝ (Vgs + Vth )/Tox [33]. Since the trapped charges mainly affect the strength of Coulomb scattering, the mobility only degrades in the region of low gate bias voltage. Therefore, the degradation of carrier mobility under NBTI only needs to be considered in some AMS designs, but is negligible in digital design [12]. The shift of other device parameters, such as the linear and the saturation current, can be predicted from the nominal device model with the shift of Vth and the mobility. Aging Statistics: The TD mechanism is statistical in nature, inducing more uncertainties to aging analysis. Statistical NBTI

CAO et al.: CROSS-LAYER MODELING AND SIMULATION OF CIRCUIT RELIABILITY

11

Fig. 5. NBTI only affects carrier mobility when gate bias is low, but not in the strong inversion region.

Fig. 7. (a) No correlation between fresh Vth (t = 0) and Vth shift from NBTI. (b) Decrease in the variation of φ with larger transistor size.

Fig. 6. TD model (7), fitted from data < 20ks under 1.8 V, 125o C, well predicts the longterm statistics at 200ks. The shift of Vth approximately follows the normal distribution.

data is collected from a 65 nm test chip at 1.8 V [34]. From this stress data, parameters in the TD model (7) are extracted. Fig. 6 compares aging statistics with model prediction; the variation is attributed mainly to the randomness in parameter φ, proportional to the fluctuation of trap numbers; other model parameters do not suffer a significant amount of variations. By extracting the randomness in φ from short-term measurement (e.g., 20ks), the TD model reliably predicts the increase of both mean and variance at longer stress time (e.g., 200ks) [28]. The variation in Vth is also a function of transistor size, similar to the random variation in fresh Vth [35]. For a given length (L), the smaller width (W) devices have a larger initial Vth value. But Vth is independent of the initial Vth as shown in Fig. 7(a). Fig. 7(b) illustrates that the variability of Vth due to stochastic NBTI is inversely proportional to the square root of transistor area, similar to the behavior of Vth variation due to random dopant fluctuations. In design practice, the mean and the variance of statistical parameters [e.g., in (7)] need to be characterized from silicon data [28]. By embedding them into the model, aging

Fig. 8. TD model predicts the dynamic behavior under DVS, such as the transition period and the convergence to the constant stress condition.

simulation tools are able to predict the fluctuations in the degradation through Monte-Carlo sampling. Dynamic NBTI Model: Today’s circuits typically have a reduced activity factor through dynamic voltage scaling (DVS), to reduce power consumption. Since the degradation rate is highly sensitive to the stress voltage, DVS leads to a different amount of circuit aging compared to the constant stress case. Compact models for dynamic NBTI are proposed based on the RD mechanism [25], [36], [37]. The models capture some important characteristics, such as the frequency independence through many switching cycles, but still have difficulties in explaining the fast recovery. From the perspective of TD, NBTI under changing stress voltages is explained as a dynamic equilibrium that is determined by both the initial number of occupied traps and the occupation probability, which is voltage dependent. Assuming the voltage switches from V1 to V2 at t = t0 , a closed form

12

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 33, NO. 1, JANUARY 2014

solution is obtained following a similar derivation as the static NBTI model [38]   Vth (t) = φ2 · A + B log (1 + Ct)

1 + C (t + t0 ) + φ1 · B log (8) 1 + Ct where φ 1 corresponds to that under V1 and φ2 corresponds to that under V2 . Vth shift in (8) can be physically interpreted as a sum of two components, 1 and 2 , which are proportional to φ1 and φ2 , respectively. When the voltage changes to a lower value (i.e., V1 > V2 ), traps initially emit some excess charge carriers, since the occupation probability is lower under V2 . Therefore, 2 dominates Vth initially, showing the recovery. When the stress under V2 continues for a longer time, 1 eventually takes over and Vth increases. If V2 > V1 , the degradation rate increases at t0 . Fig. 8 validates the dynamic model, especially the nonmonotonic behavior from V1 = 1.5 V to V2 = 1.2 V. In reality, both RD and TD events may occur together during NBTI, causing permanent damage and fast recovery, respectively [39], [40]. Depending on the fabrication process, one mechanism may be more pronounced than the other. While (7) and (8) present NBTI models based on the TD mechanism, RD models are widely available in [12] and other works. Furthermore, research efforts are active in decomposing these two mechanisms from experimental data, to provide a solid basis for model parameter extraction and reliable prediction of circuit aging [41], [42]. B. Compact Modeling of CHC The effect of channel hot carrier is mainly observed in nMOS transistors. The main source of the hot carriers is the heating inside the channel during dynamic switching [43]. These energetic carriers may lead to impact ionization within the substrate and generate electrons or holes inside the channel. The heated carriers themselves can be injected into the gate dielectric. During this process, the injected carriers generate interface or bulk oxide defects. As a result, the transistor characteristics, such as threshold voltage and transconductance, degrade over time. The power law model (tn model, where t is the stress time) well explains the time dependence of Vth due to CHC, since CHC can be microscopically described as the generation and distribution of interface or oxide traps (charges). Similar as the RD mechanism for NBTI, there are two phases in this modeling framework [12]: the generation phase, where some Si-O bonds are broken under the electrical stress, and the distribution phase of trapped charges. While NBTI happens uniformly in the channel, CHC impacts primarily the drain end. The degradation of Vth caused by CHC is given by [12]



  ϕit Eox q Vth (t) = exp − tn K Vgs − Vth exp Cox E0 qλEm (9) where Eox is the vertical electric field, Em is the maximum lateral electric field. λ is interpreted as the hot-electron meanfree path and ϕit the minimum energy that a hot electron must have to create an impact ionization. The temporal degradation

Fig. 9. Sub-circuit model for NBTI in the pMOS transistor. A similar module is used to implement CHC.

Fig. 10. SPICE simulations with dynamic NBTI and CHC models predict the different behavior of F under different stress voltages.

rate is governed by the time exponent, n, which is about 0.45. CHC does not recover as seen in NBTI. Furthermore, for AMS designs, the degradation of other device parameters, such as the mobility, need to be considered to an accurate prediction [43]. C. Verilog-A Implementation To support transistor-level simulation of aging effects, models for NBTI and CHC are implemented into SPICE using a voltage-controlled-voltage-source (VCVS), which is coded as a Verilog-A module [44], [45]. Fig. 9 illustrates the subcircuit model for a pMOS transistor. The VCVS emulates the increase in Vth as a decrease in Vg , resulting in the degradation of other transistor characteristics. This module is applied to each transistor, capturing the degradation rates due to various operation conditions. In the case of CHC, an additional voltage-controlled-current-source (VCCS) may be necessary between the drain and the source node, in order to simulate the degradation of carrier mobility and other parameters [43]. Fig. 10 demonstrates one circuit example using this implementation. It evaluates the frequency degradation (F) of a voltage-controlled-oscillator (VCO) under different stress voltages. Due to the different voltage dependence of NBTI and CHC, F is dominated by NBTI under a lower stress voltage. As the voltage increases, the impact of CHC is more pronounced, indicated by a larger time exponent n.

III. Circuit-Level Longterm Modeling The impact of reliability effects on IC design is in the long term (e.g., 5 years or even 25 years). This property is unique

CAO et al.: CROSS-LAYER MODELING AND SIMULATION OF CIRCUIT RELIABILITY

13

If (10) and (11) are directly used to predict the longterm shift of Vth at a time of t, the device has to be simulated for m = t/Tclk cycles, which is impractical. However, it is possible to obtain a closed form for the upper bound on Vth as a function of α, Tclk , and t. As shown in Fig. 11 2n   1/2 Vths,m = Kv α1/2 Tclk + 2n Vthr,m−1 (12) √

2ξ1 Te + ξ2 CT (1 − α) Tclk √ Vthr,m = Vths,m 1 − (1 + δ) Tox + CT mTclk = Vths,m · βm . (13) Using (12) and (13), Vths,m+1 is obtained as a function of Vths,m and then repeatedly, Vths,j can be replaced by Vths,i−1 for i = m, . . . , 1 Fig. 11. Shift of Vth by NBTI under multicycle operations. Model parameters in the longterm NBTI model are defined.

Vths,m+1 from other circuit performance metrics, such as speed and power: the lifetime of an IC product is only probable from short-term samples, but not able to be directly measured. In this context, the short-term or instantaneous models only have limited usage in design practice; what is needed is the longterm model that is capable of predicting the degradation after many cycles of operation. Such a longterm model should be sensitive to circuit-level parameters, including DVS, switching activity, signal patterns, temperature, etc. This section presents the development of longterm aging models for digital and analog designs, because they are different in their operations. A. Digital Operation A transistor in a digital design usually experiences a railto-rail input signal, through the life time. For the CHC effect, the total amount of degradation is simply the integration of Vth from each stress period, since there is no recovery. For the NBTI effect, it is more complicated due to both the stress and the recovery effects. Fig. 11 presents Vth under NBTI during multiple cycles. Because of the structure of the logic path, the period under the stress (e.g., Vgs = 0 for a pMOS transistor) can be any value between 0%–100% of the clock period (Tclk ), which is characterized by signal probability (or duty cycle) α. RD-based Model: Based on the reaction-diffusion mechanism, the stress phase is a combination of both the reaction and the diffusion effects, while the recovery phase is dominated by the diffusion effect. For each phase, a compact model of Vth (t) is expressed as [7] 2n   Stress(t

Suggest Documents