Soft-Error Vulnerability of Sub-100-nm Flip-Flops - IEEE Xplore

1 downloads 0 Views 363KB Size Report
The soft-error vulnerability of flip-flops has become an important factor in IC reliability in sub-100-nm. CMOS technologies. In the present work the soft-error.
14th IEEE International On-Line Testing Symposium 2008

Soft-Error Vulnerability of Sub-100-nm Flip-flops Tino Heijmen NXP Semiconductors [email protected] for chips running at GHz frequencies, e.g., microprocessors, the SER contribution from combinational logic can be as large as 10% [1]. However, for applications that run at frequencies well below 1 GHz the SER contribution from flip-flops is generally dominant over the combinational logic contribution. There is a trend that the SER risk is increasing for several applications that typically run at moderate frequencies. Examples include the automotive and medical domains. Because of the lower frequency in these application domains, certain aspects of the simulation approach can be simplified compared to methodologies that are used for ICs running at GHz frequencies. In the current paper a detailed analysis is presented of the soft-error vulnerability of flip-flops in sub-100nm CMOS technologies. The focus is on edgetriggered D-type FFs without reset, preset, or hold functionality. These so-called “plain” FFs are widely used in IC designs. Previous measurements on 90-nm FFs showed that a “clear” FF (with reset functionality) has on average a 45% lower alpha-SER, but a 10% higher neutron-SER, compared to a plain FF [2]. In general, the contribution of a circuit element to the SER of an IC depends on three factors [3], [4], • nominal SER, i.e., the SER of an isolated circuit element under static conditions, • timing vulnerability factor (TVF), i.e., the fraction of time that the circuit element in an IC is vulnerable, • architectural vulnerability factors (AVF), i.e., the probability that a soft error in a circuit element is observed at the IC level. The present works focuses on the nominal SER and the TVFs of the master and slave latches in flip-flops. The results can be used to estimate the SER of ICs. The AVF is outside the scope of the current paper. The outline of the paper is as follows. In Sec. 2 the alpha-accelerated SER testing of a 65-nm flip-flop is discussed. The results are compared with data from a 90-nm flip-flop. In Sec. 3 the SER vulnerability is

Abstract The soft-error vulnerability of flip-flops has become an important factor in IC reliability in sub-100-nm CMOS technologies. In the present work the soft-error rate (SER) of a 65-nm flip-flop has been investigated with the use of alpha-accelerated testing. Simulations have been applied to study the flip-flop SER sensitivity in detail. Furthermore, an easy-to-use approach is presented to make an accurate estimation of the contribution of flip-flops to the SER of an IC. The method is applicable to frequencies well below 1 GHz. The approach is based on a set of expressions for the timing vulnerability factor (TVF) of the master and slave latches of the flip-flop. With this approach it is possible to make an accurate estimation of the flip-flop SER parameters.

1. Introduction Radiation-induced soft errors are an increasingly important threat to the reliability of integrated circuits (ICs) processed in advanced CMOS technologies. Until recently soft-error rate (SER) at the IC level was typically dominated by the contributions from the embedded SRAM cores. However, in sub-100-nm CMOS processes the contribution from logic is often non-negligible, especially if the embedded memories are protected with error-correction coding and the IC has to meet stringent SER requirements. Logic SER can be categorized into contributions from flip-flops and contributions from combinational logic. Soft errors in flip-flops are caused in a manner that is similar to upsets in SRAM bit-cells: the collection of radiation-induced charges results in a bitflip if the collected charge exceeds the critical value Qcrit. In contrast, charge collection in combinational logic leads to a voltage glitch, which only results in a soft error if the glitch is latched by a flip-flop. Hence, the SER due to upsets in combinational logic increases linearly with clock frequency. It has been reported that

978-0-7695-3264-6/08 $25.00 © 2008 IEEE DOI 10.1109/IOLTS.2008.12

247

140

Alpha-SER (arb. units)

120

(a)

100 80

90nm 65nm

60 40 20 0

(b)

CLK=High

CLK=High

CLK=Low

CLK=Low

Data=1

Data=0

Data=1

Data=0

Figure 2. Alpha-SER of plain-FFs

Figure 1. Equivalent circuit of a master-slave flip-flop for (a) CLK=High and (b) CLK=Low. Arrows indicate circuit nodes sensitive to upsets.

FF65 shows a lower SER than FF90, but the difference is smaller for CLK=High than for CLK=Low. The two flip-flops are comparable in schematic and in properties. The slave is sensitive to upsets if the clock is low. The output driver of the flip-flop is the load of the slave latch. In FF65 the output driver is bigger than in FF90 and therefore has a relatively large gate capacitance. This results in a stabilization of the data stored in the slave latch. Therefore, the SER at CLK=Low for FF65 is substantially smaller than for FF90. This result shows that the gate capacitance plays an important role in the soft-error vulnerability of logic cells. The schematic of the master latch, which is sensitive at CLK=High, is identical for the two flipflops. For Data=1 the FF65 has a 40% lower SER, whereas the result is comparable for Data=0. The difference in SER is caused by subtle differences in the layout design. This result shows that details of the cell design have an effect on the SER. In many cases, a change in the schematic or layout makes the cell more sensitive in one condition and less sensitive in another.

studied in detail with the use of circuit simulations and an analytical SER model. Section 4 treats the TVFs of the master and slave latches in a flip-flop. Section 5 discusses the contributions of the master and slave latches to the SER at the IC level. In Sec. 6 the different parts of our approach are combined into a method to estimate the flip-flop contribution to the SER of an IC. Finally, conclusions are presented in Sec. 7.

2. Alpha-accelerated SER tests A dedicated test vehicle was used for the SER characterization of a 65-nm flip-flop (FF65) from a standard-cell library. The vehicle consists of scan chains with in total 50k flip-flops. Alpha-accelerated SER tests were performed on this vehicle using an Am-241 alpha-emitting source. Flip-flops, in contrast with SRAMs, generally have a SER that strongly depends on both the stored data and on the clock state. A flip-flop contains two latches, a master and a slave, which are sensitive at different parts of the clock cycle. For the flip-flops investigated in the current work the master latch is vulnerable for soft errors when the clock is high and the slave latch is sensitive when the clock is low. This is illustrated in Figure 1. Because of this, the flip-flops were tested under four different static conditions, i.e., with the clock signal fixed at either a high or a low voltage level and either an All-1 or an All-0 pattern written into the scan chains. The experimental data are compared with previously published SER results for a 90-nm flip-flop (FF90) [2]. Figure 2 shows the measured alpha-SER for FF90 and FF65 at VDD=1.2 V. For all four conditions the

3. SER modeling Accelerated SER testing of a circuit element, such as a flip-flop, requires a dedicated test vehicle that includes a sufficiently high number of these elements. Therefore, in general only a very limited part of a library can be characterized with accelerated tests. An alternative is to use models to estimate the nominal SER. An accurate model can be obtained by calibrating the model parameters with experimental data. Such a model can also be applied to study the relative vulnerability of the different circuit nodes, as is discussed below.

248

It has been discussed in [2] that the SER of flipflops can be accurately modeled if both electron and hole collection are taken into account. The following expression includes two separate terms, for the contributions from the collection of radiation-induced electrons and holes, respectively,

[

SER = κ Adiff,N exp( − Qcrit,e /η elec )

]

+ Adiff,P exp( − Qcrit,h /η hole ) ,

Data = 1

NL Low

(1)

where Adiff,N and Adiff,P denote the NMOS and PMOS drain diffusion area, respectively. Qcrit,e and Qcrit,h denote the critical charge for upsets due to the collection of electrons and holes. The parameters ηelec and ηhole denote the charge collection efficiency for electrons and for holes. ηelec and ηhole, together with the overall scaling factor κ, are determined by fitting the model expression to experimental SER data. The model agreed within 30% with a set of experimental alpha- and neutron-SER data for 90nm flip-flops [2]. The model expression of Eq. (1) was fitted to the measured alpha-SER data for the 65nm flip-flop discussed in Sec. 2. The critical charges Qcrit,e and Qcrit,h were calculated using SPICE-like circuit simulations, in which a current pulse is injected into specific circuit nodes. The pulse had the waveform [5] I pulse =

2 Qtot π τ

1 ⎛ −t ⎞ exp⎜ ⎟ , τ ⎝ τ ⎠

Data = 0 L

L

H

H

NR High

(a)

NL High

NR Low

(b)

Figure 3. Schematic of the feedback loop of a latch that (a) stores a 1 and (b) stores a 0; the circles denote the sensitive drain junctions.

the latch stores a 1 then hole collection results in a current pulse at node NL and electron collection results in a pulse at node NR. The reverse order applies to Data=0. The Qcrit,e and Qcrit,h of FF65 have been calculated with the use of circuit simulations. The results are listed in Table 1. In all cases there is a clear difference between Qcrit,e and Qcrit,h. The smallest one of the two critical charges dominates the SER. Which one of the two critical charges is the smallest depends on the stored data, both for the master and the slave latch. The smallest Qcrit value is in all cases associated with a current pulse at node NL. This means that the latch is most vulnerable for charge collection by the drain junctions of the transistors in the tri-state feedback inverter. This can be explained by a difference in drive strength of the feed-forward inverter and the feedback tri-state inverter. The transistors in the feed-forward inverter are relatively wide in order to have a large drive current and therefore a small internal propagation delay of the flip-flop. In contrast, the feedback tri-state inverter is typically designed with minimum-sized transistors, because it is not in the critical path. Because the feed-forward inverter can supply a large current, it is able to compensate a current pulse at node NR. In contrast, the small feedback tri-state inverter is weak in compensating a pulse at node NL. To study the difference in vulnerability of the two storage nodes in more detail, the SER contributions from upsets at nodes NL and NR have been calculated with Eq. (1), using the Qcrit data of Table 1. The results, listed in Table 2, show that the latch SER is dominated by upsets caused by current pulses injected into node NL. The difference in vulnerability is extremely large for the slave latch, where node NR is practically immune to upsets. This is because in the

(2)

where the timing parameter τ was set to 3 ps. It was discussed in [2] that applying this pulse width results in accurate simulation results. The model expression of Eq. (1), using Qcrit,e and Qcrit,h values calculated from circuit simulations with the pulse current of Eq. (2), was used to evaluate the vulnerability of specific circuit nodes in FF65. In particular, the vulnerability of the two storage nodes in the two latches was studied in detail. Figure 3 schematically depicts the feedback loop that is used in both the master and the slave latch. The loop consists of a feed-forward inverter and a feedback tri-state inverter. The two inner transistors of the tri-state inverter have clock signals at the gate inputs. By definition, the latch stores a ‘1’ if the left node NL is low and the right node NR is high. The circles in Figure 3 denote the transistor drain junctions that are reverse-biased, for the cases in which the latch (a) stores a 1 and (b) stores a 0. Junctions that are reversebiased are able to collect charges that are induced by ionizing particles, such as alpha particles. NMOS drains can collect electrons, whereas PMOS drains can collect holes. This charge collection results in a current pulse, which in this work is modeled with Eq. (2). If

249

Table 1. Critical charges.

Latch Master Slave

Data 1 0 1 0

Electron pulse NR NL NR NL

data

Qcrit,e (fC) 8.7 3.7 >100 6.0

Hole pulse NL NR NL NR

Qcrit,h (fC) 4.3 6.7 6.1 26.2

D

Q

D

Q

FF1

FF2

CLK

CLK

clock

clock

Figure 4. Schematic of two flip-flops separated by a series of inverters.

Table 2. SER (in arbitrary units) due to current pulses at nodes NR and NL. Values in regular and italic font correspond to upsets due the collection of electrons and holes, respectively.

Latch Master Slave

Data 1 0 1 0

SER at NL 21 85 3.9 11

SER at NR 1.3 1.4 ≈0 ≈0

master-slave flip-flop, operated at a relatively low clock frequency. Please note that the TVF is associated here with the propagation of a soft error that has been generated in a sequential. The generation and propagation of soft errors in combinational logic was not studied in the work presented here. Secondly, the SER contribution from the flip-flop depends also on the (micro-)architecture of the IC and on its application. This dependency is represented by the architectural vulnerability factor (AVF), which equals the fraction of soft errors in the circuit that are observed at the system level. A discussion of the AVF is outside the scope of the current paper. The TVFs of the master and slave latches of a flipflop depend on the propagation delay of the combinational logic separating the flip-flop (FF1) from the next down-stream flip-flop (FF2). This is illustrated in Figure 4, where a series of inverters represents the combinational logic block. When the clock is high, the master latch is vulnerable. A bit flip in the master latch of FF1 will propagate through the combinational logic and will be latched in FF2 at the next rising clock edge. Let us assume that the clock has a 50% duty cycle. If the propagation delay tprop of the combinational block is less than 50% of the clock cycle time tcycle, then any bit flip in the master latch of FF1 will arrive at FF2 in time to be latched. In this case the TVF of the master latch equals ½. If tprop > ½ tcycle, then a bit flip in the master latch has to be generated at least tcycle – tprop before the rising clock edge in order to arrive at FF2 in time to be latched. In this case the TVF equals 1 – (tprop / tcycle). The impact of tprop is different for the TVF of the slave latch, which is vulnerable when the clock is low. A bit flip in the slave latch has to be generated at least ½ tcycle – tprop in order to arrive in time at FF2 to be latched. Thus, the TVF then equals ½ – (tprop / tcycle). If tprop > ½ tcycle, then a bit flip in the slave latch of FF1 cannot be propagated to FF2.

Ratio 16× 63× >100 >100

slave latch not only the node charge of NR is stabilized by the large feed-forward inverter, but the feedback loop is also stabilized by the large gate capacitance of the output driver. This example demonstrates that the contributions from both electron and hole collection need to be taken into account. For example, Table 2 shows for the master latch that if Data=1 the SER due to hole collection is 16× as high as the SER due to electron collection. In contrast, if Data=0 electron collection results in a 63× higher SER contribution compared to hole collection. Therefore, a model that considers only upsets due to the collection of electrons would not give accurate SER values.

4. Timing vulnerability factor The contribution of a flip-flop to the SER of an IC is lower than the nominal SER of the flip-flop under static conditions, because of derating factors [3], [4]. Firstly, the flip-flop will not be sensitive to soft errors during the complete clock period, but only during a window of vulnerability (WoV). The timing vulnerability factor (TVF) denotes the fraction of time that the flip-flop is vulnerable. In accelerated tests, the SER is measured under static conditions, i.e., with the clock signal fixed at either a low or a high voltage level. Applying the flip-flop under dynamic conditions results in a derating, which is accounted for by the TVF. The TVF of sequentials has been discussed elsewhere [3]. Below we discuss the specific case of a

250

Table 3. Flip-flop TVF as a function of tprop.

tprop ≤ tcycle / 2

tprop > tcycle / 2

0.5

Master

½

1 – (tprop / tcycle)

0.4

Slave

½ – (tprop / tcycle)

0

0.3

TVF

Latch

master: 1->0 master: 0->1 slave: 1->0 slave: 0->1 master (Table 3) slave (Table 3)

0.2 0.1

The results are summarized in Table 3. In comparison, the TVF of a flow-through latch equals ½ – ½ (tprop / tcycle), for all values of tprop [3]. The TVF also depends on the internal propagation delay and the setup, rise, and fall times of the flip-flop and on the clock skew and jitter [3], but these dependencies have been ignored here. This simplification is valid if these variables are much smaller than tcycle. This is generally true if the clock frequency is relatively low. Note that when the frequency is increased (i.e., tcycle is reduced), then the TVF either decreases or remains constant. The expressions in Table 3 were validated with circuit simulations in which current pulses were injected into circuit nodes of FF1, using a similar simulation setup as in [3]. The current pulses were strong enough to induce bit flips in the corresponding latch. The effect of the disturbance was monitored at the output Q of FF2. The value of tcycle was equal to 2 ns, while tprop was varied by including a variable number of inverters between the two flip-flops. The rise and fall times of the clock signal equaled 100 ps. The results are shown in Figure 5. The data obtained from the circuit simulations agree quite well with the expressions of Table 3. This result shows that for circuits with moderate clock frequencies (≤ 500 MHz) these simple expressions are an efficient and sufficiently accurate means to estimate the TVF of the latches in a flip-flop. The simulated TVF for the 0Æ1 and 1Æ0 bit flips differ with 10% in some cases. This is better than the 30% accuracy of the analytical model for the nominal SER. The variations in the TVF are caused by differences in the circuit design of the flip-flop, which affect its timing characteristics. This shows again that the details of the circuit design have an impact of the soft-error vulnerability of a flip-flop. To analyze the SER of an IC the average TVF of the flip-flops is important. This average depends on the propagation delay distribution for the combinational logic paths in the IC. If a uniform distribution is assumed then from the expressions in Table 3 it follows that the average TVF for the master latch equals 3/8, whereas the average slave TVF is 1/8. If most paths have a relatively short propagation delay, then the average TVF values will be larger. As a result,

0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

tprop / tcycle

Figure 5. TVF from simple expressions compared with circuit simulation results.

if the clock frequency is lowered the average TVF of both the master and the slave latches is increased. Figure 5 shows that the TVF is larger for the master latch than for the slave latch. Because of this, the master latch will in generally contribute more to the flip-flop SER and, consequently, to the SER at the IC level than the slave latch. Therefore, if a flip-flop design needs to be optimized for lower SER, improvement of the master latch is most effective.

5. SER of master and slave latch To calculate the contribution of a master-slave flipflop to the SER at the IC level, the contributions from the master and the slave latch have to be summed, SER FF = SER M + SER S .

(3)

The AVF is the same for the two latches, but the nominal SER and the TVF are in general different for the master and the slave,

(

)

SER FF = SER nom TVFM + SER Snom TVFS AVF , M

(4)

where SERnom denotes the nominal SER. The maximum value of TVFM and TVFS is ½ (assuming that the duty cycle of the clock is 50%). However, the WoV of the flip-flop can very well be larger than ½ tcycle, because the master and the slave are vulnerable during different parts of the clock period. This is different from what was stated in [4], where the TVF of a master-slave flip-flop was discussed. Please note that TVF is called timing derating (TD) in [4]. It follows from Eq. (4) that the individual TVFs of the master and slave should be taken into account, rather than an average TVF for the complete flip-flop. The nominal SER values of the master and the slave latch are obtained from accelerated test results or from models. If p1 and p0 denote the probabilities that

251

the flip-flop stores a 1 or a 0, respectively, then the nominal SER of the master and the slave is given by,

available the TVFs of the latches can be calculated more accurately. The TVF will then be lower then ½. Especially for the slave latch the difference can be significant. Application of the expressions listed in Table 3 allows a more accurate SER calculation.

=1 =0 SER nom = p1 SER Data + p0 SER Data , M M M

SER Snom = p1 SER SData =1 + p0 SER SData =0 .

6.

(5)

7. Conclusions

Flip-flop SER estimation approach

The current paper offers an approach to estimate the SER-related flip-flop parameters that are essential in the SER simulation of sub-100-nm ICs. The nominal SER of flip-flops can be obtained from accelerated SER test data or from models calibrated with experimental SER data. The nominal flip-flop SER does not show large differences between the 90-nm and 65-nm nodes. The TVF of the flip-flops can be calculated with the use of propagation delay distribution data. Alternatively, a worst-case TVF of ½ can be applied for both the master and the slave latch. Future work is the validation of the model with accelerated SER testing under dynamic conditions. When the nominal SER values and TVFs have been determined, SER simulation studies can focus on the architectural vulnerability factors (AVFs) for the different circuit elements. These AVFs are specific for a given IC design and strongly depend on application and micro-architectural details.

The alpha-induced SER of the investigated FF65 is about 20% lower compared to a similar FF in 90nm. This is in agreement with other studies, which predict that the SER per bit of 90-nm and 65-nm latches are comparable, with a slightly lower average SER in 65nm for a set of different latches. This suggests that the SER of flip-flops and latches starts to follow the trend that is observed for the SRAM SER per bit. This trend has its peak at the 130-nm (or 90-nm node, depending on the details of the process technology) and shows a decrease in more advanced technologies. However, in general the SER scaling trend is more ambiguous for flip-flops than for SRAM, because details of the flipflop cell schematics and layout can substantially affect the nominal SER. If accelerated SER data are not available, the use of a model that has been calibrated with experimental data is an alternative for the SER characterization of a flip-flop. If neither experimental data nor models are available, the value of 0.001 FIT/flip-flop is a reasonable guess for the nominal SER of flip-flops in sub-100-nm bulk CMOS technologies [7]. For a master-slave flip-flop this implies that both for the master and for the slave latch a nominal SER of 0.001 FIT/latch is applied. The simulation study using the model of Eq. (1) shows that both the contribution from electron collection (by NMOS drains) and hole collection (by PMOS drains) must be taken into account. It also demonstrates that one of the storage nodes is much more vulnerable to radiation-induced current pulses than the complementary node. This information is useful when optimizing the flip-flop design. Although it is not feasible to reduce the SER with orders of magnitude in this way, optimizing flip-flop cell layouts can help to lower the logic SER significantly. The study presented here shows that for relatively low clock speeds an estimation of the TVF of the master and slave latches can be made. This estimation is based on the distribution of the propagation delay of the combinational logic blocks separating the flipflops. In the worst-case, the TVF of both the master and the slave latch is equal to ½. If the propagation delay distribution of the combinational logic blocks is

Acknowledgement The author would like to thank André Nieuwland and Bram Kruseman for critically reading the manuscript.

References [1] S. Mitra et al., “Robust system design with built-in softerror resilience”, Computer 38(2), pp. 43-52, Feb. 2005. [2] T. Heijmen et al., “A comprehensive study on the softerror rate of flip-flops from 90-nm production libraries”, IEEE TDMR 7(1), pp. 84-96, Mar. 2007. [3] N. Seifert and N. Tam, “Timing vulnerability factors of sequentials”, IEEE TDMR 4(3), pp. 516-522, Sep. 2004. [4] H.T. Nguyen et al., “Chip-level soft error estimation method”, IEEE TDMR 5(3), pp. 365-381, Sep. 2005. [5] L.B. Freeman, “Critical charge calculations for a bipolar SRAM array”, IBM J. Res. Dev. 40(1), pp. 77-89, Jan. 1996. [6] N. Seifert et al., “Radiation-induced soft error rates of advanced CMOS bulk devices”, IRPS 2006, pp, 217-225. [7] Wrap-up of the 2nd workshop on System Effects of Logic Soft Errors (SELSE 2), http://selse2.selse.org/.

252