Nat.Lab. Unclassified Report 2002/828 Date of issue: 08/02
Radiation-induced soft errors in digital circuits A literature survey
Tino Heijmen
Unclassified Report c Philips Electronics Nederland BV 2002
2002/828
Unclassified Report
Authors’ address data: Tino Heijmen WAY41;
[email protected]
c Philips Electronics Nederland BV 2002
All rights are reserved. Reproduction in whole or in part is prohibited without the written consent of the copyright owner.
ii
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
Unclassified Report:
2002/828
Title:
Radiation-induced soft errors in digital circuits A literature survey
Author(s):
Tino Heijmen
Part of project:
Embedded memories & embedded logic
Customer:
Van de Mortel, Beenker, Schrooten, Hendrickx
Keywords:
Soft-errors; SER; memories; alpha-particles; cosmic-neutrons; device-hardening; SRAM; reliability
Abstract:
The current technical note gives an overview of the available literature on the soft error rate (SER) of semiconductor devices at sea level. The main focus is on SRAM circuits, but other memories and logic are considered also. The radiation sources causing ionizing particles in semiconductors are discussed and the physical mechanisms of the upset of data bits are treated. Reported work on soft error modeling and simulation is reviewed and test methods are summarized. The impact of technology scaling on SER is discussed. Several techniques are treated that were proposed to improve the SER sensitivity of circuit designs. The report includes a discussion about the impact of the SER issue on deep-submicron design in coming years.
Conclusions:
The soft error issue is increasingly important in deep-submicron design. Nowadays, the topic is particularly relevant for memories, but in the near future it will also have to be addressed in logic design. The radiation sources inducing soft errors are alpha particles (from wafer and package materials) and cosmic neutrons. The two types of particles require their own measurement techniques. Different modeling and simulation methods have been developed at both the device and the circuit level. Several solutions have been proposed to improve the SER sensitivity of devices. Eventually, the application of error detection and correction methodologies will be necessary to obtain sufficient reliability. Technology scaling increases both the total SER and the percentage of multiple-bit errors.
c Philips Electronics Nederland BV 2002
iii
2002/828
iv
Unclassified Report
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
Contents List of tables
vii
List of figures
vii
Terms and definitions
xi
1
Introduction
1
2
Historical overview
4
3
Radiation sources
7
3.1
Alpha particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
3.2
High-energy cosmic neutrons . . . . . . . . . . . . . . . . . . . . . . . .
14
3.3
Neutron-induced boron fission . . . . . . . . . . . . . . . . . . . . . . .
18
4
5
6
7
Physical mechanism
21
4.1
Charge collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
4.2
Circuit functioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
4.3
Critical charge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
Modeling and simulation
28
5.1
Modeling of radiation sources and charge tracks . . . . . . . . . . . . . .
28
5.2
Device level modeling . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
5.3
Circuit level modeling . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
5.4
SER simulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
5.5
Empirical models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
SER testing methods
36
6.1
System SER test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
6.2
Accelerated SER test . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
6.3
Alpha ASER test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
6.4
Neutron ASER test . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
SER in logic and in systems
42
7.1
42
SER in logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
c Philips Electronics Nederland BV 2002
v
2002/828
7.2 8
9
Unclassified Report
SER in systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
Scaling effects
48
8.1
Scaling of critical charge and collection efficiency . . . . . . . . . . . . .
48
8.2
Comparison of SRAM, DRAM, and 1TRAM . . . . . . . . . . . . . . .
51
8.3
Multiple-bit soft errors . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
Improvement of SER sensitivity
57
9.1
Improvements in materials, shielding, and packaging . . . . . . . . . . .
57
9.2
Process modifications . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
9.3
Component hardening . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61
9.4
System solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
63
10 Discussion
67
Distribution
vi
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
List of Tables Half-life and kinetic energy of the emitted alpha particles for the decay of Th-232 and its associated daughter products. . . . . . . . . . . . . . . .
8
Half-life and kinetic energy of the emitted alpha particles for the decay of U-238 and its associated daughter products [7]. . . . . . . . . . . . . . .
8
Half-life and kinetic energy of the emitted alpha particles for the decay of Am-241 and its associated daughter products. . . . . . . . . . . . . . . .
9
4
Natural alpha emission rates of process and package materials [7, 85]. . .
12
5
Comparison of the contributions to the SER of SRAMs from two production technologies [9]. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
1 2 3
List of Figures 1
Measured SER for an IBM 36kB DRAM chip [66]. . . . . . . . . . . . .
5
2
Correlation between data from accelerated and system SER measurements on SRAMs [41, 49]. . . . . . . . . . . . . . . . . . . . . . . . . .
5
3
Timeline of SER research [5]. . . . . . . . . . . . . . . . . . . . . . . .
6
4
Alpha energy spectrum obtained from a thick foil of Th-232 [5, 6, 7]. . .
10
5
Differential charges of a proton, an alpha particle, a lithium recoil, and a silicon recoil in silicon [104]. . . . . . . . . . . . . . . . . . . . . . . .
11
Range of a proton, an alpha particle, a lithium recoil, and a silicon recoil in silicon [104]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
7
Alpha counting [6]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
8
Neutron activation analysis [6].
13
9
Cosmic ray disintegration, causing a cascade of nuclear reactions [22].
.
14
10
Cosmic neutron flux at sea level as a function of the neutron kinetic energy [7, 40, 102]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
Generation of electron-hole pairs in silicon by alpha particles (or other light charged particles) and by a heavy recoiling nucleus produced by the collision of a high energy neutron with a silicon nucleus [58]. . . . . . .
16
Burst generation rate as a function of neutron energy for various silicon recoil energies [6, 7, 52]. . . . . . . . . . . . . . . . . . . . . . . . . . .
17
13
10 B
fission producing ionizing particles [6]. . . . . . . . . . . . . . . . .
18
14
Cumulative probability of a 10 B fission event as a function of the neutron energy [7]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
6
11
12
c Philips Electronics Nederland BV 2002
. . . . . . . . . . . . . . . . . . . . . .
vii
2002/828
Only ionizing particles generated in the first few BPSG layers are capable of inducing soft errors [6]. . . . . . . . . . . . . . . . . . . . . . . . . .
20
Charge generation and collection after a particle hit [6]. The picture on the right shows the funnel generated by the particle strike. The distortion of the local field is indicated. . . . . . . . . . . . . . . . . . . . . . . . .
22
Transient current (a) and collected charge (b) at a struck junction, for two different substrate doping concentrations. [34]. . . . . . . . . . . . . . .
23
Schematic representation of carrier paths, several ps after the ion strike [16]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
Schematic representation of an SRAM cell. The shaded rectangles indicate the areas that are sensitive to strikes by ionizing particles, for the case that the cell stores a logical “1” [6]. The transistors N1 and N1a, on the one hand, and N2 and N2a, on the other hand, generally share a single diffusion area. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
Layout of a 6-transistor SRAM cell [21]. The red box indicates the boundaries of the unit, the green regions are the polysilicon lines. . . . . . . .
26
21
Simulation flow for circuit alpha-SER [15]. . . . . . . . . . . . . . . . .
31
22
Basic flow diagram for the soft error Monte Carlo modeling (SEMM) program [62, 84]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
23
Configuration for α ASER testing [6, 40]. . . . . . . . . . . . . . . . . .
38
24
A transient pulse will only be latched if it coincides with the clocking edge. The probability that a pulse is latched is larger if the pulse is wider or the frequency is higher. [22]. . . . . . . . . . . . . . . . . . . . . . .
43
Direct hit of a latch and the propagation of a soft error hit in combinatorial logic [4]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
26
Dynamic pipeline latch [42]. . . . . . . . . . . . . . . . . . . . . . . . .
44
27
Integrated testing environment [38]. . . . . . . . . . . . . . . . . . . . .
46
28
Manifestation of an injected fault at the system level [38].
. . . . . . . .
47
29
Amount of charge representing a data bit as a function of process technology generation [22]. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
30
Scaling of the three contributions to SER as a function of voltage [5, 6]. .
50
31
Scaling of the SER per Mbit for DRAM [6].
. . . . . . . . . . . . . . .
52
32
Scaling of systems including DRAM [6]. . . . . . . . . . . . . . . . . .
52
33
Scaling of the SER per Mbit for SRAM [6]. The two different SER curves correspond to the application of different packaging materials. . . . . . .
53
Scaling of systems including SRAM [6]. The two different SER curves correspond to the application of different packaging materials. . . . . . .
53
15 16
17 18 19
20
25
34
viii
Unclassified Report
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
SER as a function of technology generation for 6T SRAM, pseudo-6T SRAM, and 1TRAM devices [53]. The SER of 1TRAM may be too optimistic, see [81]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
The probability of a multiple-bit soft error for an incident neutron compared with an alpha particle hit [76]. The dashed checkers denote junctions corresponding to a corrupted bit. . . . . . . . . . . . . . . . . . . .
55
Shielding from alphas by a polyimide layer (a); collected charge as a function of polyimide layer thickness (b) [6]. . . . . . . . . . . . . . . .
58
38
Keep-away zones for solder bumps [6]. . . . . . . . . . . . . . . . . . .
58
39
Illustration of the effect on SER sensitivity from the inclusion of a buried p+ layer (a) and using an SOI process (b) [6]. (Usually, SOI junction diffusions touch the buried oxide layer.) . . . . . . . . . . . . . . . . . .
60
SER hardening of an SRAM cell by the inclusion of two feedback resistors [2]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
62
41
Principle of triple modular redundancy (TMR) [22].
64
42
Traditional ECC architecture applied in memories [22].
. . . . . . . . .
64
43
Generic temporal sampling latch (a) and corresponding clock scheme (b) [55]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
Time redundancy approach for detecting single-event transient pulses, with (a) a single-clock and (b) a two-clock implementation, both giving the same signal diagram (c) [22]. . . . . . . . . . . . . . . . . . . . . .
66
SER characterization methodology for stand-alone memory chips at TI [5, 6]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
68
Actions on SER robustness performed during the design process of standalone memory chips at TI [5, 6]. . . . . . . . . . . . . . . . . . . . . . .
68
35
36
37
40
44
45 46
c Philips Electronics Nederland BV 2002
. . . . . . . . . . .
ix
2002/828
x
Unclassified Report
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
Terms and definitions The following terms and definitions are frequently used in publications on soft errors, cf. Sec. 2 of [40]. ASER: Accelerated soft error rate. The ASER is the error rate obtained in the presence of an extra ionizing radiation source. BGR: Burst generation rate. The BGR B(E neutron , E recoil ) quantifies the rate at which secondary reaction products of recoil energy E recoil are produced from neutrons of energy E neutron . Critical charge (Qcrit ): The minimum amount of charge, collected at a sensitive node due to a particle strike, that produces a soft error. If a device node collects charge in excess of its critical charge after an ionizing radiation event, the data state of that circuit will be upset. DUT: Device under test. ECC: Error-correction code. Sometimes the term error detection and correction (EDAC) is used instead. FIT: Failure-in-time. One FIT denotes one failure in 109 device hours. Fluence: The particle flux integrated over the time required for the entire run, expressed as particles/cm2 . Flux: The number of particles passing through one square area per unit time (particles/cm2 s). Hard error: A permanent circuit or device failure. The error is “hard” because the data is lost and the circuit or device no longer functions properly, even after power reset and re-initialization. Hard errors are usually not an issue for alpha-particle events but can be caused by cosmic events, such as gate rupture or destructive latch-up. LET: Linear energy transfer. The LET is the amount of energy deposited per unit path length in silicon. A LET of 1 MeV cm2/mg is equivalent to a deposited charge of ≈ 10 fC/µm or a stopping power of ≈ 0.227 MeV/µm. MBU: Multiple-bit upset. An MBU is an event that induces a data error or upset in which the state of more than one latch or memory cell is reversed. c Philips Electronics Nederland BV 2002
xi
2002/828
Unclassified Report
Sensitive volume: A region, or multiple regions, of a device from which deposited charge can be collected by device nodes, such that an SEU is produced. SEU: Single-event upset. An SEU is an event that induces a data error or upset in which the state of a latch or memory cell is reversed (“1” to “0”, or vice versa). Soft error: An SEU in a latch or memory cell that can be correctly rewritten. The error is “soft” because the circuit itself is not permanently damaged and behaves normally after the data state has been restored. Also called soft fail. SER: Soft error rate. The rate at which soft errors are occurring. SSER: System soft error rate. The SSER is the unaccelerated SER, measured with no radiation source present other that what is there naturally. SSER testing is also called life testing.
xii
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
1 Introduction In electronic circuits, elemental data bits are represented by small packets of charge. When these charge packets are modified by noise, the stored information may be changed. A soft error is a random error induced by an event that corrupts the data stored in the device but does not damage the device itself [6]. In contrast, when a hard error occurs, the function of the device is stopped permanently, because the internal structure is damaged. Soft errors can be caused by radiation, electromagnetic interference, or electrical noise. The current technical note (TN) is dedicated to soft errors caused by radiation. Besides soft errors, radiation can cause hard errors by damaging the device, for example by gate rupture, latch-up, or VT shift. Hard errors will not be covered in the present TN. Research on soft errors has taken two rather independent paths, that of applications at sea level, and that of systems used at high altitudes. The current TN focuses on the sea level application area. Although many papers have been published on satellite SER, these are of limited value for terrestrial SER studies. This is because the radiation causing soft errors at high altitudes is very different from the radiation at sea level. Radiation types such as high-energy galactic cosmic rays, containing extremely energetic particles, and high-energy protons from trapped radiation belts are important for space environments but not for terrestrial applications [41]. Although the occurrence of soft fails in space and satellite applications has been known since the 1950s, the first evidence of soft errors at sea level due to radiation was given in 1978 by May and Woods from Intel [56, 57]. Since then, the soft error issue has been an important topic in semiconductor industry. A short historical overview of softerror related work is presented in Sec. 2. Within Philips, research on soft errors has been performed in the 1980s. The results have been published in several internal reports [43, 44, 48, 54]. With continuing technology scaling, circuits become more sensitive to soft errors. Since recently, manufacturers of embedded static and dynamic memories get questions from their customers regarding the soft-error rate (SER) performance of their products. The questions that have been raised by the external customers of the library technology group (LTG) of Philips Semiconductors (PS) have formed the trigger to initiate a research project in the group Digital Design & Test in Philips Research. One of the first actions was to collect knowledge from literature sources. The present TN is the result of this literature study. There are three primary radiation sources causing soft errors: alpha particles, high-energy cosmic rays, and neutron-induced boron fission. Alpha particles originate from radioactive impurities in chip and package materials. Alphas induce soft errors by generating charges in the silicon device. On the other hand, cosmic rays, predominantly neutrons, indirectly generate charges by colliding with nuclei within the chip. The products of such collisions are particles that are capable of ionizing silicon atoms. The cosmic ray component of the SER has become more important as increased material purity has reduced the SER due to alpha particles from radioactive contaminants. The third source, boron fission, occurs when a low-energy (thermal) neutron hits a 10 B nucleus, which then c Philips Electronics Nederland BV 2002
1
2002/828
Unclassified Report
breaks up into an alpha and a lithium recoil. This source gives a significant contribution if specific materials, in particular boron phospho-silicate glass (BPSG), have been used in the fabrication of the chip. The three sources that generate soft errors in semiconductor devices are discussed in more detail in Sec. 3. Section 4 addresses the mechanisms by which an ionizing particle can induce a soft error in a silicon device. Basically, there are four different methods to evaluate the SER of a device. Testing methods can be categorized into accelerated SER (ASER) test methods, which measure the SER in the presence of a relatively strong extra radiation source, and system SER (SSER) test methods, which evaluate the SER under nominal conditions. Both accelerated and system test approaches are discussed in Sec. 6. It is also possible to simulate soft errors. Modeling of soft errors can be done at the device level or the circuit level. Simulation and modeling of soft errors are treated in Sec. 5. Finally, field reports of device reliability provide information on SERs. Obviously, the last method is the least favorable one. In practice, however, customer feedback is often the trigger to start studies on SER [103]. Soft errors occur in SRAM and DRAM devices, but not in ferroelectric RAMs (FRAMs), magnetic RAMs (MRAMs), or flash memories [25]. In the late 1970s, dynamic memories were most vulnerable devices. Because of ongoing technology scaling, however, currently (embedded) SRAMs are the most critical with regard to soft errors. Soft error rates for logic circuits used to be negligible compared with the failure rate of memory devices. However, for 100 nm technologies and beyond, logic SER has to be taken into account. Soft errors in logic circuits are the topic of Sec. 7. This section also discusses the failure rates of systems containing both logic and memories. The sensitivity to soft errors is increasing with decreasing devices dimensions and operating voltages in deep-submicron technologies. An important parameter is the critical charge, which is the minimum amount of charge needed to change a stored data bit. The critical charge is the product of the voltage difference between a logical “1” or a “0” and the capacitance of the node storing the data bit. Both factors decrease with successive technology scaling. On the other hand, also the efficiency of charge collection is reduced with decreasing feature sizes. The probability of a multiple-bit upset increases relatively fast with technology scaling, compared with the probability of the upset of a single bit. This difference in scaling may have an impact on the application of error correction codes. Issues related to technology scaling are treated in Sec. 8. Several different measures have been proposed to reduce the SER of a circuit or a system. The concentration of alpha-emitting radioactive contaminants can be decreased by using purified materials in the production of chips and packages. Also shielding layers can be applied. Furthermore, several modifications at the process level have been proposed to reduce the SER. Many SER-hardened devices have been reported, including feedback loops, resistive hardening, or an increased node capacitance. A different approach is to accept the occurrence of single-bit upsets and to apply error-correction codes in order to prevent that these upsets lead to erroneous signals at the outputs of the circuit. The different approaches to improve SER are discussed in Sec. 9. Soft errors form one of the challenges of designing in deep-submicron technologies, next to complexity management, timing closure, power dissipation, etc. It can be regarded as 2
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
one of the effects that are encompassed by the term signal integrity, which also includes cross-talk, ground bounce, etc. With continuing technology scaling, soft errors will be an increasingly import topic in circuit design, both for memories and logic. At the end of the present TN, Sec. 10 discusses the impact of SERs on deep-submicron digital design.
c Philips Electronics Nederland BV 2002
3
2002/828
Unclassified Report
2 Historical overview The first to report that alpha particles from package materials are a serious cause for soft errors at sea level were May and Woods in the late 1970s. Their investigations were motivated by a serious industrial problem at Intel concerning operational errors in a DRAM series. The cause of the problem appeared to be a relatively high trace radioactivity in the packaging material. The results of their work were presented at the IPRS conference [56] and later published in a journal paper [57]. The event reported by May and Woods was recognized as very important and preprints of their papers circulated through the industry before they were officially published. It was followed rapidly by similar studies such as [99]. Researchers from IBM realized that if alpha particles could generate circuit soft errors, cosmic radiation could possibly do the same, even though heavy ions and alphas are absent in the cosmic ray flux at sea level. Indeed, Ziegler and Lanford reported in 1979 that also cosmic rays induce SER at terrestrial altitudes [106]. This paper was followed by a more detailed study that also addressed the amount of charge that was needed to upset a circuit [107]. Because of the usage of materials with low alpha emission rates, cosmic neutrons replaced alpha particles as the main source of memory SER during the 1990s. McKee et al. from TI showed in 1996 that high-energy neutrons from cosmic rays were the major contributor to the system SER of the current generation DRAMs [58]. In 1995, Baumann et al. presented a study that showed that boron compounds are a non-negligible source of soft errors [8]. The first major paper covering the modeling of SER was by published by Kirkpatrick from IBM in 1979 [47]. It was followed by a more complete approach by Sai-Halasz et al., considering also some of details of the interaction between radiation and circuits, both for alpha particle and cosmic radiation [72, 73, 74]. An important contribution to the understanding of the mechanism by which a charged particle upsets a circuit was made by Hsieh et al. They found that the charge generated by a particle along its track distorts the local electrical fields, such that charge is pulled up the track towards the surface. This effect was named funneling [32, 33, 34]. IBM started field tests in 1983, using a portable tester with several hundreds of chips. These measurements provided evidence that even at sea level cosmic radiation contributes significantly to the SER, and its effect increases exponentially with altitude [65, 66], see Fig. 1. The solid curve in this figure shows the alpha-particle component of the SER. The vertical distance above the solid curve corresponds to the cosmic ray component. It was also shown that there is a close correlation between the SER and the neutron flux. The test results further indicated that cosmic rays generate a relatively large number of multiple bit errors, because the generated charge is generally larger than for an alpha-particle event. The first accelerated tests of sensitivity to cosmic rays date were performed in 1985. Reference [105] gives an overview of soft error experiments at IBM from 1978 to 1994. The fact that the SER of circuits is not exclusively caused by alpha particles was further demonstrated in [49] (see also [41]). Data was collected for various SRAM types, obtained from system SER measurements in which a large number of memory devices 4
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
1.4
1 SER (106 FIT)
L
S Sea level U Underground B Boulder (1.6 km) L Leadville (3.1 km)
1.2
0.8
0.6 B 0.4
S S U
0.2
0 0
5
10
15
20
25
30
35
40
Real time (khr)
Figure 1: Measured SER for an IBM 36kB DRAM chip [66]. were tested for a long time and the few occurring soft errors were logged. These data were compared with the corresponding accelerated measurements, where memories were exposed to a high flux of alpha particles. If alpha particles were the only source of soft errors, the correlation between the two sets of experimental data would be linear. However, when the system SER results are plotted against data from accelerated measurements, as shown in Fig. 2, there is a clear discrepancy from the linear correlation. The data points are represented in Fig. 2 by circles, the solid line shows a fit through the data points, and the dashed line denotes the linear extrapolation. The system SER is larger than would be expected from the results of the accelerated tests with alpha sources, in particular for low SER values. The difference can be explained if the contribution from neutrons is included. An extensive discussion on soft errors in computers systems at sea level was presented in [64]. 5
10
4
System SER (FITs)
10
3
10
2
10
1
10
0
10 −2 10
−1
10
0
10
1
10
2
10
3
10
Accelerated SER (hits/hr)
Figure 2: Correlation between data from accelerated and system SER measurements on SRAMs [41, 49]. c Philips Electronics Nederland BV 2002
5
2002/828
Unclassified Report
The history of research on SER is visualized in Fig. 3, reproduced from [5].
Figure 3: Timeline of SER research [5].
6
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
3 Radiation sources In a recent review paper [7], the three primary sources for the induction of soft errors in semiconductor devices have been discussed in detail, 1. Alpha particles 2. High-energy cosmic neutrons 3. Neutron-induced boron fission In the current section, the charge generating characteristics are treated for these three sources. Other rays that are capable of inducing soft errors, such as heavy ions, are not considered here, because they only occur in space or in the upper atmosphere.
3.1 Alpha particles A significant source of ionizing radiation in semiconductor devices is formed by alpha particles from the wafer, the package material, or solder bumps [97]. An alpha particle is a double-ionized helium atom (4 He2+ ), composed of two protons and two neutrons. Packaging materials and solder contain traces of radioactive isotopes, which can produce alphas, next to beta and gamma rays, when they decay to a lower energy state. The emitted alphas have kinetic energies that are typically in the range of ≈ 4–9 MeV. Radioactive isotopes are unstable, emitting various types of radiation as they decay into other isotopes [6]. The concentration N of a radioactive isotope falls exponentially with time, N (t) = N0 e−λt , (1) where N0 is the concentration at t = 0. The half-life t1/2 of a radioactive species is defined as the time that is required for 50% of the parent product to decay into the daughter product(s), ln 2 . (2) t1/2 = λ The standard unit of measure is the Curie, abbreviated as Ci, with 1 Ci being equal to 3.7 ×1010 decays/sec. Although many different radioactive isotopes are present, the most dominant sources of alphas in packaging materials are Th-232 and U-238 isotopes, together with their daughter products. The decay of Th-232 and U-238 and their intermediates is shown in Tables 1 and 2, respectively. Both the half-times and the kinetic energies of the emitted alphas are listed. Note that Bi-212 decays to either Po-212 (64%) with emission of a beta particle, or to Tl-208 (36%) with emission of an alpha particle. Both daughters decay to Pb-208, by emitting an alpha (Po-212) or a beta particle (Tl-208). In solders, alphas are produced by the decay of Po-210 and Pb-210 impurities. Po-210 (t1/2 = 138 days) decays to the stable Pb-206 isotope by emitting an alpha with a kinetic c Philips Electronics Nederland BV 2002
7
2002/828
Unclassified Report
Species
Half-life
Th-232 Ra-228 Ac-228 Th-228 Ra-224 Rn-220 Po-216 Pb-212 Bi-212 Po-212 (64%) or Tl-208 (36%) Pb-208
1.41×1010 yrs 5.76 days 6.13 hrs 1.91 yrs 3.66 days 55.6 sec 0.15 sec 10.64 hrs 60.6 min 304 nsec 3.05 min stable
Emitted α-energy particle (MeV) α 4.016; 3.957 β β α 5.426; 5.343 α 5.686; 5.449 α 4.785; 4.602 α 6.288 β 6.779 α or β 6.336; 6.297 α 8.785 β
Table 1: Half-life and kinetic energy of the emitted alpha particles for the decay of Th-232 and its associated daughter products.
Species U-238 Th-234 Pa-234 U-234 Th-230 Ra-226 Rn-222 Po-218 Pb-214 Bi-214 Po-214 Pb-210 Bi-210 Po-210 Pb-206
Half-life 4.47×109 yrs 24.1 days 6.69 hrs 2.45×105 yrs 7.45×104 yrs 1.60×103 yrs 3.82 days 3.05 min 26.8 min 19.7 min 164 µsec 22.3 yrs 5.01 days 138.4 days stable
Emitted α-energy α 4.196; 4.149 β β α 4.774; 4.723 α 4.688; 4.621 α 4.785; 4.602 α 5.490 α 6.002 β β α 7.687 β β α 5.305
Table 2: Half-life and kinetic energy of the emitted alpha particles for the decay of U-238 and its associated daughter products [7].
8
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
energy of 5.3 MeV. Pb-210 (t1/2 = 22.3 years) decays to Po-210 by successively emitting two beta particles. The preferred source for testing device sensitivity to alpha particles from these solder impurities is Am-241 [40]. The decay of Am-241 and its intermediates is shown in Table 3. Species Am-241 Np-237 Pa-233 U-233 Th-229 Ra-225 Ac-225 Fr-221 At-217 Bi-213 Po-213 (98%) or Tl-209 (2%) Pb-209 Bi-209
Half-life 432 yrs 2.14×106 yrs 27 days 1.59×105 yrs 7.34×103 yrs 14.8 days 10 days 4.9 min 32.3 ms 45.5 min 4.19 µsec 2.2 min 3.25 hrs stable
Emitted α-energy particle (MeV) α 5.486; 5.443 α 4.787; 4.770 β α 4.824; 4.783 α 4.842; 5.054 β α 5.829; 5.793 α 6.340; 6.126 α 7.066 α or β 5.55; 5.87 α 8.377 β β
Table 3: Half-life and kinetic energy of the emitted alpha particles for the decay of Am241 and its associated daughter products. In equilibrium, the emission of alphas from a daughter product would be equal to the emission from the parent (secular equilibrium), because the half-times of the daughters are smaller than the half-life of the parents. However, most semiconductor materials are highly purified. Therefore, the alpha emitting impurities will not be in secular equilibrium in general. The emitted alphas have sharply defined energies. In practice, however, the alphas are generally produced inside a thick foil (thick with respect to the range alphas have in that material, which is typically several µm). When an alpha travels from where it was generated (which can be anywhere in the foil) to the surface, it loses energy. Therefore, the energy levels listed in Tables 1, 2, and 3 are broadened. The alpha energy spectrum from a thick foil of Th-232 is shown in Fig. 4. The characteristic decay peaks cannot be clearly distinguished, since the spectral lines are broadened due to energy loss in the foil. In most cases, the semiconductor device will “see” such a broadened distribution of incident alpha flux. An exception occurs if the alphas are generated at or very near the surface of the source, which is the case for thin layers. In this case, the energy spectrum of the emitted alphas will be a set of sharp spectral lines. One example is a residue of alphaemitting impurities left after a wet etch with phosphoric acid batches. Another example is formed by flip-chip solder bumps, where the alpha-emitting impurities are located at the surface. c Philips Electronics Nederland BV 2002
9
2002/828
Unclassified Report
Figure 4: Alpha energy spectrum obtained from a thick foil of Th-232 [5, 6, 7]. Because the energy and the trajectory of an alphas largely determine the probability that it will cause a soft error, comprehension of the energy spectrum of the incident alpha flux is key in accurate determination of SERs for semiconductor devices. Alpha particles traveling in silicon induce electron-hole pairs through electric interactions. For each electron-hole pair that is created, on average 3.6 eV of kinetic energy is lost. This implies that an alpha can cause a burst of typically a million electron-hole pairs. As the alpha is losing kinetic energy, its speed is lowered, which increases the available time to induce electron-hole pairs. Therefore, the charge generation rate increases with the distance traveled by the alpha and has a maximum near the end of the trajectory. This nonlinear response is one of the reasons that accurate knowledge of the energy spectrum of the incident alpha flux is necessary to determine device SERs correctly. A convenient metric in SER calculations is the differential charge dQ/dx. The differential charge generated by several particles, including the alpha particle, is shown in Fig. 5 as a function of the kinetic energy. Figure 6 illustrates the corresponding ranges of the particle in silicon. Both the differential charge and the range curves were computed with the SRIM Monte–Carlo program [104]. The differential charge can also be expressed as a linear energy transfer (LET) or the power needed to stop the particle. A LET of 1 MeV cm2/mg corresponds to a differential charge of ≈ 10 fC/µm. On the other hand, a stopping power of 1 MeV/µm is equivalent to ≈ 44 fC/µm of charge in silicon. As can be seen from Fig. 5, an alpha generates a maximum differential charge of approximately 16 fC/µm. In total, a 1 MeV alpha particle creates approximately 42 fC of charge [22]. Sources of alpha-emitting impurities in packaged semiconductor devices are materials used to package or fabricate the chip, or during the fabrication process. Alpha particle emission rates of some production materials are listed in Table 4. The data in this table were obtained from [7] and [85]. In general, the main source of alphas is the package: the mold compound, the flip-chip underfill, solder bumps, etc. Other alpha sources are the applied metals, in particular the chip interconnect, bond wires, golden lid plates, and lead-frame alloys. Using low-emission materials, a limit of ≈ 0.001 α/(cm2 hr) for the 10
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
silicon lithium alpha proton
2
dQ/dx (fC/µm)
10
1
10
0
10 −2 10
−1
10
0
1
10 10 Particle energy (MeV)
2
10
3
10
Figure 5: Differential charges of a proton, an alpha particle, a lithium recoil, and a silicon recoil in silicon [104]. 6
10
proton alpha lithium silicon 4
Range (µm)
10
2
10
0
10
−2
10
−2
10
−1
10
0
1
10 10 Particle energy (MeV)
2
10
3
10
Figure 6: Range of a proton, an alpha particle, a lithium recoil, and a silicon recoil in silicon [104]. c Philips Electronics Nederland BV 2002
11
2002/828
Unclassified Report
packaged product is achievable nowadays. The present limit for alpha counting detection is approximately the same. Material Cu metal Al metal Fully processed wafer Mold compound Flip-chip underfill Pb-based solders Package
Emission rate [ α/(cm2 hr) ] 0.0019 0.0014 < 0.001 0.024 – < 0.002 0.002 – 0.0009 7.200 – < 0.002 0.01 – 0.001
Table 4: Natural alpha emission rates of process and package materials [7, 85]. The alpha particle activity of a wafer can be measured by alpha counting or neutron activation analysis (NAA) [6], In alpha counting, the alpha flux emitted by the wafer sample is determined by an ionization detector, see Fig. 7. The detector signal is amplified and led to a discriminator and counter. The advantages of alpha counting are that the alpha flux is measured directly, sample preparation is not required, and the equipment is simple. Drawbacks are that large area samples and long counting times are required, and that the method is ineffective for low flux materials. In NAA, the wafer is activated by a flux of thermal neutrons, which are captured by the radio-active impurities in the wafer, see Fig. 8. After the activation, the impurities decay to secondary products, which concentration can be measured by counting emitted gamma photons. The method is very sensitive, detects many different impurities, and is relatively quick. The disadvantages are the extreme contamination sensitivity, the necessity of a neutron facility, and that certain materials cannot be analyzed. Currently, the industry faces a lack of alpha metrology, while knowledge of the quality of materials is fundamental to estimate SER sensitivity [27]. The alpha emission rates of fully processed wafers as specified by different companies vary significantly. While ST claims an emission rate of 0.0003 particles/cm2 -hr [85], TI reports a value of 0.0009 [5, 7] and TSMC gives an upper bound of 0.0017 [67], almost a factor of 6 higher than the ST value. The reliability of the reported numbers is not in every case obvious. However, it is clear that the detection limit of the test equipment plays an important role. For example, the emission rate of a wafer fully processed in a 0.13 µm technology was determined by TSMC as 0.000 ± 0.0017 particles/cm2 -hr. The emission rate is published as an upper bound of 0.0017 particles/cm2-hr, which number is completely determined by the error margin of the alpha activity measurement [67]. The fact that materials used during the fabrication process cannot be neglected as possible sources of radio-active impurities was illustrated by what is now known as the “Hera problem” of IBM. In this case, the SER for a memory chip was found to temporarily increase by several orders of magnitudes, due to radio-active contamination of a few samples in a lot of etch acid bottles [105]. Also contaminations in implants could be a source of alpha particles. 12
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
Figure 7: Alpha counting [6].
Figure 8: Neutron activation analysis [6].
c Philips Electronics Nederland BV 2002
13
2002/828
Unclassified Report
This concludes the discussion of alpha particles for now. In Sec. 10 the methods that are employed to reduce the alpha particle sensitivity of semiconductor devices are discussed.
3.2 High-energy cosmic neutrons Soft errors are not only caused by alpha particles, but also by cosmic rays. Primary cosmic radiation contains both galactic particles (e.g., from supernovae), with energies 1 GeV, and particles from the solar wind, with energies < 1 GeV. These two types of cosmic particles include protons (89%), alpha particles (10%), and heavier nuclei (1%). Primary cosmic rays react with the earth’s atmosphere via strong nuclear interactions, producing complex cascades of second and higher generation particles [101], see Fig. 9. Most charges from cosmic rays are trapped or deflected by the geo-magnetic field of the Earth [22]. Only 1% of the neutrons created by cosmic rays reach the surface of the Earth. The flux of neutrons with energies above 1 MeV at sea level is approximately 25 neutrons/cm2-hr [107]. Neutrons with energies of approximately 5 MeV or higher are capable of causing soft errors, the exact threshold depends on the properties of the silicon device. Particles that have strong interaction (also called nuclear force) are named hadrons.
Figure 9: Cosmic ray disintegration, causing a cascade of nuclear reactions [22]. At terrestrial altitudes (as opposed to flight or satellite altitudes) the predominant hadrons are protons, neutrons, electrons, muons, and pions. Muons and pions are short-lived and protons and electrons are attenuated by Coulombic interactions with the atmosphere. At sea level the proton flux is less than 5% of the total. The electron flux is comparable to the neutron flux at sea level, but the charge generation capability of electrons is orders of magnitude less compared to neutrons, at least for the particle energies of interest [106]. Therefore, neutrons are the dominant cosmic particles producing soft errors in devices at terrestrial altitudes, because of their relatively high flux and their stability. The SER of devices due to cosmic rays can be estimated with sufficient accuracy by just considering neutron interactions. 14
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
Figure 10 shows the cosmic neutron flux at sea level. The curve of Fig. 10, reproduced from [40], represents the number of neutrons that are incident on a device at sea level, as a function of the neutron kinetic energy. In total, the flux of energetic neutrons at sea level is about 200,000 neutrons/cm2-yr, which means that every VLSI chip has 104 –105 energetic neutrons passing through it every year. The neutron flux varies by as much as 30% as a function of the solar cycle. Furthermore, the neutron flux is strongly dependent on altitude and depends weakly on geographical location [102]. At a height of 2,000 m the neutron flux is five times as high as at sea level [22]. The geographical dependency is expressed through a factor named the geomagnetic rigidity cutoff factor. The curve shown in Fig. 10 corresponds to a nominal neutron flux in New York City (NYC) (see also [40, Annex E]). Because both the altitude and the geomagnetic rigidity are approximately the same for the Netherlands and NYC, the neutron flux in the Netherlands is comparable to the flux shown in Fig. 10. −2
Neutron flux (# neutrons / cm2−sec−MeV)
10
−3
10
−4
10
−5
10
−6
10
0
10
1
10
2
10 Neutron energy (MeV)
3
10
Figure 10: Cosmic neutron flux at sea level as a function of the neutron kinetic energy [7, 40, 102]. Because neutrons are uncharged particles, they do not directly generate ionization. Instead, high-energy neutrons interact with silicon (and other chip materials), producing charge-generating particles. These interactions are extremely complicated and depend on the incident neutron energy. The primary interaction by which cosmic neutrons induce soft errors is the induction of silicon recoil. When a high-energy neutron collides with a silicon nucleus, it can transform enough energy to knock the nucleus from the lattice. The collisions can be either elastic or inelastic, in the latter case the silicon nucleus is excited to a higher state and will emit gamma radiation afterwards. Generally, the resulting silicon recoil eventually breaks into smaller fragments, each of which can generate charge. It is also possible that the incident neutron is absorbed by the nucleus, which then breaks into secondary particles. These interactions of neutrons with silicon nuclei create secondary ions with large varic Philips Electronics Nederland BV 2002
15
2002/828
Unclassified Report
ations in properties. The created species can range from H to Si, possibly P, including several isotopes. Advanced quantum mechanical calculations of the nuclear interactions are needed to determine the nature of the recoil fragments in detail [88]. However, satisfactory agreement with experimental data can be obtained when the silicon recoil is treated as a single event. During the remainder of the current section, this assumption will be used. Because neutrons create ionizing particles indirectly, the tracks of the created secondary ions can start anywhere and in any direction in the device. This is in contrast with alpha particles or other light charged particles with sufficient ionizing power, which arrive from outside the device. This implies that in the case of neutrons, the number of geometrical situations with respect to ion tracks and sensitive regions of the devices is much larger. Furthermore, a collision of a neutron with a silicon nucleus results in more than one ionizing particle. Also, in contrast with alpha particles, most of the neutrons entering a device pass through the silicon without interaction. Only a small fraction gets involved in a nuclear reaction, which is also called a spallation reaction [59]. This difference in nature of neutrons compared to light charged particles such as alphas is illustrated in Fig. 11.
Figure 11: Generation of electron-hole pairs in silicon by alpha particles (or other light charged particles) and by a heavy recoiling nucleus produced by the collision of a high energy neutron with a silicon nucleus [58]. Distributions in energy of silicon recoils can be computed by nuclear physics simulations. The burst generation rate (BGR), B(E neutron , E recoil ), quantifies the rate at which incident neutrons with energy E neutron produce secondary reaction products of energy E recoil [52]. The term ‘charge burst’ refers to charge produced by a recoiling heavy nucleus. Figure 12, adapted from [7], illustrates the charge burst generation rate in silicon as a function of the energy of the incident neutron, for various silicon recoil energies. It is clear that the probability of a recoil decreases rapidly with increasing recoil energy. Furthermore, for higher energy recoils, the burst generation rate increases with increasing neutron energy. The true impact to semiconductor devices is the amount of charge an event deposits in the active device volume. The differential charge and range curves for a silicon recoil in silicon are depicted in Figs. 5 and 6. Comparison with the differential charge curve for the alpha particle shows that the charge density per traveled distance is significantly 16
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
3
10
2
3
Burst generation rate (cm /µm )
2
10
1
10
0
10
Erecoil=0.1 MeV E =1 MeV recoil E =5 MeV recoil E =10 MeV recoil E =15 MeV recoil
−1
10
−2
10
0
200
400
600
800
1000
Neutron energy (MeV)
Figure 12: Burst generation rate as a function of neutron energy for various silicon recoil energies [6, 7, 52]. higher for silicon recoils than for alpha particles. Therefore, cosmic neutron events have significantly higher potential to upset a semiconductor device than an alpha particle event. It is clear that the charge density per distance traveled is up to 150 fC/µm, while it is less than 16 fC/µm for alpha particles. Silicon recoils are generally stopped within a few µm, because they lose their energy rapidly, see Fig. 6. For the same energy, a heavy ion, with a high LET and a short range, is more likely to cause an upset than a light particle, with a smaller LET and a larger range [36]. While chips can be shielded from alpha particles, significant shielding of cosmic neutrons is not possible, at least not at the chip level. Concrete has been shown to reduce the cosmic neutron flux by approximately a factor of 1.4 per foot (i.e., a factor of 3 per meter) [7]. In particular the flux of neutrons with energies below 50 MeV is dependent on location. Therefore, the SER of a system due to cosmic neutrons will depend on its location inside a building. Also, due to the source of the particles, SER cannot be reduced with keepout zones or high-purity materials. The only way to deal with cosmic neutron SER is by reducing the device sensitivity, either by design or by process adjustment. Ziegler [103] has pointed out recently that the SER sensitivity of SRAMs as a function of neutron energy has changed considerably over the last fifteen years. The contribution of lower-energy neutrons has increased significantly for successive SRAM generations. The median SER has shifted from ≈ 1 GeV for a typical SRAM chip from 1988 to less than 100 MeV for a chip manufactured in 2000. [The median indicates that 50 % of the chip fails are induced by particles below that energy and the other 50 % by particles above that energy.] The sensitivity to high-energy neutrons dropped by almost two decades, because c Philips Electronics Nederland BV 2002
17
2002/828
Unclassified Report
of the much smaller device volumes, which allow less ionizing charge to be deposited. On the other hand, the sensitivity to low-energy (< 20 MeV) neutrons increased, which has not been explained yet.
3.3 Neutron-induced boron fission Besides alpha particles and high-energy cosmic neutrons, a third source of ionizing particles in semiconductor devices is secondary radiation induced by the interaction of cosmic neutrons with boron nuclei [8]. Two isotopes of boron exist, 10 B (19.1% abundance) and 11 B (80.1% abundance). The 10 B isotope is highly unstable when exposed to neutrons, compared to other isotopes. Furthermore, while other isotopes emit only gamma photons after absorbing a neutron, the 10 B nucleus fissions (i.e., breaks apart), producing an excited 7 Li recoil nucleus and an alpha particle, see Fig. 13. The lithium nucleus is emitted with an energy of 0.840 MeV (94%) or 1.014 MeV (6%), the emitted alpha particle has an energy of 1.47 MeV. Both particles generate charges in silicon and can therefore cause soft errors.
Figure 13:
10 B
fission producing ionizing particles [6].
Although neutrons with any energy can induce fission, the probability decreases rapidly with increasing neutron energy. Therefore, only thermal (i.e., low-energy) neutrons need to be considered. Figure 14 shows the cumulative probability function of 10 B fission, computed from the neutron capture cross section for 10 B and the cosmic neutron background flux. In other words, Fig. 14 depicts the chance that the fission is caused by a neutron with an energy below the given value. The 90% cumulative probability point is indicated. However, the thermal neutron background is not well-defined. The local flux can vary significantly, because neutrons with lower energies are scattered more easily. Integral plots showing the cumulative probability, such as Fig. 14, are used generally to visualize SERs. These are more informative than plots depicting the product of the particle flux and the SER cross section without integration, because of the shape of the resulting curve [108]. Figures 5 and 6 show the differential charge and the range of a lithium recoil in silicon. 18
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
1 0.9
Cumulative probability
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 −3 10
−2
10
Figure 14: Cumulative probability of a energy [7].
−1
0
10 10 Neutron energy (eV)
10 B
1
10
2
10
fission event as a function of the neutron
The maximum charge generation rate of a lithium recoil (with a kinetic energy of 0.84 MeV) is 25 fC/µm. Figure 6 shows that the lithium nucleus generates more charge along its trajectory than the simultaneously emitted alpha. Therefore, the lithium recoil has a higher probability of creating a soft error. Boron is used extensively as a p-type dopant and also in boron phospho-silicate glass (BPSG) layers. Boron is added to PGS to improve the step coverage and contact reflow at lower processing temperatures. Typically, the 10 B concentration in BPSG is thousands of times higher than in diffusion and implant layers. For conventional processes, BPSG is the dominant source of boron fission and, in some cases, the primary source of soft errors. Only BPSG in close proximity to the silicon substrate is a threat, because the range of both the alpha particle and the lithium recoil is less than 3 µm. In most cases, the emitted particles have insufficient energy beyond 0.5 µm to create a soft error. Soft errors due to the activation of 10 B in BPSG can be eliminated by excluding BPSG from the process flow. Because of the limited ranges of the emitted alpha particle and lithium recoil, only the first level of BPSG needs to be replaced by a different dielectric, see Fig. 15. If the unique properties of BPSG are needed, a process flow including BPSG with enriched 11 B could be used. It was demonstrated in [9] that neutron-induced boron fission can be the dominant source of soft errors in production SRAM devices. Tests were performed with cold neutron beams on 512 kbit SRAMs fabricated in a 0.25 µm CMOS process with BPSG and a 0.18 µm CMOS process without BPSG. The SER induced by boron fission was compared with the contributions due to alphas and high-energy neutrons, the results are reproduced in Table 5. The use of low-alpha materials, with an emission rate of 0.002 alphas/cm2-hr, was assumed. If standard (i.e., not low-alpha) materials are used, the SRAM SER would probably be dominated by the contribution due to alphas. Experimental results for the SER in SRAM devices due to boron fission were also reported in [86].
c Philips Electronics Nederland BV 2002
19
2002/828
Unclassified Report
Figure 15: Only ionizing particles generated in the first few BPSG layers are capable of inducing soft errors [6].
0.25 µm (with BPSG) Alpha particles 4% High-energy neutrons 15% 10 B fissions 81% Total SER (a.u.) 7.5 Contribution
0.18 µm (no BPSG) 18% 82% 0% 1.0
Table 5: Comparison of the contributions to the SER of SRAMs from two production technologies [9].
20
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
4 Physical mechanism 4.1 Charge collection An alpha particle or a recoil fragment induced by a neutron has a linear trajectory in silicon. Energy is transferred to the device by Coulombic forces between the ionizing particle and the electrons in the silicon atoms. Initially, very hot electron-hole pairs are created, which subsequently decay into plasmons, which in turn decay into a cascade of carriers. Finally, the complex thermalization process results in one electron-hole pair for every 3.6 eV of energy transferred by the particle [34]. An ionizing particle that strikes a semiconductor device thus creates within pico-seconds a highly conductive, neutral, and localized column of electron-hole pairs. Initially, the track radius is ≈ 0.1µm and the charge density is ≈ 1019 –1021 cm−3 [6]. If a junction has been struck, the device returns to its initial equilibrium state through two different mechanisms, 1. Drift of the carriers through local electrical fields, known as funneling, and 2. Diffusion of excess electrons and holes. Recombination, a third possible mechanism, is in general negligible. When an ionizing particle strikes a p-n junction, the generated column of carriers disturbs the electric fields of the depletion region. The created electrons and holes drift to the positive and negative potential sides of the depletion region, respectively. Therefore, the net charge in the original depletion region is reduced and the potential across it drops. The degree of potential drop depends on the number of electron-hole pairs generated within the depletion layer, according to Poisson’s equation. Simultaneously, the electron-hole pairs in the quasi-neutral substrate region start to separate by one Debye length after one dielectric relaxation time. The slow-moving holes located in the center and the electrons at the edge of the free carrier column together determine the net charge. The equipotential lines of the depletion layer rapidly spread down and envelope the entire length of the track, see Fig. 16. The electric field spreads down along the track into regions that originally were field-free. This generated electric field accomplishes the charge collection mechanism known as the funneling effect. Funneling is a relatively fast process, producing a sharp peak in the disturbing current. It generally starts a few pico-seconds after the particle strike. Within ≈ 500 ps, the generated carrier density near the junction is comparable to the substrate doping, and the disturbed field relaxes back to its original position as the junction re-establishes itself. The timescale for charge collection by diffusion, as discussed below, is usually in nano-seconds. The funneling concept was introduced by Hsieh [32, 33, 34]. It was named “funnel” because of the typical shape of the carrier column, see Fig. 16 (b). An in-depth theoretical discussion of electric currents in ion tracks, as related to charge collection in silicon devices, was published in [23]. Funneling strongly depends on the doping concentration of the substrate [34]. The speed of field spreading and charge collection is directly related to the dielectric relaxation time, c Philips Electronics Nederland BV 2002
21
2002/828
Unclassified Report
(a)
(b)
Figure 16: Charge generation and collection after a particle hit [6]. The picture on the right shows the funnel generated by the particle strike. The distortion of the local field is indicated.
which is proportional to the resistivity of the substrate material. Substrates with a higher resistivity, because of a lower doping concentration, show a slower field distortion. The funneling mechanism continues to collect charge until the carrier density is comparable to the substrate doping. Therefore, the amount of collected charge is larger if the doping concentration of the substrate is lower. The decrease in charge collection for highly doped substrates is thus caused by the reduction in funneling, not because of a decreased carrier lifetime. Figure 17 illustrates the charge collection in a junction struck by an alpha particle with an energy of 5 MeV, for two different doping concentrations of the substrate. The transient current is depicted in Fig. 17(a), showing that the current spike occurs more rapidly, has a higher maximum, and decays faster if the substrate concentration is larger. The charge collected by the struck node is shown in Fig. 17(b), which illustrates that the lightly-doped substrate collects significantly more charge in the end. If the incident angle is large, it is possible that an ionizing particle penetrates between n+ regions, for example, the source and the drain of a MOS transistor. In this case, the resulting distortion of the electrical field can cause the transistor channel to turn on, which can result in a state change. Recently, a device simulation study was performed on simple semiconductor structures to analyze the basic mechanisms that are induced by a high-density of electron-hole pairs created locally by an ionizing particle [37]. It appears that the funneling concept presents a view that is too restricted to allow for an accurate evaluation of the induced effects. Instead, all electrical modifications induced by the ion throughout the structure have to be considered. It is argued that the fast current peak (the prompt phase of the charge collection) is not due to the drift of carriers under influence of the disturbed electrical fields, 22
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
100
350
90 300
N =1x1015
80 Collected charge (fC)
Current (µA)
250
200
150 15
NB=1x10 100
B
70 60 50
N =7x1015 B
40 30 20
NB=7x1015
50
0 0
0.1
0.2
0.3
10 0.4
0.5
Time (ns)
(a)
0.6
0.7
0 0
0.2
0.4
0.6
0.8
1
1.2
Time (ns)
(b)
Figure 17: Transient current (a) and collected charge (b) at a struck junction, for two different substrate doping concentrations. [34].
but instead is caused by capacitive coupling between the track and the device electrodes. The drain current (both magnitude and duration) was found to be mainly determined by the initial ion track conductivity [36]. Charge collection by diffusion is a process that takes several nano-seconds and is therefore significantly slower than charge collection by drift. Initially, diffusion is ambipolar, i.e., the electrons and holes diffuse together, keeping the semiconductor electrically neutral. The track cylinder is radially expanded because of this ambipolar diffusion. While the excess electrons and holes deeper in the substrate diffuse away, part of the diffusing electrons are captured by the sensitive node. An extensive mathematical analysis of diffusion in relation with charge collection was presented in [24]. If a biased active region is hit, charge collection is dominated by drift, with a secondary contribution from diffusion. On the other hand, if the strike take place a few micrometers from the depletion region, funneling can still occur but with a time delay and at a lower magnitude. Initially, the induced carriers spread out through ambipolar diffusion. When the charge cloud reaches the depletion region and the carrier density is sufficiently high, the internal field is disturbed and the funneling process begins. If the particle strikes at some distance of a junction, diffusion is the main mechanism. Besides funneling and diffusion, a third mechanism, due to the bipolar effect, can contribute to charge collection. This mechanism originates from the fact that a MOS transistor includes a parasitic transistor, with the source acting as the emitter and the region uder the gate as the base. The bipolar mechanism is in particular important for circuits designed in silicon-on-insulator (SOI) technologies [39, 91]. In SOI technologies, the region under the gate, where the conduction occurs, is called the body, to distinguish it from the silicon substrate under the buried oxide. This body is often floating in SOI, i.e., it is not referenced to a potential. Body contacts are applied in some cases, but usually not in digital circuits, because of the additional area that the contacts consume. The floating body c Philips Electronics Nederland BV 2002
23
2002/828
Unclassified Report
is responsible for the transistor latch effect, which can be triggered by a single ionizing particle. Minority carriers generated in the drain-body junction can forward bias the source-body junction, causing the parasitic bipolar transistor to turn on. Part of the holes in the funnel diffuse to the source region, others drift to the transistor body and form the base current of the parasitic bipolar transistor [16]. Electrons can then flow from the source to the body and, via the funnel, to the drain. Basically, the parasitic bipolar transistor of the CMOS amplifies the collected charge, thus reducing the sensitivity threshold. A survey of the upset mechanism in SOI transistor was published in [63]. Although the bipolar effect is more important in SOI processes, also in bulk CMOS technologies the critical charge is partly determined by the bipolar transistor included in MOS transistors [16, 19]. Figure 18 illustrates the bipolar mechanism. The drain current shares between the source and the substrate electrodes during the first perturbation stage. Later, when the electrostatic potentials re-establish, the source current changes sign, because electrons are collected that move from the funnel towards the source/substrate junction by diffusion and under the influence of the electric fields [70]. Note that device simulations are needed to investigate such mechanisms, see Sec. 5. Vd Id
0000 1111 11111111111111 00000000000000 0000 1111 00000000000000 11111111111111 0000 1111 00000000000000 11111111111111 Drain 0000 1111 00000000000000 11111111111111 0000 1111 00000000000000 11111111111111 0000 1111 00000000000000 11111111111111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111
Gate Is
Ib
111111111111111 000000000000000 000000000000000 111111111111111 000000000000000 111111111111111 Source 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111
Contact p++
p−well region
Figure 18: Schematic representation of carrier paths, several ps after the ion strike [16].
4.2 Circuit functioning Charge collection by a node storing a data bit can cause a circuit upset. The most sensitive structure in semiconductor devices is the reverse-biased junction. The n+ -p junction is more sensitive than the p+ -n junction because electrons are more mobile than holes. Weakly driven reverse-biased nodes are more susceptible than strongly driven reversebiased nodes. Forward-biased junctions are not sensitive, because the available current is sufficient to compensate the charge collection. Figure 19 shows the sensitive areas in an SRAM cell when the cell stores a logical “1”. The most sensitive region is the drain of transistor N1, the nMOS transistor in the “OFF” state, i.e., the load transistor of the inverter in the “1” state [70]. The drain area of N1 is 24
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
more sensitive than the drain of P2, because of the higher mobility of electrons compared with holes. In the case that the cell stores a “0”, the drain of transistor N2 and P1 are sensitive. In many practical situations, both nMOS and pMOS source/drain areas contribute to the upset rate of a node, not only in an SRAM cell but also, for example, in a dynamic register latch [42]. Charge generated by a single ionizing particle could be shared by two or more neighboring nodes. In these cases, the contributions from the different nodes have to be determined separately. Usually, circuit simulation is needed to calculate such effects. VDD
bitline True
bitline False
P2
wordline P1
1111 0000 0000 1111 1111 0000 1111 0000
1111 0000 0000 1111 0000 1111 1111 N1a 0000 1111 0000 0000 1111 0000 1111 0000 1111 N1
wordline
N2a N2
VSS
Figure 19: Schematic representation of an SRAM cell. The shaded rectangles indicate the areas that are sensitive to strikes by ionizing particles, for the case that the cell stores a logical “1” [6]. The transistors N1 and N1a, on the one hand, and N2 and N2a, on the other hand, generally share a single diffusion area. The diffusion area of the access transistor N1a, which area is connected to the output of the inverter in the “1” state, is capable of collecting charge induced by an ionizing particle. However, the access and the load transistor (both NMOS) usually share a single diffusion area in the physical layout, see Fig. 20. The access transistors do not influence the change of state due to the SEU. The electric behavior of the SRAM cell during the upset is determined by the four transistors storing the logic state [70]. Because the conduction of the MOS transistors is determined by the gate potential, the node voltages have to be modified if the logic state of an SRAM cell is to be changed. The voltage variation must both exceed the threshold of the target inverter and last long enough to change the state of the second inverter. The potential variation is time dependent, because it depends on the equivalent node capacitance and the conduction states of the two transistors connected to the node. The equivalent capacitance of the node storing the data bit includes the two gate capacitances of the node and the capacitances of the drain-substrate junctions of the adjacent NMOS and PMOS transistors [71]. Typically, the inverter containing the struck transistor flips about 0.1–1 ns before the other inverter [70]. A particle hit has a different effect on dynamic and static RAM cells. DRAM cells are periodically refreshed with a refresh time that is relatively large (several ms) compared c Philips Electronics Nederland BV 2002
25
2002/828
Unclassified Report
Figure 20: Layout of a 6-transistor SRAM cell [21]. The red box indicates the boundaries of the unit, the green regions are the polysilicon lines. with the charge-collection times (several ps). Therefore, the total collected charge within the refresh cycle determines whether a DRAM cell changes state. Because leakage mechanisms are draining the storage capacitor of its charge, the critical charge of the DRAM cell decreases with time until the cell is refreshed. On the other hand, SRAM cells are stabilized by the cross-coupled inverters, which have a short response time (tens of ps). A current into the drain of the OFF NMOS transistor can only produce an upset if the current magnitude exceeds the maximum current that the connected PMOS transistor can drive. Obviously, the higher the current magnitude, the faster the node potential varies. The potential threshold must be exceeded long enough to change the state of the second inverter and to ensure locking. Thus, a failure occurs only if enough charge is collected within a sufficiently short time, which make pulse duration important for SEUs in SRAM cells [71]. Also tracks that do not cross the OFF n-channel MOSFET drain, but are close, are capable of causing an SEU of an SRAM cell [69]. In this case, the current pulse is delayed, its width is decreased, and its peak value is reduced. The current shape is mainly controlled by the diffusion of the deposited charges and the collection of the charges that reach the drain junction. This is because the density of the carriers as they reach the junction is not longer high enough to collapse the electric field surrounding the junction. SEUs induced by nuclear reactions are only possible if the minimum distance between the ion track and the nMOST drain is smaller than 2 µm.
4.3 Critical charge The concept of critical charge (Q crit ) has been well established in SER research. However, the definition of critical charge is dependent on the techniques that are used to determine it [82]. In principal, the critical charge of a circuit should be defined in terms of the actual chip circuitry, including all process, device, and layout aspects. Because SEUs have different effects on SRAM and DRAM cells, the specification of the Q crit in SRAM and DRAM circuits is different also. 26
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
The time a circuit needs to recover from a disturbing event depends on how close the collected charge is to the critical charge. If the collected charge is very close, the circuit could go into a metastable state with a long recovery time. Cell information is then lost, although the cell may return to the correct state and, by chance, the correct data could be read out. The critical charge of a circuit is not a single-valued quantity but is a function of the shape of transient charge pulse generated by the disturbing event, the position of the circuit on the chip, the power supply voltage, and process variations. An accurate calculation of the chip critical charge thus requires a circuit model incorporating all relevant process, device, and operating parameters. For this reason, SER calculations that are solely based on device simulations present an incomplete picture [82]. Although the critical charge calculated from theoretical models significantly increases for higher temperatures, this effect has not been observed experimentally. In fact, measured chip SER is only weakly dependent on temperature [108]. Recently, it was found by device simulation that the struck node current does not depend on the layout of the heavily doped regions of the cell. It is only determined by the ion column characteristics, in particular, by the column conductivity, which is much higher than the surrounding impedances in the cell [71]. In an SRAM cell, both the node capacitance must be discharged and the PMOS drain current has to be overcome to change the potential of the storage node. The critical charge Q crit , needed to flip the state of the SRAM cell, can be defined as [69],
Tflip
Q crit =
IDN dt ,
(3)
0
where IDN denotes the drain current induced by the ionizing particle in the NMOS transistor and the flipping time Tflip defines the irreversibility point after which the feedback mechanism of takes over from the noise current to continue the flipping process. The node charge is defined as the product of the equivalent node capacitance and the voltage change, which approximately equals the supply voltage, Q node ≡ Cnode VDD .
(4)
Q node is a lower limit of the charge needed to cause an upset. In good approximation, the critical charge can be estimated as the sum of a capacitive component Q node and a conduction term [69, 71], Q crit = Q node + IDP Tflip , (5) where IDP is the average drain current from the driving PMOS transistor. In many practical situations, however, it is acceptable to neglect the second term.
c Philips Electronics Nederland BV 2002
27
2002/828
Unclassified Report
5 Modeling and simulation Simulation methods are necessary to investigate the soft-error sensitivity of different design elements at an early stage. The modeling of soft errors basically incorporates the modeling of three aspects, • the radiation sources, their geometry, and charge track generation; • the charge transport in the device fields and the charge collection by the critical nodes; • the circuit failure. To predict SERs of product chips accurately and efficiently, both the physics of charge generation and collection and the circuit responses to transient noise pulses must be understood. Both tasks are critical and challenging. Reference [82] gives an overview of the approaches and methodologies that are applied to investigate SERs in integrated circuits. At present, comprehensive models to accurately predict the SER of SRAMs are not available [103]. The prediction of device SER within a factor of 2 requires the inclusion of accurate information about the sensitivity to low-energy (< 100 MeV) neutrons. Also a reliable estimate of the corresponding flux is necessary. Both types of data are not available currently.
5.1 Modeling of radiation sources and charge tracks Modeling alpha radiation sources and the charge tracks they generate in the device is relatively straightforward. If the energy spectrum, the alpha flux, and the geometries of the radiation sources are known, the charge production can be simulated by a Monte Carlo approach, randomly chosing positions, track angles, and emission energies [83]. The length of the alpha track can be determined from the stopping power for alpha particles in silicon. The situation is more complicated for neutron-induced soft errors, due to the nuclear reactions that are involved. The charge collection from all tracks generated by the different nuclear fragments has to be calculated simultaneously. For accurate simulations, the details of the nuclear reactions have to be known, including the number of simultaneously produced particles, their atomic numbers, their atomic weights, the incident angles, and the incident energies. In several studies, a simple assumption named the rectangular parallelepiped model [also known as first-order model (FOM)] is used. This model assumes that all charges created within a sensitive volume associated with the junction of interest are collected. If this amount is larger than the critical charge, an SEU occurs. It has been reported that the assumption is meaningful, provided that a second layer is included, from which part of the generated charge is collected by the junction [60]. 28
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
Generally, the sensitive volume is a basic concept in error rate calculations [16], as is the critical charge. For space applications, the sensitive volume is more generally used as a parameter than for terrestrial applications. Also the linear energy transfer (LET) parameter is typically used in studies on SER in space applications [59]. To model the charge transport due to drift and diffusion, a Monte Carlo procedure can be used [83]. Information on the vertical impurity concentration profiles is necessary. A three-dimensional random-walk approach applying the Monte Carlo method is applied in the proprietary IBM tool SEMM [62, 82], to investigate the transport of excess carriers through a device. The Monte Carlo approach largely reduces the need to solve threedimensional device equations, allowing the simulation of several devices simultaneously. The Monte Carlo transport model is much simpler and less expensive than 3D device modeling methods when a large number of devices are to be modeled. Nevertheless, it requires a considerable amount of processing time and memory capacitance.
5.2 Device level modeling Device modeling is a form of SER evaluation that allows focusing on weaks links in the circuit and device design at an early stage. Examples of weak links include floating bit lines, insufficient margins on sense-amplifiers, and junctions exposed to the chip substrate [103]. Device simulation offers the possibility to study the phenomena that are at the origin of the SEU, which is not possible with experimental methods. Simulations in the device domain are usually computationally expensive, studies of several months on parallel computer systems using years of CPU time are no exception [21]. Device simulations focus on the collection dynamics of the charge generated by ionizing particles and can be used to study the device details, such as funneling and ion track structure [87]. However, this type of simulations does not fully take into account the chip circuitry. An ideal study case to investigate ionizing-particle induced phenomena in SRAMs is the modeling of an entire memory cell in the device domain [70]. To predict if cell flipping is induced, properties of the ionizing particle have to be connected to electrical properties of the device. Several research groups work in this field, including the groups of Palau at the Universit´e Montpellier II [13, 16, 36, 37, 69, 70, 71, 98] and Dodd at the Sandia National Laboratories [17, 18, 19, 20, 21]. Device simulation can be used to develop simple models for the shapes of the current pulses [69, 80]. To simulate the process of charge collection in detail, in many studies Poisson’s equation and the hole-and-electron continuity equations are solved simultaneously [32, 33, 34, 39], φ = q n − p − ND+ + NA− , ∂n kB T −qφ qφ eff = ∇ · µn n int exp ∇ exp ∂t q kB T kB T −qφ kB T qφ ∂p eff = ∇ · µ p n int exp ∇ exp ∂t q kB T kB T c Philips Electronics Nederland BV 2002
n n eff int p n eff int
(6)
−R ,
(7)
−R ,
(8)
29
2002/828
Unclassified Report
where φ is the electrostatic potential and n and p denote the electron and hole densities, respectively, the parameter R represents the net combination rate due to Shockley–Read– Hall (SRH) and Auger recombination, n eff int is the effective intrinsic carries density, and µn and µ p denote the electron and hole mobilities, respectively. Most device simulation approaches use finite-element methods to solve the set of equations numerically. Reference [35] presented a simple physical model describing the most important features of funneling. The analytical model of [35] is sometimes used in device simulations for the charge collection due to drift. A similar analytical model for the charge collection due to diffusion was reported in [42]. Recently, a model was presented to simulate the current and the collected charge induced by the bipolar mechanism [91]. The bipolar effect is particularly important for SOI technologies. A state-of-the-art device-level simulation of a memory cell starts with the set up of its three-dimensional structure and the associated grid [69]. The set up reflects the dimensions and the doping profiles of the simulated technology, mesh refinement is performed in the regions of interest. When the cell has been defined, the physical models to be used in the simulation are selected. In addition to the standard drift-diffusion laws, the models include the field- and concentration-dependent mobility, taking into account the effect of velocity saturation of the carriers; concentration-dependent SRH and Auger recombination; and band-gap narrowing for heavily doped regions. The effect of a particle strike is then simulated by a column of electron-hole pairs, with a radial distribution that follows a Gaussian distribution. One of the difficulties in modeling soft errors is to predict the percentage of generated electron-hole pairs that are actually collected by the critical node where the data is stored. The interaction between the ionizing particle and the semiconductor material is generally modeled in 3D device simulators with a column of electron-hole pairs, where the main axis is the particle track. The charge generation model uses a Gaussian radial distribution of charges with a fixed characteristic radius of 0.1 µm and a Gaussian time distribution centered at 1 ps. The carrier generation is taken into account through an external generation function added to the semiconductor continuity equations. Using results from full-cell device simulations, it has been possible to develop a method to predict SEU rates due to neutrons, based on the secondary ions’ effect [36]. It applies the possibility to define the properties of the secondary ions necessary to induce an upset as a function of the location of the generation point in the device. Also, by performing device simulations for several LET values, the evolution of the sensitive area as a function of LET can be demonstrated [21]. Simulations in the device domain used to be very difficult to perform because of numerical technique limitations and prohibitive computational burden. Therefore, a simulation methodology named mixed-mode simulation was developed, which is presently widely applied [16, 18, 70]. The principle of this technique is to limit the use of the device simulator to the struck transistor, while the rest of the cell is represented by compact circuit models. The device and circuit equations are then solved simultaneously. An important drawback of this method is that coupling effects between adjacent transistors cannot be taken into account. It is also not possible to study the drift or diffusion of generated 30
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
carriers through the differently doped regions of the transistors.
5.3 Circuit level modeling Circuit simulations can be used to determine the SER sensitivity of different circuit configurations. To compute the critical charge Q crit , a series of alpha particle models with varying energies are applied in the circuit simulation. The model with lowest energy that is capable of corrupting a logical output defines the critical charge. Clearly, estimating Q crit by simulating the circuit with all parasitic capacitances and diodes extracted from the actual layout is superior to using an approximation such as (4) in Sec. 4 [31]. Thus, to model the chip failure rate, a circuit analysis has to be done to determine the effect of the current pulses at the various nodes [83]. The expected shape of the current pulse has to be known beforehand for each of the p/n junctions. The shape of the current pulse generated by an ionizing particle depends on the exact geometry of the junction, the incident angle of the striking particle, the doping profile, etc. In practice, however, a reasonable approximation of the exact pulse shape gives satisfactory results. The current shape is often approximated by a double exponential,
−t −t Q coll i coll (t) = exp − exp , (9) τfall − τrise τfall τrise where Q coll represents the total collected charge, and τrise and τfall are the rise and delay time constants of the pulse, respectively. Piece-wise linear waveforms have been applied as well, with a peak representing the funneling charge collection and a tail accounting for the diffusion charge collection [15]. A compact simulation methodology for SERs due to alpha particles was proposed in [15] and [27]. The simulation flow is depicted in Fig. 21. The circuit critical charges are estimated from the circuit netlist, including parasitics, and model waveforms representing the injected current pulse due to alpha particle strikes. The critical charge data are combined with a compact model for charge generation and collection through a statistical algorithm, resulting in SER data. Information on the geometry of the chip is required as input. The difficult part is, of course, the construction of the compact model. schematic
RC extraction
alpha waveform circuit netlist
layout
critical charge simulation
critical charges
geometry extraction
geometry
compact model based statistical SER FIT simulation
circuit SER FIT
Figure 21: Simulation flow for circuit alpha-SER [15]. c Philips Electronics Nederland BV 2002
31
2002/828
Unclassified Report
A standard method for computing critical charges was presented in [26]. The method was demonstrated for bipolar SRAM arrays. Soft error events were simulated by connecting a current source to the cell. For the pulse shape a modification of the expression of (9) was used, characterized by a single time constant τpulse , −t t Q coll i coll (t) = K exp , (10) τpulse τpulse τpulse where K is a constant such that the pulse shape integrates to unity. First, the cell operation was studied as a function of the isolated-cell standby current. Next, the peripheral circuits and array organization were investigated to understand their effects on the cell current variation. In order to obtain numerical values for the critical charge, an operation definition of critical charge is required. This definition has to be physically meaningful, quantifiable in a repeatable manner, and calculable by a circuit simulator. Therefore, the charge that results in a cell recovery to a 100 mV cell differential voltage after 10 ns was selected as the working definition in [26]. This method of defining a critical charge is generally applicable to other memory and logic types. However, the typical parameters depend on many factors, such as technology and applied design style, and could therefore be unique for each design. Several other studies on circuit models for simulation of soft errors in SRAM device have been published [12, 35, 51, 75]. A review of simulation work on SRAM devices in SOI technologies can be found in [63]. The cells in a storage array are designed similarly. However, they can have dissimilar operating characteristics due to differences in their electric and thermal environments. For example, it was observed in [26] that the largest Q crit was obtained for the cell nearest to the word decoder, with the highest temperature and the lowest line resistance. The spread in critical charge for the bipolar array studied there was predicted to be from 340 to 755 fC. Thus, the upper bound on Q crit was a factor of two larger than the lower bound! Applying a 3σ tolerance for the critical charge resulted in a 60× variation in simulated SER. This illustrates that large variations in SER are to be expected. Because SER is a very low probability event and usually only a small number of discrete observations are made (either from simulations or measurements), determining the confidence bounds of the SER results is important [82]. In the case of a relatively large number of failures, a Gaussian statistical distribution can be applied. However, for smaller failure rates the Poisson distribution is more adequate.
5.4 SER simulators The soft error Monte Carlo Modeling (SEMM) program is an IBM proprietary tool to calculate the SER of semiconductor chips from design and radiation data [62, 82, 83, 84]. As such, it is an example of an industrial approach of SER modeling. SEMM is applied mainly to determine if chip designs meet SER specifications and enables the designer to 32
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
make changes in order to meet specifications. It is also used to determine specifications on the level of radioactive impurities in the chip and package materials. SEMM is used for soft errors caused by both alphas and cosmic particles. A detailed description of the SEMM package was given in [84], a simplified flow diagram is shown in Fig. 22. The SEMM kernel needs a number of inputs. A geometrical description of the semiconductor region of interest has to be available in the form of a layout file. Also the vertical dopant profiles for the devices and radiation source data have to be provided. In the case of an alpha source, a geometrical description of the source (or sources) is given. For cosmic sources, a nuclear physics program computes the energies and directions of the generated secondary particles for each possible primary collision event [88]. Wiring rules
Layout Dopant profiles
SEMM
Alpha/cosmic flux
Charge collection data Post processing
Radiation data
Circuit simulator
Total SER/bit
Critical charges
Figure 22: Basic flow diagram for the soft error Monte Carlo modeling (SEMM) program [62, 84]. A typical SEMM simulation includes 10,000 events, or more if the SER is low, and is the time-consuming step in the flow. Each event that results in a significant charge collection is listed in a charge collection file. Each entry in this file gives transient details of the charge collection at all relevant locations in the circuit. The charge collection file is applied later in the post-processing step. Additional inputs needed for post-processing include the critical charge definitions, obtained from circuit simulations. Also wiring rules can be specified, which define how the charges that are collected at the device nodes combine to form a composite charge. Also the alpha or cosmic flux data is provided. Integration over all energies then gives the total SER per bit or per chip. Another SER simulation tool is the neutron induced soft error simulator (NISES) from Fujitsu [93, 94, 95], which models neutron-induced charge collection by combining the modeling of neutron-nucleus reactions and electron-hole pair generation. The nuclear reaction simulation is based on the antisymmetrized molecular dynamics (AMD) method. Electron-hole pair generation is modeled following the approach proposed by Ziegler [101]. Neutron-silicon reactions are randomly generated in the silicon substrate using a Monte Carlo method. c Philips Electronics Nederland BV 2002
33
2002/828
Unclassified Report
5.5 Empirical models Neutron-induced soft error rates can be computed with simulators such as SEMM [62] or NISES [89], discussed above, or with simplified models. A simple method to estimate neutron-induced SER was reported in [90]. The method is based on a modified version of the burst generation rate (BGR) model [52, 106]. According to the version of the BGR model discussed in [52, 95], the neutron-induced SER can be expressed as a function of the critical charge Q crit , SER(Q crit ) = BGR(Q crit ) neutron Vsens C ,
(11)
where neutron is the neutron flux, Vsens is the sensitive volume (defined as the sensitive area Asens times the sensitive depth dsens ), and C denotes the collection efficiency. If all charges induced in the sensitive volume are assumed to be collected by the junction, C is equal to 1. Because the induced charge is dependent on the sensitive depth dsens , the BGR was redefined in [90] as BGR(dsens ; Q crit ). The parameter C, which has to be determined from experimental data, can then be discarded and the SER can be expressed as, SER(Q crit ) = BGR(dsens ; Q crit ) neutron Vsens .
(12)
The BGR values for a range of values of Q crit and dsens were computed by the NISES simulator and reported in [90]. The parameters in the model are determined from design and process data. The sensitive area Asens is defined as the junction area plus the depletion area. The sensitive depth is computed using the funneling model [75], N0 , (13) dsens = W 1 + β N0 + N A where W is the junction depth plus the depletion width. The value of Q crit is approximated as the product of the node capacitance Cnode and the supply voltage VDD . Although the estimations of the model parameters are rough, the results are reasonably good for neutron-induced SERs, because these vary less with changes in the parameters than alpha-induced SERs. The model has been shown to predict measured neutron-induced SERs within a factor of five for 0.35 and 0.25 µm technologies. The accuracy of the model for designs in technologies with gate lengths below 0.25 µm is unknown. A different methodology for SER characterization of an IC manufacturing process was proposed in [29], where an empirical method was reported capable of predicting cosmicray neutron SER of an arbitrary circuit in a particular process. The model is supposed to be helpful in the control of SER performance of, for example, SRAMs circuit in the design stage. Given the neutron flux neutron , the number of soft errors Nerror occurring during a time T can be expressed as Nerror = σtotal neutron T , (14) 34
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
where σtotal is the total cross section of a circuit, which is a sum of partial cross sections σi of each node contribution to the total SER,
σi . (15) σtotal = i
Given certain strike conditions, the waveform of the noise pulse can be described by a charge Q and a vector s describing the pulse shape. Q crit is a function of the supply voltage VDD and s, and can be determined by circuit simulation. For a given neutron radiation environment, the cross section of a node with diffusion area Adiff can be expressed as, σi (VDD , Adiff , Q crit ) = Adiff penv (VDD , Q crit (VDD , seff )) .
(16)
In the case that the critical charge Q crit of the node does not depend on the pulse shape, (16) reduces to, σi (VDD , Adiff , Q crit ) = Adiff penv (VDD , Q crit (VDD ) .
(17)
The function penv is specific for each process, diffusion type (n-type or p-type), and neutron energy spectrum. Comparison with the BGR model [90] shows that penv can be regarded as the product of the sensitive depth and the burst-generation rate function, penv (VDD , Q crit (VDD , seff )) = dsens BGR(dsens (vDD , Q crit )) .
(18)
This empirical model has two advantages compared with the BGR model of [90]. First, the model does not include the sensitive depth, which is difficult to determine accurately. Second, the dependence of the critical charge on the charge collection waveform is accounted for. A drawback is that the model needs to be calibrated for the manufacturing process of interest. The parameters in the empirical model were determined using measurements on a test chip [31]. Because the charge collection process is fairly independent of the cell type, parameters derived from experimental data for one type of cell are applicable to all other cells for a given technology [77]. The sensor cell on the test chip is operated in either one of two modes during SER measurements. In the first mode, the critical charge of the node of interest is independent of the waveform shape and the measured SER cross sections are used to determine the function penv for the p-type and n-type diffusions separately. Critical charges can be determined within a few percent, which is significantly more accurate than when circuit simulations are used, because that approach leads to significant errors due to process variations. In the other mode, Q crit depends on the pulse shape. The SER cross sections measured in that node are applied to extract the value of seff by leastsquares fitting. It was verified that the empirical model can be satisfactorily applied to other circuits manufactured in the same process as the test circuits. The calculated SERs agreed within a factor of 1.8 with experimental data.
c Philips Electronics Nederland BV 2002
35
2002/828
Unclassified Report
6 SER testing methods Basically, circuit SER can be measured by using either system or accelerated test methods. In the case of system SER (SSER) testing, SER is measured under nominal conditions. Accelerated SER (ASER) testing includes the presence of an extra ionizing radiation source with a relatively high activity, for example, an alpha emitting source or a neutron (or proton) beam. Special benchmark circuits can be used to extract more information about the SEU phenomenon than is possible with “off-the-shelf” circuits [31]. For example, specific test circuits can be used to measure critical charges. Several assumptions are made in accelerated testing [108]. First, the intensity of the extra source must be such that circuits are allowed to recover to a quiescent state between fails. Otherwise, the chip may be anomalously sensitive. Secondly, testing is performed either under static or under dynamic conditions. In real systems, soft error sensitivity can be increased by noise due to circuit switching. This is particularly relevant for testing of cache memories or embedded memories in logic. An eightfold difference in SER has been observed between static and dynamic tests for some chips. In principle, accelerated SER testing can only be used to obtain relative SER data, because of the required extrapolation. The only method to accurately measure absolute error rates is by system SER testing. Both testing methods are not capable of identifying specific causes of soft errors. This requires device modeling, which is discussed in Sec. 5. Test vehicles, however, can help to isolate and quantify various sources of soft errors [27]. An industrial standard for the measurement of soft errors in semiconductor devices was set by JEDEC in 2001. The standard, labeled JESD89 [40], specifies the standard requirements and procedures for terrestrial SER testing of integrated circuits. Furthermore, it defines a standardized methodology for reporting the results of the tests. The procedures documented in this standard apply primarily to memory devices — DRAMs and SRAMs — but can also be used for logic devices, with some adjustments.
6.1 System SER test System SER (SSER) measurement consists of testing a large number of devices for a sufficiently large period of time, i.e., for weeks or months. Enough soft errors have to be collected to determine an accurate estimate of the SER. The method does not require special experimental equipment, nor extrapolation or assumptions to determine the data of interest. Therefore, SSER testing is a direct method of measuring of the SER of actual product. However, it requires that hundreds or thousands of devices are tested in parallel for long periods of time. This is only feasible in a production environment, not for research purposes. A survey of SSER experiments, also named field testing was given in [66]. The large number of devices and the long testing time are necessary to estimate SER with sufficient accuracy, because of the low failure rates. An example was presented in [103]. 36
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
Consider a tester holding 500 test chips with an estimated SER of 5 × 10−6 fails/hr. It takes 16 fails to obtain 50% reliability at 2σ , which will cost about nine months. In general, this testing time is unacceptable. An alternative is to perform SSER test at lower supply voltages, since SER sensitivity increases drastically with reduced voltage. A problem with this approach is that the various components of device SER have different voltage dependencies, as is discussed in Sec. 8. For example, SER due to alpha particles shows a much stronger bias dependency than neutron-induced SER. If voltage is reduced, the contribution of alpha particles to the increase in SER will be larger than the cosmic component. Because the purpose of SSER testing is to evaluate device SER under nominal conditions, results from tests at reduced voltages have to be scaled to the nominal voltage. This can only be done if the voltage dependency is known for each of the contributions to the SER. Also the relative magnitude of each component has to known. In practice, it is very difficult to obtain these two types of data with sufficient accuracy. Two conclusions with regard to SSER testing are drawn in [103]. The first is that SSER test is useful for the validation of results from accelerated tests or modeling, but not for the evaluation of actual device SER, because testing is too expensive and takes too long. Secondly, reduced voltage testing should be rejected because the results are not reliable enough due to the assumptions that have to be made.
6.2 Accelerated SER test A more commonly applied method to determine SER is accelerated SER (ASER) testing. In this method, devices are exposed to an extra radiation source. The energy spectrum of the emitted radiation is well-known and the intensity is both well-defined and typically several orders of magnitude higher than the intensity of background radiation. ASER testing requires only a fraction of the time that is needed for SSER tests. Only a few devices are needed and measurement can be performed within days or hours. Disadvantages of ASER testing are that extrapolation of the measured data is required and that different radiation sources have to used to make sure that soft errors induced by both alpha particles and cosmic neutrons are accounted for. It is important to stress that data from alpha-particle tests cannot be used to determine cosmic-ray induced SER, and that data from cosmic radiation tests cannot be used to predict alpha-induced SER. This is because the magnitude, shape, and distribution of the generated noise pulses is different for alpha particles and cosmic radiation. Therefore, voltage, timing and other correlations differ for the two sources. Testing is useful only when knowledge about the radiation environment at use conditions is available [27]. An overall assessment of the SER sensitivity of a device is only complete when both the alpha and the cosmic-ray components have been accounted for. Accelerated tests are generally performed at different operating voltages to evaluate nonstandard operation and to obtain data for SER modeling. In some cases, also the temperature of the DUT is varied. c Philips Electronics Nederland BV 2002
37
2002/828
Unclassified Report
6.3 Alpha ASER test Alpha sources can be categorized into sources simulating thorium (Th-232) and uranium (U-238) impurities in package materials, and sources simulating alpha-emission from Am-241 and Pb-210 in lead-based solder bumps [40]. In practice, however, the differences between the two types of sources are small, because both have approximately the same energy spectrum. For example, a Th-232 was used in [97] to measure the SRAM SER due to alphas emitted by solder bumps. Alpha sources for ASER testing come in a wide variety of form factors and isotopes that define their energy distribution, intensity, and applicability to simulating SER. The most widely used alpha sources are thorium isotopes (Th-228, Th-232) and Am-241. In general, solid foils are the best alpha sources since the alpha spectrum will be close to the spectrum in the real device. The typical configuration is either a metal substrate on which a radio-isotope has been deposited and diffusion-bonded by annealing or a solid radio-isotope foil. The spacing between the extra alpha source and the DUT should be less than 1 mm to ensure that the DUT is exposed to alpha particles with virtually all possible angles of incidence. Also, the foil source area should be significantly larger than the area of the DUT. This particular configuration, shown in Fig. 23, resembles as close as possible the situation under nominal conditions. The surface of a contaminated chip package or the aluminum wires on the chip contaminated with radio-active impurities can be considered as an infinitely thin layer of radiating material [42]. Bond wire
Adhesive with spacer Alpha−emitting source foil
0000000000000000000000000000000000000000000000000000000000000000000000000000 1111111111111111111111111111111111111111111111111111111111111111111111111111 000000000000000000 111111111111111111 Test die 000000000000000000 111111111111111111 000000000000000000 111111111111111111 000000000000000000000 111111111111111111111 000000000000000 111111111111111 000000000000000000000 111111111111111111111 000000000000000 111111111111111 000000000000000000000 111111111111111111111 111111111111111 000000000000000 000000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111
Ceramic package
Figure 23: Configuration for α ASER testing [6, 40]. The measured SER is expressed as failure-in-time (FIT), where one FIT corresponds to one failure in 109 device hours. The alpha FIT can be calculated using the relation, FIT =
Nfail nom 9 10 , Ttest src
(19)
where Nfail is the experimentally determined number of failures, Ttest is the test duration in hours, and nom and src denote the alpha particle flux due to the impurities in the device — the nominal situation — and due to the alpha source, respectively, with src nom . 38
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
Equation (19) is an approximation, because it implicitly assumes that the energy spectra of the alphas coming from the source and in the nominal flux are the same and that the angular distributions are identical in both cases. These two conditions are in general not completely fulfilled [48]. To compute the FIT from measured ASER data, a geometrical factor has to be included. This factor is necessary because the experimental setup is not an ideal representation of the nominal situation [48, 81, 97]. The basic test requirement for memory arrays is that a known data pattern is stored in the array while the device is exposed to radiation and that the stored data is compared with the original pattern after the device has been irradiated. The applied data pattern generally has little effect on the measured SER. The memory could be filled with either all zeros, all ones, or a checkerboard pattern with alternating ones and zeros. Besides the radiation source, the experimental setup basically contains an input stimulus generator and a response recorder. Testing requires some sequence of writing data to the device under test (DUT), reading the data back, comparing the output data with the written data, and tabulating the number of fails. For simple memory arrays, a bit is failing when the data read from that bit is different from the last data written to that bit. Also the addresses of the failing bits and the fail times can provide useful information. Several testing procedures are used for the loading and interrogation of test arrays [105], including, • Write once and read once (WORO). The pattern is written only at the beginning and read only at the end of the test run. The number of corrupted bits must be kept below 5% of the total, to avoid that bits flip state back and forth. • Write once and read many (WORM). The pattern is written only at the beginning, but continuously read. A typical read cycle time is 100 ns. • Read and write (R/W). The array is loaded with a pattern and the memory is continuously read, correcting any errors as they occur. • Read and write complement (R/W-C). Same as R/W, except that the complement of the current state is written during every cycle.
6.4 Neutron ASER test Information on accelerated testing of neutron induced SER (nASER) can be found in the JESD standard [40] and in [108]. The cosmic neutron flux at sea level is discussed in Sec. 3.2 (see Fig. 10). The energy distribution of the Weapons Neutron Research (WNR) beam at Los Alamos National Laboratory closely matches the terrestrial neutron spectrum, with an intensity that is 1.38×108 times higher. This means that one hour in the beam is equivalent to 15,750 years under nominal conditions at sea level. Because its spectrum matches so closely the actual environment, the beam can be used in a relatively transparent way to determine SER from c Philips Electronics Nederland BV 2002
39
2002/828
Unclassified Report
high-energy neutrons (> 1 MeV). Therefore, the WNR beam at Los Alamos is the preferred facility. An alternative is to perform a series of experiments with mono-energetic sources at energies of 10 (or 20), 50, 100, and 150 MeV. The point data can then be used to estimate a continuous cross-section curve. Reference [108] reported rules of thumb to estimate the SER probability using data measured at only one or two energies. For energies greater than 50 MeV, protons produce reactions in silicon that are very similar to those generated by neutrons. Therefore, proton facilities, which are generally more readily available, may be used instead of neutron facilities for higher energy ranges. However, protons could produce failure modes that neutrons do not, such as total ionizing dose failure. Ziegler [103] suggests SER testing with neutron beams for energies below 30 MeV and protons above this energy. A list of available neutron and proton test facilities is given in [40, Annex D]. The experimental beam must be uniform over an area that is larger than the chip size. To compute SER levels, an accurate estimate of the neutron flux as a function of energy is required. A method to determine the neutron spectrum was described in [102] (see also [40, Annex E]). The best monitors for nuclear beams are so-called golden chips — chips with a known soft error sensitivity [108]. A possible alternative is to use a Faraday cup [109]. The required test equipment is basically the same as for alpha particle SER testing. Special requirements have to be fulfilled with respect to the SER sensitivity of the tester, because of the ambient radiation that is present in the experimental room [108]. Furthermore, the test equipment must allow the controller to operate the tester from some distance (behind shielding), and must be portable. Mapping of the observed failures as a function of the physical position on the chip can provide useful information. Even with a uniform beam, some fail pattern may exist because of voltage loss, which is a function of the position on the chip. The SER is proportional to the integral of the bit failure cross section (which is energydependent) times the distribution of neutron energies, ∞ SER = Nbit σbit (E) nom (E) d E , 0
σbit (E) =
Nfail , (Nneutron /cm2 ) Nbit
(20)
where σbit (E) is the bit-fail cross section and nom (E) is the neutron flux under nominal conditions as a function of the neutron energy E. A cross section can be imagined as a target that is fired at. The larger the target area, the greater the probability of a hit. Analogously, the probability for a chip to be “hit” by a soft error is proportional to the size of its cross section. If the WNR neutron beam is used, with its energy spectrum that is very similar to the terrestrial spectrum, the obtained SEU cross-section per bit can be applied directly to estimate the terrestrial failure rate. When integrated over the energy range of 10 − 104 MeV, the neutron flux is approximately 14 neutrons/cm2-hr. Neutrons with energies below 10 MeV can be discarded, because these will induce very few upsets compared to higher 40
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
energy neutrons. Therefore, the terrestrial SER is estimated as, SER ≈ 14 σWNR Nbit .
(21)
If devices have significant SER sensitivity to thermal neutrons, which is the case if boron fission is important, the 10 MeV cutoff is not appropriate. In some cases, accelerated testing of soft errors induced by boron fission (10 B-ASER) has to be performed, for example if BPSG has been used as a dielectric. In this case, use of low energy neutron beams is required to determine the SER due to cosmic neutrons with energies less than 1 MeV.
c Philips Electronics Nederland BV 2002
41
2002/828
Unclassified Report
7 SER in logic and in systems 7.1 SER in logic Continuing technology scaling and the increase in frequency towards the GHz range have led to a number of consequences for the SER sensitivity of logic. First, critical charges are decreasing, due to smaller node capacitances and lower operating voltages. Furthermore, the propagation of a transient pulse (an SE glitch) has become much more efficient. Therefore, the SER sensitivity of logic is a substantial and growing concern [4]. Soft errors in logic differ from those in memories in several ways [4]. The effects of soft errors in logic are dynamic, as they depend on the arrival time and the width of the generated transient pulse. Also, the susceptibility to soft errors depends on the input vectors of the circuit. Another difference is that logic modules consist of non-regular topologies, where memories are composed of array-like structures. Furthermore, SEUs in logic are not one-to-one mappable to observable errors, unlike upsets in memories. Finally, the critical charge for an input of a logic gate is (slightly) affected by the levels at other inputs, because these levels influence the logic threshold of the gate [42]. The mechanism by which an ionizing particle can accomplish data corruption is more complex for digital logic circuits than for memories. A transient pulse that is generated in combinatorial logic can only cause false data at a critical point in the circuit if the following conditions are fulfilled: 1. The transient pulse has to be strong enough to generate a signal on one of the nodes in the circuit. The required strength of the pulse is dependent on the sensitivity of the nodes. The noise pulse has to compete with the functional signal pulses. 2. The combinatorial logic in the circuit has to be fast enough to propagate the signal. Nowadays, signals with a pulse width of ≈ 200 ps can be propagated. 3. The path that is traveled by the pulse has to be logically enabled. In other words, the path has to be sensitized. 4. The fault has to be latched. The probability that a noise pulse overlaps with a rising clock edge increases with clock frequency and pulse width, see Fig. 24. The dependency on pulse width is related to the dynamic properties of the latch, in particular to its setup and hold times. Typically, in the order of 1% of the glitches in the combinatorial part of a data path are latched [77]. An erroneous signal that is latched is named a soft fault. The soft fault is called a soft error only if it corrupts an output. Two types of soft error hits are shown in Fig. 25. A particle can directly hit a latch, or it can hit a susceptible node in the combinatorial logic. Dependent on the active logic paths, the generated noise pulse can result in a soft fault in the latch. The susceptibility to soft errors of combinatorial logic is still relatively small compared with memories [77]. However, the SER sensitivity of the static latches that are used in 42
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
Figure 24: A transient pulse will only be latched if it coincides with the clocking edge. The probability that a pulse is latched is larger if the pulse is wider or the frequency is higher. [22].
Figure 25: Direct hit of a latch and the propagation of a soft error hit in combinatorial logic [4].
c Philips Electronics Nederland BV 2002
43
2002/828
Unclassified Report
core logic is approaching the sensitivity of memory cells. On the other hand, the fraction of upsets that result in a soft error at the system level is much smaller for latches than for memories. Therefore, the effective SER of core logic is still significantly smaller than the memory SER. The neutron-induced component of the SER is larger for logic (and DRAM) than for SRAM, because the critical charges are larger. In fact, neutrons are the dominant source of soft errors for the current logic and DRAM technologies [58, 93, 94]. Latch circuits in 0.35 µm and older technologies are immune to soft errors induced by alphas, because their critical charges are usually ≥ 100 fC and the maximum collected charge due to an alpha particle is typically 50–70 fC [92]. An example of a deep-submicron CMOS circuit that is susceptible to soft errors is the dynamic pipeline register, where data is stored on floating nodes, These floating nodes are particularly sensitive, because the charge that is collected at the source and drain junctions of the pass transistors cannot be removed when these pass transistors are turned off [42]. A latch that is used in such a dynamic pipeline is shown in Fig. 26.
00 11 11 00 00 11 00 11
sensitive junctions
node capacitance
Figure 26: Dynamic pipeline latch [42]. If the upset rate URn,i corresponds to hits in a nMOS source/drain junction (which can only discharge a node) and UR p,i corresponds to hits in a pMOS source/drain junction (which can only charge a node) the SER of a dynamic logic circuit can be expressed as [42, 77], n
Tsens,i SER = Ni Pobs,i PH,i URn,i + PL ,i UR p,i , (22) Tclk i where n is the number of different types of floating nodes, Ni is the number of floating nodes of type i , Pobs,i denotes the probability that an upset causes an observable error at the outputs of the circuit, Tsens,i is the time that node i is sensitive to an upset, Tclk is the clock period, and PH,i and PL ,i denote the probability that node i stores a logical “1” or a logical “0”, respectively. Reference [77] categorizes soft errors into synchronous soft errors, showing a clock cycle time dependence, and asynchronous soft errors, which are independent of the clock frequency. For example, the SER of a memory is insensitive to system timing and is thus asynchronous. Also, soft errors due to the upset of latch data are asynchronous, because the time fraction that the node is susceptible to SEUs is independent of the clock frequency. The SER of combinatorial logic, however, is cycle time dependent [78]. 44
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
Therefore, SERs in combinatorial logic increase linearly with increasing frequency, where SERs in sequential logic are approximately frequency-independent. A study on the modeling of soft errors in combinatorial logic was performed at the VHDL level in [4]. This type of modeling involves several steps. First, the sensitivity of a node to a strike by an ionizing particle has to be determined by a statistical method. Next, the generation of signals that exceed the noise margin is calculated with an analytical model. The propagation of a signal to a latch through the active paths is determined using a set of test vectors. The latching of the signal during the setup-and-hold window, resulting in a soft fault, is again described statistically. Finally, the propagation of the soft fault to the external outputs is modeled by applying a test vector set.
7.2 SER in systems The failure rate of microprocessors, containing both random logic and memory cells, was studied in [77]. Several generations of Compaq’s Alpha microprocessor were investigated on alpha-particle induced SER. The technologies that were considered ranged from 0.5 to 0.18 µm process nodes. Simplified, the overall SER for a given pattern (instructions and data) can be expressed as, (23) SERsys = δcache SERcache + δcore SERcore , where δcache and δcore are the effective derating factors for the cache and core contributions, respectively. These derating factors denote the fractions of soft errors that actually result in a system error. It was found that the chip-level SER for microprocessors manufactured in a 0.5 µm technology was dominated by the SRAMs in the cache memory. For chips produced in a 0.18 µm process, however, the SER of the core logic was larger than the SRAM SER. The cause of this is, among others, that ECC protection of the data cache is applied in the newer versions. As a consequence, the SER sensitivity of the Alpha microprocessor has decreased over the latest process generations. Further, it was observed in [77] that the increase in clock frequency had a relatively small effect on the SER. For a frequency of 200 MHz, the SER was found to be approximately 20% higher than for a frequency of 20 MHz. The relatively weak dependence on frequency suggests that the majority of soft errors in the core are asynchronous. Reference [38] reported an integrated test environment for the evaluation of the SER sensitivity of memory subsystems and its impact at the system level. The study was motivated by the fact that memory subsystem reliability is an increasingly important issue in system design. Contrary to performance figures of merit, however, reliability marks for memory subsystems are usually not available. This makes it very difficult or impossible to accurately assess the reliability impact of memory subsystem errors on a large, complex system. Therefore, an experimental environment for the characterization and gradation of the SER sensitivity of memory subsystems was proposed. The key components of the method are circuit simulation, testing of radiation effects, and c Philips Electronics Nederland BV 2002
45
2002/828
Unclassified Report
mixed-mode simulation, see Fig. 27. The SEU susceptibility of sensitive design components is determined by circuit simulations in which transient faults are inserted. The sensitive components include storage cells, the sense amplifier, data-bit lines, and the decoder. The circuit simulations model a single-event radiation effect as a device-level charge collection process. The circuit simulations serve to estimate parameters such as the critical charge. Ions from a cyclotron facility with mono-energetic beams were injected into target memories to test the radiation effects. The percentage of multi-bit errors was found to differ significantly for the DRAM components of different manufacturers. The SER statistics obtained from the experiments were fed to a mixed-mode systemlevel simulator. The simulator applies gate-level and behavioral models as design input. The simulation engine for fault injection consists of three parts: an event (“experiment”) scheduler, a bit-wise parallel simulator, and a fault/error behavior monitor. These three programs enable the study of the correlation between the component-level SERs and the system-level impact. Circuit description
Transient fault
Circuit−level simulator (Spice)
Radiation experiments
Input parameter processor
Workload (instruction stream)
Mixed−mode system−level simulator
Design/configuration analysis
Statistical analysis System impacts analysis
Figure 27: Integrated testing environment [38]. A fault can manifest itself at the system level in different forms, as is illustrated in Fig. 28. In a case study with DRAM subsystems, it was found that 5% of the total fault injections resulted in a system error. It was demonstrated that parity check is not effective for improving reliability if multi-bit upsets are important. On the other hand, the inclusion of error correction code reduced the system-level SER drastically, but not to zero.
46
c Philips Electronics Nederland BV 2002
Unclassified Report
Bit−flip fault injected into memory
2002/828
Overwritten Non−overwritten
Correct result Error detection and recovery logic
Complete termination Wrong result
No error detected Error recovered
Illegal op−code Illegal address Time−out
Incomplete termination
Error detected (not recovered)
Figure 28: Manifestation of an injected fault at the system level [38].
c Philips Electronics Nederland BV 2002
47
2002/828
Unclassified Report
8 Scaling effects The importance of the SER topic is increasing for every next IC process generation. Technology scaling has a number of implications for the SER [27]. First, due to smaller feature sizes, node capacitances are reduced. Because also the supply voltage decreases, the amount of charge that represents a single data bit is reducing, as is shown in Fig. 29. Furthermore, the advancement in chip design has resulted in an increasingly large number of transistors per chip. Thus, even if the SER sensitivity per cell or per gate remains the same, the chip SER increases. Also, in some cases the number of chips that is incorporated in a system increases, which has a negative effective on the SER at system level. The use of flip-chip packaging also has an impact on SER, because solder bumps are contaminated with radioactive impurities producing alpha particles. An additional consequence of the voltage lowering is that the noise margins decrease. Moreover, gate delays reduce and clock frequencies increase. The latter three trends are in particular important for the SER of digital logic.
Figure 29: Amount of charge representing a data bit as a function of process technology generation [22]. Soft errors have become a concern for random logic as well as for memories, because of ongoing technology scaling and the application of design styles that aim for high performance, for example, by using dynamic logic. Furthermore, the percentage of failure events that involve more than one bit is increasing, due to the lower voltage, smaller node capacitance, and higher density.
8.1 Scaling of critical charge and collection efficiency Two important parameters affecting the SER are the supply voltage and the node capacitance, because these are directly related to the critical charge. In first order, the critical charge can be approximated by Q crit = VDD Cnode . Technology scaling from the 0.35 µm 48
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
to the 0.18 µm node has led to a reduction of Q crit that can be attributed almost completely to the decrease in supply voltage. Reference [29] showed that the error probability for a given value of Q crit is larger if the supply voltage is higher and, consequently, if the node capacitance is lower. This is caused by the fact that the electric field inside a depletion region is higher and the amount of charge collected by funneling is larger if the supply voltage is higher. The dependence of the SER on the critical charge is roughly exponential. Consequently, lowering the supply voltage for a given circuits results in an exponential increase in SER. However, this does not imply that the SER increases with the same rate from one technology generation to the next [77]. The reduction of the supply voltage and the node capacitances result in a decrease in Q crit , but this decrease is (parly) compensated by a reduction of the collection efficiency. The smaller feature sizes and changes in the process lead to a lower efficiency of the charge collection process, as the collection volume decreases and the funneling mechanism weakens. The balance between the reduction in critical charge and the decrease in collection efficiency determines if the SER per bit increases or decreases. In fact, the failure rates may improve from one process generation to the next. Decreasing the operating voltage for a given circuit was found to lead to an exponential increase in the SER of 2.1–2.2 decades per Volt [14]. This result holds both for memory and for dynamic logic circuits. Accelerated tests on single- and double-port RAMs in a 0.18 µm technology showed a similar relation [11]. The measured SER at VDD = 1.44 V was approximately 10 times larger than at VDD = 2.16 V. The relation between SER and critical charge for Alpha microprocessors was found to be very similar to the relation for SRAMs and dynamic logic circuits [14]. Because the correlation is so comparable for different circuit types, relatively simple test circuits can be used to determine the relationship between critical charges and the SER of actual products. Reference [45] described an experiment in which the neutron-induced SER of static latches was measured as a function of diffusion collection area and supply voltage. The error rates were found to increase linearly with the diffusion area on the charge collecting node. It was predicted that the neutron-induced SER of a single latch will increase if the supply voltage scales by a factor of 0.7 for each process generation, but will be approximately constant if the supply voltage scales by a factor of 0.8 only, assuming worst-case scaling for the diffusion area, effective gate length, and oxide thickness. Empirical models predicted that for general CMOS circuits the neutron-induced SER per bit decreases at least linearly with decreasing feature size [30]. This study applied the supply voltage predicted by the SIA road map for each technology node. Taken the increasing number of bits per die into account, a linear decrease in SER per bit implies a linear increase in SER per chip. The characteristics of SER scaling depend on process generation and on technological properties. However, a general scaling trend as a function of supply voltage can be distinguished for the three contributions to SER, namely alpha particles, cosmic neutrons, and boron fission [5, 6, 27, 41, 80, 93, 94]. Figure 30 shows the scaling trends for the three components, using normalized units. All three components have an exponential dec Philips Electronics Nederland BV 2002
49
2002/828
Unclassified Report
pendency on the supply voltage. However, the decrease in SER with increasing voltage is much stronger for alpha particles than for cosmic neutrons. The voltage dependency of the SER induced by boron fission is between the dependencies for alpha-induced and neutron-induced SER. The difference in voltage-dependency for alpha particles, highenergy neutrons, and thermal neutrons is a direct consequence of the difference in charge generated the ionizing particles, as was discussed in Sec. 3. If the majority of the ions produce an amount of charge that is much larger than the critical charge, as is the case for cosmic neutrons, then the corresponding SER is relatively insensitive to changes in the critical charge and, therefore, also insensitive to changes in the supply voltage. If the generated charge is in the same order of magnitude as the critical charge, however, the dependency is much stronger, as is the case for alpha particles.
Figure 30: Scaling of the three contributions to SER as a function of voltage [5, 6]. At extremely low bias regimes, a saturation (roll-off) effect was observed for the scaling of the alpha-induced SER in SRAMs manufactured in the 0.25 µm technology node and below [27]. This roll-off effect starts at higher voltages if the process generation is more advanced. The percentage of alpha hits that is capable of inducing an SEU increases exponentially in each successive technology, due to the reduction in critical charge. However, this percentage saturates for critical charges below 10 fC, because then almost all alphas cause an upset. In that case, the dependence on voltage is approximately the same as for cosmic neutrons. The upset mechanism for various strike locations and the dependence on gate length scaling was studied in [19] by device simulation. As technology scales, the drain of an OFF transistor outside a well structure is becoming increasingly sensitive to upsets, compared to inside-the-well structures. This change is probably caused by the growing importance of the bipolar charge-collection mechanism and is more likely in p-substrate than in nsubstrate technologies. 50
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
8.2 Comparison of SRAM, DRAM, and 1TRAM Historically, the SER of DRAMs used to be poor because the stored charge was relatively small and the charge collection efficiency was relatively large [53]. Six-transistor (6T) SRAM devices traditionally had superior SER immunity because of the more stable cells and the high signal levels. However, technology scaling has dramatically changed the relative SER sensitivity of different memory types [6, 53]. DRAM devices generally have improved SER per bit with every new generation, because the charge-collection efficiency decreases faster than the critical charge. The SER of DRAM as a function of product generation is shown in Fig. 31. The decreased sensitivity per bit matches the increase in density, such that the SER of a system including DRAM devices is approximately constant, as is shown in Fig. 32. Nowadays, the SER of DRAM devices is typically in the order of a thousand FITs per Megabit (Mbit) when operated at full speed. The situation for SRAM systems is completely different. The SER per bit of SRAM devices tends to worsen by a factor of 5 to 10 for each new process generation because the critical charge decreases faster than the charge-collection efficiency. This results in an increasing SER per Mbit for every product generation, as is shown in Fig. 33, and an even stronger increase in system SER, see Fig. 34. Specific measures, such as the elimination of BPSG from the process, or the application of low-alpha package materials, are required to keep the SER at an acceptable level. The SER of 6T SRAM operating at full speed is rapidly exceeding the desirable threshold of 1,000 FITs per Mbit. The technology scaling of logic SER is approximately similar to the trend for 6T SRAM [6]. Therefore, the SER of systems containing both random logic and SRAM is a growing point of concern. However, the system-level SER generally does not follow monotonic trends, because many factors are involved, including design, process, and packaging measures [77]. Recently, 1TRAM (also known as 1T-SRAM) has been introduced as embedded memory in SoCs [53]. The 1TRAM cell consists of a capacitor of less than 10 fF and an access transistor and is essentially a planar DRAM cell with the capacitor implemented using a MOS structure. The most important advantages of the cell are its relatively small area (compared with an SRAM cell) and its compatibility with a logic manufacturing process (in contrast with embedded DRAM). In terms of SER, the 1TRAM cell has several advantages over the SRAM cell [53]. First, the larger node capacitance of the 1TRAM cell results in a larger critical charge. Also, the SER vulnerability during sensing operation is smaller, because of short bit lines, large internal signal levels, and fast sensing speed. Furthermore, some 1TRAM implementations use only p-channel devices, for example, the 1TRAM of MoSys [53]. This makes the collection efficiency in the cell relatively low, since holes have a lower mobility than electrons. The dependence of the SER of 1TRAM devices on the critical charge is roughly linear [81]. This is in contrast with conventional memories, which SERs exhibit a strong exponential dependency. A possible cause is the roll-off effect, discussed above, related to the small critical charge of the 1TRAM cell. c Philips Electronics Nederland BV 2002
51
2002/828
Unclassified Report
Figure 31: Scaling of the SER per Mbit for DRAM [6].
Figure 32: Scaling of systems including DRAM [6].
52
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
Figure 33: Scaling of the SER per Mbit for SRAM [6]. The two different SER curves correspond to the application of different packaging materials.
Figure 34: Scaling of systems including SRAM [6]. The two different SER curves correspond to the application of different packaging materials.
c Philips Electronics Nederland BV 2002
53
2002/828
Unclassified Report
Figure 35: SER as a function of technology generation for 6T SRAM, pseudo-6T SRAM, and 1TRAM devices [53]. The SER of 1TRAM may be too optimistic, see [81]. Although for older technologies the SER per Mbit for 1TRAM is worse than for SRAM, the alpha-induced SER is better for the 0.13 µm technology node and beyond [81]. 1TRAM is compared with 6T-SRAM and pseudo-6T SRAM in Fig. 35. Four-transistor SRAM and pseudo-6T SRAM (6T SRAM with a poly-PMOS load) perform even more worse in terms of SER sensitivity than 6T SRAM, with unacceptably high SER at the 0.18 µm node and below. The SER for 1TRAM devices depicted in Fig. 35 is probably too optimistic. The SER of 1TRAM increases more slowly with technology generation because the cell capacitance is kept approximately constant. The empirical model of [90] was used in [81] to predict similar scaling characteristics for the neutron-induced SER of 1TRAM. Furthermore, the dependency of the SER on supply voltage, temperature, and process variations was found to be smaller for 1TRAM than for SRAM.
8.3 Multiple-bit soft errors In principle, an ionizing particle can induce more than one bit failure [98]. This is in particular relevant for soft errors generated by cosmic neutrons, because these generate much more charge than alpha particles, see Sec. 3. Furthermore, a neutron-nucleus reaction can produce several ionizing particles and each of them may corrupt a bit, as is shown in Fig. 36. It was found experimentally [25], that a neutron with an energy of 300 MeV or more is needed to cause a multi-bit soft error. This threshold is significantly higher than the ≈ 5 MeV of neutron energy that is needed to generate a single-bit soft error. Three mechanisms contribute to MBUs [98]: I. Two or more particles — alphas or neutrons — interact with the device at the same time, where each nucleon triggers an SEU independently. Under nominal conditions, this is very unlikely. Therefore, this mechanism can be neglected. 54
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828 Neutron
Alpha 11111111 00000000 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 0000000 1111111 00000000 11111111 0000000 1111111 00000000 11111111 0000 1111 0000000 1111111 0000 1111 0000000 1111111 0000 1111 0000 1111 0000 1111 0000 1111
~ 15 fC/um
000000000 111111111 111111111 000000000 000000000 111111111 000000000 111111111 Si chip 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 00000000 11111111 000000000 111111111 00000000 11111111 000000000 111111111 00000000 11111111 000000000 111111111 00000000 11111111 000000000 111111111 0000 1111 1111 0000 000000000 111111111 0000 1111 0 1 0000000 1111111 0 1 0000000 1111111 0 1 0000000 1111111 0 1 0000000 1111111 0000000 1111111
~ 150 fC/um
Figure 36: The probability of a multiple-bit soft error for an incident neutron compared with an alpha particle hit [76]. The dashed checkers denote junctions corresponding to a corrupted bit. II. An incident nucleon creates two or more ionizing particles and each of them induces an SEU separately. III. One ionizing particle passes through several cells and deposits sufficient charge to upset more than one bit. For current technologies, the type II and type III mechanisms occur with roughly the same probability. Type II MBUs occur for technologies that are sensitive to light particles, because a nuclear reaction generally produces one heavy particle (recoil) and several lighter particles, for example, alphas. Since future processes will be increasingly more sensitive to light particles, it is likely that the relative probability of type II MBUs will increase. For current DRAM technologies, multiple-bit upsets (MBUs) due to alpha particles are assumed to be negligible [76]. It was observed that the probability of multiple-bit errors of multiplicity n ≥ 2 in DRAMs was approximately n orders of magnitude less than the probability of a single-bit error. The percentage of MBUs is usually smaller for SRAMs than for DRAMs manufactured in the same technology, because of the lower density of SRAM cells [25]. Reference [98] computed SEU and MBU cross sections from simulations for a commercial 0.25 µm SRAM-CMOS technology, with a critical charge of ≈ 13 fC (E crit ≈ 0.3 MeV). The SEU and MBU cross sections are measures of the occurrence probability of one single-bit or multiple-bit upset, respectively. The MBU cross section was observed to be more than two orders of magnitude smaller than the SEU cross section. For an SRAM manufactured in 0.18 µm process, it was reported that 99.9% of the failures corresponded to a single bit [67]. However, the relatively contribution of MBUs increases with decreasing critical charge. The relative scaling of SEU and MBU cross sections can be studied using a simple model based on the approximation Q crit = Cnode VDD [98]. The node capacitance Cnode of an SRAM cell can be estimated as the summed gate capacitances of the NMOS and PMOS transistors associated to the node. However, better models are needed to understand the impact of charge sharing on scaling [5]. c Philips Electronics Nederland BV 2002
55
2002/828
Unclassified Report
If both the supply voltage and the gate oxide thickness are reduced, such that the electrical field remains constant, the scaling of the critical charge will follow a s 2 law, where s is the scaling factor [98]. In this case, the SEU cross section scales approximately linearly with s. If the oxide thickness is maintained constant while the supply voltage is reduced, the critical charge will scale with s 3 . The SEU cross section is then almost constant, because the decrease in Q crit is balanced by the reduction in sensitive volume. The scaling of the MBU cross section involves also the distance between the cells, besides the critical charge and the sensitive volume [98]. In the cases that Q crit follows a s 2 or a s 3 law, the MBU cross section will roughly scale with 1/s or 1/s 2 , respectively. Simulations suggest that while the MBU rate is less than one percent of the SEU rate for a 0.25 µm process, the MBU contribution will be several percents for 0.13 µm technologies. The ratio between MBU and SEU cross sections will scale with a factor between 1/s 2 and 1/s 4 . Therefore, it is very likely that MBUs will become increasingly important for successive technologies and will no longer be negligible for processes below 0.13 µm. One possible technique to suppress soft errors is the usage of error-correction codes (ECC). However, if MBUs are non-negligible, conventional ECC may not be sufficient for certain configuration patterns. Multiple-bit SER decreases exponentially with the distance between the storage nodes [76]. Therefore, double-bit SER can be significantly reduced by reading every other bit simultaneously and checking for parity, instead of reading all bits in a row simultaneously.
56
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
9 Improvement of SER sensitivity Methods to improve SER are commonly known as mitigation techniques. Improvement of the SER of semiconductor devices can be achieved at different levels. These can be categorized as follows: 1. Application of materials with low alpha-emission rates in the package, the chip, and the IC process; usage of shielding layers; modifications in the package design. 2. Process modifications, or the usage of alternative technologies such as silicon-oninsulator (SOI) or silicon-on-sapphire (SOS). 3. SER hardening of the circuit components (memory cells, latches, gates). 4. System solutions, in particular the application of redundancy, parity, and error detection and correction (EDAC).
9.1 Improvements in materials, shielding, and packaging The alpha-emission rates of the materials in the package, the wafer, and the manufacturing process can have a large effect on SER. In the late 1970s and early 1980s, alpha particles emitted from particularly the package were the main source for soft errors. During the last few decades, however, failure rates due to alpha particles have been significantly improved by using purified materials. For packages, mold compounds with low concentrations of alpha-emitting particles are available. The effect of using low-alpha packaging materials can be quite large, see for example Table 4 in Sec. 3 and Fig. 33 in Sec. 8. Also the purity of the materials used in the wafer are important. Nowadays, the alpha-emission rate of a fully processed wafer is in the same order of magnitude as the emission rate of the package. Finally, also the materials used in the manufacturing process should have low concentrations of radio-active contaminants. The introduction of flip-chip technology significantly increased the alpha-flux due to the lead present in the solder bumps, as was discussed in Sec. 3. This effect can be improved by using lead with low concentrations of radio-active impurities. A different approach is to use a shielding layer of polyimide to prevent that alpha particles from the package can reach the chip. However, the shielding layer should be sufficiently thick. If a thin layer is applied, the alpha particles reach the devices with decreased energy. These particles have a relatively large charge generation rate, see Fig. 5 in Sec. 3. A polyimide layer that is too thin therefore deteriorates the SER, instead of improving it, as is illustrated in Fig. 37. A shielding layer of ≈ 20 µm blocks all alphas [6]. Remarkably, a shielding layer of 30 µm appeared not to be effective in accelerated tests on SRAMs using an Am-241 source [67]. The same shielding effect as from the polyimide layer is obtained if a lid coat is placed between the package and the die [77]. Shielding layers are effective only for alpha particles, chips cannot be shielded from cosmic neutrons. c Philips Electronics Nederland BV 2002
57
2002/828
Unclassified Report
(a)
(b)
Figure 37: Shielding from alphas by a polyimide layer (a); collected charge as a function of polyimide layer thickness (b) [6]. Solder bumps can be placed such that they are not in the direct neighborhood of the sensitive areas of the chip. Because of the short range of alphas in silicon, the impact of alphas originating from solder bumps can be drastically reduced by using keep-away zones [6], see Fig. 38.
Figure 38: Keep-away zones for solder bumps [6]. In summary, the following solutions have been discussed in the current section [6], • low-alpha package materials, • low-alpha wafer materials, • purified processing materials, • low-alpha lead in solder bumps, • shielding layers, • keep-away zones for solder bumps. 58
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
9.2 Process modifications Several process solutions have been proposed to reduce SER sensitivity of circuits, including the usage of well structures, buried layers, deep trench isolation, and implants under the most sensitive nodes. Higher impurity concentrations in the substrate result in lower SER, because the funneling mechanism stops when the carrier density near the junction is approximately equal to the substrate doping [32]. Also, highly doped substrates are less sensitive to variations in the bias voltage. The difference between inside-the-well and outside-the-well junctions was discussed in [19]. For example, an n-type diffusion is more sensitive to upsets if it is inside a p-type substrate than if it is inside a p-type well. This is because the well-substrate junction collects a substantial amount of the induced charge. This difference was quantified experimentally in [29]. As a rule of thumb one can use that the maximum depth of charge collection is approximately half the well depth [42]. The difference in sensitivity for single-well and triple-well devices has been demonstrated experimentally for DRAM SERs [25]. In [3] and [86], an ion-implanted shallow well structure was suggested to reduce the soft error sensitivity in SRAMs. A shallow well structure both reduced the amount of deposited charge in a well layer and decreased the charge transport to the storage nodes. For a 0.4 µm twin well process, it was found that decreasing the well depth from 2.0 to 1.3 µm reduced the charge collected at the storage nodes by 30–50% and decreased the SER by 70–80 %. On the other hand, TSMC reported a 50–100% reduction in the SER of 6T-SRAM if a deep n-well is used [67]. Also wafer thinning has been proposed as a way to reduce SEU sensitivity [20]. It was shown that the overall SEU threshold LET can be significantly increased if the substrate thickness is reduced to 0.5 µm. In practice, however, several criteria would have to be met to make the thinning of fully processed wafers possible. The contribution of boron fission to the SER can be reduced to almost zero by eliminating BPSG from the process flow. If the use of BPSG is necessary, enriched 11 B could be used in the BPSG layers [6]. Silicon-on-insulator (SOI) technologies are relatively insensitive to soft errors. SOI is usually applied for low-power and high-performance applications [50]. The transistors that are used in SOI technologies can be categorized into three major types [63], • partially depleted (PD), • fully depleted (FD), • bulk-like. Partially depleted, fully depleted, and bulk-like SOI transistors are also known as thin film, ultra-thin film, and thick film transistors, respectively. The three structure types differ mainly by the thickness of the silicon film, which determines the mode of operation. Fully depleted SOI is particularly insensitive to radiation-induced soft errors, because an c Philips Electronics Nederland BV 2002
59
2002/828
Unclassified Report
active device does not include a parasitic bipolar transistor. Also (silicon-on-sapphire) SOS technologies perform well in terms of SER reliability. SOS can be regarded as a simplified version of a thin film CMOS SOI process, without a buried silicon substrate. The SER in SOI (or SOS) is lower than in a corresponding bulk CMOS process because the volume in which charges are generated is smaller. A similar effect can be obtained by including a buried layer in bulk CMOS, see Fig. 39.
(a)
(b)
Figure 39: Illustration of the effect on SER sensitivity from the inclusion of a buried p+ layer (a) and using an SOI process (b) [6]. (Usually, SOI junction diffusions touch the buried oxide layer.) It was discussed in Sec. 4 that the bipolar effect in SOI is a point of concern. Body ties are used in most of the hardened technologies to suppress the parasitic bipolar transistor [63, 86]. Basically, the emitter and the base are tied together to limit the increase in VBE under minority carrier injection. It is also possible to modify the operating mode of the SOI transistor by thinning the silicon film, such that the conduction in the body is controlled simultaneously by both the gate and the substrate voltages. The floating body effects are then significantly reduced. Applying SOI technology instead of the corresponding bulk process improves the SER with a factor in the range of 2 to 8 [28, 91]. However, the cost of materials, especially of the wafers, is higher for SOI. In a recent paper [21], using 3D device simulations and experimental techniques, the SEUsensitive volumes of bulk and SOI CMOS SRAMs were compared. It was found that the measured sensitive volumes in SOI SRAMs were significantly larger than predicted by device simulation. Not only the gate regions in SOI SRAMs appeared to be sensitive to ion strikes, as predicted by earlier studies, but also the reverse-biased drain regions. It was suggested that charge collection from either within the buried oxide or the substrate (or a combination of both) occurs. In particular for thick SOI technologies, a strong dependence between the angle of incidence and the collected charge has been predicted [63]. 60
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
Thus, the solutions at the process level are, in summary, • well structures, • buried layers, • deep trench isolation, • implants under the most sensitive nodes, • wafer thinning, • elimination of BPSG, • application of 11 B enriched BPSG, • manufacture in an SOI or SOS technology.
9.3 Component hardening There are two basic approaches to improve SER sensitivity at the circuit level. On the one hand, the components applied in the design can be modified such that they are less susceptible to soft errors. The main goal of this approach, often named design hardening, is to manufacture SER-reliable circuits using standard CMOS processing without additional masks [96]. On the other hand, one can accept that soft errors occur at a certain rate and include extra circuitry to detect and correct them. Error correction and detection is discussed in the next subsection. In either of the two approaches, trade-offs between SER sensitivity and other design metrics (speed, power dissipation, area) determine if a design hardening method is suitable, as every possible solution has its penalty. Solutions to reduce the SER sensitivity of components can be categorized as techniques to increase the capacitance of the storage node, to reduce the charge collection efficiency, or to compensate for charge loss [100]. The applied design style can have an important effect on SER. For example, it was demonstrated that level-sensitive latches using transmission gates are more sensitive than edgetriggered static latches, because the former use floating nodes to store information [77]. A straightforward method to improve SER sensitivity is to enlarge the critical charges by increasing the capacitance of the storage nodes [67]. In fact, if all critical charges are sufficiently large (> 35 fC for a 0.35 µm technology [41]), alpha particles are not able to upset a circuit and neutrons are the only source of soft errors. An explicit feedback capacitor can be added to the node capacitances [45]. An SER-hardened SRAM cell reported in [68] used stacked cross-coupled interconnect to increase the capacitor area. Enlargement of the node capacitances is not only applied in memory design, but was also shown to be an efficient way to improve the SER sensitivity of sequential or domino nodes in high-performance circuits [46]. The main drawback of increasing the node capacitances is that generally the cell area is increased. c Philips Electronics Nederland BV 2002
61
2002/828
Unclassified Report
In terms of SER improvement, the capacitance from gates and interconnect must be distinguished from diffusion capacitances [100]. The addition of extra gate or interconnect capacitance increases SER robustness because it has little effect on the charge collection process. On the other hand, the addition of diffusion capacitances increases the charge collection efficiency, because the total collection area of the node is augmented. This effect could cancel out the impact of the higher node capacitance, resulting in an increase in SER, instead of a reduction. Reference [45] demonstrated experimentally the increase SER due to enlargement of the diffusion area. It was shown for a latch that reducing the diffusion area, by properly sizing the transistor stack, resulted in an improvement of the SER, at the expense of only small setup and power penalties. Increasing the widths of the pMOS transistors in an SRAM cell has a positive impact on the SER, at the cost of a larger cell area [67]. First, it results in a stronger pull-up gain of the cell, which implies that the stored charge is stabilized more efficiently. Furthermore, the node capacitance is increased. Increasing the widths of the nMOS transistors, however, has a negative impact on SER. This is because a wider transistor has a larger sensitive drain area and, consequently, a higher collection volume. This effect outweighs the positive effect of the increase node capacitance [42]. The SER sensitivity of SRAM cells and latches can be improved by adding feedback resistors between the output of one inverter and the input of the other [2, 79], see Fig. 40. The transient pulse induced by an ionizing particle is then filtered by the two resistors, which slow down the circuit such that it does not have sufficient time to flip state. However, the inclusion of feedback resistors in a memory element has the drawback that the write speed is lowered [96]. VDD
bitline True
bitline False
P2
wordline P1
wordline N1a N2a N2
N1
VSS
Figure 40: SER hardening of an SRAM cell by the inclusion of two feedback resistors [2]. A different solution is the addition of extra active devices to the circuit that are connected to the critical nodes. In this manner, memory elements are provided with an appropriate feedback mechanism that restores the data when corrupted by an ion strike [96]. Keeper devices can compensate for some of the charge loss during a strike by actively driving 62
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
the node high or low. The strength of the introduced feedback loop is of importance, because it affects not only the SER susceptibility but also other design metrics, such as speed and power dissipation. Reference [10] proposed a method for the SER hardened design of digital CMOS circuits using additional transistors placed in isolated wells that are actively biased. Another circuit parameter influencing SER is the voltage switching point of the logic gates connected to the sensitive storage nodes [100]. Ideally, the switching point is halfway between VDD and ground, but in real designs it is often adjusted to improve speed. A large skew in switching point makes a storage node more susceptible to an upset. Also the biasing polarity of diffusion at the storage nodes can be used to effectively reduce SER [100]. The n+ and p+ diffusion areas are mainly sensitive to upsets if they are biased at logic high (VDD ) or logic low VSS levels, respectively. Therefore, a storage node that is dominated n+ (p+) diffusion is most susceptible to a “1 to 0” (“0 to 1”) transition. Because of this, the SER can be improved by adjusting the circuit switching points, such that “1 to 0” and “0 to 1” transition probabilities are balanced. For memory cells that are particularly sensitive during write operations, the SER can be improved by shortening the bit lines [53]. Also design solutions that result in a decrease bit-line floating time can be helpful. In practice, however, the SRAM SER is approximately independent of cycle time, as upsets can be predominantly addressed to the cell. Failures in the periphery of the memory only show up for very short cycle times [67]. The component hardening methods can be summarized as • increase of critical node capacitance, • increase of pull-up gain, • feedback resistors, • active feedback devices, • switching point optimization, • bit line shortening.
9.4 System solutions Methodologies have been proposed that aim to provide SER robustness by design [22]. These approaches include the application of parity check, error detection and correction (EDAC) schemes, and time or space redundancy. The goal is to have cost-effective designs with high performance that are not prone to soft errors. Reference [22] calls such design strategies corrective intelligence. A traditional technique to achieve increased system reliability is redundancy. Figure 41 shows a functional level example of triple modular redundancy (TMR). Obviously, such solutions are unacceptably expensive for commercial applications. c Philips Electronics Nederland BV 2002
63
2002/828
Unclassified Report
Figure 41: Principle of triple modular redundancy (TMR) [22]. Traditionally, the use of redundancy in memory design implies the addition of redundant word or bit lines. These spare lines are made available through fusible devices if testing shows that other lines are not functioning correctly. This type of redundancy works only for hards errors and not for soft errors, because these occur occasionally and randomly in time and space. The general approach for applying redundancy to deal with both soft and hard errors in memories is to use redundancy in the coding. Longer codes that allow error detection and correction are then applied. A traditional architecture of an error correction code (ECC) is depicted in Fig. 42. On-chip ECC protection of random core logic is much more difficult and costly than it is for memories [77].
Figure 42: Traditional ECC architecture applied in memories [22]. The challenge is to include an EDAC method in the design with the smallest possible penalties for speed, area, and cost efficiency. For a system-on-chip (SoC) using memories with a failure rate of 10,000 FIT/Mbit, the system error rate would be about one error per week. This low error rate does not justify a significant deterioration of the performance and the cost efficiency [22]. Parity checks and simple error correction codes (ECC) cannot be used to detect and correct a multiple-bit error in a word. However, interleaving the data lines of the arrays, such that physically adjacent bits are not part of the same word, reduces the probability of undetectable and/or uncorrectable multi-bit errors [6, 25, 59, 100]. 64
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
An SER-hardened latch that applies temporal sampling was proposed in [55]. The circuit is capable of eliminating both upsets of the data stored in the latches and induced transients in the combinatorial logic or the clock or control lines. Multiple parallel sampling flipflops are applied to create spatial parallelism preventing SEUs in the static latches. Multiple sampling of the data over time results in temporal parallelism, which provides immunity to induced transients in combinatorial logic, clock, or control lines. Combining a temporal sampling stage, controlled by three different clocks, and a majority voting stage, using a fourth clock, results in temporal redundancy, see Fig. 43. The penalties of the method are in increased area, because latches are replaced by much larger circuits, and loss of speed, due to the required delays between the edges of the three sampling flipflops. IN
Master Clock
CLKA
OUT
CLKA
MAJ
CLKB
CLKB
CLKC CLKC Temporal Sampling
Synchronous Voting
CLKD
CLKD
(a)
(b)
Figure 43: Generic temporal sampling latch (a) and corresponding clock scheme (b) [55]. Time redundancy techniques take advantage of the transient characteristics of the pulse to compare signal values at different time instances [22]. This approach uses an extra latch to store the output of a combinatorial circuit, with a delay δ compared to the first latch. The delay value is set to δ = Dtrans + Dsetup , where Dtrans is maximum width of the transient pulse and Dsetup is the setup time of the latch. Figure 44 shows two possible implementations and the corresponding signal diagram. Reference [1] reported an approach to improve the radiation tolerance of SRAM-based field-programmable gate arrays (FPGAs). The method uses read-back error detection and reconfiguration for error correction. An EDAC method named transparent error correction (TEC) was introduced by MoSys to improve the reliability of their 1TRAM (also called 1T-SRAM) memories in 0.13 µm technology and below [61]. The approach includes ECC bits per 32-bit word at Hamming level 1 and is claimed to have no area penalty. Simulation results at an operating frequency of 200 MHz showed a reduction in SER from about 1,000 to about 10 FITs/Mbit. In summary, system solutions include, • redundancy in the coding, • temporal sampling, c Philips Electronics Nederland BV 2002
65
2002/828
Unclassified Report
Figure 44: Time redundancy approach for detecting single-event transient pulses, with (a) a single-clock and (b) a two-clock implementation, both giving the same signal diagram (c) [22]. • time redundancy, • read back error detection and reconfiguration (in FPGAs), • transparent error correction (in 1TRAM).
66
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
10 Discussion Companies that have activities in stand-alone SRAM and/or DRAM chips have considered SER sensitivity as an essential issue for many product generations. Figure 45 illustrates the methodology that is followed by TI to characterize SER via measurement and modeling [5, 6]. Figure 46 shows the time-scale on which the different actions on SER robustness are performed during the design process. Reference [6] formulated challenges in SER research in four separate fields, 1. Environment. To make reliable estimations of SER sensitivity, one needs information about the flux of ionizing particles, either alpha particles or cosmic neutrons. The cosmic ray flux is not sufficiently understood, especially at lower energies. To estimate the alpha flux in nominal situations more accurately, the characterization of the alpha emission by chip and package materials needs to be improved. 2. Modeling. Better models are needed to comprehend the impact of charge sharing as technology scaling continues. Accurate models are needed to allow early intervention at the design stage. 3. Characterization. The correlation between the SER obtained from accelerated testing and the system SER needs to be better understood. 4. System testing. Nowadays, system SER test at reduced voltages suffers from the drawback that the different scaling behavior of the individual sources is not understood well enough. System SER test methods have to be developed that both apply reduced voltages and provide accurate results. Alpha-accelerated SER testing is the main evaluation method to obtain information on alpha-induced SER. Such tests are relatively simple and do not require any specific test equipment, except for an alpha-emitting radiation source. Experimental results obtained from test chips designed in a given technology can be used to estimate SER for other designs in the same process, because the SER of different structures in the same technology is comparable, as was discussed in Sec. 5. Neutron-accelerated SER tests are much more complicated that alpha-accelerated tests. First, the application of a nuclear reactor facility is required, which is expensive and may give organizational problems. Furthermore, it will be difficult to perform tests in a different environment, because of the difference in test equipment, software, test vectors, etc. The use of simplified empirical models could be helpful, as was discussed in Sec. 5. Such empirical models need to be calibrated once with the use of experimental data for a given technology. Once the parameter values have been estimated, the model can be used for other designs in the same process. It is also possible to apply a scaled version of an empirical model that was constructed for a previous technology generation. The accuracy of empirical models is limited, but this is acceptable in situations where the alpha-induced SER component dominates over the neutron-induced contribution. This is c Philips Electronics Nederland BV 2002
67
2002/828
Unclassified Report
Figure 45: SER characterization methodology for stand-alone memory chips at TI [5, 6].
Figure 46: Actions on SER robustness performed during the design process of stand-alone memory chips at TI [5, 6].
68
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
the case for designs with relatively small critical charges, such as SRAM circuits. However, the neutron-induced component is dominant if the critical charges are large, which is the case for logic circuitry designed in present-day technologies. The relatively poor accuracy of the model may then be unacceptable and neutron-accelerated SER testing may be inevitable to obtain accurate SER estimates. The critical charges of the storage nodes are the key parameters at the circuit level. Circuit simulations computing the critical charges are needed for the prediction of the SER sensitivity of a given circuit configuration. The approach reported by [26] appears to be the most viable. The method simulates induced transient pulse by a current source that is added to a circuit netlist, with parasitic capacitances and diodes extracted from the actual layout. The shape of the current pulse needs to be taken into account, especially for circuits that are stabilized by a feedback construction, such as SRAM cells. As the SER sensitivity increases with ongoing technology scaling, solutions such as the use of purified materials, process improvements, and the application of device hardening techniques will not be sufficient to assure reliable designs. Eventually, the use of error detection and correction methods will be necessary, both for memories and for logic. Reference [11] stated that memory designs below 0.18 µm and logic design below the 0.13 µm technology node should include error correction. However, the decision to use ECC for a design in a given technology will depend on many factors. The reliability that is required depends on the application of the memory or logic design. Therefore, the application area is an important issue when considering the inclusion of error correction. In general, solutions to improve SER have specific penalties in speed, power dissipation, area, and design complexity. A trade-off has to be found between SER reliability and other design metrics. The impact at the system level of the SER of a subsystem, for example an embedded memory, has to be taken into account, see Sec. 7.2. Detailed studies are needed to decide if a particular measure to reduce SER is necessary. Otherwise, solutions could be included at a high cost that do not have a significant effect at the system level. The main conclusion should be that the increasing SER sensitivity of systems and subsystems cannot be ignored. Designers have to realize that the situation with respect to soft errors will not improve in coming years, unless revolutionary technologies emerge.
c Philips Electronics Nederland BV 2002
69
2002/828
Unclassified Report
References [1] P. Alfke and R. Padovani. Radiation tolerance of high-density FPGAs. In Proc. MAPLD98 Conf., 1998. [Online] http://www.xilinx.com/appnotes/HiDensityFPGAs.pdf. [2] G.M. Anelli. Design and characterization of radiation tolerant integrated circuits in deep submicron CMOS technologies for the LHC experiments. Ph.D. thesis, Dec. 2000. [Online] http://rd49.web.cern.ch/RD49. [3] Y. Arita, M. Takai, I. Ogawa, T. Kishimoto, K. Sonoda, and K. Tsutsumi. Soft error improvement in SRAMs by ion implanted shallow well structure. In Proc. Conf. Ion Implant. Techn., 2000. [4] A.E. Baranski, L.W. Massengill, D.O. Van Nort, J. Meng, and B.L. Bhuva. Single event faults in combinational logic — modeling vulnerability during VHDL design. In Proc. SRC Top. Res. Conf. Rel., Oct. 2000. [Online] http://www.sematech.org/public/news/conferences/Reliability4/. [5] R.C. Baumann. Soft error characterization and modeling methodologies at TI: Past, present, and future. In Proc. SRC Top. Res. Conf. Rel., Oct. 2000. [Online] http://www.sematech.org/public/news/conferences/Reliability4/. [6] R.C. Baumann. Silicon amnesia: a tutorial on radiation induced soft errors. IRPS, 2001. [A pdf file containing the slides of this tutorial is available on request from the author of the current Technical Note.]. [7] R.C. Baumann. Soft errors in advanced semiconductor devices—Part I: the three radiation sources. IEEE Trans. Dev. Mat. Rel., 1(1):17–22, Mar. 2001. [8] R.C. Baumann, T.Z. Hossain, S. Murata, and H. Kitagawa. Boron compounds as a dominant source of alpha particles in semiconductor devices. In Proc. IEEE Int. Rel. Phys. Symp. (IRPS), pages 297–302, 1995. [9] R.C. Baumann and E.B. Smith. Neutron-induced boron fission as a major source of soft errors in deep submicron SRAM devices. In Proc. IEEE Int. Rel. Phys. Symp. (IRPS), pages 152–157, 2000. [10] M.P. Baze, S.P. Buchner, and D. McMorrow. A digital CMOS design technique for SEU hardening. IEEE Trans. Nucl. Sci., 47(6):2603–2608, Dec. 2000. [11] J. Borel. Soft errors at ground level. IEEE Design & Test of Computers, 18(6):66– 67, Nov./Dec. 2001. [12] P.M. Carter and B.R. Wilkins. Influences on soft error rates in static RAMs. IEEE J. Solid-State Circuits, 22(3):430–436, June 1987. 70
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
[13] K. Castellani-Couli´e, J.M. Palau, G. Hubert, M.C. Calvet, P.E. Dodd, and F. Sexton. Various SEU conditions in SRAM studied by 3-D device simulation. IEEE Trans. Nucl. Sci., 48(6):1931–1936, Dec. 2001. [14] N. Cohen, T.S. Sriram, N. Leland, D. Moyer, S. Butler, and R. Flatley. Soft error considerations for deep-submicron CMOS circuit applications. In Proc. IEEE Int. Dev. Meet. (IEDM), pages 315–318, 1999. [15] C. Dai, N. Hakim, S. Hareland, J. Maiz, and S.W. Lee. Alpha-SER modeling & simulation for sub-0.25µm CMOS technology. In Proc. VLSI Tech., pages 81–82, 1999. [16] C. Detcheverry, C. Dachs, E. Lorf`evre, C. Sudre, G. Bruguier, J.M Palau, J. Gasiot, and R. Ecoffet. SEU critical charge and sensitive area in a submicron CMOS technology. IEEE Trans. Nucl. Sci., 44(6):2266–2273, Dec. 1997. [17] P.E. Dodd. Device simulation of charge collection and single-event upset. IEEE Trans. Nucl. Sci., 43(2):561–575, April 1996. [18] P.E. Dodd and F.W. Sexton. Critical charge concepts for CMOS SRAMs. IEEE Trans. Nucl. Sci., 42(6):1764–1771, Dec. 1995. [19] P.E. Dodd, F.W. Sexton, G.L. Hash, M.R. Shaneyfelt, B.L Draper, A.J. Farino, and R.S. Flores. Impact of technology trends on SEU in CMOS SRAMs. IEEE Trans. Nucl. Sci., 43(6):2797–2804, Dec. 1996. [20] P.E. Dodd, M.R. Shaneyfelt, E. Fuller, J.C. Pickel, F.W. Sexton, and P.S. Winokur. Impact of substrate thickness on single-event effects in integrated circuits. IEEE Trans. Nucl. Sci., 48(6):1865–1871, Dec. 2001. [21] P.E. Dodd, M.R. Shaneyfelt, K.M. Horn, D.S. Walsh, G.L. Hash, T.A. Hill, B.L. Draper, J.R. Schwank, F.W. Sexton, and P.S. Winokur. SEU-sensitive volumes in bulk and SOI SRAMs from first-principles calculations and experiments. IEEE Trans. Nucl. Sci., 48(6):1893–1903, Dec. 2001. [22] E. Dupont, M. Nicolaidis, and P. Rohr. Embedded robustness IPs for transienterror-free ICs. IEEE Design & Test of Computers, 19(3):56–70, May/June 2002. [23] L.D. Edmonds. Electric currents through ion tracks in silicon devices. IEEE Trans. Nucl. Sci., 45(6):3153–3164, Dec. 1998. [24] L.D. Edmonds. A time-dependent charge-collection efficiency for diffusion. IEEE Trans. Nucl. Sci., 48(5):1609–1622, Oct. 2001. [25] A. Eto, M. Hidaka, Y. Okuyama, K. Kimura, and M. Hosono. Impact of neutron flux on soft errors in MOS memories. In Proc. IEEE Int. Dev. Meet. (IEDM), pages 367–370, 1998. [26] L.B. Freeman. Critical charge calculations for a bipolar SRAM array. IBM J. Res. Dev., 40(1):119–129, Jan. 1996. c Philips Electronics Nederland BV 2002
71
2002/828
Unclassified Report
[27] S. Hareland, C. Dai, S. Walsta, and J. Maiz. Intel perspectives on SER. In Proc. SRC Top. Res. Conf. Rel., Oct. 2000. [Online] http://www.sematech.org/public/news/conferences/Reliability4/. [28] S. Hareland, J. Maiz, M. Alavi, K. Mistry, S. Walsta, and C. Dai. Impact of CMOS process scaling and SOI on the soft error rates of logic processes. In Proc. VLSI Tech., pages 73–74, 2001. [29] P. Hazucha and C. Svensson. Cosmic ray soft error rate characterization of a standard 0.6-µm CMOS process. IEEE J. Solid-State Circuits, 35(10):1422–1429, Oct. 2000. [30] P. Hazucha and C. Svensson. Impact of CMOS technology scaling on the atmospheric neutron soft error rate. IEEE Trans. Nucl. Sci., 47(6):2586–2594, Dec. 2000. [31] P. Hazucha and C. Svensson. Optimized test circuits for SER characterization of a manufacturing process. IEEE Trans. Solid-State Circuits, 35(2):142–148, Feb. 2000. [32] C.M. Hsieh, P.C. Murley, and R.R. O’Brien. Dynamics of charge collection from alpha-particle tracks in integrated circuits. In Proc. IEEE Int. Rel. Phys. Symp. (IRPS), pages 38–39, 1981. [33] C.M. Hsieh, Ph.C. Murley, and R.R. O’Brien. A field-funneling effect on the collection of alpha-particle-generated carriers in silicon devices. IEEE Elec. Dev. Lett., 2(4):103–105, Apr. 1981. [34] C.M. Hsieh, Ph.C. Murley, and R.R. O’Brien. Collection of charge from alphaparticle tracks in silicon devices. IEEE Trans. Elec. Dev., 30(6):686–693, June 1983. [35] C. Hu. Alpha-particle-induced field and enhanced collection of carriers. IEEE Elec. Dev. Lett., 3(2):31–34, Feb. 1982. [36] G. Hubert, J.M. Palau, K. Castellani-Couli´e, M.C. Calvet, and S. Fourtine. Detailed analysis of secondary ions’ effect for the calculation of neutron-induced SER in SRAMs. IEEE Trans. Nucl. Sci., 48(6):1953–1959, Dec. 2001. [37] G. Hubert, J.M. Palau, Ph. Roche, B. Sagnes, J. Gasiot, and M.C. Calvet. Study of basic mechanisms induced by an ionizing particle on simple structures. IEEE Trans. Nucl. Sci., 47(3):519–526, June 2000. [38] S.H. Hwang and G.S. Choi. A reliability testing environment for off-the-shelf memory subsystems. IEEE Design & Test of Computers, 17(3):116–124, July 2000. [39] H. Iwata and T. Ohzone. Numerical analysis of alpha-particle-induced soft errors in SOI MOS devices. IEEE Trans. Elec. Dev., 39(5):1184–1190, May 1992. 72
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
[40] JESD89. Measurement and reporting of alpha particles and terrestrial cosmic rayinduced soft errors in semiconductor devices, Aug. 2001. [Online] http://www.jedec.org/. [41] A.H. Johnston. Scaling and technology issues for soft error rates. In Proc. SRC Top. Res. Conf. Rel., Oct. 2000. [Online] http://www.sematech.org/public/news/conferences/Reliability4/. [42] T. Juhnke and H. Klar. Calculation of the soft error rate of submicron CMOS logic circuits. IEEE J. Solid-State Circuits, 30(7):830–834, July 1995. [43] S.Y. Kan. Nuclear radiation induced soft errors in integrated electronics devices. Part 1. Literature review. Nat.Lab. Report 5704, 1981. [44] S.Y. Kan. Some soft error measurements on video frame CCD memories. Part 2 of nuclear radiation induced soft errors in integrated electronic devices. Nat.Lab. Technical Note 1982/032, 1982. [45] T. Karnik, B. Bloechel, K. Soumyanath, V. De, and S. Borkar. Scaling trends of cosmic ray induced soft errors in static latches beyond 0.18µ. In Proc. Symp. VLSI Circ., pages 61–62, 2001. [46] T. Karnik, S. Vangal, V. Veeramachaneni, P. Hazucha, V. Erraguntla, and S. Borkar. Selective node engineering for chip-level soft error rate improvement. In Proc. Symp. VLSI Circ., pages 204–205, 2002. [47] S. Kirkpatrick. Modeling diffusion and collection of charge from ionizing radiation in silicon devices. IEEE Trans. Elec. Dev., 26(11):1742–1753, Nov. 1979. [48] K. de Kort and P. Damink. α-particle induced soft errors in 16K SRAMs. Nat.Lab. Report 6200, 1987. [49] C. Lage, D. Burnett, T. McNelly, K. Baker, A. Bormann, D. Dreier, and V. Soorholtz. Soft error rate and stored charge requirements in advanced highdensity SRAMs. In Proc. IEEE Int. Dev. Meet. (IEDM), pages 821–824, 1993. [50] R. van Langevelde, A.J. Annema, D.B.M. Klaassen, A.H. Montree, H.J.M. Veendrick, and P.H. Woerlee. A comparison between silicon-on-insulator and bulk CMOS for low-voltage applications. Nat.Lab. Technical Note 204/99, 1999. [51] M. Lee, W.I. Sze, and C.M.M. Wu. Static noise margin and soft-error rate simulations for thin film transistor cell stability in a 4 Mbit SRAM design. In Proc. IEEE Int. Symp. Circ. Syst. (ISCAS), volume 2, pages 937–940, 1995. [52] J.R. Letaw and E. Normand. Guidelines for predicting single-event upsets in neutron environments. IEEE Trans. Nucl. Sci., 38(6):1500–1506, Dec. 1991. [53] W. Leung, F.C. Hsu, and M.E. Jones. The ideal SoC memory: 1T-SRAM. In Proc. 13th Ann. IEEE Int. ASIC/SOC conference, pages 32–36, 2000. c Philips Electronics Nederland BV 2002
73
2002/828
Unclassified Report
[54] F.J. List, M. Vertregt, and B. Walsh. Documentation on Backend PEM modules: double 1K1 SRAM for soft error analysis; stand alone cell modules; alpha-particle sensor (Mega Note M89/16). Nat.Lab. Technical Note 146/89, 1989. [55] D.G. Mavis and P.H. Eaton. Soft error rate mitigation techniques for modern microcircuits. In Proc. IEEE Int. Rel. Phys. Symp. (IRPS), pages 216–225, 2002. [56] T.C. May and M.H. Woods. A new physical mechanism for soft errors in dynamic memories. In Proc. IEEE Int. Rel. Phys. Symp. (IRPS), pages 33–40, 1978. [57] T.C. May and M.H. Woods. Alpha-particle-induced soft errors in dynamic memories. IEEE Trans. Elec. Dev., 26(1):2–9, Jan. 1979. [58] W.R. McKee, H.P. McAdams, E.B. Smith, J.W. McPherson, J.W. Janzen, J.C. Ondrusek, A.E. Hyslop, D.E. Russell, R.A. Coy, D.W. Bergman, N.Q. Nguyen, T.J. Aton, L.W. Block, and V.C. Huynh. Cosmic ray neutron induced upsets as a major contributor to the soft error rate of current and future generation DRAMs. In Proc. IEEE Int. Rel. Phys. Symp. (IRPS), pages 1–6, 1996. [59] P.J. McNulty. Extracting SER parameters from circuit simulation codes. In Proc. SRC Top. Res. Conf. Rel., Oct. 2000. [Online] http://www.sematech.org/public/news/conferences/Reliability4/. [60] P.J. McNulty, Ph. Roche, J.M. Palau, and J. Gasiot. Threshold LET for SEU induced by low energy ions. IEEE Trans. Nucl. Sci., 46(6):1370–1377, Dec. 1999. [61] MoSys. 1T-SRAM quality, May 2002. [62] P.C. Murley and G.R. Srinivasan. Soft-error Monte Carlo modelling program, SEMM. IBM J. Res. Dev., 40(1):109–118, Jan. 1996. [63] O. Musseau. Single-event effects in SOI technologies and devices. IEEE Trans. Nucl. Sci., 43(2):603–613, Apr. 1996. [64] E. Normand. Single event upset at ground level. 43(6):2742–2750, Dec. 1996.
IEEE Trans. Nucl. Sci.,
[65] T.J. O’Gorman. The effect of cosmic rays on the soft error rate of a DRAM at ground level. IEEE Trans. Elec. Dev., 41(4):553–557, April 1994. [66] T.J. O’Gorman, J.M. Ross, A.H. Taber, J.F. Ziegler, H.P. Muhlfeld, C.J. Montrose, H.W. Curtis, and J.L. Walsh. Field testing for cosmic ray soft errors in semiconductor memories. IBM J. Res. Dev., 40(1):41–49, Jan. 1996. [67] T.C. Ong. TSMC 6T-SRAM soft error rate (SER) summary. TSMC, Oct. 2001. [68] F. Ootsuka, M. Nakamura, T. Miyake, S. Iwahashi, Y. Ohira, T. Tamaru, K. Kikushima, and K. Yamaguchi. A novel 0.20 µm full CMOS SRAM cell using stacked cross couple with enhanced soft error immunity. In Proc. IEEE Int. Dev. Meet. (IEDM), pages 205–208, 1998. 74
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
[69] J.M. Palau, G. Hubert, K. Coulie, B. Sagnes, M.C. Calvet, and S. Fourtine. Device simulation study of the SEU sensitivity of SRAMs to internal ion tracks generated by nuclear reactions. IEEE Trans. Nucl. Sci., 48(2):225–231, April 2001. [70] Ph. Roche, J.M. Palau, K. Belhaddad, G. Bruguier, R. Ecoffet, and J. Gasiot. SEU response of an entire SRAM cell simulated as one contiguous three dimensional device domain. IEEE Trans. Nucl. Sci., 45(6):2534–2543, Dec. 1998. [71] Ph. Roche, J.M. Palau, G. Bruguier, C. Tavarnier, R. Ecoffet, and J. Gasiot. Determination of key parameters for SEU occurrence using 3-D full cell SRAM simulations. IEEE Trans. Nucl. Sci., 46(6):1354–1362, Dec. 1999. [72] G.A Sai-Halasz. Cosmic ray induced soft error rate in VLSI circuits. IEEE Elec. Dev. Lett., 4(6):172–174, June 1983. [73] G.A Sai-Halasz and M.R. Wordeman. Monte Carlo modeling of the transport of ionizing radiation created carriers in integrated circuits. IEEE Elec. Dev. Lett., 1(10):211–213, Oct. 1980. [74] G.A Sai-Halasz, M.R. Wordeman, and R.H. Dennard. Alpha-particle-induced soft error rate in VLSI circuits. IEEE Trans. Elec. Dev., 29(4):725–731, Apr. 1982. [75] S. Satoh, R. Sudo, H. Tashiro, N. Higaki, and N. Nakayama. CMOS-SRAM softerror simulation system. In Proc. IEEE Int. Rel. Phys. Symp. (IRPS), pages 339– 343, 1994. [76] S. Satoh, Y. Tosaka, and S.A. Wender. Geometric effect of multiple-bit soft errors induced by cosmic ray neutrons on DRAMs. IEEE Elec. Dev. Lett., 21(6):310–312, June 2000. [77] N. Seifert, D. Moyer, N. Leland, and R. Hokinson. Historical trend in alpha-particle induced soft error rates of the Alpha microprocessor. In Proc. IEEE Int. Rel. Phys. Symp. (IRPS), pages 259–265, 2001. [78] N. Seifert, X. Zhu, D. Moyer, R. Mueller, R. Hokinson, N. Leland, M. Shade, and L. Massengill. Frequency dependence of soft error rates for sub-micron CMOS technologies. In Proc. IEEE Int. Dev. Meet. (IEDM), pages 323–326, 2001. [79] F.W. Sexton, W.T. Corbett, R.K. Treece, K.J. Hass, K.L. Hughes, C.L. Axness, G.L. Hash, M.R. Shaneyfelt, and T.G. Wunsch. SEU simulation and testing of resistor-hardened D-latches in the SA3300 microprocessor. IEEE Trans. Nucl. Sci., 38(6):1521–1528, Dec. 1991. [80] H. Shin. Modeling of alpha-particle-induced soft error rate in DRAM. IEEE Trans. Elec. Dev., 46(9):1850–1857, Sep. 1999. [81] D. Sinitsky, S. Peng, J. Wang, T.C. Ong, E. Chen, and F.C. Hsu. SER reliability of 1TRAM designs. In Proc. IEEE Int. Rel. Phys. Symp. (IRPS), pages 226–229, 2002. c Philips Electronics Nederland BV 2002
75
2002/828
Unclassified Report
[82] G.R. Srinivasan. Modeling the cosmic-ray-induced soft-error rate in integrated circuits: An overview. IBM J. Res. Dev., 40(1):77–89, Jan. 1996. [83] G.R. Srinivasan, P.C. Murley, and H.K. Tang. Accurate, predictive modeling of soft error rate due to cosmic rays and chip alpha radiation. In Proc. IEEE Int. Rel. Phys. Symp. (IRPS), pages 12–16, 1994. [84] G.R. Srinivasan, H.K. Tang, and P.C. Murley. Parameter-free, predictive modeling of single event upsets due to protons, neutrons, and pions in terrestrial cosmic rays. IEEE Trans. Nucl. Sci., 41(6):2063–2070, Dec. 1994. [85] ST. eSRAM single upset event. Application note (draft). [86] M. Takai, Y. Arita, S. Abo, T. Iwamatsu, S. Maegawa, H. Sayama, Y. Yamaguchi, M. Inuishi, and T. Nishimura. Evaluation of soft errors in DRAM and SRAM using nuclear microprobe and neutron source. In Proc. Eur. Solid-State Dev. Res. Conf. (ESSDERC), 2001. [87] E. Takeda, K. Takeuchi, D. Hisamoto, T. Toyabe, K. Ohshima, and K. Itoh. A cross section of α-particle-induced soft-error phenomena in VLSI’s. IEEE Trans. Elec. Dev., 36(11):2567–2575, Nov. 1989. [88] H.H.K. Tang. Nuclear physics of cosmic ray interaction with semiconductor materials: Particle-induced soft errors from a physicist’s perspective. IBM J. Res. Dev., 40(1):91–108, Jan. 1996. [89] Y. Tosaka, H. Kanata, T. Itakura, and S. Satoh. Simulation technologies for cosmic ray neutron-induced soft errors: Models and simulation systems. IEEE Trans. Nucl. Sci., 46(3):774–780, June 1999. [90] Y. Tosaka, H. Kanata, S. Satoh, and T. Itakura. Simple method for estimating neutron-induced soft error rates based on modified BGR model. IEEE Elec. Dev. Lett., 20(2):89–91, Feb. 1999. [91] Y. Tosaka and S. Satoh. Soft error modeling and simulation for SOI circuits. In Proc. SRC Top. Res. Conf. Rel., Oct. 2000. [Online] http://www.sematech.org/public/news/conferences/Reliability4/. [92] Y. Tosaka, S. Satoh, T. Itakura, H. Ehara, T. Ueda, G.A. Woffinden, and S.A. Wender. Measurement and analysis of neutron-induced soft errors in sub-half-micron CMOS circuits. IEEE Trans. Elec. Dev., 45(7):1453–1458, July 1998. [93] Y. Tosaka, S. Satoh, T. Itakura, K. Suzuki, T. Sugii, H. Ehara, and G.A. Woffinden. Cosmic ray neutron-induced soft errors in sub-half micron CMOS circuits. IEEE Elec. Dev. Lett., 18(3):99–101, Mar. 1997. [94] Y. Tosaka, S. Satoh, K. Suzuki, T. Sugii, H. Ehara, G.A. Woffinden, and S.A. Wender. Impact of cosmic ray neutron induced soft errors on advanced submicron CMOS circuits. In Proc. VLSI Tech., pages 148–149, 1996. 76
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
[95] Y. Tosaka, S. Satoh, K. Suzuki, T. Sugii, N. Nakayama, H. Ehara, G.A. Woffinden, and S.A. Wender. Measurements and analysis of neutron-reaction-induced charges in a silicon surface region. IEEE Trans. Nucl. Sci., 44(2):173–178, Apr. 1997. [96] R. Velazco, D. Bessot, S. Duzellier, R. Ecoffet, and R. Koga. Two CMOS memory cells suitable for the design of SEU-tolerant VLSI circuits. IEEE Trans. Nucl. Sci., 41(6):2229–2233, Dec. 1994. [97] VLSI Technology, Inc. Radiation hardness of SRAM devices in 0.20µm process. Internal Document Corporate Quality and Reliability. [98] F. Wrobel, J.M. Palau, M.C. Calvet, O. Bersillon, and H. Duarte. Simulation of nucleon-induced nuclear reactions in a simplified SRAM structure: Scaling effects on SEU and MBU cross sections. IEEE Trans. Nucl. Sci., 48(6):1946–1952, Dec. 2001. [99] D.S. Yaney, J.T. Nelson, and L.L. Vanskike. Alpha-particle tracks in silicon and their effect on dynamic MOS RAM reliability. IEEE Trans. Elec. Dev., 26(1):10– 16, Jan. 1979. [100] K. Zhang, S. Hareland, B. Senyk, and J. Maiz. Methods for reducing soft errors in deep submicron integrated circuits. In Proc. Int. Conf. Solid-State and Integr. Circ. Techn., pages 516–519, 1998. [101] J.F. Ziegler. Terrestrial cosmic rays. IBM J. Res. Dev., 40(1):19–39, Jan. 1996. [102] J.F. Ziegler. Terrestrial cosmic ray intensities. IBM J. Res. Dev., 42(1):117–139, Jan. 1998. [103] J.F. Ziegler. Review of accelerated testing of SRAMs. In Proc. SRC Top. Res. Conf. Rel., Oct. 2000. [Online] http://www.sematech.org/public/news/conferences/Reliability4/. [104] J.F. Ziegler and J.P. Biersack. SRIM stopping and range of ions in matter program. [Online] http://www.SRIM.org/. [105] J.F. Ziegler, H.W. Curtis, H.P. Muhlfeld, C.J. Montrose, B. Chin, M. Nicewicz, C.A. Russell, W.Y. Wang, L.B. Freeman, P. Hosier, L.E. LaFave, J.L. Walsh, J.M. Orro, G.J. Unger, J.M. Ross, T.J. O’Gorman, B. Messina, T.D. Sullivan, A.J. Sykes, H. Yourke, T.A. Enger, V. Tolat, T.S. Scott, A.H. Taber, R.J. Sussman, W.A. Klein, and C.W. Wahaus. IBM experiments in soft fails in computer electronics (1978– 1994). IBM J. Res. Dev., 40(1):3–18, Jan. 1996. [106] J.F. Ziegler and W.A. Lanford. Effect of cosmic rays on computer memories. Science, 206:776–788, Nov. 1979. [107] J.F. Ziegler and W.A. Lanford. The effect of sea level cosmic rays on electronic devices. J. App. Phys., 52(6):4305–4311, June 1981. c Philips Electronics Nederland BV 2002
77
2002/828
Unclassified Report
[108] J.F. Ziegler, H.P. Muhlfeld, C.J. Montrose, H.W. Curtis, T.J. O’Gorman, and J.M. Ross. Accelerated testing for cosmic soft-error rate. IBM J. Res. Dev., 40(1):51–72, Jan. 1996. [109] J.F. Ziegler, P.A. Saunders, and T.H. Zabel. Portable Faraday cup for nonvacuum proton beams. IBM J. Res. Dev., 40(1):73–76, Jan. 1996.
78
c Philips Electronics Nederland BV 2002
Unclassified Report
2002/828
Author(s)
Tino Heijmen
Title
Radiation-induced soft errors in digital circuits A literature survey
Distribution Nat.Lab./PI PRL PRB LEP PFL CIP
WB-5 Redhill, UK Briarcliff Manor, USA Limeil–Br´evannes, France Aachen, BRD WAH
Director: Department Head:
Gerard Beenker Ad ten Berg
WAY-52 WAY-42
Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven
WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41
Abstract
Abbo, Anteneh Bennebroek, Martijn Benschop, Nico Choudhary, Vishal Danilin, Alexander Dielissen, John Garg, Manish Goel, Sandeep Goossens, Martijn Kaam, Kees van Katoch, Atul Koll, Harry van Krishnan, Rohini Kruseman, Bram Marinissen, Erikjan Mooy, Dick de Nieuwland, Andr´e Oetelaar, Stefan van den Pessolano, Fransesco Pineda de Gyvez, Jos´e Rao, Kiran Schilders, Wil Sevat, Leo Vaassen, Ad Veen, Rutger van Vermeulen, Bart c Philips Electronics Nederland BV 2002
79
2002/828
Unclassified Report
Vranken, Harald Waterlander, Erwin Wilde, Hans de
Research Eindhoven Research Eindhoven Research Eindhoven
WAY-41 WAY-41 WAY-41
Damink, Paul Vertregt, Maarten
Research Eindhoven Research Eindhoven
WY-61 WAY-51
Cuppens, Roger Dijk, Steven van Ditewig, Ton Huisken, Jos Kleihorst, Richard Meijer, Maurice Meijer, Peter Salters, Roelof Veendrick, Harry Wielage, Paul Zieren, Victor
Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven Research Eindhoven
WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41 WAY-41
Azimane, Mohamed Gronthoud, Guido Lousberg, Maurice Woltjer, Reinout
ED&T ED&T ED&T Research Eindhoven
WAY-31 WAY-31 WAY-31 WAY-42
Bellefroid, Loek Bussel, Erik van Cloudt, Fred Goumans, Leon Habraken, Bert Hendrickx, Johan Mortel, Peter van de List, Frans Schrooten, Guido Steeg, Patrick van de Thorn, Simon Winkelhoff, Koos van Wouters, Clemens
Semiconductors/AMDC Semiconductors/AMDC Semiconductors/LTG Semiconductors Semiconductors Semiconductors/AMDC Consumer Electronics Semiconductors/CTO Semiconductors/LTG Semiconductors/AMDC Semiconductors/LTG Semiconductors/AMDC Semiconductors/LTG
WAY-11 WAY-11 WAY-11 Nijmegen WAY-11 WAY-11 SFJ6 Eindhoven WAY-11 WAY-11 Nijmegen WAY-11 WAY-11
Ashby, Phil Bisschop, Jaap Chen, Y.L. Courteille, Gerard Dela Cruz, Danny Deloraine, Laurent
Semiconductors Semiconductors/QMS Semiconductors Semiconductors Semiconductors Semiconductors
Southampton Nijmegen Kaohsiung (Taiwan) Caen Calamba (Philippines) Caen
Full report
80
c Philips Electronics Nederland BV 2002
Unclassified Report
Ferrazzi, Delphine Fischer, Markus Full, Bill Gandhi, Joy Haas, Trudy Hinds, Sylvia Ketnirat, Udom Keyman, Jaap Kieffer, Thierry Krieg, Roland Kuper, Fred Lefebvre, Jean-Luc Liang, Zhongning Lijbers, Gijs Mayer, Kurt Narayan, S. Rongen, Ren´e Schmidt, Wolfgang-BLI Schravendeel, Ronald Vaal, Kees de Woicke, Matthias Yen, P.S.
c Philips Electronics Nederland BV 2002
2002/828
Semiconductors Semiconductors Semiconductors Semiconductors Semiconductors Semiconductors Semiconductors Semiconductors/QMS Semiconductors Semiconductors Semiconductors Semiconductors Semiconductors/Q&R Semiconductors/QMS Semiconductors/Q&R Semiconductors/QMS Semiconductors Semiconductors/BLI Semiconductors/ATO Semiconductors Semiconductors/MaCS Semiconductors
Caen Boeblingen (Germany) Sunnyvale Sunnyvale Sunnyvale Sunnyvale Bangkok Eindhoven Z¨urich Boeblingen (Germany) Nijmegen Caen Nijmegen Eindhoven Z¨urich Eindhoven Nijmegen Hamburg Nijmegen Nijmegen Hamburg Kaohsiung (Taiwan)
81