A Sensor to Detect Normal or Reverse Temperature Dependence in Nanoscale CMOS Circuits David Wolpert and Paul Ampadu ECE Dept., University of Rochester, Rochester, NY 14627
[email protected],
[email protected] Abstract The temperature dependence of MOSFET drain current varies with supply voltage. Two distinct voltage regions exist—a normal dependence (ND) region where an increase in temperature decreases drain current, and a reverse dependence (RD) region where an increase in temperature increases drain current. Knowledge of the temperature dependence is critical for avoiding overheating and wasted performance from excessive guardbands. In this paper, we present the first temperature dependence sensor to detect whether a system is operating in the ND or RD region. The dependence sensor occupies an area of 985 NAND2 equivalent gates. The sensor consumes 15.9 pJ per sample at a supply voltage of 1 V, with a 1°C resolution over the military-specified temperature range of -55°C to 125°C.
1. Introduction Changes in temperature affect system speed, power, and reliability by altering the threshold voltage V T and mobility µ in each device [1]. The resulting changes in device current can lead to timing failure or cause circuits to exceed power or energy budgets. Two temperature dependences exist—the normal dependence (ND) region, where drain current ID (and thus, device speed) decreases with increasing temperature, and the reverse dependence (RD) region, where ID increases with increasing temperature [2]. Between these two regions there is a temperature-insensitive supply voltage V INS [3], above which circuits operate in the ND region, and below which circuits operate in the RD region. Existing temperature sensor designs do not consider this change in temperature dependence. Temperature sensors have two purposes in integrated circuit design—to ensure reliability by (i) preventing physical damage to a chip from overheating and thermal runaway, and (ii) alerting the system when a detected temperature may cause circuits to exceed their timing requirements. Many integrated temperature sensors [4][8] measure differences in BJT base-emitter voltage (∆V BE) [4]-[6] or circuit delay [7][8] to determine operating temperature. The RD region is not observable in ∆V BE [9], making sensors based on these inputs less useful for detecting temperature-induced timing failure in the RD region than MOSFET-based designs. In this paper, we discuss a method of determining a circuit’s temperature dependence with oscillator-based temperature sensors [7][8]. In technologies larger than ~45 nm, the RD region only occurs at very low voltages [10][11]; thus, prior integrated temperature sensors are designed for the ND region, and indicate that a circuit is overheating when the oscillator delay or device current increases beyond some threshold. Unfortunately, the RD region has different failure criteria; a circuit in the RD region will be overheating when oscillator delay decreases beyond some threshold. In addition, a circuit in the RD region will fail timing if its temperature decreases beyond a threshold, ‘overcooling’ rather than overheating.
Table 1. VINS approaches VNOM as technology scales Technology VNOM VINS VINS/VNOM 90 nm 1.2 V 0.37 V 0.31 65 nm 1.1 V 0.40 V 0.36 45 nm 1.0 V 0.61 V 0.61 32 nm 0.9 V 0.69 V 0.77 22 nm 0.8 V 0.73 V 0.91 As technology scales, it will become increasingly important to include both temperature dependence regions in sensor design. As shown in Table 1, V INS is fast approaching the designated nominal supply voltage V NOM in nanoscale technologies [10][11]. The data in Table 1 were generated using predictive technology models [12], with the 45 nm, 32 nm, 22 nm nodes using high-κ dielectric/metal gate models. As VINS approaches V NOM , adaptive systems which vary supply voltage to reduce energy consumption or improve reliability [13]-[16] will have operating voltages in both the ND and RD regions. If the temperature dependence is not known, a decrease in oscillator delay could indicate that (i) the circuit is in the ND region and is not overheating, or (ii) the circuit is in the RD region and is overheating. Thus, including the RD region in sensor design is critical for detecting overheating circuits and diagnosing timing failures. To include both of these dependences, a system is needed to detect the temperature dependence region in which a circuit is operating. This paper presents a sensor that achieves these goals, allowing it to be used to detect overheating and timing failures over a wide range of operating voltages. The remainder of the paper is organized as follows: Section 2 provides an overview of the normal and reverse temperature dependences. In Section 3, we present a sensor to determine the temperature dependence region of a circuit at a given supply voltage. Results are presented in Section 4, and conclusions are given in Section 5.
2. Normal and reverse temperature dependences MOSFET mobility and threshold voltage are related to temperature by the following empirical expressions [1]:
µ = µ 0 (T / T0 )α µ VT (T ) = VT 0 + α VT (T − T0 ),
(1) (2)
where T is the temperature, T0 is the nominal temperature, µ0 is the mobility at T0, αµ is an empirical parameter referred to as the mobility temperature exponent, VT0 is the threshold voltage at T0, and αVT = ∂VT/∂T is a negative constant referred to as the threshold voltage temperature coefficient. In the temperature region of concern (between -55°C and 125°C, the range of military operating temperatures [17]), both µ and VT decrease with increasing temperature [1]; however, decreasing µ decreases ID while decreasing VT increases ID. When the impact of a change in µ on ID is larger than the impact of a change in VT on ID, increasing temperature decreases ID (the normal temperature dependence). When the VT impact on ID is dominant, increasing temperature increases ID (the reverse temperature dependence). The dominant factor depends on the supply voltage, VDD; as VDD approaches VT, a small change in VT causes
(a)
(b)
Fig. 1. Temperature dependence of device current across a range of supply voltages for (a) 90 nm technology, (b) 22 nm high-κ κ/metal gate technology. a larger change in ID. VINS is the voltage at which the µ and VT impacts on ID counterbalance each other, i.e. the value of VDD at which
∂µ ∂I D (V DD ) ∂VT ∂I D (V DD ) =0 + ⋅ ⋅ ∂µ ∂VT ∂T ∂T
(3)
where ∂µ/∂T is the temperature dependence of mobility [18], ∂ID(VDD)/∂µ is the mobility impact on device current for a given supply voltage (∂ID(VDD)/∂µ > 0 in the phonon scattering-limited regime), ∂VT/∂T is the temperature dependence of the threshold voltage [1], and ∂ID(VDD)/∂VT is the threshold voltage impact on device current for a given supply voltage (∂ID(VDD)/∂VT < 0). Circuits operate in the ND region when VDD > VINS, are nearly insensitive to temperature when VDD = VINS, and operate in the RD region when VDD < VINS. The dependence regions are shown in Fig. 1(a) for a set of I-V curves over the range of military operating temperatures, using diode-connected PMOS and NMOS devices from a 90 nm technology model [12]. In Fig. 1, VINS occurs inside of the shaded regions. VINS occurs at increasingly larger supply voltages as technology scales, particularly with the use of high-κ dielectrics to replace SiO2 [4] (high-κ dielectrics reduce µ and change ∂µ/∂T [19], altering the balance of the µ and VT impacts on ID). The ID dependences in a 22 nm technology with high-κ dielectrics are shown in Fig. 1(b). VINS in the PMOS device increases from ~375 mV at 90 nm to ~575 mV at 22 nm. The 22 nm NMOS device is in the RD region over the entire range of supply voltages.
3. Sensing temperature dependence This section proposes a system making use of two component temperature sensors to create a temperature dependence sensor. Ring oscillator-based temperature sensors provide a good indication of circuit speed (which is not necessarily dependent only on temperature— voltage stability is important in these sensors). In the ND region, this information can determine if circuits are overheating; when the oscillator frequency fosc slows below a certain threshold frequency fTh (defined by the temperature at which a system may fail, with suitable guardband), cooling systems can be triggered or operating frequency can be throttled. fTh is dependent on the size of the oscillator and the operating voltage, VOP; however, when VOP is reduced to the RD region, overheating is indicated by fosc > fTh, not fosc < fTh. This important difference will cause existing sensors to misdiagnose overheating circuits in the RD region.
To change the overheating criteria for the ND and RD regions, we must have a sensor which can determine the dependence region of a circuit, TempDep, which may be represented as
∂τ 1, ∂T ≥ 0 ( ND) TempDep = ∂τ 0, < 0 ( RD) ∂T
(4)
where ∂τ/∂T is the change in delay induced by a change in temperature. The dependence sensor requires a change in temperature to provide a reading. If the sensor is in an environment where temperature changes very slowly, the sample time may have to be reduced before a reading can be provided. For system design, this is of limited importance; if the temperature is not changing, there is no temperature impact on delay. The temperature dependence can be determined using two digital temperature sensors; one at the same operating voltage as the circuit being monitored, VOP, and another at a voltage with a known temperature dependence VKTD (far enough from VINS that even with process and voltage variation the temperature dependence will remain known). If the two sensor readings change in the same direction, the dependence at VOP matches the known dependence at VKTD; if the readings change in opposite directions, the dependence at VOP is the opposite of the known dependence. Thus, the sensor output Sens_Out can be represented as
1, TempDep(VOP ) ≠ TempDep(V KTD ) Sens _ Out = 0, TempDep(VOP ) = TempDep(V KTD )
(5)
3.1. Temperature dependence sensor The temperature dependence sensor implementation is shown in Fig. 2. Two component temperature sensors are used which simultaneously take readings at the chosen VKTD and the operating voltage of the system, VOP (these temperature readings are referred to as TKTD,t and TOP,t, respectively). Low-voltage up-shifters [20] are used to convert the lower voltage component sensor output to the higher component sensor voltage to facilitate comparison. Each reading is compared with its previous sample, TKTD,t-1 and TOP,t-1, which are stored in buffers. The comparators are composed of the overflow circuit of a kill-zero carry lookahead adder, and indicate if Tt ≥ Tt-1 or Tt < Tt-1. The two comparator outputs are then passed into an XOR gate, which determines if the temperature dependence at VOP is the known dependence or if it is the opposite dependence.
Fig. 2. Schematic of temperature dependence sensor.
As mentioned, if the temperature does not change (or the change is too small to affect the temperature sensor outputs), the comparator outputs will indicate Tt ≥ Tt-1 regardless of the actual temperature dependence. For example, if the temperature sensor resolution is 1°C and the temperature only changes by 0.5°C between readings, there may not be a change in the component sensor outputs, so no dependence decision can be made. To account for this possibility, a Valid signal is needed to tell the system if the current temperature dependence reading is meaningful (i.e. if the temperature has changed by an amount large enough to change both temperature sensor outputs). If a reading is not valid, then the temperatureinduced delay change between samples was too small to detect, and the component sensor sample times should be increased before another measurement is performed. The Valid signal is generated by comparing a bitwise XOR of TKTD,t and TKTD,t-1, and a bitwise XOR of TOP,t and TOP,t-1. These vectors are reduced to one bit outputs with an OR tree (not shown), and passed into an AND gate to determine if both sensor outputs have changed and the result is valid. 3.1. Component temperature sensors A low-complexity component temperature sensor design [8] is used to reduce area and energy overhead, allowing the sensor to be replicated as necessary to create a comprehensive thermal map of a chip, as well as determining multiple temperature dependences in chips which may have multiple voltage islands. Low overhead is particularly important for temperature sensors; in the Power5 microprocessor, 24 temperature sensors were used to generate a comprehensive thermal map [21]. The component sensor design is shown in Fig. 3, and consists of a ring oscillator and a pulse counter with some modifications which will be discussed momentarily. The oscillation period (1/fosc) is converted to a number of oscillations by applying a fixed Enable pulse width (PW), and the number of oscillations is stored in the counter to produce the digital vector Out. The flip-flops used in the counter are C2MOS flipflops, shown to have excellent energy properties over a wide range of supply voltages [22]. The temperature sensor is modified in two minor ways to improve its behavior— additional circuitry between the Enable input and the ring oscillator is added to eliminate a synchronization problem, and the sensor outputs are connected to transmission gates to prevent unnecessary toggling of the other units in the dependence sensor. The synchronization problem occurs because the oscillator frequency is temperature-dependent; the falling edge of Enable may result in a final pulse that is only one or two inverter delays long, causing a spurious output response. The spurious response is shown near -15°C in the top half of Fig. 4, which plots the first three output bits over the entire range of temperatures
Fig. 3. Schematic of component temperature sensor.
Fig. 4. Impact of temperature on oscillator output with and without enable correction circuit. examined. The reset device is added to ensure that the oscillator is in the correct state when re-enabled for the next sample, and reduces short circuit power resulting from the floating output node of the transmission gate between samples. The grounded NMOS device ensures that leakage through the Reset device does not pull up the floating node between the end of the Enable pulse and the time at which the sample is read. The oscillator period has a nonlinear relationship with temperature, as shown by the increasing range of temperatures for each state in the lower half of Fig. 4 (for example, at the low temperature end the sensor output changes approximately every 2°C, while at higher temperatures the sensor output changes approximately every 4°C). The minimum resolution of the sensor is the largest range of temperatures with a single sensor output; in the lower half of Fig. 4, the minimum resolution is ~10°C—resolutions of