Uprating of a single inline memory module - IEEE Xplore

0 downloads 0 Views 196KB Size Report
Uprating of a Single Inline Memory Module. Neeraj Pendsé, Member ... the application in which the part is used outside the manufacturer- specified temperature ...
266

IEEE TRANSACTIONS ON COMPONENTS AND PACKAGING TECHNOLOGIES, VOL. 25, NO. 2, JUNE 2002

Uprating of a Single Inline Memory Module Neeraj Pendsé, Member, IEEE, Dawn Thomas, Diganta Das, Member, IEEE, and Michael G. Pecht, Fellow, IEEE

Abstract—Uprating is the process of assessing the ability of a part to meet the functionality and performance requirements for the application in which the part is used outside the manufacturerspecified temperature range. Uprating is considered when there are no electronic parts rated to operate at the required application temperature and other alternatives are found to be technically incompatible or inadequate. This paper presents a case study of an uprating assessment of single inline memory modules. The focus of the study was to determine if the modules could operate at extended temperatures for extended periods of time. The results established the ability of the parts to function at high temperature for extended periods. Index Terms—Memory testing, parameter conformance, parameter recharacterization, part rating, reliability, stress balancing, uprating.

I. INTRODUCTION

T

ODAY’S semiconductor parts are most often specified for use in the “commercial” 0 to 70 C, and to a lesser extent in the “industrial” 40 to 85 C operating temperature range. These temperature ranges generally satisfy the dominant semiconductor customers in the computer, telecommunications, and consumer electronic industries. However, there is also demand for parts rated beyond the “industrial” temperature range, primarily from the aerospace, military, oil and gas exploration, and automotive industries, but, the demand is often not large enough to induce semiconductor part manufacturers to make these parts. The challenge is to determine what can be done if equipment manufacturers cannot find parts with documented temperature specifications that meet the required conditions. Efforts are underway to determine methods to assess the capability of electronic parts to perform at extended temperature ranges. In particular, the CALCE Electronic Products and Systems Center at the University of Maryland has developed methods for uprating1 [21], [25], and had helped the International Electrotechnical Commission (IEC) develop processes Manuscript received September 13, 2000; revised February 27, 2002. This work was supported by the CALCE Electronic Product and Systems Center, University of Maryland. This work was recommended for publication by Associate Editor D. N. Donahoe upon evaluation of the reviewers’ comments. N. Pendsé was with the CALCE Electronic Products and Systems Center, University of Maryland, College Park, MD 20742 USA. He is now with National Semiconductor, Santa Clara, CA 95053 USA (e-mail: [email protected]). D. Thomas, D. Das, and Michael G. Pecht are with the CALCE Electronic Products and Systems Center, University of Maryland, College Park, MD 20742 USA (e-mail: [email protected].). Publisher Item Identifier S 1521-3331(02)04736-0. 1The term uprating was coined by Michael Pecht [21] to distinguish it from upscreening, which is a term used to describe the practice of attempting to create a part equivalent to a higher quality level by additional screening of a part. (For example, screening a commercial part to a space qualification part.) Uprating may be performed for environmental and operational factors other than temperature; this study is for operating temperature only.

Fig. 1. Functional block diagram of a DRAM (CAS: Column Address Strobe, RAS: Row Address Strobe, WR: Write, RD: Read).

for a guide on methods of using electronic parts outside the manufacturer’s specified temperature range [5]. Previous CALCE work had also pointed out the difficulties in using the junction temperature based uprating methodologies [24]. For a legal discussion on uprating, the reader is referred to the following articles [2], [23], [26] for additional information. This paper presents a case study of uprating assessment of single in-line memory modules (SIMM) at extended temperatures for extended periods. The SIMMs were rated for an ambient temperature range of 0 to 70 C. In this study, diagnostic tests at temperatures averaging 152 C ambient were conducted on the modules. The purpose was to determine if the modules were capable of performing outside their temperature specification for an extended period [22]. II. BASICS OF DRAM Random access memory (RAM) is used for temporary storage of programs and data, while the programs are executing. The two main types of read–write computer memory are static random access memory (SRAM) and dynamic random access memory (DRAM). DRAM uses smaller area per cell by employing a single transistor that charges a single capacitor to represent a “1” or “0,” and requires periodic refreshing (every several microseconds) due to capacitor self discharge. This capacitor is changed/discharged to write/read the memory contents. As opposed to DRAM, SRAM employs two back-to-back connected transistors. Due to smaller cell size, DRAM is cheaper to manufacture, however SRAM is faster than DRAM. Therefore, DRAM is used extensively for main memory with SRAM relegated only to cache memory. A typical functional block diagram of a DRAM is shown in Fig. 1. The DRAM consists of a memory array—each “cell” in the array can be accessed by a row number and a column number for reading or writing. The row and column numbers form the “address” of the cell location in the memory, and are specified on the address lines (as an -bit binary number, on address lines) of the memory IC. The data lines are used to

1521-3331/02$17.00 © 2002 IEEE

PENDSÉ et al.: UPRATING OF A SINGLE INLINE MEMORY MODULE

267

Fig. 2. Single inline memory module.

read/write data to the cell being addressed by the address lines. The Read/Write signals signal the memory chip about the operation being performed. Typically in DRAMs, the address lines are multiplexed, i.e., the row address and the column address is given one after the other on the same set of pins. To signal the DRAM which addressing is for rows and which addressing is for columns, the signals RAS (row address strobe) and CAS (column address strobe) are used. This technique reduces the required amount of address lines in half [3], [6], [7], [10]. III. TEST MODULES Single in-line memory modules (SIMMs) are memory card assemblies often used as extended memory in computers. The purpose of the SIMM is to provide additional real estate for the motherboard, while increasing the memory capacity of the computer. The current trend is the use of dual in line modules called DIMMs. A typical SIMM with surface mounted memory (typically DRAM) attached to a printed wiring board with edge connectors is shown in Fig. 2. The SIMM is connected to the motherboard via a SIMM socket and an edge connector. Electrical contacts are in a single line, with opposing pins on either side of a module tied together to form one electrical contact [16]. The SIMMs used in this case study were assembled with plastic small outline J-leaded (SOJ) DRAMs from Micron and IBM (part numbers MT4C1024 and 11E1320PA-70 respectively), and some chip capacitors. The test module was a double-sided SIMM with 12 DRAM parts, six per side. There were eight IBM parts and four Micron parts. Both the DRAM parts are rated for 0 to 70 C ambient temperature operation. The modules as a whole are also rated for 0 to 70 C of operation [12]. IV. MEMORY FUNCTIONALITY AND TESTING Failure of memory can lead to program or computer shutdown. Causes of failures depend on such factors as component density, circuit layout, and the manufacturing method, materials and process defects, operational environment extremes, and aging effects [13], [14]. For a memory to be considered functional, it must be able to perform the following functions with the required voltage and timing performance regardless of the contents of the other cells or previous memory access sequences: 1) store a 0 or a 1 into every cell of the memory; 2) change every cell from 0 to 1 as well as from 1 to 0; 3) read every cell correctly [13].

Fig. 3.

Steady state temperature test setup.

Faults only cause failure if the memory cell containing the fault is accessed [17]. Categories of faults include stuck-at, transition, coupling, linked, retention, and pattern-sensitive. Faults are classified as permanent if they lead to catastrophic failures. Non-permanent faults are not expected to cause irreversible damage [20]. Stress on the memory depends on the on-chip current and voltage signals, which in turn depend on the activity the chip is performing. Chip activity depends on the input patterns and operating conditions. Memory testing is performed at various stages in the production of the memory chip to identify defects in the chip. Test patterns are developed based on factors such as allocated test time, the degree of access to the internal circuitry, fault coverage, and cost. For practical purposes, a test to cover all possible faults is impossible. In fact, the development of test patterns over the years has shown that no single pattern can exercise a RAM thoroughly enough to detect all the failure modes [15]. Memories are therefore tested with several patterns. For example, a “walking one” pattern is used to detect shorts between adjacent memory cells. In the “walking one” pattern, all the memory locations are first initialized to zero. Then, a one is written to each location. That location and all it’s surrounding locations are read to see if that one has “leaked” to the neighboring locations. Then that location is set to zero. The process is continued throughout the entire memory. V. EXPERIMENTAL DESCRIPTION The test platform consisted of a Compaq Presario 924 (Fig. 3). The side cover panel of the computer was removed for accessibility to the SIMMs. Minco thin film resistive heaters were used with two nonfunctional “dummy” SIMMs that were placed in the first and third SIMM slots. The SIMM under test was placed in the second SIMM slot, located between slots 1 and 3. An average temperature of 152 C on the SIMM was obtained during this test. Type T thermocouples (5 mil, 36 AWG), read with an Omega Type T thermocouple thermometer, were used to monitor the SIMM temperature. Thermocouples were placed on various components, underneath components, and on the surface of the SIMM board. A thermocouple was also used to measure the temperature of the computer motherboard. Type T thermocouples have an upper temperature limit of 371 C (700 F) [1].

268

IEEE TRANSACTIONS ON COMPONENTS AND PACKAGING TECHNOLOGIES, VOL. 25, NO. 2, JUNE 2002

There is a 1 C error introduced by the thermocouple2 [18]. A 36 AWG wire was used to minimize extension wire error. In-situ functional tests were performed using the QAPlus/FE diagnostic software to assess if the voltage and timing parameters met functional requirements during the operation of the computer. QAPlus/FE performs a series of memory tests that access each of the data bits in various patterns to detect faults on the memory chip. In this experiment, the software was continuously executed. Each time a test was performed, the temperature readings from the thermocouples were recorded. VI. TEST RESULTS, CONCLUSIONS, AND DISCUSSIONS The modules were confirmed to operate for 2592 continuous hours (108 days) at an average temperature of 152 C. Micron performs high temperature operating life (HTOL) qualification (integrity) tests on their DRAM parts at 125 C for 1008 h. Thus, testing in this study was more severe than the integrity tests done by the manufacturer. The results show that the SIMM module can work at temperatures outside those for which it is specified, and it won’t fail for extended periods. If the required application conditions and the expected failure mechanisms under those conditions are known, then these results can be translated to predict a lifetime for the SIMM in the application, using temperature dependent physics of failure models [9], [8], [11]. The reliability of the part in its intended application has to be assessed independent of whether uprating is performed or the uprating assessment is successful. Part data sheets [4] provide absolute maximum ratings and recommended operating conditions. These two terms are used on almost all data sheets. The following observations can be made regarding the two ratings. 1) Absolute maximum ratings (AMR) are generally provided as a limit for the “reliable” use of parts. 2) Recommended operating conditions are the conditions within which electrical functionality and specifications of a part are guaranteed. In other words, the part temperature ratings are set for electrical performance reasons as opposed to package or device reliability reasons. The part temperature ratings are designated in the recommended operating conditions for parts. The limits for reliable operation of parts are designated by the absolute maximum ratings, but use at AMR conditions do not imply catastrophic failure. Almost all data sheets contain some form of warning statement or disclaimer to discourage or prohibit use of the parts at or near absolute maximum ratings. The most common wordings used in the warning labels regarding AMR are given below. None of these warnings point to any catastrophic failure at the AMR conditions. 1) These are stress ratings only. 2) Stresses above these ratings can cause permanent damage to the parts. 3) Functional operation is not implied in these ranges. 4) Exposure to these conditions for extended periods may affect reliability and reduce useful life. 2Information on thermocouple accuracy is obtained from the thermocouple manufacturer.

In this case study, the parts have a commercial temperature rating of 0 to 70 C as recommended operating condition, but the qualification and integrity monitor tests performed on the parts are based on military standards. This is not unexpected, because the qualification and integrity monitor tests for the parts do not have a direct relation with the operating temperature limits of the parts. The integrity tests measure the ability of the part to survive some “standard” load conditions, but there is no relation to the use conditions. In fact, it has been generally observed that the manufacturer’s part qualification process is not based on the part’s temperature ratings but rather on Mil-Std-883 type testing [19]. Only recently, some manufacturers are planning to move toward using multiple testing and qualification based on the use conditions [27], these concepts are yet to be widely accepted. The reliability of the part in its intended application has to be assessed independent of whether uprating is performed or the uprating assessment is successful. The test results from this case study do not prove or disprove the reliability of the part in all application conditions. The result presented here establishes the functionality of the part at high temperature and that the part can be uprated; in addition, the results provide input to the reliability assessment process. REFERENCES [1] ASTM, Manual on the Use of Thermocouples in Temperature Teasurement. Philadelphia, PA: Amer. Soc. Testing Mater., 1974, p. 19. [2] R. Biagini, M. Rowland, M. Jackson, and M. Pecht, “Tipping the scales in your favor when uprating,” IEEE Circuits Devices Mag., vol. 15, pp. 15–23, July 1999. [3] L. Condra et al., “Terminology on use of electronic parts outside the manufacturer’s specified temperature ranges,” IEEE Trans. Comp., Packag. Manufact. Technol. A, vol. 21, pp. 355–356, Sept. 1999. [4] D. Das, N. Pendse, M. Pecht, L. Condra, and C. Wilkinson, “Deciphering the deluge of data,” IEEE Circuits Devices, vol. 16, pp. 26–34, Sept. 2000. [5] IEC/PAS 62 240, edition 1, 2001-04, Use of Semiconductor Devices Outside Manufacturers’ Specified Temperature Ranges (Also being developed as GEIA 4900), 2001. [6] M. Jackson, A. Mathur, M. Pecht, and R. Kendall, “Part manufacturer assessment process,” Qual. Rel. Eng. Int., vol. 15, pp. 457–468, 1999. [7] M. Jackson, P. Sandborn, M. Pecht, C. Hemens-Davis, and P. Audette, “A risk informed methodology for parts selection and management,” Qual. Rel. Eng. Int., vol. 15, pp. 261–271, 1999. [8] P. Lall, M. Pecht, and E. Hakim, Influence of Temperature on Microelectronics and System Reliability: A Physics of Failure Approach. New York: CRC, 1997. , “Characterization of functional relationship between temperature [9] and microelectronic reliability,” Microelectron. Rel., vol. 35, no. 3, pp. 377–402, 1995. [10] , Influence of Temperature on Microelectronics and System Reliability: A Physics of Failure Approach. Boca Raton, FL: CRC, 1997. [11] M. Pecht, R. Radojcic, and G. Rao, Guidebook for Managing Silicon Chip Reliability. Washington, DC: CRC, 1999. [12] N. Pendse and M. Pecht, “Parameter re-characterization case study: Electrical performance comparison of the military and commercial versions of all octal buffer,” in Future Circuits International. London, U.K.: Technology, 2000, vol. 6, pp. 63–67. [13] V. N. Rayapati, “VLSI semiconductor random access memory functional testing,” Microelectron. Rel., vol. 30, no. 5, pp. 877–889, 1990. [14] J. Rhea, “COTSCON attendees agree: Upscreening is here to stay,” Military Aerosp. Electron., p. 1, June 1999. [15] A. K. Sharma, Semiconductor Memories: Technology, Testing, and Reliability. Piscataway, NJ: IEEE Press, 1997. [16] SMART Modular Technologies. (1998, May) Memory industry terms & acronyms. [Online]. Available: www.smartm.com/knowledge/html/memoryterms.html. [17] I. Sommerville, Software Engineering, 5th ed. Reading, MA: Addison-Wesley, 1995.

PENDSÉ et al.: UPRATING OF A SINGLE INLINE MEMORY MODULE

[18] A. Svab. (1998, July) Thermocouples, Thermistors & RTDs. Tech. Rep. [Online]. Available: www.sensorsci.com/thermocouples.htm. [19] U.S. Department of Defense, Test Method Standards—Microcircuits, Mil-Std-883, 1996. [20] A. J. van de Goor, Testing Semiconductor Memories: Theory and Practice. New York: Wiley, 1991. [21] M. Wright et al., “Uprating electronic components for use outside their temperature specification limits,” IEEE Trans. Comp., Packag., Manufact. Technol. A, vol. 20, pp. 252–256, June 1997. [22] Xilinx Inc., “The reliability data program, expanded version,” Tech. Rep., San Jose, CA, 1999. [23] M. Pecht and R. Biagini, “The business, product liability and technical issues associated with using electronic parts outside the manufacturer’s specified temperature range,” in Proc. 7th Pan-Pacific Microelectron. Symp., Maui, HI, Feb. 5–7, 2002, pp. 391–398. [24] L. Condra, D. Das, C. Wilkinson, N. Pendse, and M. Pecht, “Junction temperature considerations in evaluation parts for use outside manufacture-specified temperature ranges,” IEEE Trans. Comp. Packag. Technol., vol. 24, pp. 721–728, Dec. 2001. [25] D. Das, N. Pendse, M. Pecht, and C. Wilkinson, “Parameter re-characterization: A method of thermal uprating,” IEEE Trans. Comp. Packag. Technol., vol. 24, pp. 729–737, Dec. 2001. [26] D. W. Okey, “Products liability and uprating of electronic components,” Southern Methodist Univ. Sch. Law J. Air Law Commerce, 1999. [27] Intel, “Knowledge based reliability evaluation of new package technologies utilizing use conditions,” Tech. Rep., Santa Clara, CA, Mar. 1999.

Neeraj Pendsé (M’99) received the B.E. degree in electrical engineering from Visvesvaraya Regional College of Engineering, Nagpur, India, in 1998 and the M.S. degree in mechanical engineering from University of Maryland, College Park (UMCP), in 1999. He was a Research Assistant at the CALCE Electronic Products and Systems Center, UMCP. He has been with National Semiconductor Corporation, Santa Clara, CA, since March 2000, where he works on high-speed interconnect design and electrical performance of packaging.

269

Dawn Thomas, photograph and biography not available at the time of publication.

Diganta Das (M’00) received the B.Tech. degree (with honors) from the Indian Institute of Technology and the Ph.D. degree in mechanical engineering from the University of Maryland, College Park (UMCP). He is a Research Associate in the CALCE Electronic Products and Systems Center, UMCP. His primary research interests are environmental and operational ratings of electronic parts, uprating, obsolescence prediction, and management, technology trends in the electronic parts, and their effects on the parts selection and management methodologies. He has published in the areas of electronic part uprating, operational environments of electronic parts, organized international conferences and workshops, and worked in international standards developments. He also provides services to scholarly journals and magazines.

Michael Pecht (F’92) received the B.S. degree in acoustics, the M.S. degree in engineering, and the M.S. and Ph.D. degrees in engineering mechanics, all from the University of Wisconsin, Madison. He is the Director of the CALCE Electronic Products and Systems Center, University of Maryland, College Park, and a Full Professor with a three-way joint appointment in mechanical engineering, engineering research, and systems research. He serves on the Board of Advisors for various companies and consults for the U.S. government, providing expertise in strategic planning in the area of electronics products development and marketing. Dr. Pecht is an ASME Fellow. He served as Chief Editor of the IEEE TRANSACTIONS ON RELIABILITY for eight years and on the Advisory Board of IEEE SPECTRUM. He is currently the Chief Editor for Microelectronics Reliability, an Associate Editor for the IEEE TRANSACTIONS ON COMPONENTS AND PACKAGING TECHNOLOGIES, and on the Advisory Board of the Journal of Electronics Manufacturing.

Suggest Documents