Department of Electrical Engmeermg, Unwersity of Southern Cahfornta, Los Angeles, .... A semiconductor RAM organization at the functional level of description.
Functional Testing of Semiconductor Random Access Memories Magdy S. Abadir Department of Electrical Engmeermg, Unwersity of Southern Cahfornta, Los Angeles, California 90007
Hassan K. Reghbati Computmg Science Department, S~mon Fraser Universzty, Burnaby, British Columbia, Canada VSA 1S6
This paper presents an overview of the problem of testing semiconductor random access memories (RAMs). An important aspect of this test procedure is the detection of permanent faults that cause the memory to function incorrectly. Functional-level fault models are very useful for describing a wide variety of RAM faults. Several fault models are &scussed throughout the paper, including the stuck-at-0/1 faults, coupled-cell faults, and single-cell pattern-sensitive faults. Test procedures for these fault models are presented and their fault coverage and execution times are discussed. The paper is intended for the general computer scmnce audience and presupposes no background in the hardware testing area. Categories and Subject Descriptors: B.3.4 [Memory Structures]'. Reliability, Testing, and Fault Tolerance--test generatmn; B.7.3 [ I n t e g r a t e d Circuits]: Reliability and
Testing--test generatton General Terms: Algorithms, Rehabihty Additional Key Words and Phrases: Checking experiments, fault detection
INTRODUCTION
Recent developments in semiconductor memory technology have greatly increased the use of semiconductor random access memories (RAMs). These memories are usually large-scale-integration (LSI) devices and employ either bipolar or metal oxide semiconductor (MOS) techniques lINTEL 1975]. To detect the variety of faults that can occur within a memory chip, extensive testing of semiconductor memories is carried out by both users and manufacturers. As a result of the large number of cells involved, most of the standard faultdetection techniques [Muehldorf and Savkar 1981] are not suitable for memory testing. A RAM chip consists mainly of a memory cells array, an address decoder, a memory
address register (MAR), a memory data register (MDR), and read/write logic (sense amplifiers, write drivers, etc.). The general organization of a semiconductor RAM is given in Figure 1. Physical faults can occur in any part of the memory chip. The causes of failure in RAMs, though not yet fully understood, are known to depend on such factors as component density, circuit layout, and method of manufacture. In this paper, we limit ourselves to the fault-detection, rather than the fault-location, problem. In this way, testing for faults that have similar effects on the functional behavior of the memory system can be simplified by testing for one of these faults. We intend to survey the different methods for detecting functional faults in semiconductor RAMs and to compare these
Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. © 1983 ACM 0360-0300/83/0900-0175 $00.75 Computing Sutveys~VoL 15, No, 3, September 1983
176
•
M . S . Abadir and H. K. Reghbati
CONTENTS
procedure is supplied by the manufacturer. Because DC parametric testing is so implementation dependent, we do not discuss it further in this paper.
INTRODUCTION 1 MEMORY FAULT MODELS 2. T E S T I N G F O R S T U C K - A T F A U L T S 2.1 The M S C A N Test Procedure 2 2 The A T S Procedure 2.3 The M A T S Procedure 2 4 Comparative Study 3. T E S T I N G F O R C O U P L I N G F A U L T S 3.1 Commonly Used Test Procedures
AC parametric testing, or dynamic testing, measures AC parameters of the RAM to detect any malfunction. The parameters measured include memory access time, setup time, and hold time. Examples of malfunctions that can be detected by AC testing include [Breuer and Friedman 1976]
3 2 More Effective Procedures for the Detection of Coupling Faults 3 3 Comparative Study 4. TESTING FOR PATI~ERN-SENSITWE FAULTS 5 CONCLUDING REMARKS ACKNOWLEDGMENT REFERENCES
(1) Slow write-recovery: the memory may not produce the correct information at the specified access time when each read cycle is preceded by a write cycle. (2) Sleeping sickness: memory loses information in less than the specified hold time. 1
A
v
Although AC parametric test procedures are not considered in this paper, functional test procedures detect many AC malfuncmethods in terms of their time efficiency tions. For example, the stuck-at faults and fault coverage. We begin in Section 1 tests, when applied to the memory at the by describing the three most commonly maximum possible rate, may detect slow used fault models for s e m i c o n d u c t o r write-recovery faults. RAMs, concentrating on the stuck-at faults Functional testing is designed to detect model, the coupling faults model, and the permanent faults that cause the memory to pattern-sensitive faults model. function incorrectly.2 Generally speaking, a 1. MEMORY FAULT MODELS
A wide variety of internal faults can occur in semiconductor memories, causing various types of failures in the memory function. Test procedures necessary to detect these faults can be classified--as in any digital circuit test--into three classes: DC parametric testing, AC parametric testing, and functional testing [Abadir and Reghbati 1983; Akers 1980; Breuer and Friedman 1976; Muehldorf and Savkar 1981].
DC parar~etric testing, as the name suggests, measures the DC parameters in semiconductor RAMs. It checks for unacceptable output level, high power consumption, fanout capability, noise margins, rise and fall times in electrical signals, etc. Clearly, this kind of testing depends on the implementation details of the RAM under test, and usually the DC parametric test Computing Surveys, Vol. 15, No 3, September 1983
1 Metal oxide semmonductor (MOS) dynamic memories store their data via the charge across a capacitor, because of leakage this charge must be refreshed at fixed intervals, or the value in the cell will be lost The maximum period of time t h a t data can be stored without refreshing is called the hold time. 2 The contrast between software program testing and functional testing of digital systems is worth noting. In program testing, the main concern is achieving an acceptable assurance t h a t the programmer has put together his primitives (i.e., language constructs) correctly. Although functional testing of digital systems is similarly concerned with whether hardware primitives have been put together correctly, it is further concerned with the integrity of the primitives themselves. Moreover, while human factors and software complexity make program testing largely an art at present, the relative simplicity of digital systems {memories, in particular) makes functional testing more methodological. The program testing work t h a t is most analogous to functional testing is that of compiler validation [Hoyt 1977]. An area of special interest is test data selection techniques [Demillo et al. 1978].
Functional Testing of Semiconductor Random Access Membries
i!
177 I
I
Memory CellsArray
SenseAmplifiers& Write Drivers
MDR
]
Figure 1. A semiconductorRAM organizationat the functionallevelof description.
read/write RAM can be defined as functional if it is possible to store a 0 or a 1 into every cell of the memory, to change every cell from 0 to 1 as well as from 1 to 0, and to read every cell correctly when it stores a 0 as well as when it stores a 1, regardless of the contents of the remaining cells or the previous memory access sequence~ A functional test that will cover all the possible faults is impossible from the practical point of view, since the complexity of such a test will be on the order of 2n [Hayes 1975], where n is the number of cells in the memory (we have to check every cell for all possible states of the remaining cells). For example, assuming an access time of ~500 nanoseconds, testing a 1 Kbit RAM would take approximately 10293 seconds! Therefore, in order to develop any practically feasible test procedure, we should restrict ourselves to a subset of faults that are most likely to occur. This practice is known as "selecting a fault model." The following three fault models for RAMs are the most widely used ones (listed according to their degree of complexity): (1) Stuck-at Faults Model. One or more logic values in the memory system cannot be changed. For example, one or more cells are "stuck at" 1 or 0. Stuck-at faults are also useful for modeling faults in other parts of the memory system (see Figure 1). (2) Coupling Faults Model. There exist two or more cells that are coupled. A pair
of memory cells, i and j, are said to be coupled if a transition from x to y in one cell of the pair, say cell i, changes the state of the other cell, that is, cell j, from 0 to 1 or from 1 to 0. This, of course, does not necessarily imply that a similar transition in cell j will influence cell i in a similar manner. (3) Pattern-Sensitivity Faults (PSF) Model. The fault that alters the state of a memory cell as a result of certain patterns of zeros, ones, zero-to-one transitions, and/ or one-to-zero transitions in the other cells of the memory is called a PSF. Also, the fault that causes a read or a write operation of one cell of the memory to fail owing to certain patterns of zeros and ones in the other cells of the memory is called a PSF. General pattern-sensitivity tests are infeasible in practice [Hayes 1975]. However, some restricted, and realistic, models of PSFs allow for the generation of efficient test sequences [Hayes 1980; Suk and Reddy 1980]. It should be noted that coupling faults are a special case of PSF. There are many test procedures for RAMs. These procedures can be partitioned into three classes, according to the fault model that each procedure adopts. Procedures for testing stuck-at faults are simple and fast. Procedures for testing coupling faults are more complicated and usually require more time, but they detect all stuck-at faults, as well as all coupling Computing Survey6, Vol. 15, No. 3, September 1983
178
•
M.S.
Abadir and H. K. Reghbati
faults. Procedures for testing pattern-sensitivity faults are very complicated and they only consider some restricted case of PSFs. In the rest of the paper we describe different representatives from each class of procedure, their time complexity, and their degree of fault coverage. 2. TESTING FOR STUCK-AT FAULTS
This section describes different testing algorithms intended to detect stuck-at faults in read/write semiconductor RAMs. This kind of fault is independent of the memory state and can occur in any part of the memory system. Thus we have to consider stuck-at faults in the memory data register, memory address register, address decoder, the memory cells array, and the read/write logic. Note that fault detection is our main concern, not fault location, since the whole RAM is assumed to be on one chip. 2.1 The MSCAN Test Procedure
A trivial test procedure, called Memory Scan (MSCAN), is presented to illustrate some aspects of the complexity of testing a RAM [Breuer and Friedman 1976]. A word with address i is denoted W,. A write operation to W, is denoted as Write: W, (-- ~ where a is the all 0 word or the all 1 word. Similarly, a read operation of W, is denoted
memory array, in the memory data register, or in the read/write logic. Of course, such a test is of little value because at the end of this procedure we can only be sure that there is at least one word in the memory that is functionally correct. (A fault could have occurred in the decoder such that the same word is always referenced.) The complexity of the MSCAN test is 4n. 2.2 The ATS Procedure
Knaizuk and Hartmann developed a test algorithm called ATS (algorithmic test sequence) that detects any combination of stuck-at-0 or stuck-at-1 multiple faults in a RAM. In their first paper [Knaizuk and Hartmann 1977a] they introduced the ATS algorithm, showing that it detects any single stuck-at fault in a RAM. In their second paper [Knaizuk and Hartmann 1977b] they proved that the same algorithm detects all multiple stuck-at faults. A discussion of the ATS will be given in this section. Assume that the memory contains n words, each denoted as W, for 0 _ i _< n - 1, and that each word is m bits long. The memory words are grouped into three partitions, Go, G1, and G2, such that Go = {W, l i = 0 (modulo 3)}, G1 = {W,[i = 1 (modulo 3)}, G2 = {W, [ i = 2 (modulo 3)}.
as
The same notation for the read and write operations used in Section 2.1 is used here:
Read: W, (--a) where the value in parenthesis is the expected value of W,.
ATS Procedure
MSCAN Procedure For i - 0, 1, 2. . . . . n - 1 do: Write: W, .-- 0 Read: W, (=0) Write: Wi o- 1 Read: W, (=1). This test writes each word, first with a 0 and then with a 1, and verifies the writing with two read statements. The above test procedure will not detect any stuck-at fault in the memory address register or in the decoder. But on the assumption that those two are fault free, the test can detect any stuck-at fault in the Computing Surveys, Vol. 15, No. 3, September 1983
1. For all W, E G1 U G2 Write: W, *-- 0 "0 denotes the all 0 word" 2. For all W, ~ Go Write: W, (-- 1 "1 denotes the all 1 word"
3. For all W, E G1 Read: W, (=0) 4. For all W, E G1 Write: W, *- 1 5. For allW, E G2 Read: W, (=0) 6. For allWi E Go U G1 Read: W, (=1) 7. For all Wi ~ Go Write: W, (-- 0 Read: W, (=0)
Functional Testing of Semiconductor Random Access Memories
1
Go
2
3
4
5
Wrl
GI
:WrO
G2
Wr 0
6
R1
RO
Wrl
7
*
179
8
WrO, RO
R1
Wrl, R1
RO
Figure 2. ATS algorithm (Wr means write and R means read).
8. For all W, ~ G2 Write: W, * - 1 Read: W, (=1) END. The algorithm is shown in tabulated form in Figure 2. The RAM is assumed to consist of a memory address register {MAR), an address decoder, a memory data register (MDR), and a memory cells array. The decoder is assumed to be of "noncreative" design [Klayton 1971]; that is, a single fault within the decoder does not create a new memory address to be accessed 3 without also accessing the programmed address. Stuck-at faults occurring in other physical parts of the memory can be visualized as stuck-at faults in the above subsystems. Furthermore, stuck-at faults in the MAR are indistinguishable from those in the inputs of the decoder. Thus it is enough to only consider the three remaining subsystems {i.e., decoder, MDR, and memory cells array). To prove that the ATS detects all multiple stuck-at faults in the RAM, we first prove that it detects multiple faults in each subsystem, assuming that all other subsystems are fault free. Next, in Section 2.2.4, 3By accessing a memory address we mean reading from, or writmg to, the memorycells associated with that address If, owingto a fault, more than one word is accessedduring a read operation, the words will be ORed or ANDed together, depending on the RAM technology.
we show how ATS detects any combination of faults in the RAM.
2.2.1 DecoderFaults A noncreative decoder is an /-input, 2L
output device constructed of AND and NOT gates with no reconverging paths [Klayton 1971]. An example of a 4-input, 16-ouput decoder is given in Figure 3. We first consider the faults in the input of the decoder. Because we grouped the memory locations into three different partitions, no single stuck-at fault in the decoder input can map an address in one partition into another address in the same partition [Knaizuk and H a r t m a n n 1977a]. The following test sequence detects any fault in the input of the decoder [Ravi 1969]: 1. Write: W2 ( - 0 2. For every W, which is Hamming distance 1 apart from W2 Write: W, j , as shown in Figure 10a. (3) Eight cases are sufficient to demonstrate that Condition 3 is satisfied; these are shown in Figure 10b for i < j