FITS: An Integrated ILP-Based Test Scheduling ... - IEEE Xplore

1598

IEEE TRANSACTIONS ON COMPUTERS,

VOL. 54, NO. 12,

DECEMBER 2005

FITS: An Integrated ILP-Based Test Scheduling Environment James Chin, Member, IEEE, and Mehrdad Nourani, Senior Member, IEEE Abstract—We present a comprehensive and flexible test scheduling environment, called FITS, for testing core-based system-onchips. Our environment prevents formation of hot spots during test. It also allows trade-off among test time, test access mechanism, power, and test controller/resource constraints. The basic strategy is to use power profile over application time and structural grids of nonembedded cores to find the best test schedule of their test pattern subsets while satisfying the constraints. As case studies, four integer linear programming formulations, corresponding to four power approximation models, are extensively analyzed. With proper setting of the weights and constraints, optimized results can be obtained quickly for each of the four power approximation models. Extensive experimental results are reported based on ISCAS ’89 benchmarks and verify the efficiency and flexibility of the FITS environment. Index Terms—Automatic test equipment, embedded core, grid, hot-spot, ILP formulation, power profile, system-on-chip, trade-off, test access mechanism, test schedule.

æ 1

INTRODUCTION

1.1

Motivation

S

EMICONDUCTOR

companies produce identical dies on a wafer. Each die can be packaged to a chip for sale, but needs to be tested first. Due to the fast advances in the semiconductor process technology, the number of transistors in a single die has dramatically increased in the past decade. Therefore, a system with several cores (big modules, also referred to as mega blocks) and a large number of transistors can be put together on a die to save cost. Merging several chips to a single chip also reduces the cost for a system. The reuse of cores has become a common practice which saves the development time for a new chip. SoC (System-on-Chip) was born under the above situation. In spite of advantages, there are some drawbacks in SoC development. Let’s elaborate on three such test difficulties. First, the number of input/output (I/O) pins in a chip cannot keep pace with the growth of the number of cores inside a chip. To reduce the hardware production and test cost, it is preferred to reduce the number of pins of a chip. This, unfortunately, implies that the test access time will be longer due to the delay time and limited test paths from the primary input/output to the core under test. This is called the TAM (Test Access Mechanism) limitation. Examples of TAM limitations are core access pins or scan chains that deliver test data to/from cores. Therefore, the trade-off between the test time and TAM is unavoidable. Second, the heat dissipation capability of a single die is limited by the package cost. Researchers have shown that heat dissipation during manufacturing testing of dies is

. J. Chin is with Analog Devices Inc., 804 Woburn Street MS-613, Wilmington, MA 01887-3494. E-mail: [email protected]. . M. Nourani is with the Center for Integrated Circuits & Systems, The University of Texas at Dallas, Richardson, TX 75083-0688. E-mail: [email protected]. Manuscript received 28 June 2004; revised 25 Mar. 2005; accepted 14 July 2005; published online 14 Oct. 2005. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number TC-0221-0604. 0018-9340/05/$20.00 ß 2005 IEEE

quite crucial. This is due to the fact that test patterns can be highly uncorrelated and consume much higher power than in normal mode [1]. Therefore, the total heat generation (or power consumption) at a single time frame for a chip needs to be limited as well. There are several ways to reduce the power consumption, such as changing the test architecture [2], [3], manipulating the test vector [4], and optimizing the test schedule [5], [6]. The main strategy is to reduce the number of simultaneous switching of transistors (or gates). In our study, we concentrate on optimizing the test schedule with power, time, and other constraints. Third, there are several ways to select the test hardware to facilitate accessing cores. Test bus with multiplexors and test rail with daisy chains are used for such a purpose [7]. A test schedule should be tuned properly based on available test resources. For example, ATE (Automatic Test Equipment) usually has limited off-chip resources. Similarly, in BIST (Built-In-Self-Test) applications, pattern generators and signature analyzers are considered on-chip resources whose availabilities are limited and should be considered in test planning. This is even more important for multisite testing, where the availability of test resources becomes the bottleneck. The above difficulties motivated us to work on a test environment that can simultaneously deal with TAM, power, and test resource limitations during test. Such a flexible and comprehensive test schedule will obviously be more complicated and may require more time compared to those methods that consider only one or two metrics. On the other hand, it looks at multiple metrics globally and provides more flexibility for trade-off among various design metrics (e.g., power, time, area). We believe the overall time and efforts will still be less than multiphase ad hoc techniques that deal with the same design metrics one at a time.

1.2 Prior Work There are three major methods to solve the test scheduling problem: 1) graph-based techniques, 2) bin-packing methods, and 3) ILP/MILP approaches. A survey of test Published by the IEEE Computer Society

CHIN AND NOURANI: FITS: AN INTEGRATED ILP-BASED TEST SCHEDULING ENVIRONMENT

scheduling methods is presented in [8]. We briefly review them here.

1.2.1 Graph-Based Techniques The work in [9] presents a heuristic, based on the compatibility graph, to limit power dissipation during test. This graph algorithm uses the single peak power value model for cores under test. Two-value power profile and algorithms to reduce test time under power constraints are discussed in [10]. It employs the multiple peak power value model for a core under test. A power compatible algorithm combines the test compatible graph and the multiple power value model to generate the test schedule. A similar partitioning algorithm is used to facilitate insertion of the cores in the test scheduling process [11]. A graph-based dynamic partitioning algorithm is presented in [12] to deal with variable TAM width and solve the test scheduling problem. In this work, a single power value for a core test is used. 1.2.2 Bin-Packing Methods A generic two-dimensional rectangle packing algorithm is discussed in [13]. A rectangle representation of core tests was presented in [14]. A more flexible bin-packing approach was shown in [15] for the test scheduling problem. A complete algorithm for test design with wrapper/TAM and power constraints was presented in [16]. The authors used an extended bin-packing algorithm which can deal with precedence, preemption, power, and concurrency constraints. The number of preemption (the number of halts during testing a core) can be limited. The work presented in [16] did not consider the order of the core inserting sequence in a binpackaging algorithm. This can be further explored as a heuristic to achieve the optimal solution. A 3D bin-packing algorithm is adopted to deal with TAM, time, and power constraints in test scheduling [17]. A heuristic algorithm is proposed to solve the problem based on the constraint-driven two-dimensional bin-packing algorithm [18]. These algorithms still use the single peak power value model. 1.2.3 ILP/MILP Approaches An ILP (Integer Linear Programming) formulation is used in [19] to access cores in an SoC using the concept of bridges. In this work, the test time and test hardware cost can be minimized according to their weight factors predefined by user. A test scheduling method for core-based SoC using MILP formulation is presented in [20]. A method for determining preemptive and power-constrained schedules is discussed in [21]. This method uses an MILP (Mixed Integer Linear Programming) formulation which deals with fixed TAM and single peak power value model for a core. There are dedicated test resources (e.g., buses, pattern generators) for the internal (BIST) tests and the external (ATE) tests. Such resource limitations plus two TAM features, precedence and preemptive, are considered in [21]. The precedence test imposes a partial order among the test sequences. A cores with a preemptive test is a core that can be halted for a period of time and resumed later. This helps to lower the overall test time while satisfying the power and TAM constraints. Cores with nonpreemptive tests cannot be halted and their tests need to be entirely continuous. In [22], an MILP scheduling method is presented to do power-time trade-off based on multiple peak power values

1599

per core. Practically, in test mode, some areas may form hotspots that can lower the chip’s reliability and lifetime [1]. In [23], we presented an MILP scheduling method to do powertime trade-off based on power profiles of the cores over test pattern sets and area units, called grids. This paper showed how formation of hot-spots and test scheduling are related. In addition to the above three categories, there are other heuristics to optimize test hardware, power, data compression, and test scheduling. In [24], an algorithm for the powerconstrained scheduling problem is presented. To halt pattern transfer for reducing the power, a gated clock method is used in [2]. In [3], disabling the scan chain is programmed to reduce power. Similarly, in [4], minimizing the signal transitions in the scan chain can reduce power. Multiple scan chains are used for power minimization in [25]. The trade-offs in scan power and test data compression are studied in [26]. In [5], the authors show an approach to reduce SoC test data volume, scan power, and testing time. Larson et al. [6] minimized the test time and the TAM routing cost while considering test conflicts and power constraints. A test resource partitioning and optimization technique was presented in [27]. Test scheduling methods by using TAM utilization (e.g., parallelizing usage of resources) [28] or by employing efficient wrappers [29] have also been investigated. Zhao and Upadhyaya [30] used a modified shortest path algorithm to balance the resources and schedule the tests for cores.

1.3 Contribution and Paper Organization The main contribution of our work is a comprehensive test scheduling methodology for core-based SoCs that allows the user to do a trade-off among power, TAM, and ATE resource constraints while avoiding formation of hot-spot areas during test. This methodology is integrated within a flexible ILP-based test scheduling (FITS) environment. As case studies, four ILP formulations are presented for four power approximation models (PAM). By setting the weight factors, we can control trade-off among time, power, TAM, and preemption times (the times that a core test can be halted). The number of preemption (halts) is proportional to the configurable test hardware cost. The strategy of using ILP for test scheduling is not new. However, there are a few techniques used in the FITS environment that distinguish our method. They include: 1) the partitioning rule based on power profile in space domain, 2) simultaneous optimization of time, TAM, power, and tester resources, and 3) a flexible way of incorporating various design/test choices and constraints within the formulation. The rest of this paper is organized as follows: The simulation-based methods to obtain the power profile of cores over time and grids, the TAM models, and the ATE resource models are discussed in Section 2. FITS environment is presented in Section 3. To show the flexibility of the FITS environment, four ILP formulations are examined in Sections 4. We also explain the CAD tools and the sequence of employing them in this section. Experimental results are discussed in Section 5. Finally, concluding remarks are in Section 6.

2

MODELING POWER, TAM, AND TEST RESOURCES

This section introduces two power profiling techniques, four power approximation models (PAM), four TAM

1600


VOL. 54, NO. 12,

DECEMBER 2005

TABLE 1 Four Power Approximation Model (PAM)

models, and two ATE resource models. Later, we will show how the proper mix of these models can be used for test scheduling using our FITS environment.

2.1 Modeling Power for Nonembedded Cores There are at least two kinds of partitioning to profile the power consumption of a core and its subcircuits. Partitions in Time Domain: Traditionally, a single peak power value is used to represent the power consumption of a core in a time window. When the test patterns are applied to a core, the switching activities are often used to estimate the power consumption. With the simulation frequency, supply power, and specification of the cell, the average and instantaneous power can be calculated. A powertime waveform can be reported by a simulation tool such as PrimePower of Synopsys [32]. We equally partition the time-intervals in the X-axis. Each timeinterval is defined as a pattern set time step or, simply, time step for short. The peak power value for each time step can be represented on the Y-axis. Therefore, a core can have multiple power values (i.e., one per time step) during a test. 2. Partitions in Space Domain: A core structure can be partitioned into several subcircuits (or regions in layout when available). Here, we call them grids. A grid is a physical part of a core whose average or instantaneous power can be measured. A grid is a flexible notion of physical area (structure or layout) for which power profiling is possible. Therefore, a core can have several grids or just one grid. The size of a grid can be decided by a user based on information available. A grid can have a single or multiple peak power value. Similarly to time domain, using the power-time waveform for a grid, we can obtain multiple power values for a grid during a test. We define four Power Approximation Models (PAMs) as shown in Table 1. Here, “S” and “M” stand for single and multiple peak power values, respectively. “C” means per Core and “G” means per Grid. Almost all researchers so far have used SC-PAM in the time domain only. To the best of our knowledge, only reference [10] employed MC-PAM using exactly two power values. One of our contributions here is to extend the test scheduling methodology to cover all four models that handle multiple power values across both time and space. 1.

2.1.1 Power Profiling in Time Domain Assume a test pattern set for Core j has nj patterns. We equally partition nj patterns into m subsets, P Sj;1 ; ; Sj;k ; ; Sj;m , such that m jS j ¼ nj . Each subset j;k k¼1 has the same number of patterns, i.e., jSj;k j ¼ jSj 8k. If the

Fig. 1. Power profiling over time for the s386 benchmark.

speeds of cores during test are different, we still use one parameter jSj, but it corresponds to a different number of test patterns for different cores. For example, suppose two cores run at different speeds and accessing them has no other restriction. The application time of jSj patterns to a core running at frequency f1 is, time-wise, equivalent to ff21 jSj patterns of the second core working at frequency f2 . Still, the basic test scheduling time step (time unit) will be equivalent to the time needed to apply jSj test patterns. Throughout this paper, we refer to this time unit as pattern set time step. If ff21 is not an integer, there might be some idle cycles at the end of the last pattern set time step. This situation is similar to the drop of efficiency in a pipeline when the delay of stages is not well-balanced. s386 is one of the sequential circuits in ISCAS ’89 benchmark set. Fig. 1 shows the instantaneous power consumption of s386 circuit for n ¼ 48 test patterns using the Synopsys toolset [32] and Artisan-TSMC18 (0:18m feature size with V DD ¼ 1:8V olt) library. The power profile, corresponding to subset size (jSj) of 16 (time partition of 16 sec), is shown. For this example, using the peak power value within each time partition, we need to n ¼ 3 numbers in the formulation, i.e., integrate jSj f746; 1277; 1040g W att.

2.1.2 Power Profiling in Space Domain A core can be divided into several grids (i.e., structural/ physical units) by a user. The power-time waveform can be obtained for each grid. Some of the adjacent grids, which may even belong to different cores, can be grouped into an area of interest. One motivation of defining such areas is to avoid potential hot-spots, where peak power should be strictly limited. With the time domain partitioning method explained in Section 2.1.1, the peak power value in each partition set is used to represent the worst power consumption of that pattern set. For example, suppose, for a two core SoC, Core 1 and Core 2 have three and two pattern sets, respectively. Recall that a grid is a structural partition whose power is a matter of concern. In the following, grids are considered in two situations: 1) Grid 1, which covers the


1601

Fig. 2. Power profiling over grids and pattern sets. (a) Power for grids in Core_1 [uWatt]. (b) Power for grids in Core_2 [uWatt].

entire core, and 2) Grid 2, which is a subcircuit of that core. Depending on the application and user’s concern, any other similar definition of grids can be used. Using previous simulation tools, we find the power profile over grids. Fig. 2a and Fig. 2b summarize power data obtained for Core 1 and Core 2, respectively. Ultimately, a user chooses an area of interest as a combination of grids to avoid formation of hot spots. Fig. 3 shows two typical selections (shown in dashed lines) when the power (or thermal) distribution across the layout is available. Fig. 3a and Fig. 3b correspond to selecting areas based on Grid 1 (entire cores) and Grid 2 (portion of cores), respectively.

2.2 Modeling TAM In addition to test hardware, the number of I/O pins or interconnects to access cores inside a SoC is also limited. In general, users can select a test bus which uses hardware multiplexors or a test rail which uses a software-based polling protocol. In any way, TAM (e.g., number of I/O pins, scan chains) in an SoC is limited. Note that TAM can be shared by cores and the overall TAM limit depends on access limitation of all cores embedded inside a SoC. In our formulation, TAM is a flexible concept that represents the mechanism to access cores. Examples of TAM are a limited number of I/O pins, bit-width of test buses, number of scan chains, etc. 2.2.1 Constant versus Variable TAM for Each Core If a core has constant TAM bit-width during test, then it fits our definition of a constant TAM model. Both the SC-PAM and SG-PAM models can only use the constant TAM in the FITS environment. It has moderate cost and can shorten the test time. Fig. 4a shows the concept of constant TAM. Note that the TAM bit-width of each core is fixed during test. In a multiple scan chain design, a short chain will finish test earlier than a long chain. If the wrapper and TAM control is flexible, it can redirect the idle resources (e.g., pins, chains, etc.) to be used for testing another core. This utilization can be extended for scan-in, scan-out, and other signal deliveries during test. However, it requires changing

Fig. 3. An example of area selection. (a) Based on Grid_1. (b) Based on Grid_2.

Fig. 4. Four TAM models. (a) Constant TAM. (b) Variable TAM. (c) Distributed TAM.

the TAM bit-width for a core test on-the-fly. To the best of our knowledge, the existing CAD tool does not support this feature at present and, thus, a lot of don’t-care bits or patterns fill the short chain test for synchronization purposes. If the TAM bit-width can be changed during a core test, then such a core fits the variable TAM model. Here, each core can have multiple TAM bit-width during the test and the TAM bit-width changes dynamically. This may achieve the shortest test time. For the MC-PAM and MG-PAM models. Fig. 4b shows a variable TAM scenario. If there are multiple TAM requirements (e.g., scan chains with different lengths) inside a core, usually the longest test will be considered the test time of the core. However, some TAM of the shorter test can be switched earlier to other cores after their tests are completed. In this way, the overall test time may be shortened.

2.2.2 Distributed versus Nondistributed TAM Nondistributed TAM means all TAMs can be shared by all cores at different time steps. The total number of TAM bits (constant or variable) will be split into Q groups in distributed TAM model. Each group of TAM will be used by the cores which have the same width of TAM. There is no connection between any two of the groups. Within each group, either the constant or variable (for MC-PAM and MG-PAM only) TAM model can be used. This hardware design has no flexibility and will take the longest test time, but, in general, can achieve the smallest test hardware cost. However, for each group, the number of TAM bits used at a single time step should be limited. Fig. 4c shows the distributed TAM with the two groups separated by a thick horizontal line. 2.3 Modeling ATE Resources Test resources can be divided into on-chip test resources and off-chip ones. Typical on-chip test resources are BIST

1602


TABLE 2 Constant ATE Resource Assignments

components such as pattern generators and signature analyzers. Off-chip test resources are assumed to exist inside an ATE. The selection of the ATE resources is controlled by some relays in ATE or hard wires on the test board, but still relay-controlled by ATE. Test software can control the relays and other ATE resources. In the FITS environment, we can plan for any complex control of ATE resources through careful test scheduling.

2.3.1 Constant ATE Resources If the test resources are shared by several cores and need to be used for the entire test time of a core, then the constant ATE resource model is used. Table 2 shows an example of the flexible test resource assignments. Only Core 1 and Core 2 can be tested at the same time. Core 2 and Core 3 cannot be tested at the same time due to an insufficient number of resource Res_2. Core 1 and Core 3 cannot be tested at the same time due to resource Res_3 conflict. 2.3.2 Variable ATE Resources If the ATE resources are shared by several cores and need to be used for only some test time of a core, then the variable ATE resource model is used. Here, the overall test time is expected to be less than the constant ATE resource model. The control of test resource selection is mostly done by software, which makes this model flexible and efficient. This model is used for MC-PAM and MG-PAM only. Table 3 shows an example of variable ATE resources assignments. For example, as a requirement for test resource Res_3, Core 1 needs two units at time step k1 , three units at time step k2 , and one unit at time step k3 .

3

INTEGRATED ILP TEST SCHEDULING ENVIRONMENT

In this section, we discuss our flexible ILP-based test scheduling (FITS) environment which takes power, TAM, and test resource factors into account. Based on the choice of PAM (Power Approximation Model), TAM model, and ATE test resource model, a set of constants, variables, and constraints will be selected to form a formulation which is suitable for the user’s specific problem. Specifically, the

VOL. 54, NO. 12,

DECEMBER 2005

constants need to be supplied by users so that an executable ILP formulation can be generated by FITS. These ILP formulations guarantee the optimization of power, TAM, and test resource and is capable of taking various constraints into account. The price, however, is expected to be a reasonably long optimization time when SoC includes a large number of cores or simultanrous optimization of too many factors is sought. We will elaborate on this in Section 5.

3.1

Constants

.

Time Mandatory Part: 8 Wt Optimization weight for total test > > > > scheduling time > > > > Wh Optimization weight for total number > > > > of halts < Upper bound for the test set time steps T MAX > > required > > > > > N CORE Total number of cores > > > > Total number of test pattern subsets for N SET > : j Core j

.

Area Optional Part: 8 AREA N Total number of areas > > > > ðSG-PAM and MG-PAMÞ > > > > Total number of grids in Core j < NjGRID ðSG PAMandMG PAMÞ > > > Binary constant showing if Area i Ai;j;h > > > > contains Grid h of Core j > : ðSG-PAM and MG-PAMÞ

.

Power Mandatory Part: 8 Wp Optimization weight for total power > > > P P EAK Maximum peak power allowed for SoC > > > > AREA > Maximum peak power allowed for Area i > Pi > > > ðSG-PAM and MG-PAMÞ > > > > Peak power dissipation when subset Sj P > j > > > tests Core j ðSC-PAMÞ > > < Peak power dissipation at Grid h of Pj;h Core j when tested with subset Sj > > > > ðSG-PAMÞ > > > > > Peak power dissipation when subset Sj;k P > > j;k > > tests Core j ðMC-PAMÞ > > > > Pj;h;k Peak power dissipation at Grid h of > > > > Core j when tested with subset Sj;k > : ðMG-PAMÞ

TABLE 3 Variable ATE Resource Assignments


.

TAM Resource Mandatory Part: 8 Wa Optimization weight for total TAM > > > N T AM Total number of TAM bits available > > > > > to test SoC > > < T AM Total number of TAM bits required Nj to test Core j when subset Sj is applied > > > T AM > N Total number of TAM bits required > j;k > > > > to test Core j when subset Sj;k is applied > : ðMC-PAM and MG-PAMÞ

.

ATE Resource Optional Part: 8 RES Rr Total number of available ATE resource r > > > > Rr;j Total number of ATE resource r > > > > required to test Core j when subset Sj < is applied > > > Total number of ATE resource r required R r;j;k > > > > to test Core j when subset Sj;k is applied > : ðMC-PAM and MG-PAMÞ

8 SCores > > > > > > > > Sp > > > > > > > > < Snp Sc > > > > > > > > Snc > > > > > > > > : Sd

3.2 .

Time Mandatory Part: -

T MAX , the upper bound of test time, can be directly provided by users. It can also be estimated by the following formula:

T MAX ¼

CORE NX

-

Rj NjSET :

j¼1

Rj ¼

-

Integer variable total time denotes the overall test scheduling time in a pattern set time step unit. We define the binary variable cj;l . Briefly, cj;l ¼ 1 means that Core j is under test at time l. To be able to set a core precedence and nonpreemption constraints on Core j and also trace on scheduling time, we need to know when its testing begins (bj ) and when it ends (ej ). In other words, bj and ej are integer variables that denote time step index that Core j receives the first and last test pattern sets, respectively. For MC-PAM and MG-PAM, we need an additional set of variables as defined below: 8 < 1 if Core j is scheduled to receive its test data subset Sj;k at time step l tj;k;l ¼ : 0 otherwise: Variables cj;l and tj;k;l have the following relationship:

For SC-PAM and MC-PAM, Rj ¼ 1. For SG-PAM and MG-PAM, Rj can be obtained from Ai;j;h constants: (

Set of all cores in SoC; i:e:; SCores ¼ fCore 1; Core 2; ; Core N CORE g Set of preemptive cores whose test can be halted Set of nonpreemptive cores; i:e:; SCores Sp Set of concurrent core pairs whose test cannot overlap in any cycle Set of nonconcurrent core pairs; i:e:; SCores Sc Set of precedent core pairs whose test must be in sequence:

Variables

. Auxiliary Constants/Sets: To simplify writing formulas, we define an auxiliary constant and six sets as follows: -

1603

NjSET

cj;l ¼

X

tj;k;l :

k¼1

PN AREA PNjGRID 1 if i¼1 h¼1 Ai;j;h > 0 0 otherwise:

If T MAX is too large, we can trade off accuracy with faster ILP search time by considering a larger size of test pattern subsets. T MAX can be entered manually by a user. In such a case, it should be a number smaller than the value of the above formula. Therefore, the number of constraints and variables will be small and a solution can be found quickly. A word of caution is that, if the number entered is too small, then there may be no feasible solution. Then, the user needs to enter a larger T MAX and rerun the formulation. Six subsets of cores are also defined based on the user’s choice of their behaviors in the test mode. The rationale for defining these will be clarified later in Sections 3.3 and 3.4.

.

.

SC-PAM and SG-PAM models require only cj;l . Power Mandatory Part: Integer variable pAREA i denotes the power consumption at Area i. For SCPAM and MC-PAM, there is only one area (the entire SoC). Therefore, the total power consumption . All power constants is pSoC , which is equal to pAREA 1 need to have the same unit and be only integers. Real power units (e.g., expressed in W att units) need to be normalized for power constants. All integer power constants can keep power variables as integers. If some power constants are real numbers, then the power variables will be real. In that case, the generated formulation will become MILP instead of ILP. In general, we prefer to keep the formulation ILP due to its faster convergence to solution. TAM Mandatory Part: Integer variable nT AM denotes the total number of TAM used. In general, the unit for TAM can be different depending on the applications. For example, TAM can indicate pin (when variable ATE pins are considered), bus bitwidth (when test buses are used) or scan chain (when

1604


-

-

Objective Function The objective function is to minimize a linear function of weighted test time, TAM width, power consumption, and preemption times: ! X T AM AREA Þ þ Wp pi ðWt total timeÞ þ ðWa n 0

X

þ @Wh

ð1Þ

All test pattern subsets for each core must be applied during the schedule: MAX TX

cj;l ¼ NjSET

8j:

ð2Þ

l¼1

-

i

1

DECEMBER 2005

cj;l 1 8j; l:

the SoC’s scan chains are decided). The TAM unit needs to be the same for all the cores.

3.3

VOL. 54, NO. 12,

ðej bj ÞA:

Each scheduling time step should be within the given bound: total time T MAX :

ð3Þ

Core j2Sp

For SC-PAM and MC-PAM, power consumption is represented by Wp pSoC . Preemption (or preemptive) test means the test can be halted and resumed later. It makes the test time longer for each core, but can often achieve a shorter time for the entire SoC. Nonpreemption or (nonpreemptive) test means the test cannot be halted, therefore, the test time becomes shorter for each core, but longer for the entire SoC [16]. The number of preemptions (also called preemption times or halt times) usually is limited because of the limitation of the test hardware cost [16]. By minimizing the distance between bj and ej , we also minimize the halt times (number of preemption). The test hardware cost can be estimated as follows: C

CORE NX

-

The assigned time step (e.g., l cj;l ) to test each core should stay within the overall scheduling time: l cj;l total time

-

8j; k; l:

ð5Þ

For MC-PAM and MG-PAM, any test pattern subset must be scheduled only once: MAX TX

ð4Þ

For MC-PAM and MG-PAM, variables tj;k;l should be 0 or 1: tj;k;l 1

-

8j; l:

tj;k;l ¼ 1 8j; k:

ð6Þ

l¼1

NjT AM 1 þ NjP reemption ;

j¼1

where C is a cost constant and NjP reemption is the halt times (number of preemption) for Core j. In order to reduce the test hardware cost, the NjP reemption parameter, which is related to configurable hardware cost, needs to be reduced. Therefore, Wh (the weight of the test hardware cost) needs to be larger than all the other weights. Selecting different weights for Wt , Wa , Wp , and Wh implies how important, from the user’s point of view, these factors are in the optimization process. Both Wt and Wh need to be greater than 0, but others can be optionally zero. By changing these weights, we can explore different options and see the tradeoff among various parameters. We will elaborate on weight selection in Section 5.6.

3.4

Constraints

We use five indices (i, j, k, h, and l) to define the variables. Their ranges are: 1 i N AREA , 1 k NjSET , 1 h NjGRID ( f o r 1 j N CORE , MAX . The constraints will be Core j), and 1 l T as follows: .

Time Mandatory Part: -

Variables cj;l indicate that Core j is tested at time step l:

-

For MC-PAM and MG-PAM, cj;l and tj;k;l are related: NjSET

cj;l ¼

X

tj;k;l

8j; l:

ð7Þ

k¼1

-

Variables ej and bj are bounded: l cj;l ej total time

8j; l;

1 bj ðT MAX Þ ðT MAX lÞ cj;l

ð8Þ 8j; l:

ð9Þ

We also need to make sure that variables bj (ej ) get the maximum (minimum) of the scheduled time steps. To do this, we add this term in the objective function that independently maximizes (minimizes) these variables: CORE NX

ðej bj Þ:

j¼1

If there is a nonpreemptive constraint for a core, then we do not need the above term for that core because the bj (ej ) is already the maximum (minimum) by the nonpreemptive constraint.


.

GRID CORE Nj NX X

Time Optional Part: -

8Core p 2 Snp :

8Core p; Core q 2 Sd :

-

ð10Þ

ð12Þ

l tj;k;l
> > Pq¼1 > Q > nTq AM ¼ nT AM N T AM > > > Pq¼1 CORE > > < Nq CORE NjT AM cj;l nTq AM j¼N þ1 q1

> NqT AM > > > > > or PNqCORE > N T AM tj;k;l nTq AM CORE > j¼Nq1 þ1 j;k > > : NqT AM

8q; l 8q; l: ð20Þ

.

ATE Resource Optional Part: -

For the constant ATE model of each core, if there are limits on the ATE resources, they can also be incorporated within the constraints for shared resources only. Dedicated resources do not need this constraint. The total number of resource r used in a single time step l should be limited: CORE NX

Rr;j cj;l RRES r

8r; l:

ð21Þ

j¼1

-

For the variable ATE model of each core, MC-PAM or MG-PAM should be used. The ATE resources can be shared for different pattern sets of cores. As an example, this can be done by switching the relays in ATE with the software control for offchip cases. The total number of resource r used in a single time step l should be limited: SET CORE Nj NX X

j¼1

4

CASE STUDIES

Rr;j;k tj;k;l RRES r

8r; l:

ð22Þ

k¼1

AND

EXPERIMENTAL PLATFORM

As we put many optimization metrics and design/test constraints into the formulations, the size (and, thus, the running time) naturally becomes large. However, we believe the overall time to find a solution is still much less than taking care of these metrics one by one in an ad hoc fashion. Moreover, in practical cases, depending on the user’s interest and/or available design/test budget, only a subset of these metrics and constraints is used. This makes the formulation size much smaller and the problem in the majority of cases converges to an optimum solution quite rapidly. To show the flexibility of our FITS environment, in this section, we present four ILP formulations based on various choices among four PAMs, two TAM models, and two ATE resource models. The experimental results of these four formulations and their running time will be discussed in Section 5.

4.1 Four ILP Formulations These four ILP formulations deal with four PAM models outlined earlier in Table 1. They all use the same objective function, as discussed in Subsection 3.3. However, a subset of constants, variables, and constraints will be selected for each formulation. The definitions of constants, variables, objective, and constraints were explained in Section 3. The

Fig. 7. Our simulation platform.

Appendix, which can be found on the Computer Society Digital Library at http://computer.org/tc/archives.htm, summarizes these four formulations and their key specifications in Table 4 and Figs. 5 and 6.

4.2 CAD Tools Used The relationships among CAD tools used are shown in the flowchart of Fig. 7. 4.2.1 Synopsys Toolset There are several tools in Synopsys [32] used in our experimentation. -

-

-

-

Design_analyzer generates optimized and synthesizes the gate level HDL (Hardware Description Language) description (i.e., VHDL and Verilog) of cores and SoC using the Artisan-TSMC library HDL files. Using this tool, multiscan chains can automatically be inserted into the sequential cores. TetraMax is used as the fault simulator and ATPG (automatic test pattern generator). Vhdlan analyzes both the gate level HDL file from design_analyzer and test bench HDL file from TetraMax. Scirocco takes gate level and test bench HDL files as input, simulates the circuit, and provides the switching information of nodes at the gate level. PrimePower takes the gate level HDL file from design_analyzer, measures the switching activity of all the standard cells, and generates the power-time waveform and also average power for all cores and subcircuits inside them. Although the total power


1607

TABLE 5 Specifications of Four ISCAS ’89 Benchmarks Used as Cores in Experimentation

Fig. 8. Instantaneous power for four ISCAS ’89 benchmarks.

-

-

covers both dynamic and leakage power, in the 0:18m technology library that we used, the dynamic power is dominant. JupiterXT performs placement, floorplan, routing optimization, and generates layout based on the technology file (Mosis standard cell library in our work [31]). CosmosScope can plot all waveforms, including the instantaneous power-time waveforms of cores and their subcircuits.

4.2.2 Artisan-TSMC18 Library This library from Mosis [31] is used as the main database for design_analyzer, TetraMax, PrimePower, and JupiterXT. 4.2.3 CPLEX We used the CPLEX package from ILOG S. A., Inc. [33] to solve the ILP/MILP formulations. We wrote C programs to translate the formulations from input data files, in our textual format, to the lp format required by CPLEX. CPLEX allows real, integer, or mixed real-integer linear optimizations, along with some memory control commands for efficiency purposes. If the input power constants are all integers, then the solution will be integer and the formulation becomes integer linear programming (ILP). If any one of the power constants is real, then the solution will be real and the formulation becomes mixed real and integer linear programming (MILP).

5

EXPERIMENTAL RESULTS

5.1 Simulation Environment and Setup The time and space-domain power profiles for each cores are critically needed in our methodology. Unfortunately, details and gate-level description of internal cores are not available for ITC ’02 benchmarks [34]. Therefore, we chose ISCAS ’89 benchmarks, whose gate-level HDLs are widely available, as cores. We then obtained power profiles using our experimental platform, explained in Section 4. Note that both combinational and sequential cores can be handled in the FITS environment as long as the test wrappers around combinational cores and flip-flops in sequential circuits can be combined through single or multiple scan chains. We used four ISCAS ’89 benchmarks as cores, running at 200 MHz (clock cycle of 5 nanoseconds). Although this system is behaviorally fictitious, having these well-known benchmarks with known timing, area, and power characteristics provides the possibility of comparison and repeating the experiments. To be able to show the flexibility provided in FITS, we assume no bottleneck exists on the core access mechanism and we can access multiple cores at the same time. We also assume the tester memory is large enough so that all the feasible results are acceptable. The key features of these four cores are summarized in Table 5. These four cores and the complete system were described in VHDL. Then, synthesis, test patterns generation, simulation, power calculation, and profiling were

1608


VOL. 54, NO. 12,

DECEMBER 2005

Fig. 9. Analyzing core s1488. (a) Core s1488 placement. (b) Layout of SDFFXL (flip-flop of interest). (c) Instantaneous power of SDFFXL.

Fig. 10. Floorplan of an SoC (called s_t4c) that combines s386, s298, s1238, and s1488 cores.

performed using the design_analyzer, TetraMax, Scirocco, and PrimePower tools in Synopsys [32], respectively, using the Artisan-TSMC18 (0:18m feature size with V dd ¼ 1:8V olt) library. The instantaneous power values for these cores, which become the basis of power profiling over time, are shown in Fig. 8, captured using CosmosScope in the Synopsys [32] environment. Finally, four ILP formulations were solved with different weights in the objective function. The spatial layout information was obtained from JupiterXT in Synopsys [32]. The floorplan and placement were done by “Design Planning” of JupiterXT. The placement of core s1488 is shown in Fig. 9a. This core includes 14 flip-flops (big rectangles in the placement figure). The layout of these flip-flops (tagged as SDFFXL) is shown in

Fig. 9b. For SG-PAM and MG-PAM, we assumed each core has multiple grids and two choices of areas are analyzed. The first area is the whole SoC and the second one is a possible hot-spot formed by one grid from each core. In the latter case, each selected grid covers a flip-flop and its corresponding interconnects. We chose these flip-flops as the power simulation indicated high switching activity and, thus, potential for hot-spot formation. The instantaneous power values of such flip-flops inside each core can be obtained during the power profiling step (see Section 4 for our CAD platform). Fig. 9c shows the instantaneous power values of DFF_2 (of type SDFFXL) inside core s1488. As for implementing an SoC, four ISCAS ’89 benchmarks, s386, s298, s1238, and s1488, are chosen as cores.


TABLE 6 Floorplan Data of Four ISCAS ’89 Benchmarks

There are flip-flops at corners of these four cores. The top level system is called s_t4c, whose snapshot of the floorplan in the JupiterXT interface is shown in Fig. 10. The width, height, physical area, and placement area of the four embedded cores are shown in Table 6. For SG-PAM and MG-PAM, we assumed each core has multiple grids and two choices of areas are analyzed. The first area is the whole SoC and the second one is a possible hot-spot formed by one grid from each core, almost in the middle of the floorplan. In the latter case, each selected grid covers a flipflop and its corresponding interconnects. We chose these flipflops as the power simulation indicated high switching activity and, thus, potential for hot-spot formation. To picture their relative sizes, the floorplaning of our design and the area

of interest are redrawn in Fig. 11. Note that Area_1 is the entire system and Area_2 is the area of interest (potential local hotspot equivalent to almost 2 percent of design). Area_2 covers two flip-flops from s1238 and one flip-flop from s1488, s386, and s298, respectively. We intentionally defined these two cases to show FITS capability in controlling power globally and locally across an SoC. Note that, in the FITS environment, any grid can be independently included in an area of interest. This makes selecting such areas fully under the user’s control. This selection would be more meaningful if power/heat estimation tools are used to identify potential hot-spots. Our formulation finds a test schedule such that the grid power stays within the limit. Our example is intended to show the flexibility of the formulation in area selection. Various weights ðWt ; Wa ; Wp ; Wh Þ are listed in Tables 7, 8, and 9. The total time is expressed in a pattern set time step. Column TAM Width shows the maximum number of shows the maximum power of Area i pins used. The pAREA i in W att. # of Halts shows the overall number of halts (in preemption cases) in the cores’ test schedules. The Run Time column represents the running time (in second) that CPLEX converges to a solution.

Fig. 11. Defining two areas in our example.

TABLE 7 Wa ¼ Wp ¼ Wh ¼ 1, P1AREA ¼ 15mW atts, P2AREA ¼ 1; 200W atts, N T AM ¼ 80

NA: Not applicable.

1609

1610


VOL. 54, NO. 12,

DECEMBER 2005

TABLE 8 Wa ¼ Wp ¼ Wh ¼ 1, Wt ¼ 104 , P1AREA ¼ 104 W atts, P2AREA ¼ 700W atts

NA: Not applicable.

TABLE 9 Nonpreemptive Tests: Wh ¼ 0, P1AREA ¼ 15mW atts, P2AREA ¼ 1; 200W atts, N T AM ¼ 80

NA: Not applicable.

5.2 Results for Preemptive Test The four formulations, shown in Figs. 5 and 6, are used with constraints (21) and (22) on external test resource limits included. We used default preemptive formulations first, where the tests of the four cores are allowed to be halted a period of time and then resumed. Adding constraint (10) can change a formulation from preemptive to nonpreemptive. Although constraint (19) is used in both the MC-PAM and MG-PAM problems, the TAM widths for each core are not changed during test. Constant P1AREA is the same as P P EAK and variable pAREA is the same as pSoC in the tables. As shown 1 in Table 7, the time weight heavily affects the runtime. In general, ðWt ; Wa ; Wp ; Wh Þ ¼ ð104 ; 1; 1; 1Þ produced a good result in a short time. The proper upper limits of TAM width and power values can be found in Table 7. Table 8 shows the results with a different number of TAM, when the weights and power limits are properly chosen from the results of Table 7. With smaller power and TAM limits, the runtime in Table 8 is much larger than that in Table 7.

5.3 Results for Nonpreemptive Test Four ILP formulations in Section 5.2 with Constraint (10) are used for nonpreemptive tests. Table 9 shows the results of different weights with nonpreemptive tests. Two values of 1, 104 are assigned to Wt , Wa , and Wp in three extreme cases. Those three cases really minimize only time, TAM, or power. The choice of ðWt ; Wa ; Wp Þ ¼ ð102 ; 10; 1Þ gives a balanced result among time, TAM, and power. Using different weights for time, TAM, and power terms provides the flexibility and trade-off in our formulations. 5.4 Preemption versus Nonpreemption ðWt ; Wa ; Wp ; Wh Þ ¼ ð104 ; 0; 0; 1Þ is used in Table 10. The runtime is shorter than the previous ones with ðWt ; Wa ; Wp ; Wh Þ ¼ ð104 ; 1; 1; 1Þ in Table 8 because there is no need to reduce the number of halts (preemption times) when optimizing the solution. If we do not want to halt the test of a core, then constraint (10) can be added to that core to make it a nonpreemptive test. Therefore, the ej bj term of Core j in the objective will be removed. With nonpreemptive


1611

TABLE 10 Wa ¼ Wp ¼ 0, Wh ¼ 1, Wt ¼ 104 , P1AREA ¼ 104 W atts, P2AREA ¼ 700W atts, nT AM ¼ 48

NA: Not applicable.

TABLE 11 Runtime Results of ILP-MG when Wh ¼ 1, Wt ¼ 104 , P1AREA ¼ 104 W atts, P2AREA ¼ 700W atts, N T AM ¼ 48

test for the four cores, the results of the four ILP formulation are shown in Table 10. The runtime now is two to three orders of magnitude faster than the preemptive test cases.

5.5 Discussion on Runtime Table 8 shows that the optimization runtime can be large. In general, this is a price to pay for such comprehensive optimization. However, for practical optimizations, this rarely happens. Additionally, there are still ways to significantly reduce the runtime. One way is to limit the search by observing the improvement of the result over iterations. More specifically, using the CPLEX tool, our ILP formulation first converges to a feasible solution very quickly. Then, CPLEX spends a lot of time optimizing the solution based on the weights. To make this point clear, we used ILP-MG as an example to show how we can monitor the improvement over iterations and stop the search when the improvement over iterations becomes negligible. A user can settle down with a suboptimal solution or use the information to revise the objective and/or constraints and rerun CPLEX. CPLEX can be halted and resumed after checking the variables. Column Gap in Table 11 shows the difference between the value of objective function compared to the optimum value from the LP (Linear Programming) formulation, which is often a real number. Briefly, ÞobjðLP Þ j and it is reported periodically on the Gap ¼ j objðILP objðLP Þ screen while running CPLEX. Table 11 shows that, as time passes, Gap becomes smaller, meaning a new solution closer to the optimal is found. But, sometimes, finding the absolute optimum solution becomes too time-consuming even though there can be lots of high-quality (but suboptimal) solutions. We

just stop the program and get a solution when Gap is very small. Therefore, all the solutions are acceptable, but the number and distribution of preemption time slots vary. Another way to shorten the runtime is to put weights of zero in the objective function for those metrics that are not so critical. For example, if the TAM width and/or the power values are not critical and we are satisfied with their loose constraints (through (18) and/or (19)), then we can set Wa ¼ Wp ¼ 0, whose runtime is about 10 times faster than the result of Wa ¼ Wp ¼ 1. Comparing runtimes in Table 10 and Table 11 clearly verifies this.

5.6 Comment on Weight Selection The weights in the objective function imply the level of importance in the optimization process. Due to the complexity of formulation and wide range of sensitivity of the parameter, we cannot offer any fixed set of weights that always provide the best solutions. On the other hand, internal treatment of these weights by the ILP solver tool also affects the quality of final solution. In spite of such ambiguities, a few tries always provide enough information to tune the weights. In what follows, we summarized our findings in this respect. In most of the test scheduling applications, time is the preferred metric to be minimized while all other parameters can be specified (or contained) by the user’s predefined constants and constraints. For the CPLEX tool, we have always obtained very good quality solutions and reasonable runtime using the following four categories of weights: 1.

a very large number (e.g., weight of 102 to 104 ) for a metric that is critically important (e.g., time weight Wt ¼ 104 ),

1612


[6]

a mid-range number (e.g., weight of 101 to 102 ) for those metrics that are our primary concern, 3. a small number (e.g., weight of 1 to 10) for those metrics that are our secondary concern, 4. weight of zero for all parameters that are not critical and we can handle them through constraints. Table 9 shows experiments with different permutations of these four categories of weights. In all cases, an optimum solution was found in reasonable time.

[9]

6

[10]

2.

CONCLUSION

There are a lot of bumps in the road to achieving efficient SoC testing. We present a flexible ILP-based test scheduling (FITS) environment and four ILP formulations as case studies. Our formulations simultaneously consider time, power, resource (TAM and ATE) constraints, and can minimize or take into account total test time, TAM width, power, and number of halts. The cornerstone in our methodology is power profiling in time and space domains. The ILP-MC and ILP-MG formulations are more flexible than the other two (ILP-SC and ILP-SG) because of considering the multiple peak power values across test pattern sets for each core. The objective function is a weighted mix of test time, TAM width, power, and preemption occurrence. By adjusting the weights of the above four terms, the optimized test schedule and the values of the four terms can be obtained through a flexible trade-off mechanism. For preemptive test cases, CPLEX can provide a feasible solution. But, it will take longer to find an optimum solution as it attempts to reduce the number of halts. In general, when nonpreemptive tests and/or single metric (e.g., test time) optimization options are chosen, CPLEX produces the results much faster. FITS is a very flexible environment giving many options for the user to choose from. The price to pay is the need to tune the weights by observation or heuristics and compromise on a suboptimal solution when the runtime becomes too long for very large systems or overly constrained choices.

ACKNOWLEDGMENTS This work was supported in part by US National Science Foundation CAREER Award #CCR-0130513.

REFERENCES [1] [2]

[3]

[4] [5]

P. Chen, L. Wu, G. Zhang, and Z. Liu, “A Unified Compact Scalable Id Model for Hot Carrier Reliability Simulation,” Proc. IEEE 37th Ann. Int’l Reliability Physics Symp., pp. 243-248, 1999. Y. Bonhomme, P. Girard, L. Guiller, C. Landrault, and S. Pravossoudovitch, “A Gated Clock Scheme for Low Power Scan Testing of Logic ICs or Embedded Cores,” Proc. 10th Asian Test Symp., pp. 253-258, Nov. 2001. R. Sankaralingam and N. Touba, “Reducing Test Power during Test Using Programmable Scan Chain Disable,” Proc. IEEE Int’l Workshop Electronic Design, Test, and Applications, pp. 159-163, Jan. 2002. O. Sinanoglu, I. Bayraktaroglu, and A. Orailoglu, “Test Power Reduction through Minimization of Scan Chain Transitions,” Proc. 20th IEEE VLSI Test Symp., pp. 166-171, Apr. 2002. A. Chandra and K. Chakrabarty, “A Unified Approach to Reduce SOC Test Data Volume, Scan Power and Testing Time,” IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 22, no. 3, pp. 352-363, Mar. 2003.

[7]

[8]

[11] [12] [13]

[14]

[15]

[16]

[17]

[18]

[19] [20]

[21] [22]

[23] [24]

[25] [26]

[27]

VOL. 54, NO. 12,

DECEMBER 2005

E. Larsson, K. Arvidsson, H. Fujiwara, and Z. Peng, “Integrated Test Scheduling, Test Parallelization and TAM Design,” Proc. 11th Asian Test Symp. (ATS ’02), pp. 397-404, Nov. 2002. E. Marinissen, R. Arendsen, G. Bos, H. Dingemanse, M. Lousberg, and C. Wouters, “A Structured and Scalable Mechanism for Test Access to Embedded Reusable Cores,” Proc. Int’l Test Conf. (ITC-98), pp. 284-293, 1998. V. Iyengar, K. Chakrabarty, and E.J. Marinissen, “Recent Advances in Test Planning for Modular Testing of Core-Based SOCs,” Proc. 11th Asian Test Symp. (ATS ’02), pp. 320-325, Nov. 2002. R. Chou, K. Saluja, and V. Agrawal, “Scheduling Tests for VLSI Systems under Power Constraints,” IEEE Trans. VLSI, vol. 5, no. 2, pp. 175-185, June 1997. P. Rosinger, B. Al-Hashimi, and N. Nicolici, “Power Profile Manipulation: A New Approach for Reducing Test Application Time under Power Constraints,” IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 21, no. 10, pp. 12171225, Oct. 2002. D. Zhao and S. Upadhyaya, “Adaptive Test Scheduling in SoC’s by Dynamic Partitioning,” Proc. 17th IEEE Int’l Symp. Defect and Fault Tolerance in VLSI Systems, pp. 334-342, Nov. 2002. D. Zhao and S. Upadhyaya, “Power Constrained Test Scheduling with Dynamically Varied TAM,” Proc. 21st VLSI Test Symp. (VTS ’03), pp. 273-278, Apr. 2003. E.G. Coffman Jr., M.R. Garey, D.S. Johnson, and R.E. Tarjan, “Performance Bounds for Level-Oriented Two-Dimensional Packing Algorithm,” SIAM J. Computing, vol. 9, pp. 809-826, 1980. Y. Huang, W.-T. Cheng, C.-C. Tsai, N. Mukherjee, O. Samman, Y. Zaidan, and S.M. Reddy, “Resource Allocation and Test Scheduling for Concurrent Test of Core-Based SoC Design,” Proc. Asian Test Symp., pp. 265-270, 2001. Y. Huang, W.-T. Cheng, C.-C. Tsai, N. Mukherjee, O. Samman, Y. Zaidan, and S.M. Reddy, “On Concurrent Test of Core-Based SoC Design,” J. Electronic Testing: Theory and Applications, vol. 18, Aug. 2002. V. Iyengar, K. Chakrabarty, and E.J. Marinissen, “Wrapper/TAM Co-Optimization, Constraint-Driven Test Scheduling, and Tester Data Volume Reduction for SOCs,” Proc. 39th Design Automation Conf., pp. 685-690, June 2002. Y. Huang, S.M. Reddy, W.-T. Cheng, P. Reuter, N. Mukherjee, C.-C. Tsai, O. Samman, and Y. Zaidan, “Optimal Core Wrapper Width Selection and SOC Test Scheduling Based on 3D Bin Packing Algorithm,” Proc. Int’l Test Conf., pp. 74-82, Oct. 2002. Y. Huang, W.-T. Cheng, C.-C. Tsai, N. Mukherjee, and S.M. Reddy, “Static Pin Mapping and SOC Test Scheduling for Cores with Multiple Test Sets,” Proc. Fourth Int’l Symp. Quality Electronic Design, pp. 99-104, Mar. 2003. M. Nourani and C. Papachristou, “An ILP Formulation to Optimize Test Access Mechanism in SoC Testing,” Proc. Int’l Test Conf. (ITC), pp. 902-910, Oct. 2000. K. Chakrabarty, “Test Scheduling for Core-Based Systems Using Mixed Integer Linear Programming,” IEEE Trans. Computer-Aided Design, vol. 19, pp. 1163-1174, Oct. 2000. V. Iyengar and K. Chakrabarty, “Precedence-Based, Preemptive, and Power-Constrained Test Scheduling for System-on-a-Chip,” Proc. VLSI Test Symp. (VTS), pp. 368-374, 2001. M. Nourani and J. Chin, “Power-Time Trade-Off in Test Scheduling for SoCs,” Proc. Int’l Conf. Computer Design, pp. 548553, Oct. 2003. J. Chin and M. Nourani, “SoC Test Scheduling with Power-Time Tradeoff and Hot Spot Avoidance,” Proc. Design Automation and Test in Europe Conf. (DATE), pp. 710-711, Feb. 2004. V. Muresan, X. Wang, V. Muresan, and M. Vladutiu, “List Scheduling and Tree Growing Technique in Power-Constrained Block Test Scheduling,” Digest of Papers European Test Workshop, pp. 27-32, 2000. N. Nicolici and B. Al-Hashimi, “Multiple Scan Chains for Power Minimization during Test Application in Sequential Circuits,” IEEE Trans. Computers, vol. 51, no. 6, pp. 721-734, June 2002. P.M. Rosinger, P.T. Gonciari, B.M. Al-Hashimi, and N. Nicolici, “Analysing Trade-Offs in Scan Power and Test Data Compression for Systems-on-a-Chip,” IEE Proc. Computers and Digital Techniques, vol. 149, no. 4, pp. 188-196, July 2002. E. Larsson and H. Fujiwara, “Test Resource Partitioning and Optimization for SOC Designs,” Proc. 21st VLSI Test Symp., pp. 319-324, Apr. 2003.


[28] E. Larsson, K. Arvidsson, H. Fujiwara, and Z. Peng, “Integrated Test Scheduling, Test Parallelization and TAM Design,” Proc. 11th Asian Test Symp., pp. 397-404, Nov. 2002. [29] J. Pouget, E. Larsson, Z. Peng, M. Flottes, and B. Rouzeyre, “An Efficient Approach to SoC Wrapper Design, TAM Configuration and Test Scheduling,” Proc. Eighth IEEE European Test Workshop, pp. 51-56, May 2003. [30] D. Zhao and S. Upadhyaya, “A Resource Balancing Approach to SoC Test Scheduling,” Proc. 2003 Int’l Symp. Circuits and Systems (ISCAS ’03), vol. 5, pp. 525-528, May 2003. [31] The Mosis Service, www.mosis.org, 2004. [32] Synopsys Inc., “User Manuals for SYNOPSYS Version 2004,” 2004. [33] ILOG S.A, Inc., “The User’s Manual for ILOG CPLEX 8. 1,” 2003. [34] http://www.extra.research.philips.com/itc02socbenchm, 2005. Tzuchien (James) Chin (M’05) received the BS degree in electronic engineering from National Chiao Tung University, Hsinchu, Taiwan, in June 1985. He received two MS degrees in electrical engineering from National Tsing Hua University at Hsinchu, Taiwan (May 1989) and Texas Tech University in Lubbock, Texas (June 1993). He received the PhD degree in electrical engineering from the University of Texas at Dallas in May 2005. Dr. Chin worked at Texas Instruments from 1998 to 2003 as a mixed signal test engineer and is currently with Analog Devices, Inc. in Wilmington, Massachusetts. His current research interests include high electron mobility transistors, noise in VLSI chips, neural networks, and various system-on-chip test challenges. He is a member of the IEEE.

1613

Mehrdad Nourani (S’91-M’94-SM’05) received the BSc and MSc degrees in electrical engineering from the University of Tehran, Tehran, Iran, in 1986 and the PhD degree in computer engineering from Case Western Reserve University, Cleveland, Ohio, in 1993. During the 1994 academic year, he was a postdoctoral fellow in the Department of Computer Engineering at Case Western Reserve University. He was with the Department of Electrical and Computer Engineering, University of Tehran from 1995 to 1998 and the Department of Electrical Engineering and Computer Science, Case Western Reserve University from 1998 to 1999. Since August 1999, he has been on the faculty of the University of Texas at Dallas, Richardson, where he is currently an associate professor of electrical engineering and a member of the Center for Integrated Circuits and Systems. His current research interests include design for testability, system-on-chip testing, signal integrity modeling and test, application specific processor architectures, packet processing devices, high-level synthesis, and lowpower design methodologies. He has published more than 120 papers in journals and refereed conference proceedings. Dr. Nourani received the Texas Telecommunications Consortium Award (1999), The Clark Foundation Research Initiation Grant (2001), the US National Science Foundation Career Award (2002), and Cisco Systems Inc. URP Award (2004). He is a senior member of the IEEE and a member of the IEEE Computer Society and the ACM SIGDA.

. For more information on this or any other computing topic, please visit our Digital Library at www.computer.org/publications/dlib.