Accelerated Testing Obtaining Reliability ... - Iowa State University

Accelerated Testing Background

Accelerated Testing Obtaining Reliability Information Quickly

• Today’s manufacturers face strong pressure to:

Improve productivity, product field reliability, and overall quality using new technology. Develop newer, higher technology products in record time.

William Q. Meeker Department of Statistics and Center for Nondestructive Evaluation Iowa State University Ames, IA 50011

• Implies increased need for up-front testing of materials, components and systems. • Accelerated tests provide timely information for product design and development. • Users must be aware of potential pitfalls

1

2

Overview What is Reliability? • Different kinds of accelerated tests • R(t) = 1 − F (t)

• Example 1—Evaluation of an insulating structure

• The probability that a system, vehicle, machine, device, and so on, will perform its intended function under encountered operating conditions, for a specified period of time. • Quality over time

• Example 2—New-technology microelectronic logic device • Accelerated Degradation Tests • Importance of physics of failure and physical/chemical models (and sensitivity analysis)

• A powerful marketing tool

• Example 3—Microelectronic RF amplifier device

• An engineering discipline requiring support from

• Connecting with the field

Physics and chemistry

• Example 4—Appliance field reliability

Statistics

• Areas for further research

3

4

Breakdown Times in Minutes of a Mylar-Polyurethane Insulating Structure (from Kalkanis and Rosso 1989)

Some Applications of Accelerated Tests

• Assess component or material reliability or durability.

10

4

10

3

10

2

10

1

10

0

• Make design decisions to improve reliability or lower cost

• System test to simulate field-use at accelerated conditions. • Predict product field performance. • Identify and fix potential failure modes at system/subsystem level (HALT and STRIFE tests). • Screening (100% or audit) testing of manufactured product (e.g. ESS and burn-in).

Minutes

• Verify predictions produced with physical models (e.g. FEM)

•• • • • •

• •• • •• •

• • • • • •

• •• •• • • •• • • • •

-1

10

100

150

200

250

300

350 400

kV/mm

5

6

Plot of Inverse Power Relationship-Lognormal Model Fitted to the Mylar-Polyurethane Data (also Showing 361.4kV Data Omitted from the ML Estimation)

Inverse Power Relationship-Lognormal Model

The inverse power relationship-lognormal model is

Pr[T ≤ t; volt] = Φnor

log(t) − µ σ

10

5

10

4

10

3

10

2

10

1

••• • • •

Minutes

where

• µ = β0 + β1x, and • x = log(Voltage Stress).

10

• σ assumed to be constant.

• ••• •• •

• ••• •• •

• •• •• •

90%

• •• • • • •

0

-1

10

50

100

200

50% 10%

500

kV/mm 7

8

Lognormal Probability Plot of the Inverse Power Relationship-Lognormal Model Fitted to the Mylar-Polyurethane Data

Methods of Acceleration Three fundamentally different methods of accelerating a reliability test:

.99 .98 .95

• Increase the use-rate of the product (e.g., test a toaster 400 times/day). Higher use rate reduces test time.

.9

Proportion Failing

.8 .7

• Use elevated temperature or humidity to increase rate of failure-causing chemical/physical process.

.6 .5 .4 .3

• Increase stress (e.g., voltage or pressure) to make degrading units fail more quickly.

.2 .1 .05

Use a physical/chemical (preferable) or empirical model relating degradation or lifetime at use conditions.

.02 219.0

.01 10

0

1

10

157.1

122.4

10

50 kV/mm

100.3

2

10

3

10

4

10

5

Minutes

9

Interval ALT Data for a New-Technology IC Device

10

New-Technology Integrated Circuit Device ALT Data

• Tests run at 150, 175, 200, 250, and 300◦C.

• Failures had been found only at the two higher temperatures. • After early failures at 250 and 300◦C, there was some concern that no failures would be observed at 175◦C before decision time. • Thus the 200◦C test was started later than the others.

11

Hours

• Developers interested in estimating activation energy of the suspected failure mode and the long-life reliability.

10

7

10

6

10

5

10

4

10

3

10

2

x x x

100

150

200

250

x x x

300

350

Degrees C

12

The Arrhenius-Lognormal Regression Model Elevated Temperature Acceleration of Chemical Reaction Rates The Arrhenius-lognormal regression model is

• The Arrhenius model Reaction Rate, R(temp), is

−Ea −Ea × 11605 = γ0 exp kB (temp ◦C + 273.15) temp K where temp K = temp ◦C+273.15 is temperature in Kelvin and

Pr[T ≤ t; temp] = Φnor

R(temp) = γ0 exp

kB = 1/11605 is Boltzmann’s constant in units of electron volts per K. The reaction activation energy, Ea, and γ0 are characteristics of the product or material being tested. • The reaction rate Acceleration Factor is

log(t) − µ σ

where

• µ = β0 + β1x, • x = 11605/(temp K) = 11605/(temp ◦C + 273.15)

R(temp) R(tempU ) 11605 11605 = exp Ea − tempU K temp K

AF(temp, tempU , Ea ) =

• and β1 = Ea is the activation energy

• When temp > tempU , AF (temp, tempU , Ea) > 1.

• σ is constant

13

Arrhenius Plot Showing ALT Data and the Arrhenius-Lognormal Model ML Estimation Results for the New-Technology IC Device.

14

Lognormal Probability Plot Showing the Arrhenius-Lognormal Model ML Estimation Results for the New-Technology IC Device

.95

10

7

10

6

10

5

10

4

10

3

.9

.7

Proportion Failing

Hours

.8 .6 .5 .4 .3 .2 .1 .05 .02

10

x x x

x x x

2

100

150

200

250

300

.01 .005 .002 50%

.0005

10% 1%

300 Deg C

.0001

350

10

2

250

200 3

10

175 10

4

150

100 10

5

10

6

10

7

Hours

Degrees C on Arrhenius scale 15

16

Lognormal Probability Plot Showing the Arrhenius-Lognormal Model ML Estimation Results for the New-Technology IC Device with Given Ea = .8 Pitfall 4: Masked Failure Mode .95 .9

• Accelerated test may focus on one known failure mode, masking another!

.8

Proportion Failing

.7 .6 .5 .4 .3

• Masked failure modes may be the first one to show up in the field.

.2 .1 .05 .02 .01 .005

• Masked failure modes could dominate in the field.

.002 .0005 300 Deg C

.0001 10

2

250

200 3

10

175 10

4

150

100 10

5

10

6

10

7

Hours

17

18

Unmasked Failure Mode with Lower Activation Energy

6

10

6

10

5

10

5

10

4

10

4

10

10%

10

10

3

3

Mode 1

2

10

2

10

Hours

Hours

Possible results for a typical temperature-accelerated failure mode on an IC device

Mode 2

10% 10%

10

10

1

40

60

80

100

120

1

40

140

60

80

100

120

140

Degrees C

Degrees C

19

20

Percent Increase in Resistance Over Time for Carbon-Film Resistors (Shiomi and Yanagisawa 1979)

Advantages of Using Degradation Data Instead of Time-to-Failure Data

• Degradation is natural response for some tests. 10.0

173 Degrees C

Percent Increase

5.0

• Can be more informative than time-to-failure data. (Reduction to failure-time data loses information)

133 Degrees C

• Useful reliability inferences even with 0 failures.

1.0

83 Degrees C 0.5

• More justification and credibility for extrapolation. (Modeling closer to physics-of-failure) 0

2000

4000

6000

8000

10000

Hours 21

22

Percent Increase in Operating Current for GaAs Lasers Tested at 80◦C

Limitations of Degradation Data

10 0

• Analyses more complicated; requires statistical methods not yet widely available. (Modern computing capabilities should help here)

5

• Substantial measurement error can diminish the information in degradation data.

Percent Increase in Operating Current

• Obtaining degradation data may have an effect on future product degradation (e.g., taking apart a motor to measure wear).

15

• Degradation data may be difficult or impossible to obtain (e.g., destructive measurements).

• Degradation level may not correlate well with failure.

0

1000

2000

3000

4000

Hours

23

24

Device-B Power Drop Accelerated Degradation Test Results at 150◦C, 195◦C, and 237◦C (Use conditions 80◦C)

Arrhenius Model Temperature Effect on Chemical Degradation

A1 k1 -A2 and the rate equations for this reaction are

0.0

Power drop in dB

-0.2

• • •• ••• • • • • •

••• • •• • •• • ••• • • •• • ••• •• •• • • • • • •• •• •

-0.4

• • • •• • • • •• • • •• • • • •

-0.6

-0.8

-1.0

• •

-1.2

-1.4

•• •• •• • •• • • • • • ••• • ••• • • •• •• • • • ••• • • • • • • • • • • ••• •• • • • • •• • • • • • • • • • • • • •• •• • •

••• ••• •• • • • •• • ••• • •• •• • • • 150 Degrees C • •• • •• • • • •• • •• ••• • •• •• •• •• • • • • • • • • • • • • •• • • • • • • • •• • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • •• •• • • • • • • •• •• • • • • • • • • • •• • • •• •• • • • • • •• • • •• •• • • • •• •• • ••• • • • • • • • • • • • • •• •• • • • • • • • • • • • • • • • • • • • 195 Degrees C • • • • • • ••• • • • • • • • • • • • • • • • • •• • • 237 Degrees C • • • • ••

0

1000

2000

•• • • • • •• • • • •• ••• •

dA1 = −k1A1 dt Solving these gives

• • • • • • • • • •• • • • •

and

dA2 = k1A1, dt

k1 > 0.

(1)

A1(t) = A1(0) exp(−k1t) A2(t) = A2(0) + A1(0)[1 − exp(−k1t)] where A1(0) and A2(0) are initial conditions. The Arrhenius model describing the effect that temperature has on the rate of a simple first-order chemical reaction is

3000

k1 = γ0 exp

4000

−Ea kB × (temp + 273.15)

Hours 25

26

Lognormal-Arrhenius Model Fit to the Device-B Time-to-Failure Data with Degradation Model Estimates

.99

What Do Accelerated Test Results Tell Us About Field Reliability? Need information on:

.98 .95

Proportion Failing

.9 .8

• Effects of acceleration (e.g., cycling rate).

.7 .6 .5 .4 .3

• Distribution of use-rates in actual use.

.2 .1

• Distribution of environmental conditions (e.g., stress spectra distributions).

.05 .02 .01 .005 237 Degrees C

195

150 Degrees C

80 Degrees C

.001 10^1

10^2

10^3

10^4

These factors may be given or, in some situations, inferred from the available data.

10^5

Hours

27

28

Establish a Transfer Function Relating Laboratory Tests and Field Performance

Component-A Laboratory Test Cycles to Failure

• Carefully compare laboratory tests results and field failures.

Same failure mechanisms operating in laboratory tests? Same factors (environmental noises) exciting the failure mechanisms? Identify laboratory/field discrepancies to improve test procedures. Seek understanding of reasons for lack of agreement. • Find a model (transfer function) to relate laboratory test to field use. • Understanding the relationship between the laboratory test results and product field reliability will provide stronger basis for using future laboratory tests to predict field performance.

29

0

10000

20000

30000

40000

50000

Cycles

30

Appliance Use-Rate Distribution (discretized lognormal distribution)

Example Use-Rate Model

• Life of a component in cycles of use, has a distribution

0.15

FC (c) = P (C ≤ c) = Φ

log(c) − µ σ

0.05

0.10

• Actual use-rate has a distribution given by the proportion of users πi (i = 1, . . . , k) that use the appliance at constant rate Ri, where ki=1 πi = 1. • Then the failure probability as a function of time is

0.0

FT (t; θ ) = P (T ≤ t) =

k

πi Φ

i=1

log (t) − µi σ

where θ = (µ1, . . . , µk , σ) and µi = µ − log(Ri).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Relative Frequency of Appliance Uses per Week

31

32

Predicted Field Reliability of Component-A as a Weighted Average of Lognormal Distributions

Predicted Field Reliability of Component-A as a Weighted Average of Weibull Distributions

.999 .98 .9

.95 .9

.7 .5

.8 .7

.3 .2

.6 .5

Fraction Failing

Probability

.1 .05 .02 .01 .005 .003

.4 .3 .2 .1 .05 .02

.001

.01

.0005

.005 .0002

.002 .001

.0001 50

100

200

500

1000

2000

5000

50

100

200

Mon Apr 10 14:01:32 CDT 2000

Weeks of Service

500

1000

Weeks of Service

2000

5000

Fri Mar 23 21:27:13 CST 2001

33

Fitted Use-Rate Model for the Wear Failure Mode

34

Field Variability Lognormal f (r; ηR, σR ) Density for the Wear Failure Mode (unloaded cycles relative to field days of use)

Lab: subset AccWear Appliance B Wear Failure Mode ALT data Field: subset Field Appliance B Wear Failure Mode .95 Laboratory Field

.9

Lab: subset AccWear Appliance B Wear Failure Mode ALT data Field: subset Field Appliance B Wear Failure Mode

Fraction Failing

.8 .7 .6 .5 .4 .3 .2 .1 .05 .02 .01 .005 .002 .0005 .00005 .00001 2

5

10

20

50

100

Lab time: Test Cycle

200

500

Field time: Weeks

1000

2000

5000

Thu May 10 22:23:56 CDT 2001

0.01

0.05

0.20

1.00

5.00

20.00

Thu May 10 22:24:23 CDT 2001

Test Cycles per Week 35

36

Simulation of a Proposed Accelerated Life Test Plan

Planning Accelerated Tests

• Most basic ideas of traditional DOE still hold

Temp= 78,98,120 n= 155,60,84 centime= 183,183,183 parameters= -16.7330, 0.7265, 0.6000

10

5

10

4

10

3

10

2

10

1

Log time quantiles at 50 Degrees C Average( 0.1 quantile)= 8.014 SD( 0.1 quantile)= 0.4632 Average( 0.5 quantile)= 9.138 SD( 0.5 quantile)= 0.5116 Average(Ea)= 0.7266 SD(Ea)= 0.08594

• In censored accelerated life tests (failure time is response) allocate more test units to low acceleration factor level than high acceleration factor levels.

• Consider including some tests at the use conditions. • Use simulation to investigate properties of alternative ALT plans.

Days

• Limit, as much as possible the amount of extrapolation used.

10

10%

0

Results based on 500 simulations Lines shown for 50 simulations

40

60

80

100

120

140

160

Degrees C

37

38

Concluding Remarks

Areas for Further Research

• Physical/statistical models for failure acceleration

• Accelerated Testing can be valuable tool when used carefully

• Methods for sensitivity analysis when empirical models must be used

• There is no magic in Accelerated Testing

• Prediction of service life in complicated environments • Physical/statistical models the field environment • Bayesian methods for analysis and planning (especially adaptive test plans) • Accelerated degradation test planning • Degradation analysis and planning with coarse (e.g. ordered categorical and censored) data. • Physical comparison of lab and filed failures to validate testing methods

• Cross-disciplinary teams are needed to deal effectively with all issues

Product/reliability/design engineers to identify productuse profiles, environmental considerations, potential failure modes or weaknesses that need to be evaluated, etc. Experts in materials and the chemistry/physics of failure to help in the understanding of an suggest/develop appropriate models for acceleration of particular failure modes. Statisticians to help with stochastic modeling, plan tests, fit models, and to help quantify uncertainty in results. • Users of Accelerated Testing must beware of pitfalls

39

40

References References • D. Byrne, J. Quinlan, Robust function for attaining high reliability at low cost, 1993 Proceedings Annual Reliability and Maintainability Symposium, 1993, pp 183-191. • L. W. Condra, Reliability Improvement with Design of Experiments, 1993, New York: Marcel Dekker, Inc. • M. Hamada, Using statistically designed experiments to improve reliability and to achieve robust reliability, IEEE Transactions on Reliability R-44, 1995 June. • M. Hamada, Analysis of experiments for reliability improvement and robust reliability, in Recent Advances in Life-Testing and Reliability, 1995, N. Balakrishnan, editor, Boca Raton: CRC Press. • Meeker, W.Q. and Hamada, M. (1995), Statistical Tools for the Rapid Development & Evaluation of High-Reliability Products, IEEE Transactions on Reliability R-44, 187-198. 41

• Meeker, W.Q. and Escobar, L.A. (1998a), Statistical Methods for Reliability Data. John Wiley and Sons, Inc. • Meeker, W.Q. and Escobar, L.A. (1998b), Pitfalls of Accelerated Testing. , IEEE Transactions on Reliability R-47, 114-118. • W. Nelson, Accelerated Testing: Statistical Models, Test Plans, and Data Analyses, 1990, New York: John Wiley & Sons, Inc. • G. Taguchi, System of Experimental Design, 1987; White Plains, NY: Unipub/Kraus International Publications. • T. S. Tseng, M. Hamada, C. H. Chiao, (1995), Using degradation data from a factorial experiment to improve fluorescent lamp reliability, Journal of Quality Technology, 363-369.

42