KEYNOTE PRESENTATION Prognostics and Health ... - IEEE Xplore

0 downloads 0 Views 5MB Size Report
KEYNOTE PRESENTATION ... 1. Introduction. Health management in electronics high reliability ...... represents the area under the PDF curve that is equal to the.
KEYNOTE PRESENTATION Prognostics and Health Monitoring of Electronic Systems Pradeep Lall(1), Ryan Lowe(I), Kai Goebel(2) (I)Auburn University Department of Mechanical Engineering NSF Center for Advanced Vehicle and Extreme Environment Electronics (CAVE3) Auburn, AL

36849

(2�ASA Ames Research Center, Moffett Field, CA Tele:

+

94035

1 (334) 844-3424

E-mail: [email protected]

2009a,b,Constable 1992,2001].

Abstract Structural damage to BGA interconnects incurred during vibration testing has been monitored in the pre-failure space using resistance spectroscopy based state space vectors,

rate

of

change

of

the

state

acceleration of the state variable. intended

for

condition

variable,

and

The technique is

monitoring in high reliability

applications where the knowledge of impending failure is critical and the risks in terms of loss-of-functionality are too high to bear.

Future state of the system has been

estimated based on a second order Kalman Filter model and a Bayesian Framework. The measured state variable has been related to the underlying interconnect damage in the form of inelastic strain energy density. Performance of

the

during

prognostication the

vibration

health

test

management algorithm

has

been

performance evaluation metrics. been

demonstrated

on

been

correlated

leadfree with

using

The methodology has area-array

assemblies subjected to vibration. have

quantified

electronic

Model predictions

experimental

data.

The

presented approach is applicable to functional systems where corner interconnects in area-array packages may be often redundant. Prognostic metrics including a-A metric, sample standard deviation,

mean square error,

mean

absolute percentage error, average bias, relative accuracy, and cumulative relative accuracy have been used to assess the performance of the damage proxies.

The presented

approach enables the estimation of residual life based on level of risk averseness.

1.

in

primarily

electronics

focuses

on

high

damage

reliability diagnosis

involving built in self test (BIST) to monitor for failure [Steininger

2006].

1999,

Harris

2002,

Hashempour

2004,

Suthar

Damage diagnosis typically focuses on reactive

failure detection and provides limited to no insight into the system reliability and residual life. Previously damage initiation, damage progression, and residual life in the pre-failure

space

has

been

correlated

with

micro­

structural damage based proxies, feature vectors based on time, spectral and joint time-frequency characteristics of electronics [Lall

2004a•d, 2005a•b, 2006a•f, 2007"'0, 2008a•f].

Precise resistance measurements based on the resistance spectroscopy

tracking, and self separation. Complex electrical power systems

(EPS)

method

which

broadly

comprise

of

energy

generation, energy storage, power distribution, and power management, have a major impact on the operational availability,

and

reliability

of

electronic

systems.

Technology trends in evolution of avionics systems point towards more electric aircraft [Downes

2007]

and the

prevalent use of power semiconductor devices in future aircraft

and

space

platforms.

Advanced

health

management techniques for electrical power systems and avionics

systems

are

required

to

maintainability,

reliability,

meet

and

the

safety,

supportability

requirements of aeronautics and space systems. Current health

management

techniques

in

EPS

and

avionic

systems provide very-limited or no-visibility into health of power electronics, packaging to predict impending failures.

[McCann

Shiroishi

Marko1996,

2005,

Schauz

1996,

1997].

Maintenance corrective

has

evolved

maintenance

preventive

over

to

maintenance.

the

years

performing

Future

from

time-based

improvements

in

reduction of system downtime require emphasis on early detection

of

development

degradation of

mechanisms.

prognostics

and

Incentive

health

for

management

methodologies has been provided by need for reduction in operation and maintenance process costs [Jarrell catalyzed

management

applications

aircraft control and navigation, flight path prediction and

2002].

Advances in sensor technology and failure analysis have

Introduction

Health

Avionics systems require

ultra-high reliability to fulfill critical roles in autonomous

has

been

used

to

monitor

interconnects for damage and prognosticate failure [Lall

978-1-4577-0106-1/11/$26.00 ©20111EEE

-

a

broadening

prognostication

of

application

systems

to

scope

include

for large

electromechanical systems such as aircraft, helicopters, ships, power plants,

and many industrial operations.

Current PHM application areas include, fatigue crack damage in mechanical structures such as those in aircraft [Munns

2000],

surface

infrastructure [Chang

2005] and power plants Kalman

ships

filtering

[Jarrell is

a

[Baldwin

2002],

civil

railway structures [Barke

2003],

2002]. recursive

algorithm

that

estimates the true state of a system based on noisy measurements [Kalman

1960,

Zarchan

2000].

Previously,

the Kalman Filter has been used for navigation [Bar­ Shalom

2001],

economic forecasting [Solomou

and online system identification [Banyasz

1992].

navigation examples include tracking [Herring

1117

1998], Typical

1974],

-

2011 12th. Int. Conf. on Thermal, Mechanical and Multiphysics Simulation and Experiments in Microelectronics and Microsystems, EuroSimE 2011

ground navigation [Bevly 2007], altitude and heading

range of 64 to 676 110, pitch sizes are in the range of

reference [Hayward 1997], auto pilots [Gueler 1989],

O.Smm to Imm, and package sizes are in the range of

dynamic

positioning

[Balchen

1980],

GPS/INS/IMU

6mm to 27mm (Table 2). The package parameters of this

guidance [Kim 2003]. Application domains include GPS,

board are shown in Table 2.

missiles, satellites, aircraft, air traffic control, and ships.

the test board B is shown in Figure 3.

Representative samples of

The ability of a Kalman filter to smooth noisy data measurements is utilized in gyros, accelerometers, radars, and odometers. Prognostication of failure using Kalman filtering has been demonstrated in steel bands and aircraft power generators [Batzel 2009, Swanson 2000, 2001]. Numerous

applications

algorithms using more

in

prognostics

also

exist

for

advanced filtering algorithms,

known as particle filters. The state of charge of a battery was estimated and remaining useful life was predicted in b [Saha 2009a, ]. Use of Kalman Filtering for prognostication of electronic reliability based on the underlying damage mechanics is new.

In this paper, a

prognostic and health monitoring capability for electrical components based on changes in resistance has been

Figure 1: Test Board A

presented. Failure modeling of BGA interconnects is combined with Kalman filtering for plastic strain state estimation and a Bayesian framework for PHM. 2.

Two

test

boards

were

used

for

experimental

27!Tmt, 676 I/O PBGA

8-packages on one side of the test board. All packages on Interconnect

the

technologies

same

interconnect

studied

include,

type.

lead solder (63Sn37Pb), high lead solder (SnlOPb90) and Table 1 shows the

10 nun.

package parameters for test board A. A representative Figure 1.

280

I/O Flex

15 nUll, 196 I/O PBGA

144 I/O Tape Array

. ... ... . .. .. . ...... ... . ..... ...... .... . ... . .. . .. ...... .. ... . .. . . .. ..... ..... ._... ... ..... .. . .............. .;;.. , .. "... ........ . _ .... . .......... ...... . .. ......... ............. ..... ..............

copper­

board with each package numbered Ul - U8 is shown in

iii:::::::::::

10 Imn.

.................... ................... ••••••••••• •• M •••• ................... .... ........ ....

reinforced solder column grid array(CCGA), Eutectic tin­ SAC30S solder (Sn3AgO.SCu).

. .............

••••••••••••

ceramic packages with 400 110 each. Each test board has have

������im�

::::::::::::

different configurations. The boards have daisy-chained

test-assembly

. .... __ ....

............ ............ •••••••••••• ............ ............ •••••••••••• ••••••••••••

measurements. Test board A, was manufactured in four

a

..............

::::::::::::

Test-Vehicle

. .......... . . · · ........ . · · · • · • ·

. . . .... . . .... . • •••• • . .... . • • ........

........ . . ..

• ••••••••• . . •• ••

. . . • . • .

••

•• •• •• •• •• •• ••••••••• • ••••••••••

••

: .......... : 71lliII,

BGA

84110

CABGA

6 nUll,

64 I/O

Tave Arrav

Figure 2: Interconnect array configuration for Test Board­ B.

Table 1: Package Architectures for Test Board A Parameter 0 0-

.D

� If: E �� t:

�� O

U-� u c55 c U(6

uc7j

� t:I

....

.g

=

t:I r./J o::l

t:l o::l .....

:::l

��

t:lo o::l e/)

U� M t:

r./J

Length(mm)

21

21

21

21

Width(mm)

21

21

21

21

Thickness

2.4

2.4

2.4

2.4

(mm) 11O

400

400

400

400

Pitch(mm)

1

1

1

1

Ball Dia(mm)

0.6

0.6

0.6

0.6

Joint Height

2 mm

0.6

0.6mm

0.6mm

mm

Figure 3: Test Board -B

Test Board-B includes package architectures such as, plastic ball-grid arrays, chip-array ball-grid arrays, tape­ array ball-grid arrays, and flex-substrate ball-grid arrays (Figure 3). The experimental matrix has ball counts in the

-

2117 -

2011 12th. Int. Con! on Thermal, Mechanical and Multiphysics Simulation and Experiments in Microelectronics and Microsystems, EuroSimE 2011

Table 2: Package Architectures used for Test Board B.

phase dependency has been eliminated by using a second phase-sensitive detector. The signal has been multiplied

i;'

E t: E "" '" Q) tl.

I

60

1l c: � 40

failure.

50

5

4

Board-B, PBGA676,127kHz)

shift measurements of a package versus number of shock

�1l

3

Figure 16: Phase shift as a lead indicator of failure (Test

Figure 15 shows the confidence value based on phase



2

iii Q) Cl:

Time [Hours]

Figure 14: Repeatability of phase shift measurement on

\





­

----

10

10



3 Time

[Hrs]

-----�

4

'\

5

6

Figure 18: Raw resistance data. The data used as a input data vector is shown in the brackets The failure criteria for resistance change outlined in

input frequency. In general it has been noticed that the

JESD22-B103, and IPCSM785 for the number, duration,

correlation between the degradation in confidence value

and severity of intermittent events is used as the definition

and

of failure. It should be noted that the smaller step

increase in

resistance is better at higher

input

frequencies.

increases of 0.05 n during the first 90 minutes of the test

-

6/17

-

2011 12th. into Con! on Thermal, Mechanical and Multiphysics Simulation and Experiments in Microelectronics and Microsystems, EuroSimE 2011

Table 3' Anand's Constants for SAC305

are experimental noise which can be reproduced by motion of the system connections during shock and

So QIK

45.9MPa

vibration. Resistance data two-hours after the initiation of the test till failure has been studied for the construction of

A

5.87e6 lIsec

feature vector for identification of impending failure. J _1_ ..l _1 1-1- i -I J _1_ ..l _I 1-1- i J _1_ ..1 _

I J I J



2

M

0.0942

ho

9350MPa

n

0.015

;:--" __ --::: ""'" L _::;;: _ : _ ::: : _ _= _ -tEE _._ �E13

_"""" ...."T __--.... ..,... ___ 3.1 -





I I 3.125 -- 1---1--I I I

I

Table

3.12 ----I I I I � 3.115 -- ---, ---1--I I I .� 3.11 - - � - - _1--I I I :{l '" I 3.105 --

I

-�-

£.

:

4

7460 11K

a

l.5

s

58.3MPa

shows

the

dimensional

parameters

for

the

undeformed geometry of a typical solder ball based on the manufactures data sheet.

;;

Previous studies have shown

that tensile stress in the out-of-plane z-direction is the primary

stress

during

the

shock

test

in

interconnects Darveaux 2006, Chong 2006]. interconnect 2.5

deformation

during

the

the

solder

The solder

shock

test

was

simulated using non-linear finite elements by constraining

4.5

Figure 19: Zoomed view of resistance data between 2 hrs and failure

the solder interconnect along the bottom of the joint, and applying a displacement load on the top (Figure 20).

A subset of the resistance data has been used since field

Table 4 Undefiormed geometry

data will often involve electronic assemblies with accrued damage and not involve pristine assemblies.

Figure 19

shows a zoomed view of the input data highlighting the experimental noise between two hours and failure. The experimental noise is due in part to the challenges with overcoming the variance in contact resistance in the

Parameter

0f

soIder baII Specification

Solder ball diameter (mm)

0.63

Solder ball land

0.45

(mm, board and package) Solder ball height after reflow (mm)

0.48

presence of transient dynamic motion in shock or steady­ state vibration. Step changes in the resistance data can be seen at 2.8 and 4.9 hours respectively.

However, the

distinctive increase of about 25 mn during the vibration test

is

easily

experimental

discernable

noise.

even

in

Immediately

the

presence

of

before

failure

the

resistance increase is approximately exponential in nature. The change in resistance is attributed to change in geometry, since the resistivity of the solder interconnect is

Figure 20: Constraints on solder ball for FEM simulation

expected to stay constant. Change in trace geometry is the basis of operation for traditional strain gages and can be explained in a cylindrical conductor by

R

=

pL/A,

where R is the resistance of the conductor, p is the material property resistivity, L is length and A is the cross sectional area. By logarithmically differentiating both sides, and assuming linear elastic properties a relation between

dR

=

strain

and

ROEa (1 + 2v),

resistance

Ro is the initial resistance,

v

can

be

derived

as

where dR is the change in resistance, Ca

is the elastic axial strain and

is the Poisson ratio. Since the material properties and

geometry of a solder ball are non-linear a finite element simulation was used to map the change in resistance of an interconnect interconnect

to

the

was

state

of

feeling.

plastic The

strain

that

simulation

the Figure 21: Meshed model of solder ball

was

implemented in ANSYSTM Version 12 using Anand's Viscoplasticity and VISC0107 elements. The Anand's constants used for the simulation are shown in Table 3.

Resistance of the solder interconnect was computed by converting the VISOC107 elements to SOLID5 elements after intermediate steps in the deformation. A steady state conductance simulation was run using the deformed

-

71 1 7

-

2011 12th. Int. Con! on Thermal. Mechanical and Multiphysics Simulation and Experiments in Microelectronics and Microsystems. EuroSimE 2011

geometry after each sub-step. Using the built in macro command GMATRIX the conductance of the solder ball in the deformed state could be calculated. The conductance is the inverse of the resistance. The meshed geometry before deformation can be seen in Figure 21, while the deformed geometry can be seen in Figure 22. Deformation was applied to the solder joint at a specified strain rate of 1 sec-I typical of a shock test. An example of this mapping is shown in Figure 23. om-I JVI aU TlU-l.19" 011% -.$""-4))

account for changes in resistance of the entire package. This was achieved by assuming that every interconnect feels the same strain. Therefore the critical resistance is multiplied by the number of 110 in the package, i.e. 676 for the PBGA 676 to obtain the overall critical resistance value (676x5xlO-5 n 3.38xlO-2n). =

7.

PHM Framework

The strain-resistance relationships have been used to correlate the measured feature vector with the underlying damage state of the system. Feature vectors monitoring system damage have been constructed based on the sensor output (Figure 24). The feature vector is an input into the PHM algorithms. Feature Vector

PHM

Algorithm

Figure 24: Flowchart for PHM framework

Figure 22: Deformed and undeformed geometry of solder ball

"E

1

. Q) E



I I I I I - - - -l- - - - -1- - - - --t - - - -I- - - - -1I I I I I

-1 I

I

I

Time [Hrl

:::I

-

Figure 10-

4

2

° 10 Scale Factor

10.

2

'

10

10

10

6

Figure 45: Variation in the sum of beta calculation for variations in tunable the process noise parameter

critical value of state variable can severely hurt the performance of the PHM algorithm.

A physics-based

understanding of the degradation mechanism and its relationship

to

system

performance

is

critical

for

implementation of the PHM algorithm. Cumulative beta of the process is less sensitive to process noise and therefore

was

magnitude.

varied

over

a

number

of

orders

of

An incorrect selection of either critical

threshold for state variable or the process noise will have adverse effect on the performance of the PHM

algorithm.

The practical results of predicting RUL is to make critical decisions about future use and replacement of a component can be justified using statistics.

In an ultra­

high reliability system, a critical decision is whether to Assume that it takes I-hour to

order and receive a replacement component from the Given the level of mission criticality of the

application the maximum acceptable probability of failure may be allowed at no higher than 1%. The following . calculatIOn can be made to determine when to order the replacement part and schedule downtime for maintenance.

to,de,

=

Where

RULpred;ct;on crRUL

-

2.576cr RUL

-

(21)

tleadtimc

is the standard deviation of the remaining

useful life, and tleadltme is the lead time for receiving the component after placement of the order.

This equation

implemented on the data for the vibration test is shown in Figure 46.

The 2.576 parameter indicates the t-statistic

for 99% confidence.

to

order

replacement

component

calculation vs time

13. Summary and Conclusions A framework for prognostication of area-array electronics resistance spectroscopy measurements, Kalman Filtering and Bayesian PHM framework.

The measured state

variable has been related to the underlying damage state by correlating the resistance change to the plastic strain accrued in interconnects using non-linear finite element analysis. The strain-resistance relationship has been used to define the critical resistance failure threshold for the component.

The Kalman filter was used to estimate the

state variable, rate of change of the state variable, acceleration of the state variable and construct a feature vector.

The estimated state-space parameters were used

to extrapolate the feature vector into the future and predict the failure threshold.

decisions. In the Bayesian framework used in this paper

warehouse.

Time

the time-to-failure at which the feature vector will cross

12. Risk-Based Decision Making

replace a component.

46:

has been developed based on state-space vectors from

The sensitivity study shows that underestimating the

an

I

-3 �2--�2�.5�--�3--'-3�.�5--�4'---4�.�5--�5�---5.L5�

E

1

I

-2: - - - - � - ---:- - - - � - - - - � - ---:- - - - � - - - - �-

j!l3 ,!8 '02