using probabilistic methods to define reliability ...

USING PROBABILISTIC METHODS TO DEFINE RELIABILITY REQUIREMENTS FOR HIGH POWER INVERTERS 1

Russell W. Morris1 and John M. Fife2 T he Boeing Company, Seattle, Washington 2 PV Powered, Inc., Bend, Oregon ABSTRACT

The solar energy industry has reached the point where utilities are employing large multi-megawatt photovoltaic projects to provide power to the utility grid in a variety of configurations. The ability to deliver or connect solar energy to a commercial power grid with high efficiency and over a broad spectrum of environments is critical to the success of the solar utility industry. These analyses aid in evaluating the potential profitability of a project and aid in optimizing a power plant design by understanding the real life attributes of the components and subsystems. This paper presents a methodology for modeling system-level power production as a function of time based on concurrent reliability simulation of individual subsystems such as power inverters. Each subsystem model is also time-dependent and depends on understanding the probabilistic nature of the actual failure rates of the components and on endurance testing or field data for the subsystems. As an example of a detailed time-dependent predictive reliability model, the power inverter is used. The power inverter reliability and availability is essential for continuing improvement of solar power plant efficiency and cost effectiveness over the life of the installation. The authors discuss some of the common failure probability distributions, their application to components and how these affect such areas as the maintenance intervals and the number of expected spares needed. This affects the Levelized Cost of Energy (LCOE) of the system and the potential for profitable operation. Keyword Reliability, Maintainability, Availability, Spares, Probabilistic, Distributions, Weibull, Normal, Exponential, LCOE, Levelized Cost of Energy

INTRODUCTION Solar-based photovoltaic (PV) power plant development represents a major shift in the emphasis of reliability and maintainability on commercial power production that requires new ways at looking at Levelized Cost of Energy (LCOE) as well as the design of the systems. Boeing and PV Powered have collaborated to develop a multi-megawatt solar photovoltaic power plant under the DOE Solar America Initiative. They are engaged in analysis of plant performance, mandatory reliability, and economics as part of the system-level PV plant design to achieve an acceptable LCOE. Current power generation in the form of hydroelectric, nuclear, coal and oil fired plants typically deal with major component quantities on the order of 5 or 6 generators in the multi-megawatt arena. Solar power production, however, deals with components that number in the thousands. A 10 MW field, for example, may have more than 800,000 solar cells, and 6000 solar arrays and sun trackers, and more than 60 Inverters. These very large numbers require the application of statistical analyses and failure distribution characteristics to better project maintenance events, spares requirements and work force demands. This can help produce accurate assessments and projections that are important for the solar power industry to achieve the $0.15/KWh needed to compare to conventional sources of electric power. Large-scale photovoltaic power plants employ multiple arrays, inverters, and other balance of system components arranged modularly such that single point failures cannot by themselves or through a propagating fault cause immediate shutdown of the entire field. Instead, upon failure of a single subsystem, the plant typically suffers only a small fractional loss in power output. This greatly affects the system design for fault management repair and systems architecture needed to maintain power generation. Although a large percentage of the cost of solar power is associated with the purchase of the PV modules and balance of system, equipment reliability and maintainability are exceptionally important factors and should be given prime consideration when designing a PV plant. Understanding and projecting or predicting the reliability of the system

Reliability of Photovoltaic Cells, Modules, Components, and Systems II, edited by Neelkanth G. Dhere, John H. Wohlgemuth, Dan T. Ton, Proc. of SPIE Vol. 7412, 74120G · © 2009 SPIE · CCC code: 0277-786X/09/$18 · doi: 10.1117/12.826528 Proc. of SPIE Vol. 7412 74120G-1

and when spares are used to reduce down time of the item under repair and minimizing the labor needed to perform maintenance will further reduce the life cycle cost of solar-based systems. During the course of analysis of a Photo-Voltaic (PV) system, if failure probability is modeled based on a constant hazard rate represented by a Mean-Time-To-Failure (MTTF), spares and maintenance expectations would cause the initial buy of spares to be exceptionally large. The number of annual spares, Ns, estimated by:

N s = Nλ t

(1)

where N is the number of units in the installed field (λ

λ is the failure rate

=

1 MTTF

) and,

(2)

t is the annual operating time This simple formula results in a spares purchase for a 6000 unit field, an operating time of 3,600 hours and an MTTF = 30,000 hours of 700 units a year. If one assumes a unit-installed cost of $3,000 dollars, then the spares cost of just this one unit for the first year is $2.1M. In reality, most failures do not follow a constant hazard rate model. Instead, wear-out distributions are more realistic, resulting in very few failures in the first year or years. Therefore, the constant hazard rate assumption results in over-prediction of the number of spares required in the first year. Instead, timedependent analysis of the installed field is necessary to capture its unique infant mortality and wear-out curves, and the affects of changing repair and maintenance rates. Figure 1

Number of Maintenance Events

2500 2000 1500 1000

For Normal vs Exponential Distributed Expected Maintenance Events

MTBF CV Field Population

30,000 0.20 6,000

Expected Annual Maintenance Events

1st Failure Population 2nd Failure Population Exponential Estimate

500 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Years

Fig. 1 above shows a comparison between using the aforementioned constant hazard rate (exponential) and timedependent failure probability models. The Normal distribution predicts the more common wear-out failures, of which there are almost none until year 4. As new equipment is repaired or replaced the subsequent maintenance moves to the right in time and results in the cumulative expected annual maintenance shown. It is clear that, for this type of failure mode (temperature), using time-dependent probabilistic methods will result in much better planning for maintenance and spares needed to maintain a high availability. From a top-level perspective, these types of cost timing issues can strongly affect LCOE proposed for the PV based system. In approaching reliability analysis of a large-scale photovoltaic power plant, it is also useful to review historic trends in PV plant reliability. Although there are a number of known failure modes for cells, modules, and interconnects [1], as well as those for solar trackers and array drive systems, a disproportionate number of subsystem failures have involved

Proc. of SPIE Vol. 7412 74120G-2

the power inverter [2]. This, coupled with the fact that a single inverter failure may cause a significant fractional loss of power compared to a single module, for example, suggests designers should pay careful attention to inverter reliability.

TOP-LEVEL SYSTEM ANALYSIS METHODOLOGY Time-dependent probabilistic methods were used to assess the reliability of a high power inverter in a multimegawatt PV application. The approach was one of establishing functional operation in known environmental extremes, taking measurements and then assessing the reliability. The authors propose using component failure distributions derived from field data or reliability testing to model the field reliability as well as to extend it to field logistics to identify the spares and repairs required to keep the field operable and available. Many components in a field are mechanical in nature such as solar tracking mechanisms and fans (in the inverters) and they tend to follow a normal failure probability distribution as their principle failure mechanisms are related to wear out or fatigue. The initial work was done assuming that for mechanical items, the Normal or Gaussian distribution is appropriate, where the Normal Distribution CDF is:

F ( μ , σ , t) =

1

σ 2π

z2 ∫ exp 2 dt t

−

(4)

−∞

For the non-mechanical items field data suggests that it has a time dependency other than normal and fits into a Weibull 2-Parameter given as:

⎛ t ⎞ −⎜ ⎟ f (t ; α , β ) = 1 − exp ⎝ MTTF ⎠

β (5)

where α = MTTF, t = operating time and

β >1 After performing reliability testing and checking available field data the 2-P Weibull was found to best model current systems and allow the greatest flexibility in modeling a wide variety of failure modes. In addition, the Weibull 2parameter distributions are also satisfactory for approximating normal-distribution mechanical wear-out and failures of solar tracker drive circuitry and the peak-power-trackers on the arrays by using a β somewhere between 3 and 5.

EFFECT OF MAINTENANCE ON AVAILABILITY A specific point for the design of large Solar Power plants is the large (latitude dependent) amount of down time during night time hours and where weather prevents the field from generating significant power. This makes maintenance much easier to do, and there are few repairs that take more than a couple of hours. The Inherent Availability (Ai) of the field is;

Ai =

MTTF MTTR + MTTF

(6)

for inherent design attributes and the Operational Availability (Ao) is

Ao =

Uptime Uptime + Downtime

(7)

where, Uptime is the total time that the unit is operable, and Downtime is the total time that the unit cannot produce power per specification ignoring other factors such as weather.


Operational Availability (Ao) may be significantly improved by performing maintenance during non-operational times such as nighttime or during periods of thick overcast that prevents power generation.

ENVIRONMENTAL EFFECTS The environmental stresses PV systems endure (temperature, humidity, corrosives, salt, hail, ice, rain, dust, etc.) are also a strong function of geographic location. Therefore, the first step in this analysis is to obtain and understand the environmental stress at a target geographic location [9] over the planned period of life for the field (Table 1).

Table 1 Environmental Assessment for a Southwest Region of the U.S.

Year 2007 2006 2005 2004 2003 2002 2001

Max Temp (°F) 118 114 117 111 115 113 115

Days Temp >100F 91 100 78 88 105 89 90

Max Low Days Gust Temp windspe Speed (°F) ed >20 (mph) 14 160 60 28 167 54 28 148 52 24 147 54 28 141 53 21 146 64 28 136 51

Rain Days 27 30 27 26 27 24 30

Tstorm, Fog or Snow 0 0 18 9 0 4 7

Courtesy wunderground.com

For the case of thermal stress, many subsystem component temperatures do not necessarily track ambient conditions. Solar heating, self-heating, conduction, and convection (both passive and active) can add to or significantly influence the component temperatures for example. Therefore, a comprehensive thermal model must be developed that produces realistic component temperatures as a function of time. This amounts to a complete time-dependent thermal simulation of the critical components. For example, if an inverter has an active cooling system, a thermal simulation that takes into account forced convection and its effect on component temperatures given a time-dependent ambient temperature profile may be constructed. One approach is use of a custom Matlab™ program to solve the heat transfer equations (convection, conduction, and heating rate) given an input file containing a set of component properties, thermal interaction parameters, and cooling control law parameters. There is also an important secondary purpose for time-dependent thermal modeling. It allows the cooling control system of the equipment to be simulated during the design phase so that its performance can be assessed and optimized for a variety of geographic locations.

TIME-DEPENDENT PROBABALISTIC RELIABILITY SIMULATION The second step of this analysis involves calculating subsystem reliability while taking into account the component stresses. Life-stress relationships are used to accomplish this. For this paper, we are focusing on the effects of temperature on Power Inverter components. It was found that for some inverter components the Arrhenius-Weibull, which combines the Weibull distribution with Arrhenius dependence for characteristic life, fits very well for electronic components. This relationship is given by:

F (t , T ) = 1 − exp[ −(t / α (T )) β ], t > 0 ,

(8)

where the characteristic life, α=MTTF, is given by,

⎡ A ⎛ 1 1 ⎞⎤ α (T ) = α (T0 ) exp⎢ e ⎜⎜ − ⎟⎟⎥ . ⎣⎢ k ⎝ T

T0 ⎠⎦⎥


(9)

Above, β is the Weibull shape parameter, Ae is the activation energy, k is Boltzmann’s constant, and T is a constant absolute temperature in degK. It is important to note that this approach also works with a wide variety of life-stress relationships. Others, particularly the Inverse-Power-Law-Weibull, are useful for modeling different types of stresses. For example, the Coffin-Manson relationship is in the form of inverse-power with respect to the temperature excursion commonly used to model low-cycle thermal fatigue. Basic life-stress relationships like the one presented here have been generally assumed the stresses on the components to be constant. In reality, thermal stress on equipment is usually changing with time and is easily verified by monitoring component temperatures during normal operation. PV Powered equipment, is directly exposed to the environment, and sees especially high daily and seasonal variation in thermal stress. Therefore, an average stress value is very difficult to calculate for any given geographic location. The present method captures this time-varying stress for a given geographic location by using the aforementioned life-stress relationships in conjunction with a cumulative exposure (damage) model [5]. To use a Weibull-based life-stress relationships in a cumulative exposure construct,

F (t 1 , T (t )) = 1 − exp[ −ε (t 1 ) β ], t > 0 ,

(10)

where,

t1

ε (t1 ) = ∫ dt /α (T (t ))

(11)

0

To calculate availability (A) for the subsystem, a number of methods may be used. One of the most common is Monte Carlo. In the Monte Carlo method, F(t1, T(t)) is sampled using a uniform random variable to determine a time-tofailure (TTF) for the first sample. Then, the time-to-repair (TTR) is calculated in a similar manner and added to the total time. The routine is repeated until the simulation time is reached and there are a sufficiently large number of samples. Operational Availability can then be calculated from the uptime divided by the total simulation time [6], which contains both uptime and downtime (Eq. 7). Note this does require that the model contain non-design elements such as time to reach the field, delay in getting parts, and potential for doing or performing maintenance at night other times when the power station is off the grid. It is notable that the life-stress / cumulative exposure method presented here is also useful for analyzing accelerated test data. In particular, laboratory step stress testing can be a very time-efficient accelerated test method for high-value equipment such as large inverters. To simulate a step stress accelerated life test with this methodology, simply repeat this process with the appropriate thermal profile as an input.

GROUND RULES AND ASSUMPTIONS Although the approach presented in this paper is very useful when applied to wear-out failures, it does not address issues with unpredictable infant mortality (early failures), manufacturing induced failures, or failures that result from doing maintenance. In addition, phenomenon such as lightning effects will be specifically dependent upon each design and are not considered in this analysis, although they can be easily added within the framework as constant-hazard-rate (β=1) failure modes. The major effort that allows for the application of probabilistic methods is field data and or developmental reliability test data. Systems must be “designed for reliability.” Designs must have reliability designed in and there is a need for sufficient reliability data to establish the appropriate model for the equipment. In addition, there are no provisions made for the impact of scheduled or preventive maintenance in the analysis presented here, although it is easily incorporated.

INVERTER ANALYSIS EXAMPLE As an example of this analysis process, a hypothetical solar power inverter with active cooling control will be used. The inverter power profile is based on a fixed-mount installation. The assumed geographic location is Needles, CA.


LOCATION DATA Average Hourly Temp. (Needles, CA)

Temperature [degC]

LOCATION DATA (Cont.) • Latitude • Altitude

Jan

50

Feb Mar

40

Apr May

30

SUBSYSTEM THERMALS • Heat Transfer Coefficients • Active Thermal Control Parameters

Jun Jul

20

Aug 10

Sep

0

Oct Nov

0 2 4 6 8 10 12 14 16 18 20 22 UTC Time [hours]

OTHER CONFIGURATION DATA • Crystalline / Thin Film • VOC_max • Fixed / 1-Axis / 2-Axis

Dec

TIME-DEPENDENT STRESS (THERMAL) SIMULATION COMPONENT TEMPERATURES Component Temp. (July, Needles, CA)

Temperature, degC

120

Component_1

COMPONENT LIFE-STRESS RELATIONSHIP fcomponent(t,T,ΔT)

Component_2 Component_3

100


80

REPAIR TIME DISTRIBUTIONS mcomponent(t)


60

Component_8 40

Component_9

20


0

240

480 720 960 1200 UTC Time (minutes)

MAINTENANCE SCHEDULE

Component_12

TIME-DEPENDENT PROBABILISTIC RELIABILITY SIMULATION

Cumulative Failure Probability, Needles, CA Cumulative Failure Probability

1

SIMULATION OUTPUTS: • COMPONENT RELIABILITY • SUBSYSTEM RELIABILITY • SUBSYSTEM AVAILABILITY

Comp_1_FM1 0.8

Comp_10_FM1 Comp_12_FM1

0.6

INVERTER

0.4 0.2 0

0

5

10 Time (yr)

15

20

Fig. 2 First Two phases of subsystem (inverter) reliability analysis for a limited case considering only temperature stresses. Blue boxes represent simulation outputs.


Based on data gathered during preliminary design and test/evaluation, the component temperature versus time is calculated for each of 12 months of the year using the Matlab time-dependent thermal simulation previously described. The resulting inverter component temperatures for an average day in July are shown in Fig. 3. 120

Component_1 Component_2 Component_3

Temperature, degC

100

Component_4 Component_5 Component_6 Component_7

80


60

Component_10 Component_11 Component_12

40

20

240 480 720 960 1200 1440 Time (minutes)

Fig. 3. Simulated inverter component temperatures in July in Needles, California assuming fixed-mount array

The abrupt changes in temperature during the morning startup (UTC time ~800 minutes) are in response to the power on transient and the activation of the cooling control system as component temperature set points are reached. For the second step of the analysis process, the reliability simulation, three failure modes of the inverter were modeled. The failure modes are associated with components 1, 10, and 12. For each failure mode, and based on general characteristics known for the parts, an Arrhenius-Weibull life-stress model is assumed. The model parameters for each of these are given in Table 2. Table 2. Arrhenius-Weibull life-stress parameters for the inverter reliability example. α(T0) T0 Ae (degC) Failure Mode β (hr) (eV) Comp_1_FM1 1 1.0e6 100 1.0 Comp_10_FM1 5 1.0e5 40 0.3 Comp_12_FM1 1 2.0e5 48 1.0

Two failure modes in this example (Comp_1_FM1 and Comp_12_FM1) are assumed to have a random failure distribution (β=1). Comp_10_FM1, on the other hand, has distinct wear-out properties. A Matlab program was written to calculate the cumulative exposure integral, and generate the cumulative failure probability. The results are shown in Fig. 4 for a mission life of 20 years. Next, a Monte Carlo approach, again programmed in Matlab, was used to estimate the average downtime due to failure of each component. A uniform maintenance time distribution of 48 hours was assumed for simplicity, and no scheduled maintenance is performed.1 Fig. 5 shows the average annual unexpected downtime for the example inverter broken down by failure mode, as well as the total expected average downtime of the inverter subsystem, which is simply the sum of the component downtimes.

1

For PV systems, preventive and scheduled maintenance can often be performed during times that the plant is not generating; thus contributing to operating cost but not availability.


1 Cumulative Failure Probability

Comp_1_FM1 Comp_10_FM1 Comp_12_FM1

0.8

INVERTER 0.6

0.4

0.2

0

0

5

10 Time (yr)

15

20

Fig. 4. Example calculation of cumulative failure rates for three inverter failure modes and the overall inverter subsystem. 12 Comp_1_FM1

Downtime (hr)

10

Comp_10_FM1 Comp_12_FM1

8

TOTAL

6 4 2 0

0

5

10 Time (yr)

15

20

Fig. 5. Average downtime per year due to inverter failures calculated for the example inverter. 1 0.9998

Availability

0.9996 0.9994 0.9992 0.999 0.9988

0

5

Fig. 6. Availability downtime due to failure.

10 Time (yr)

of

the

example

15

inverter


20

based

on

Availability is then calculated as a function of time based on the downtime and shown in Fig. 6. Note that there is a distinct initial reduction in availability corresponding to the mean wear out time of Comp_10_FM1. This suggests preventative maintenance activities (involving much lower downtime) may be employed to improve overall availability.

SYSTEM-LEVEL ANALYSIS The above analysis is an example for one subsystem. A multi-megawatt PV plant would be expected to have multiple inverters, many modules, and other components. The same analysis presented above may be applied to each of those subsystems. Then classical methods of system-level reliability analysis may be used to estimate the overall performance of the system taking into account downtime due to failures and other causes of subsystem unavailability. The method applied above in the inverter example may also be applied a large system and to any geographic location where stress (temperature) data is available. By performing the same analysis for two other geographic locations, a comparison of relative inverter availability may be made. Table 3 shows the 20-year average availability for Needles and two very different climates (San Jose, California and Bozeman, Montana) computed using the same method as above. Once the subsystem availabilities are known, the effect of subsystem failure on energy harvest can be determined. As an example, take a 10 MW peak PV power plant with five MW average output and 3500 hours of generating time per year. Assuming the inverter is a dominant cost element, the annual energy loss is estimated as the product of the inverter unavailability, average power output, and the number of generating hours per year. Multiplying the energy loss by an average cost of electricity of $0.05 per kWh yields the annual cost of downtime due to inverter failures. These values are shown in Table 3. The results from the type of analysis presented here can be used to compare various inverter reliability scenarios and determine their impact on system reliability, availability, maintenance, and spares costs. Such a trade study is then used to define specific reliability requirements for inverters that allow the overall system goals of the PV plant to be met. Table 3. Average predicted inverter availability and resulting MWh lost annually for a hypothetical 10MW PV power plant. MWh $ Avg. Lost Lost Location Avail. Annually Annually Needles, California 0.9933 117 $6,000 San Jose, California 0.9956 77 $4,000 Bozeman, Montana 0.9971 51 $3,000

CONCLUSIONS The analysis method presented here has an important advantage over other reliability analysis in that it takes into account time-varying stresses such as daily and hourly changes in equipment temperature and adjusting reliability estimates accordingly. For solar power plants that are typically exposed to the environment, this is a powerful approach for producing more relevant and accurate failure rates across a wide variety of environments and properly assessing the maintenance and spares costs affected by operating is such environments. . Once this method is performed for each subsystem in a PV plant, the resulting reliability predictions can be combined to assess system-level performance taking into account equipment downtime. These system-level insights are helpful in plant design and when planning equipment maintenance, and budgeting for spares.

REFERENCES [1] Dhere, N.G., "Reliability of PV modules and balance-of-system components," Photovoltaic Specialists Conference, 2005. Conference Record of the Thirty-first IEEE, vol., no., pp. 1570-1576, 3-7 Jan. 2005 [2] Laukamp, H., ed., Task 7 Report International Energy Agency IBI-PVPS 77-08: 2002, “Reliability Study of Grid Connected PV Systems Field Experience and Recommended Design Practice", Fraunhofer lnstitut fur Ware Energiesysteme, Freiburg, Germany, March 2002.


[3] SEMI E10-0304E, Specification for Definition and Measurement of Equipment Reliability, Availability, and Maintainability (RAM), Semiconductor Equipment and Materials International, 3081 Zanker Road, San Jose, CA 95134. [4] "IEEE standard definitions for use in reporting electric generating unit reliability, availability, and productivity," IEEE Std 762-2006 (Revision of IEEE Std 762-1987), vol., no., pp.C1-66, March 15 2007. [5] Nelson, W. B., Accelerated Testing (Statistical Models, Test Plans, and Data Analysis), Wiley-Interscience, 2004. [6] Carazas, F. and Souza, G., “Availability Analysis of Gas Turbines Used in Power Plants,” Int. J. of Thermodynamics, ISSN 1301-9724, Vol. 12 (No. 1), March 2009. [7] Fife, John M., Morris, Russell W., System Availability Analysis for A Multi-Megawatt Photovoltaic Power Plant, PVSC Conference, June 18th 2009 [8] Rice, John (1995), Mathematical Statistics and Data Analysis (Second ed.), Duxbury Press [9] Weather Data, Table 1, Courtesy wunderground.com, Used with permission, http://www.wunderground.com/US/Region/Southwest/Fronts.html


using probabilistic methods to define reliability ...

using probabilistic methods to define reliability ...

Suggest Documents

Comparison of Probabilistic Methods to Solve the Reliability of ...

(MEMS) Using Probabilistic Methods - CiteSeerX

Using Literature to Define Justice

Methods to define confidence intervals for kriged

Different methods to define utility functions yield

Using Probabilistic Methods to Evaluate Landfire hazard

Using Probabilistic Methods to predict Phrase ... - Semantic Scholar

Using Statistical and Probabilistic Methods to Evaluate Health ... - MDPI

Using Statistical and Probabilistic Methods to Evaluate Health Risk ...

Using Probabilistic Methods to predict Phrase Boundaries ... - CiteSeerX

Probabilistic Structural Analysis and Reliability

Using Computational Neuroscience to Define Common ... - Frontiers

Using Perceptual Signatures to Define and ...

Using constellation pharmacology to define ... - Semantic Scholar

Using a panel of immunomarkers to define

Using Homogeneous Transformation Matrices to Define ... - CiteSeerX

Using molecular markers to define seed transfer

Probabilistic Settlement Analysis of Rafts using First Order Reliability

Probabilistic methods applied to Geotechnical Engineering

Reliability Methods Applicable to MechanisticâEmpirical ... - CiteSeerX

Exact Methods to compute Network Reliability

Reliability Analysis Methods

integration of geophysical methods to define the geological interfaces ...

A review of the methods used to define glucocorticoid ... - ESCEO

using probabilistic methods to define reliability ...