Nov 1, 2005 - strength, o- u is the "location parameter" or "threshold stress", i.e. a stress below which fracture cannot occur, a0 is the scaling or normalizing ...
JOURNAL
OF M A T E R I A L S
S C I E N C E L E T T E R S 5 (1986) 6 1 1 - 6 1 4
Estimation of Weibull parameters using a weight function BILL
BERGMAN
Department of Physical Metallurgy and Ceramics, Royal Institute of Technology, S- 10044 Stockholm, Sweden
The distribution of fracture stresses of brittle materials, e.g. ceramics, is commonly described by Weibull statistics. The Weibull cumulative distribution function [1] is based on the "weakest-link hypothesis" and is given by P =
1 - exp
-
a\
ao
au ] / J
(1)
where P is the fracture probability, a is the fracture strength, o-u is the "location parameter" or "threshold stress", i.e. a stress below which fracture cannot occur, a0 is the scaling or normalizing parameter and m is the Weibull modulus. This expression can be used for fracture initiated by volume or surface flaws, but, of course, the scaling parameter will, for a given material, be different in these two cases. Equation 1 is valid for both uniformly and nonuniformly loaded components. As has been argued by Trustrum and Jayatilaka [2] o-u should be set equal to zero in order to obtain reliable safety factors for design. In the following we shall assume o-u to be equal to zero and write the Weibull function as P =
1 - exp
-
+]
(2)
(Through a simple linear transformation the threeparameter function, Equation 1, is obtained from the two-parameter function [2] and the results obtained later do not lose anything in generality.) In order to obtain reliable fracture statistics, o-u, o0 and m have to be determined from measurement of fracture stresses obtained, for example, in a four-point bend test. These parameters can be evaluated in different ways. The most common method is the linear regression analysis, because of its simplicity, rather than methods of direct curve fitting or maximum likelihood. Few authors have been concerned with the merits of different methods, for example Heavens and Murgatroyd [3], Trustrum and Jayatilaka [2] and Bergman [4, 5]. The first authors used real strength testing and in the three latter references Monte Carlo simulations were used. The true merits of different methods are most easily obtained by using data from Monte Carlo simulations. One problem with the linear regression analysis is that each datum point has been given the same weight. It has been shown that this assumption is erroneous [4]. If a linear regression analysis is to be performed in a correct way it is obvious that a weight function should be used. The simplest weight function is the step function which was used by Kamiya and Kamigaito [6]. In an earlier work [4] the present author 0261-8028/86 $03.00 + .12 © 1986 Chapman and Hall Ltd.
derived an analytical expression for the appropriate weight function by using the theory of propagation of errors. It is the intention of this work to study the possible advantages of this weight function by using Monte Carlo simulations. When doing a strength testing of a ductile or brittle material we in fact take a random sample from an, in principle, infinite amount of specimens. When testing a limited amount of specimens we should, of course, not expect to obtain an exact description of the mother population. The consequence of this is that each sample will have a unique m and o-0 value, differing from the true values [4, 7]. This behaviour is well known in the statistics literature, see e.g. [8], where this has been demonstrated by drawing random samples from a gaussian distribution. However, by drawing many samples, i.e. repeating the strength testing with many different samples, we can obtain a distribution of rn and a0 values. Henceforth by statistics we can obtain an estimate of these param_eters. Besides these natural complications there is a problem of how to describe the distribution of the fracture stresses obtained in an optimal way. It should be pointed out that this distribution is the key to obtain the fracture probability for a given stress. The intriguing question is what fracture probability expression to assign to each of the fracture stresses ranked in ascending order. The most widely used expression is, regretably i Pi - n + 1 (3) where n is the sample size, i.e. the number of specimens tested, and P~ is the fracture probability of the ith ranked specimen. This expression has, however, been shown to give a biased estimate of the Weibull modulus [2, 5]. A more efficient estimation is P~ -
i - 0.5
(4)
n
which has been shown to give the least-biased modulus of four more or less common estimators [5] and should therefore be preferred. The classic way of obtaining m and a 0 is by linearizing Equation 2 as l n I l n ( 1_ ~ 1 p ) l
=
mlna--
mlna0
(5)
Then a linear regression analysis is applied to Equation 5. In fact, this method was used for investigating the statistical properties of the estimators in Equations 3 and 4 [5]. 611
As discussed in [4], this simplified use of the method of least squares is by no means justified on theoretical grounds. From Bevington ([9], p. 187) it was quoted "the method of least squares is built on the hypothesis that the optimum description of a set of data is one which minimizes the weighted sum of squares of deviations of the data Y~from the fitting function Y(x~). This sum is characterized by the variance of the fit, which is an estimate of the variance of the data". The weighted sum is given by x2 =
Y'. ( 1~0 [ Y i - Y(xi)] 2
(6)
I . 09
~.
I
I
I
I
t
I
I
9.90 9.69
9.79 t=l
9.69
tu
9.50
=
9.49 9.39 9.29 @.10
where Si is the uncertainty (standard deviation) of the measurement to the dependent parameter Y~([9], p. 98). The basic assumption is that the uncertainty in the dependent variable is considerably greater than that of the independent variable. It is, however, possible when the uncertainty of xi is greater than or equal to that of Y/to combine these uncertainties and assign these to the dependent quantity. If the uncertainty in Y~ is independent of the value of Y,-, e.g. constant or randomly distributed, the fitting function is often easily solved. This is one of the most common assumptions, which seldom or never is explicitly discussed, at least not in papers dealing with the linear regression analysis of the two-parameter Weibull function [4]. This assumption has, in a previous work [4], been discussed to some extent and in the following we shall discuss this in more detail. The writing of Equation 2 as the linearized form shown in Equation 5 is equivalent to the following equation, which is linear in the parameters to be determined, y
=
(7)
a + bx
where y -= In {In [1/(1 - P)]}, a = m In a0, b - m and x ~ In a. It is'by no means obvious or true that the uncertainty, Si, of Y~is independent of the magnitude (or that the uncertainty ofln a does not depend on the stress level). It is more natural to assume the uncertainty of P (or a) to be constant or random for any probability (or stress) value. To solve the problem by using these uncertainties we can use the theory of propagation of errors, and relate the uncertainty of Y~, i.e. S~, which is not constant or random, with the uncertainty of P,., which is more likely to be constant and we denote it by S, From Bevington ([9], p. 182) we obtain
s, =
\--~,) s; -s;
(1 -
P~) In (1 -
(9)
P,.)
The weighted sum which should be minimized for the underlying assumptions is - P) ln(1
!nq-_ e2,)
-
P~)
In In
[1 -
B. 6 0
0. ~0
•00
Figure 1 The weight function as a function o f failure probability.
has been scaled to give a maximum value of one and this occurs at Pi ~ 0.63. It is now obvious that the common assumption of a constant or random uncertainty, which is the basis for the least squares analysis of Equation 5, should overestimate the effects of data at low fracture probability. However, when doing a strength testing or a Monte Carlo simulation it often occurs by chance that the sample does not include the number of low fracture probability specimens it "should" have. In general we cannot therefore say that the Weibull modulus is too high. The asymmetric nature of the weight function could lead us to assume that an asymmetrical omission of data at low and high P should give us a more true result. But for a given sample we cannot know if this will improve the interpretation of the data or not [4]. In the previous work [4] it was not discussed whether Equation 10 can be minimized to give analytical expressions for the parameters m and ao. In the following we shall see how these can be evaluated. To simplify the expressions, we set [(1 - PD In (1 Pc)]2 equal to W~and by using Equation 7 we can write Equation 10 as I£2 =
E
Wii(Yi -- a -
bxi) 2
(11)
The minimum value of x 2 is as usual obtained by differentiating Equation 11 with respect to a and b, i.e. 0~c2
2 ~ W~(Y~- a -
8a
2 ~ Wix,(Y~-
8b
bx,)
The weight factor [(1 - P~) In (1 -- p~)12 strongly varies with P~, as is seen in Fig. 1. The weight factor
a -
bxi)
(12)
By putting &c2/Sa = ~3K2/~3b -- 0 and rearranging we obtain, 8 =
(z x, Y, wD (z We - z x, wD (z 5 wD z w,,[z £ < - (z x, w y ]
P(a,)]J
(10)
612
~'. 4 ~
FRACTURE PROBABILITY
63K2
s; =
= ~(1
9.2~
9
(8)
which gives
K2
9.99
a =
z YTw~- b Z x,W~ zw,
(13)
We have now obtained a simple way to evaluate m and a0 and the evaluation can be performed without using
T A B L E I Estimated means and standard deviations of m for different estimators and two weight functions at different sample sizes n
100 50 40 30 20 10
Pi = i/(n + 1)
Pi = (i - 0.5)/n
1/nl/2
rhvar/m
Sm/~"lvar
moon/m
Sm/mcon
fftvar/m
Sm/i~tvar
rhcon/m
S~/th~o,
0.9818 0.9633 0.9546 0.9366 0.9171 0.8646
0.095 0.135 0.149 0.170 0.200 0.305
0.9516 0.9275(0.927) 0.9193(0.918) (0.908) (0.890) (0.869)
0.101 0.149(0.149) 0.166(0.167) (0.189) (0.240) (0.333)
0.9976 0.9953 0.9924 0.9897 0.9895 1.0148
0.096 0.143 0.153 0.178 0.220 0.332
0.9967 1.003(0.998) 1.002(1.002) 1.003(1.006) (1.01 I) (1,062)
0.103 0.149(0.143) 0.162(0.166) 0.186(0.186) (0.230) (0.330)
0.10 0.141 0.158 0.183 0.224 0.316
/~var and rhcon are the mean values ofm obtained for the weight function and the constant weight factor, respectively. The values in parenthesis
are from [5]. The last column shows the theoretical coefficient of variation.
complex programs. The calculation can easily be performed by a programmable table calculator. The simulation procedure used in this work has been described in an earlier work [5] and will therefore only be briefly discussed. The procedure is as follows. We start by rewriting Equation 2 as a
=
a0 in \f--Z---fi)_] (14) I ( 1 ~] 'jm If we regard a large specimen population with a prescribed m value it is seen that random strength values can be obained from Equation 14 provided random numbers between 0 and 1 are substituted for the fracture probability P. A computer program was written, which used random numbers to obtain strength values ai, o"2, • • • , a, for a given m value. The strength values were then ranked in ascending order and fracture probabilities were Calculated from Equations 3 and 4, respectively. These ordered strength values were then analysed by a least squares analysis of the linearized expression in Equation 5. The minimization was performed by using a constant weight factor and a variable weight factor, given by Equation 10, respectively. The obtained specific m value of each sample was stored. This procedure was repeated for each combination of estimator and weight factor expression between 1000 and 4000 times depending on the sample size. The generated random samples were of size n = 10, 20, 30, 40, 50 and 100. The random strengths were generated from a distribution with m = 10 and o-0 = 1. The mean values of m, r~, and the standard deviation of m, S,,, are obtained from r~
=
~mi
-~
i=1
s~ = ~ ( m / - rh)2
(15)
i k 1 where k is the number of samples and mi is the m value o f sample i. In Table I the results of the Monte Carlo simulation are shown. The listed values are rh/m and S=/ffn. For an unbiased estimate th/m is expected to be close to unity. It has been shown [10] that the coefficient of variation of m, i.e. Sm/rh is given by 1/n 1/2. In the last column of this table, 1In l/z is given for the different samples sizes and as is seen this gives a good description of the experimental data. We can also see that there are just slight differences between the different coefficients of variation of m for a given sample size. From this point of view it should not matter what
estimator or weight factor expression one chooses. However, if we look at the bias effect we can conclude that P, = (i - 0.5)/n gives a much smaller bias than P, = i/(n + l) does. The use of a variable weight factor somewhat improves the data for P, = i/(n + 1). However, for ~ = (i - 0.5)/n there is no significant improvement by using a variable weight factor instead of a constant weight factor. It should be pointed out that when we do a Monte Carlo simulation, as described above, the obtained strength values contain no uncertainties. All the uncertainty lies in our estimate of P. A reasonable estimate of the probable error of P should be 0.5/n. It is therefore obvious that when analysing small sample sizes the uncertainty of each individual P~ can become quite big and result in an appreciable uncertainty of the weight factor. This argument should also be true for a real strength testing provided the errors introduced by, for example, specimen preparation and experimental work can be kept sufficiently small. As also found in this work, the experimental o-0 values are always close to one and with a coefficient of variation less than 2%. By using computer-generated random samples, a more fundamental understanding of the fracture statistics can be obtained. For the estimation of the Weibull modulus two probability expressions ~ = i/(n + 1) and P~ = (i - 0.5)/n, respectively, have been used. Linear regression analysis of Equation 1 was performed using a constant weight factor equal to one and a weight function, respectively. It was shown for P~ = i/(n + 1) that a weight function gave somewhat smaller bias than the constant weight factor. For P~ = (i - 0.5)/n there was no significant improvement by using a weight function instead of a constant weight factor. For all the experimental conditions it was found that the coefficient of variation of the experimentally obtained rn values was given by 1/n 1/2.
References 1. W. WEIBULL, Ingenjrrsvetenskapsakademien, Handlingar 151 (1939) 1. 2. K. TRUSTRUM and A. De S. JAYATILAKA, J. Mater. Sei. 14 (1979) 1080. 3. J. W. HEAVENS and P. N. MURGATROYD, J. Am. Cerarn. Soe. 53 (1970) 503. 4. B. BERGMAN, "On the Analysis of Brittle Fracture Statistics", TRITA-MAC-0256, April 1985, Materials Centre, Royal Institute of Technology, S-100 44 Stockholm, Sweden. 5. Idem, J. Mater. Sci. Lett. 3 (1984) 689. 6. N. KAMIYA and O. KAMIGAITO, J. Mater. Sei. 19 (1984) 4021.
613
7. B. BERGMAN, J. Mater. Sci. Lett. 4 (1985) 1143. 8. G. J. HAHN and S. S. SHAPIRO, "Statistical Models in Engineering" (Wiley, New York, 1976) p. 260. 9. P. R. BEVINGTON, "Data Reduction and Error Analysis for the Physical sciences" (McGraw-Hill, New York, I978).
614
10. J. RITTER,
N. BANDYOPADHYAY
Am. Ceram. Bull. 60 (1979) 788.
Received 1 November and accepted 25 November 1985
and K. JAKUS,