on Guality, Reliability & Engineening Statistics. SYSTEM AND. BAYESIAN.
RELIABILITY. Essays in Honor of. Professon Richard E. Barlow on His TOth
Birthday.
gt a t i s t i c s y E n g i n e e n i nS o n G u a l i t y ,R e l i a b i l i t &
SYSTEMAND BAYESIAN RELIABILITY Essaysin Honorof ProfessonRichardE. Barlow on His TOthBirthday
Editol's
Yu Hayakawa Víctoria [.]niaersity of Wellington, Neu Zealand
Telbalrony Food and. Drug Adminístration, USA
Min Xie National Uníuersíty of Singapore, Singapore
Publishedby World Scientific PublishingCo. ttp. Ltd. P O Box 128,FanerRoad,Singapole912805 USAffice: Srnte18, 1060Main Steer, River Edge,NJ 07661 UK offtce: 57 SheltonStrcet,CoventGàrden,London WC2H 9HE
British Library Caúaloguing-in-hrblication Data, A cataloguerecord for this book is availablefiom the British Librarv.
SYSTEM AND BAYESIAN RELIABILITY Copyright @2001 by Wodd Scientific publishingCo. pte. Ltd. All rights reserved.Thisboolç or parts thzreof,may not be reproducedin anyÍorm or by anymeans, electronicor mechnnical,includingphotocopying,recordingor any informationstorage'andretrieval systemnow kno'renor to be invented,without witten permissionfrom the publisher.
For photocopying of material in this volume, please pay a copying fee through úe copyright clearancecenter,lnc.,222 RosewoodDrive, Danvers,MAolgza,uS,q,. In this Ãe permissionto photocopy is not required from the publisher.
rsBN 981-02-4865-2
Printedin Singaporeby World Scienüficprinters
CHAPTER 17 A \MEIBULL \MEAROUT TEST: FULL BAYESIAN APPROACH
Telba Zalkind Irony Food anil Drug Ad.m'í.nistration 1350 Piccard Dr., HFZ-il12, Roclati,lle, MD 20850, U.S.A. E - m ail : tzi @cd.rh.f do. gou
Marcelo Lauretto*
and Carlos Alberto de Bragança PereiraÏ
Au. Afrânio Peinoto 188, CEP 05507-000 Butantan São Paulo, SP - Brasil * O-mail: lauretto@ suprernurn. corn, E- m ail : I cpereira@ime. usp. br
Julio Michael Stern Deparhnent of Mathematics and, Statistics, SUNY - Binghamton Binghamton NY 13902-6000, U.S.A. E-mail:
[email protected]
The F\rll Bayesia.n Significance Test (FBST) for precise hypotheses is presented, with some applications relevant to reliability theory. The FBST is an alternative to significance tests or, equivalently, to p-ualue.s. In the FBST we compute the evidence of the precise hypothesis. This evidence is the probability of the complement of a credible set "tangent" to the sub-manifold (of the para,rreter space) that defines the null hypothesis. We use the FBST in an application requiring a quality control of used components, based on remaining life statistics.
1. Introduction The Full Bayesian Significa,nce Test (FBST) is presented in Pereira and Stern (1999b) as a coherent Bayesian significance test. The FBST is intuitive and has a geometric characterization. It can be easily implemented us-
287
T. Z. Irony, M. Lauretto, C. A. B. Pereira and J. M. Stern
ing modern numerical optimization and integration techniques. The method is "Full" Bayesian and is based on the analysis of credible sets. By Full we mean that we need only the knowledge of the parameter space represented by its posterior distribution. The FBST needs no additional assumption, like a positive probabiÌity for the precise hypothesis, that generates the Lindley's paradox effect. The FBST regards likelihoods as the proper means for representing statistical information, a principle stated by Royall (1997) to simplify and unify statistical analysis. Another important aspect of the FBST is its consistency with the "benefit of the doubt" juridical principle. These remarks will be understood in the sequel. Significance tests are regarded as procedures for measuring the consistency of data with a null hypothesis, Cox (1977) and Kempthorne and Folks (1971). p-ualues are a tail area under the null hypothesis, calculated in the sample space, not in the parameter space where the hypothesis is formuÌated. Bayesian significance tests defined in the literature, like Bayes Factor or the posterior probability of the null hypothesis, consider the p-ualue as a measure of evidence of the null hypothesis and present alternative Bayesian measuresof evidence, Aitkin (1991), Berger and Delampady (1987), Berger et aI. (1997),Irony and Pereira (1986, 1995), Pereira and Wechsler (1993), Sellke eú al. (1999). As pointed out in Cox (7977), the first difficulty to define the p-ualue is the way the sample space is ordered under the null hypothesis. Pereira and Wechsler (1993) suggests a p-ualue that aÌways regards the alternative hypothesis. One can find a great deal of objections agaist each of these measures of evidence. The most important argument against Bayesian tests for precise hypothesis is presented by Lindley (1957). The literature is full of objections to the classicalp-ualue. The book by Royall (1997) and its review by Vieland et aI. (1998) presents interesting and relevant arguments motivating statisticians to start thinking about new methods of measuring evidence. In a more philosophical terms, Carnap (1962), de Finetti (1939), Good (1983) and Popper (1989) discuss, in a great detail, the concept of evidence. 2. Motivation In order to illustrate the FBST we discus a well known problem. Given a sample from a normal distribution with unknown parameters, we want to test if the standard deviation is equal to a constant. The hypothesis
WeibuII Wearout Test: Full Bayesian Approach
cis a straight line. we have a precise hypothesis since it is defined by o: a manifold (surface) of dimension (one) strictly smaller than the dimension of the parameter space (two). It can be shown that the conjugate family for the Normal Distribution is a family of bivariate distributions, where the conditional distribution of the mean, p,, for a fixed precisiot, P : lf o2, is normal' and the marginal distribution of the precision, p, is gamma, DeGroot (1970), Lindley (1978). 'We use the standard improper priors, uniform on ] oo, *oo[ for p', and 1f p on ]0, +m[ for p, in order to get a fair comparison with p-values, DeGroot (1970). Hence we have the parameter space, hypothesis and posterior joint distribution: O : { ( t r ,p ) e R x à + } ,
Oo: {jr,P)€ OIP:
s}
f 0r, plr) o Jp erp(-np(p, m)2lz)erp(-bp)p"-'
n : l r t . . . r n l, o : +
, rrl::0"t,
b:liA,-*)2
Figure 1 shows the plot of some ,"t"i:"..,,.rn"' of the posterior density function, including the level curve tangent to the hypothesis manifold. At the tangency point, á*, the posterior density attains its maximum, .f*, otr the hypothesis. The interior of the tangent level curve, ?*, includes all points with posterior density greater than /*, i.e. it is the highest probability density set tangent to the hypothesis. The posterior probability of.7", rc*, gives an indication of inconsistency between the posterior and the hypothesis: Small values'of rc* indicate that the hypothesis traverses high density regions, favoring the hypothesis. - 1 - rc* as the measure of evidence (for the Therefore we define Eu(H) precise hypothesis). : 10 and In Figure 1 we test c : 1 with n: 16 observations of mean rn standard deviation s :7.02, 1.1, and 1.5. we present the FBST evidence, Eu, and the standard yz-test, chi2' It is clear that this example is only an illustration: there is no need of new methods to test the standard deviation of a normal distribution. However, efiÊcient numerical optimization and integration computer proglams' make it straightforward to extend the FBST to more complex structures. In sections 6 and 7 we present an important application involving the Weibull
T. Z. Irong, M. Louretto, C. A. B. Pereúa and J. M. Stern n=16 m=1O c=l s1.5O evid=O.Oí chi2=O.OO
n=16 m=1O è1 s'1.1O evid=O.66 chi2=O.4O
s=1.O2 n=16 m=1O eí chi2=O.68 evid=O.8g
Í.1 1 1.8
í.6
o.a 1.4
o.7 1.2
t\
o.6
ii\ì
ìí
;í"1ï
o.4
íui
o.2 o.2 ío
í1
12
o.1 10 íí 9 posterioÍ mean, lr
9
lo
11
Fig. 1. Tangent and other Highest Probability Density Sets
distribution, requiring a quality control test for usedcomponents'basedon remaining life data. This problem appearsin engineeringas well as biological and pharmacologicalapplications.The FBST is exact and performswell even for small samplesand low frequencies.In the next section we give a more formal definition of the FBST. 3. The Evidence Calculus Considerthe random variableD that, when observed,producesthe data d. The statistical spaceis representedby the triplet (E, ^, O) where E is the samplespace,the set of possiblevaluesof d, A is the family of measurable subsetsof E a,ndO is the parameter space.We define now a prior model (@,B,zra),which is a probability spacedefinedover @. Note that in this model Pr{Al0} has to be O measurable.As usual, after observingdata d, we obtain the posteriorprobability model (O,B,n6), where zrais the conditional probability measureon B given the observedsample point, d. In this paper we restrict ourselvesto the casewhere the functions zrahas a
Wei,buII Wnrout Test: FuII Bayesian Approach
probability density function /. To define our procedure we should concentrate onÌy on the posterior we define T, as the subset of the paprobability space (O, B,ra).First, rameter space where the posterior density is greater than g.
Te:{0eoll(d)>e} The credibilíty of.T, is its posterior probability
r c : r a ( T e ) :I r p l d ) d n : I r r ( t l o ) o t Jo JTv
where ïr(") -- /(r) if Í@) > I and zero otherwise. Now, we define /* as the maximum of the posterior density over the null hypothesis, attained at the argument á*,
0"€arsruB*/(d) , f*: Í(0*) and define T* : TÍ- as the set "tangent" to the null hypothesis, fI, whose credibility is n*. The measure of evidence vre propose in this article is the complement of the probability of the set ?a*. That is, the evidence of the null hypothesis is Ea(H):L-K*
or 1-Td(T*)
If the probability of the set T* is "large", it means that the null set is in a region of low probability and the evidence in the data is against the null hypothesis. On the other hand, if the probability of ?* is "small", then the null hypothesis is in a region of high probability and the evidence in the data is in its favor. In the next section we give an operational construction of the FBST. and Integration 4. Numerical Optimization 'We restrict the parameter space' O, to be always a subset of Rn, and the hypothesis is defined as a further restricted subset Oo C O Ç -R'. Usually, Os is defined by vector valued inequality and equality constraints:
O o: { d e O l e ( d )< 0 ^ h ( 9 ) : 0 } . Since we are working with precise hypotheses, we have at least one equality constraint, hence dirn(An) < dim(@). Let /(á) be the posterior probability density fgnction, as defined in the last section.
The computation of the evidence measure defined in the last section performed is in two steps, a numerical optimization step, and a numerical integration step. The numerical optimization step consists of finding an argument d* that maximizes the posterior density /(á) under the null hypothesis. The numerical integration step consists of integrating the posterior density over the region where it is greater than /(á-). That is, o Numerical Optimization step:
0 * e a r" g r y q x 'f /') t g : f . - - f @ . ) 0€Ooo Numerical Integration step:
^* : I re@ld)do JE where fo(r) : f @) if f (r) ) 9 and zero otherwise. Efficient computational algorithms are available, for local and global optimization as well as for numerical integration, Bazaraa et aI. (1ggg), Horst eú al. (1995),Luenberger(1984),Nocedala.ndWright (1999),Pinter (1996),Krommer and Ueberhuber(1998),and Sloan and Joe (1994). Computer codes for severalsuch algorithms can be found at software libraries as ACM, GSL and NAG, or at internet sites as www.ornl.gouand, www-rocq.inria.fr. We notice that the method usedto obtain ?* and to calculaterc*can be used under generalconditions. Our purpose,however,is to discussprecise hypothesistesting,i.e. dim(@s)< dim(@),underabsolutecontinuityof the posterior probability model, the casefor which most solutionspresentedin the literature are controversial. 5. Weibull Distribution The two parameterWeibull probability density,reliabiìity (or survival probability) and hazard functions, for a failure time ú ) 0, given the shape,and characteristiclife (or scale)parameters,B ) 0, and 7 ) 0, are:
u(tl p,7) : (BtÊ-rltB) "*p(-(t/ìP) r(tl P,t) : erp(-(tlìB) z(tl p,t) = uOlrO : BtF-t1r0
Weibull Wearout Test: Full Bayesian Approach The mean and variance of a Weibull
293
variate are given by:
p :11(r + Llp) oz: f çr(r+ 210+ r2(1+ tl il) By alteringthe parameter,B,W(t I B,7) takesa variety of shapes,Dodson(1994).Somevaluesof shapeparameterare important specialcases:for 0 : L,I,7 is the exponentialdistribution; for B : 2, W is the Rayleigh distribution; for B - 2.5, W approximatesthe lognormal distribution; for Ê : 3.6,1I/ approximatesthe normal distributionl and for 0 : 5.0, W approximates the peakednormal distribution. The flexibility of the \Meibull distribution makesit very usefulfor empirical modeling,speciallyin quality control and reliability. The regions P < 1, 0 : l, and B > 1 correspond to decreasing,constant and increasinghazard rates. These three regions are also known as infant mortality, memoryless,and wearout failures. 7 is approximately the 63rd percentile of the life time, regardlessof the shape parameter. The \ü'eibull also has important theoretical properties. If n i'i.d. random variableshave Weibull distribution, Xi - w(tl|,7), then the first failure is a Weibull variate with characteristic life lfnr/p, i.e. X1r,,1 of the Weibull w(tl P,.,rlnL/01. This kind of propertyallowsa characterization as a limiting life distribution in the context of extremevalue theory, Barlow and Prochan(1975). The affine transformation ú : ú' + o leadsto the three parametertmncated Weibull distribution. A location (or threshold) parameter, o ) 0 representsbeginning observationof a (truncated) Weibull variate at ú : 0, after it has already survived the period [-o,0[. The three parametertmncated Weibull is given by:
w ( t l a ,g , t ) : @(t + Qe -r1 rae; rp (-((t+ a) 11) B) lr ( "1g,t) r(t I a,g,t) : erp(-((t+ a)I :)B)I r(" Ig,t)
6. Display Panels 'We were faced with the problem of testing the wearout of a lot of used display panels.A panel displays 1,2to 18 characters.Each characteris displayed as a 5 x 8 matrix of pixels, and each pixel is made of 2 (RG) or 3
T. Z. Irong, M. Lauretto, C. A. B. Pereim and J. M. Stern
(RGB) individual color elements, (like a light emitting diode or gas plasma device). A panel fails when the first individual color element fails. The construction characteristics of a display panel makes the \Meibull distribution ((burned specially well suited to model its life time. The color elements are int' at the production process, so we assume they are not at the infant mortality region, i.e. we assume the \Meibull's shape parameter to be greater than one, with wearout or increasing hazard rates. The panels in question were purchased as used components, taken from surplus machines. The dealer informed the machines had been operated for a given time, and also informed the mean life of the panels at those machines. Only working panels were acquired. The acquired panels were installed as components on machines of a different type. The use intensity of the panels at each type of machine corresponds to a different time scale, so mean lifes are not directly comparable. The shape parameter however is an intrinsic characteristic of the paneÌ. The used time over mean life ratio, p : a/F, is adimensional, and can therefore be used as an intrinsic measure of wearout. We have recorded the time to failure, or times of withdrawal with no failure, of the panels at the new machines, and want to use this data to corroborate (or not) the wearout information provided by the surplus equipment dealer. 7. The Model- -. The problem described at the preceding sections can be tested using the FBST, with parameter space, hypothesis and posterior joint density:
O : { ( o ,0 , 1 ) e ] 0 , o ox] [ 1 , o ox] [ 0 , * [ ] O s: { ( a ,0 , 1 ) e O l a : p p ( 0 , 1 ) } nm
f (o,g,r I D) o |Ir(tul o,0,t) fIr(tt I o,g,t) where the data D are of withdrawal with no At the optimization the log-likelihood, /l( withdrawals,
all the recorded failure times, ta ) 0, and the times failure, ti ) 0. step it is better, for numerical stability, to maximize ). Given a sample with n recorded failures and m
wh:1st1B1+@- 1)log(ú; +o) -Élos(r)- ((ú,+ o)lìB + ("lt)p
Weibuil Wearout Test: FuII Bayesian Approach
rtj:-((tj+41ìB+("1ìB nn
fI:Dr[*DrIi i:r
j:L
the hypothesisbeing representedby the constraint h ( o ,Ê , 7 ) : p 1 t ( I + I l 0 - a : 0 The gradientsof /l( ) and h( ) a.nalyticalexpressions,to be given to the optimizer,are: d,wl:
Í@ - L)l(t+ o) - ((t+ oòlìPBl(t + a) + (alfiBBla, + 4lt) + @lìB rog(al1) , rlÊ +tosft+ o) - los(?)- ((t + a)ll7 log((ú -Llt + (t+ a)lìBllt - @lìPïltl d,rl :
Pl(t + o) + (olúB01", [ -((t + a)1.òB -((t + log((ú + 4lú + @lìP rog(alì , ")lìB ((t + o)lìB olt , -@lìB Blt l d,h:
+ U0 I 9 2 ,P t ( + r r l0 ] [ - 1 , - p r r '$ + U r D r Q For gamma and digamma functions efficient algorithms see spanier and Oldham(1987). 8. Numerical ExamPle Table 1 displays45 failure times (in years),plus 5 withdrawals, for a small lot of 50 panels, in a 3.5 years long experiment. The panels have supposedly been used, prior to acquisition, for 30% of its mean life, i'e' we want to test p : 0.3.In general,someprior distribution of the shapeparameter is neededto stabilize the model. Knowing color elements' life time to be approximatelynormal, we considerB € [3.0,4.0].Table 2 displays the evidenceof somevaluesof p. The maximum likelihood estimatesof the - 3'54;so the estimates Weilbull'sparametersarea : I'25, P: 3'28 and I p - 0'3 with the hypothesis I.r:3.L7 and,p:0.39. The FBST corroborates an evidenceof.98%.
T. Z. Irong, M. Lauretto, C. A. B. Pereim and J. M. Stern
296
Table 1. Failure times and withdrawals in years,n: 0.01 t.2t 1,.7t 2.30 2.96
0.19 t.22 L.75 2.30 2.98
0.51 I.24 L.77 2.4L 3.19
0.57 1.48 L.79 2.44 s.25
0.70 1.54 1.88 2.57 3.31
Table 2.
p Evid
0.05 0.04
9. Final
0.L0 0.I4
0.2O 0.46
0.73 1.59 1.90 2.6t +1.19
0.75 1.61 1.93 2.62 +3.50
45,m:5
0.75 1.61 2.Ot 2.72 +3.50
1.11 1.62 2.16 2.76 +3.50
1.16 t.62 2.18 2.84 +3.50
Evidence for some values of p
0.30 0.98
0.40 1.00
0.50 0.98
0.60 0.84
0.70 0.47
0.80 0.2t
0.90 0.01
Remarks
The theory presented in this paper, grew oì.rt of the necessity of the authors' activities in the role of audit, control or certification agents, Pereira and Stern (1999a). These activities made the authors (sometimes painfully) aware of the benefit of the doubt juridical principle, or safe harbor liability rule. This kind of principle establishes that there is no liability as long as there is a reasonable basis for belief, effectively pÌacing the burden of proof on the plaintiff, who, in a lawsuit, must prove false a defendant's misstatement. Such a rule also prevents the plaintiff from making any assumption not explicitly stated by the defendant, or tacitly implied by existing law or regulation. The use of an a priori point mass on the null hypothesis, as on standard Bayesian tests, can be regarded as such an ad hoc assumption. As audit, control or certification agents, the authors had to check compliance with given requirements and specifications, formulated as precise hypotheses on contingency tables, In Pereira et al. (7999b) we describe several applications based on contingency tables, comparing the use of FBST with standard Bayesian and Classical tests. The applications presented in this paper are very similar in spirit, but we are not aware of any standard exact test in the literature. The implementation of FBST is immediate and trivial, as long as good numerical optimization and integration programs are at hand. In the applications in this paper, as well in those in Pereira et aI. (I999b), it is desirable or necessary to use a test with the following characteristics: a a
Be formulated directly in the original parameter space. Take into account the full geometry of the null hypothesis as a manifold (surface) imbedded in the whole parameter space.
WeibultWearoutTest:FuIl BagesianApproach
297
o Have an intrinsically geometric definition, independent of any nongeometric aspect, like the particular parameterization of the (manifold representing the) null hypothesis being used. r Be consistent with the benefit of the doubt juridical principle (or safe harbor liability ruÌe), i.e. consider in the "most favorable way" the claim stated by the hYPothesis. o consider onÌy the observed sample, allowing no ad hoc artifice (that could lead to judicial contention), like a positive prior probability distribution on the precise hypothesis. o consider the alternative hypothesis in equal standing with the null hypothesis, in the sense that increasing sampìe size should make the test converge to the right (accept/reject) decision' o Give an intuitive and simple measure of significance for the null hypothesis, ideally, a probability in the parameter space' FBST has all these theoretical characteristics, and straightforward (computational) implementation. Moreover, as shown in Madruga et aI' (200L), the FBST is also in perfect harmony with the Bayesian decision theory of Rubin (1987), in the sensethat there are specific loss functions which render the FBST. We remark that the evidence calculus defining the FBST takes place entirely in the parameter space where the prior was assessedby the sci((original" parameter space, although entist, Lindley (1933). we call it the acknowledging that the parameterization choice for the statistical model semantics is somewhat arbitrary. We also acknowledge that the FBST is not invariant under general change of parameterïzation' The FBST is in sharp contrast with the traditional schemes for dimensional reduction, like the elimination of so called "nuisancett parameters' ,,reduced" models the hypothesis is projected into a single point, In these greatly simplifying several procedures. Problems with the traditional approu"h are presented in Pereira and Lindley (19s7). The traditional reduclion o, projection schemes are also incompatible with the benefit of doubt principle, as stated earlier' In fact, preserving the original parameter space' in its full dimension, is the key for the intrinsic regularization mechanism of the FBST, when it is used in the context of model selection, Pereira and stern (2000,2001). of course, there is a price to be paid for working with the original parameter space, in its full dimension: A considerable computational work
T. Z. Irony, M. Lauretto, C. A. B. Perei,m anil J. M. Stern
load. But computational difficulties can be overcome with the used of efficient continuous optimization and numerical integration algorithms. Large problems can also benefit from program vectorization and parallelization techniques. Dedicated vectorized or paralÌel machines may be expensive and not always available, but most of the algorithms needed can benefit from asynchronous and coarse grain parallelism, a resource easily available, although rarely used, on any PC or workstation network through MPI, Portable Parallel Programming Message-Passing Interface, or similar distributed processing environments, Wilson and Lu (1996). Finally we notice that statements like (íincrease sample size to reject (accept) the hypothesis" made by many users of frequentist (standard Bayesian) tests, do not hold for the FBST. Increasing the sample size makes the FBST converge to the Boolean truth indicator of hypothesis being tested. In this sense, the FBST has good acceptancefrejection symmetry, even if the safe harbor rule prevents this symmetry from being perfect, introducing an ofset for small samples. We believe that the existence of a precise hypothesis test with the FBST's symmetry properties has important consequencesin knowledge theory, given the role played by the completely asymmetric standard statistical tests in some epistemological systems, Carnap (1962), Popper (1989).
References 1. 2. 3.
4. 5. 6. 7. 8. 9.
Aitkin, M. (1991).PosteriorBayesFactors.J R Statist Soc B 53(1), 111r42. Barlow, R.E. and Prochan, F. (1988) Statistical Theory ol Reliability and Life Testing.NY: Holt, Rinehart and Winston Inc. Barlow, RE and Prochan, F (1988) Life Distributions Models and Incomplete Data. In: Hanìlboolçof Statistics 7: Quali,ty Control and,Reliability. (Krishnaiah, PR and Rao, CR eds.) Amsterdam: North-Holand pp 225-250. Ba,zaraa,M.S., Sherali,H.D., and Shetty, C.M. (1993) NonliearProgramming: Theory and,Algorithms. NY: Wiley. Berger, J.O. and M Delampady,M. (1987). Testing precisehypothesis. ,9úatisücal Science3, 315-352. Berger, J.O., Boukai, B. and Wang, Y. (1997). Unified frequentist and Bayesiantesting of a precisehypothesis. Staüstical Science3, 315-352. BratÌey, P., Fox B.L. and Schrage,L. A Guid,e to Si,mulaüon SpringerVerlag, 1987. Dodson, B. (1994). Weibull Analgsis. Milwaukee: ASQC Quality Press. Ca.rnap, R. (1962). Logical Founilations of Probability. Univ. of Chicago
Weibutt Wea,routTest: FuII Bayesian Approach
Press. 1 0 . Cox, D.R. (Lg77). The role of significancetests. Scand'J Statist 4, 49-70 1 1 . Deiroot, ú.U. lroZo;. Optirnal StatisticalDecisions.NY: McGraw-Hill' 12. Evans, M. (1997). BayesianInferenceProceduresDerived via the concept of Relative Surprise. Communicationsin Statistics 26, 1L25-LL43' Galassi,M., Davies J., Theiler, J., Gough, B', Priedhorsky,R', Jungman, G. and Booth, M. (1999). GSL - GNtl Scientific Library ReferenceManual Y-r.5. WWW: lanl.gov. L4, Gomez,C. (1999). Engineeringand scientific computing with scilab. Berlin: Birkhâuser. 15. Good, I.J. (1983). Good thinking: The foundations ol probability and' its applications.University of Minnesota Press' 16. Fìnnetti, B. de (1989). Decisõ,o.in EnciclopédiaEINAUDL Romano, R. (edtr.). O Porto: ImprensaNacional' L7. Èor.t, R., PardalosP.M. and Thoai N.V. (1995). Introiluction to Global Optimization Boston: Kluwer. pro1g. Irãny, T.Z. and Pereira,C.A.B. (1986).Exact test for equality of two portions: FisherxBayes.J StatistComp I Simulation25,93-Ll4' 19. irony,T.Z. and PereiraC'A.3. (1995)'BayesianHypothesisTest:Using surface integrals to distribute prior information among hypotheses.Resenhas 2,27-46. 20. Iiempthorne,O. and Folks,L. (1971).Probabilitg,Staüstics,anil Data AnaIysis. Ames: Iowa State U.Press. 2L. ifuuth, D.E. The Art of Computer Prograrnming, uol 2 Seminumerical L996' Wesley, Algorithms. Addison 22. KÃmmer, A.R. and UeberhuberC.W. (1998). ComputationalIntegrat'íon. Philadelphia: SIAM' 24. L'Ecuyei, P. Efficient and Portable combined Pseudo-Random Number Generators. Commun. ACM' L988' 25. Lepage, G.P (1978). vEGAS: An Adaptive Multidimensional Integration Program. J. Comput. Phgs. 27, L92-205' 26. LehLann, E.t. (1986). Testing Statisti'calHgpothesis'NY: Wiley' 27. Lindley, D.V. (1957).A StatisticalParadox.Biometrika 44, t87_|92, 28. Lindley, D.V. (1978)' The BayèsianApproach' Scand J Statist 6, l-26' 29. Luenberger, D.G. (1984)' Linear and Nonlinear Programming' Reading: Addison-WesleY' 30. Ma.druga,M.R. Esteves'L.G. and Wechsler,S' (2001)' On the Bayesianity of Pereira-SternTests' To appear in TEST' 3L. Nocedal,J. and wright, s. (1999).Numerical opti,mization NY: Springer. g2. pereira,ô.A.8. and LindÌey,D.V. (1987).ExamplesQuestioningthe use of Partial Likelihood. The Statistician 36, t5-20' 33. Pereira,C.A.B.and Wechsler,S.(L993)' On the Concept of p-ualue'Braz J Prob Staüst 7, L59-L77. 34. Pereira,C.A.B.Stern,J.M. (1999a)'A Dynamic SoftwareCertification and
T. Z. Irony, M. Lauretto,C. A. B. Pereiraand,J. M. Stern
35.
37. 38.
39. 40. 4L.
42. 43. 44. 45.
47.
Verification Procedure. Proc. ISAS-99 - International Conferenceon Infornxation SystemsAnalysis anil SynthesisII, 426-435. Pereira,C.A.B.Stern,J.M.(1999b).Evidenceand Credibility: F\rll Bayesian SignificanceTest for PreciseHypotheses.Entropg 1, 69-80. Pereira,C.A.B.Stern,J.M. (2000).Intri;rrcicRegularizationin MoiLelSelect'íon us'i,ngthe Full Bagesian SignificanceTesú.Technical Report RT-MAC2000-6,Dept. of Computer Science,University of Sao Paulo. Pereira,C.A.B.Stern,J.M.(2001).Model Selection:F\rll Bayesia.n Approach. To appear in Environmetrics. Pintér, J.D. (1996). Global Optimizationin Action. Continuousand Lipschi.tzOptirnizati,on:Algorithms, Implementations and.Applications. Boston: Kluwer. Popper, K.R. (1989). Conjecturesand,Refutations: The Growth of Scientific Knowledge.London: Routledge. Royall, R. (1997). Statistical Euid,ene.:A Likelihood,Parad,igm.London: Chapman & Hall. Rubin, H. (1987). A Weak System of Axioms for ,,Rational,' Behavior and the Non-Separabilityof Utility from Púor. Statistics and Decisions5, 47-58. Sellke, T., Bayarri, M.J. and Berger, J. (1999). Calibration of p-values for Testing PreciseNull Hypotheses..I^9D,9 Discu,ssionPaper gg-15. Sloan, LR. and Joe, S. (1994). Lattice Methods for Mult'iple Integration. Oxford: Oxford University Press. Spanier,J. and Oldham, K.B. (1987).An Atlas of Functions.NY: HemispherePubÌishing. Vieland, V.J. and Hodge, S.E. (1993). Book Reviews: Statistical Euidence by R Royall (1997).Am J Hum Genet63,,283-289. Wichmann, B.A. and HilÌ, I.D. (1982).An Efficient and Portable PseudoRandom Number Generator.AppI. Stat., g1, 188-190. Wilson,G.V. and Lu,P. (1996). Parallel Programming Using C++. Cambridge: MIT Press.