A preventive maintenance policy with sequential checking procedure ...

23 downloads 1206 Views 324KB Size Report
In that way, some surveys on maintenance policies for deteriorating systems may for ex- ample be ... Sequential checking procedures have already been studied.
European Journal of Operational Research 147 (2002) 548–576 www.elsevier.com/locate/dsw

Stochastics and Statistics

A preventive maintenance policy with sequential checking procedure for a Markov deteriorating system Sophie Bloch-Mercier * Equipe d’Analyse et de Math ematiques Appliqu ees, Universit e de Marne-la-Vall ee, Cit e Descartes, 5 Boulevard Descartes, Champs-sur-Marne, 77454 Marne-la-Vall ee Cedex 2, France Received 22 November 1999; accepted 7 September 2001

Abstract We consider a repairable system subject to a continuous-time Markovian deterioration while running, that leads to failure. The deterioration degree is measured with a finite discrete scale; repairs follow general distributions; failures are instantaneously detected. This system is submitted to a preventive maintenance policy, with a sequential checking procedure: the up-states are divided into two parts, the ‘‘good’’ up-states and the ‘‘degraded’’ up-states. Instantaneous (and perfect) inspections are then performed on the running system: when it is found in a degraded up-state, it is stopped to be maintained (for a random duration that depends on the degradation degree of the system); when it is found in a good up-state, it is left as it is. The next inspection epoch is then chosen randomly and depends on the degradation degree of the system by time of inspection. We compute the long-run availability of the maintained system and give sufficient conditions for the preventive maintenance policy to improve the long-run availability. We study the optimization of the long-run availability with respect to the distributions of the inter-inspection intervals: we show that under specific assumptions (often checked), optimal distributions are non-random. Numerical examples are studied.  2002 Elsevier Science B.V. All rights reserved. Keywords: Reliability; Maintenance; Optimal sequential checking procedure; Markov renewal theory; Long-run availability

1. Introduction Let us consider a repairable system subject to deterioration while running, that leads to failure. Then, a repair is begun, that may be long or costly, because of the failure. A natural idea, and it has already been much studied in the literature, is then to try to prevent some failures of the system by preventively maintaining it. In that way, some surveys on maintenance policies for deteriorating systems may for example be found in [20,23,26] or [27]. One may also consult [25], where relations between mathematical modelling and applications are discussed.

*

Tel.: +33-1-6095-7540; fax: +33-1-6095-7545. E-mail address: [email protected] (S. Bloch-Mercier).

0377-2217/02/$ - see front matter  2002 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 7 - 2 2 1 7 ( 0 1 ) 0 0 3 1 0 - 1

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

549

Here, we assume that the initial (or unmaintained) system evolves according to a continuous-time Markov process with a finite state space as long as it is running whereas repairs follow general distributions. Such a modelization is motivated by the fact that failure rates may often be considered as constant in industry, which is generally false for repair rates (see [2] for instance). Typically, our modelization will correspond to a system formed of components with constant failure rates, which cannot be repaired while the system is running. Though, they may be repaired (or replaced) during a stop of the system (due to failure or preventive maintenance) with general repair rates. Of course, any configuration composed of components with constant failure rates and constant repair rates may anyhow be included in our modelization. Such a system is submitted to the following preventive maintenance policy: we assume that no continuous monitoring is performed on the system, so that the state of the system is not continuously known. However, failures are instantaneously detected (and repairs begun). Also, it is possible to know the degradation state of the running system (among a finite number of possible up-states) through perfect instantaneous inspections, which do not degrade the system. The up-states are divided into two parts, the ‘‘good’’ up-states and the ‘‘degraded’’ up-states. Inspections are performed on the running system in the following way: when the system is found in a degraded up-state, it is stopped to be maintained (for a random duration that depends on the degradation degree of the system). When it is found in a good upstate, it is left as it is. The next inspection epoch is then chosen randomly and depends on the degradation degree of the system by the time of inspection. Consequently, the system is submitted to what Barlow et al. [4] called a sequential checking procedure. If we have a look at the surveys on maintenance policies already quoted, we can see that, contrary to ours, the most frequently studied inspection schedules are periodic, namely such that intervals between inspections are equal, or follow the same distribution when random [1,5,15,19,22,28,29]. Our sequential schedule is more general. It is then expected that the added flexibility of our model allows for lower cost or higher availability (at least for optimal policies). Sequential checking procedures have already been studied in early papers by Savage [24] or Barlow et al. [3,4]. Though, their models highly differ from ours: a failure is only detected by inspection; there is a single up-state; no preventive maintenance is performed. (Their problem actually is to minimize the cost due to the time elapsed between system failure and its detection, and due to the inspections.) In an early paper too [18], Klein studied an inspection model with a finite state space, where the time-to-next inspection is also determined at each inspection just as in our model, generalizing works of Derman [12]. Though, he is concerned with a discrete time scaling (the system evolves according to a Markov chain). Also, failures are only discovered by inspections. A modelization nearer from ours may be found in [16], where the initial system evolves according to a semi-Markov system with a finite state space. However, when discovered in a degraded up-state by inspection, the system is then continuously monitored up to failure, contrary to us where a preventive maintenance action takes place. Also, replacements are instantaneous. Note that, contrary to the present work, this last assumption actually is very frequent ([5,13,15,19,21] etc). Some continuous-time models with continuous degradation degree have also been studied since 1980s [1,14,22,29]. Here again, replacements are often considered as instantaneous. We finally come to a paper from Cocozza-Thivent [11], presenting the same modelization as ours. The criterion used in her paper is the expected cost per unit time in long-time run whereas we here present the study of the long-run availability. We have also studied the same cost as Cocozza-Thivent, which may be found in [7]. The numerical results (for the long-run availability as well as for the cost) have been compared and are identical. The methods used in the present paper and in [11] highly differ, allowing us to get a much simpler formulation for the long-run availability (and for the cost), with simpler proofs too (though they still are quite long . . .). Also, we may note that, because of the method, inter-inspections intervals are not allowed to be non-random in [11] whereas no restriction is made here. This actually is of importance for we show here that under ‘‘reasonable’’ assumptions, non-random distributions are optimal among all the

550

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

possible distributions for the inter-inspections intervals. Such a result had already been noticed numerically by Cocozza-Thivent under the same assumptions, with non-random distributions obtained as limits of Gamma distributions. Such types of results are of course quite usual since Barlow and Proschan’s works [2]. They are quite clear when the studied criterion solely depends on the mean inter-inspections intervals. Also, they usually are easy to get when optimization is carried on with respect to a single distribution ([6] for instance). Here, the long-run availability actually depends on the distributions of the inter-inspections intervals (and not only of their means) and it is optimized with respect to several distributions, making the result more delicate. (Such types of optimizations may though be found in [2] for different models from ours.) Another addition to [11] is that conditions are obtained here, for the preventive maintenance policy to improve the long-run availability: in this way, it is shown that, under reasonable assumptions, if the durations of the preventive maintenance actions are not too long in average, then it is worthy to maintain the system. Upper bounds are provided for the mean durations of maintenance actions. Those bounds are very easy to compute and their numerical accuracy has been checked on different examples. This paper is organized as follows. We begin with specifying our notations and assumptions (Section 2). We go on with the computation of the long-run availability of the maintained system (Section 3). We then give sufficient conditions for the preventive maintenance policy to improve the long-run availability (Section 4) and we study optimization of the long-run availability with respect to the inter-inspections distributions (Section 5). We finally study some numerical examples (Section 6).

2. Description of the system – assumptions and notations We assume that the system has a finite state space, each state corresponding to a possible degradation degree of the system. Let 1; 2; . . . ; m be the up-states and m; m þ 1; . . . ; m þ p be the down-states. (We can imagine, for instance, that states 1 to m þ p are ranked according to their increasing degradation degree.) For simplicity of statement, we assume that the system is up at the beginning. It then evolves in time according to a Markov process as long as it is running and almost surely breaks down after a finite time: let T be the first on-period of the system, we then have Pi ðT < þ1Þ ¼ 1 for any 1 6 i 6 m, where Pi is the conditional probability given that the system started from state i. In case of failure, a repair is instantaneously begun, with a random duration that is independent of the evolution of the system before failure. If the system is in down-state m þ k, the repair has the same duration as a random variable Rmþk , with a finite mean EðRmþk Þ. After repair, the system starts again in an up-state which is assumed to be independent of the previous evolution of the system (and consequently, independent of the down-state by time of repair). For 1 6 i 6 m, the probability for the system to start again in state i after any repair is then denoted by DR ðiÞ. After any repair, the system evolves in time according to the same Markov process as from the beginning. This Markov process (that describes the evolution of the system from the origin up to its first failure) is denoted by ðXt Þ:  state of the system if t < T ; Xt ¼ mþk if t P T : (The down-states have been made absorbing.) Let A ¼ ðai;j Þ1 6 i;j 6 mþp and ðPt ði; jÞÞ1 6 i;j 6 mþp , respectively, be its (infinitesimal) generator and its transition kernel (Pt ði; jÞ ¼ Pi ðXt ¼ jÞ). Matrix A is subdivided as follows:

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

551

where A1 is the north-west m  m truncation of A ðA1 ¼ ðai;j Þ1 6 i;j 6 m Þ and A2 is the sub-matrix of A formed by the failure rates ðA2 ¼ ðai;j Þ1 6 i 6 m;mþ1 6 j 6 mþp Þ. We will also need the m  m square matrix g ¼ ðgi;j Þ1 6 i;j 6 m such that Z þ1 gi;j ¼ Pt ði; jÞ dt for any 1 6 i; j 6 m: 0

Symbol gi;j represents the mean time spent in state j before failure, given that X0 ¼ i. It is known that g ¼ A1 1 (see Theorem 4.25 of [17], for instance). This system is called the initial system (or unmaintained system), to be opposed to the maintained system, which is submitted to the following preventive maintenance policy. Let q1 ; q2 ; . . . ; qm be m probability distributions on Rþ : Rsymbols q1 ; q2 ; . . . ; qm represent the distributions þ1 of the inter-inspections intervals. (It is assumed that 0 < 0 tqi dt < þ1 for any 1 6 i 6 m.) Let q be a fixed integer ð1 6 q 6 m  1Þ: symbols 1; . . . ; q represent the ‘‘good’’ up-states whereas q þ 1; . . . ; m represent the ‘‘degraded’’ up-states. Let Mqþ1 ; Mqþ2 ; . . . ; Mm be m  q non-negative independent random variables, with some general distributions: they represent the durations of the preventive maintenance actions. (We assume that Mqþ1 ; Mqþ2 ; . . . and Mm have finite expectations.) Now let S0 ¼ 0 and S1 be a random variable independent of the evolution of the system, with distribution qX0 (we recall that X0 2 f1; . . . ; mg). The system is instantaneously inspected by time S1 ; S2 ; . . . ; Sn ; . . . recursively defined in the following way: for n 2 N , • If XSn 2 f1; . . . ; qg, the system is in a ‘‘good working’’ state. We do nothing but choosing the time-to-next inspection: next inspection will take place at time Snþ1 ¼ Sn þ U ðnÞ , where U ðnÞ is a random variable with qXSn for distribution, independent of the previous evolution of the system before Sn . (The random variable U ðnÞ only depends on the state XSn of the system at time of inspection Sn .) • If XSn 2 fq þ 1; . . . ; mg, the system is in a ‘‘degraded’’ up state. The system is stopped and a maintenance action is instantaneously begun. This maintenance action has a random duration with the same distribution as the random variable MXSn . It is independent of the previous evolution of the system. (Here again, MXSn only depends of XSn ). During this maintenance action, the system is in the so-called ‘‘maintenance-state’’, denoted by lXSn . After a maintenance action, the system starts again in an up-state which is independent of the previous evolution of the system (just as after repair): the system starts again in the up-state i with probability DM ðiÞ (for any 1 6 i 6 m). • If a failure happens before inspection at time Sn , a repair is begun, in the same way as for the initial system. After a down period (for repair or preventive maintenance), a new sequence of inspections is begun, recursively defined in the same way as from the beginning (see Fig. 1). The process describing the evolution of the maintained system is denoted by ðZt Þt P 0 , with values in f1; . . . ; m þ pg [ flqþ1 ; . . . ; lm g. 2.1. Matrix notations • • • • • •

For any n 2 N , In is the n  n identity matrix. For any k; n 2 N ,  0k;n is the k  n matrix of zeros. For x 2 f0; 1g and n 2 N , xn is the n  1 column vector of x’s. For any matrix z, symbol zð ; jÞ represents the jth column of z. For x1 ; . . . ; xm 2 R, diagðx1 ; . . . ; xm Þ is the diagonal m  m matrix with x1 ; . . . ; xm as diagonal coefficients. DR ¼ ðDR ð1Þ; . . . ; DR ðmÞÞ and DM ¼ ðDM ð1Þ; . . . ; DM ðmÞÞ.

552

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

Fig. 1. The preventive maintenance policy.

• 1  0q B EðMqþ1 Þ C C B C B EðMÞ ¼ B EðMqþ2 Þ C C B . .. A @ 0

0

and

1 EðRmþ1 Þ B EðRmþ2 Þ C B C EðRÞ ¼ B C: .. @ A .

EðMm Þ

EðRmþp Þ

• Symbol b ¼ ðbi;j Þ1 6 i;j 6 m is the m  m square matrix such that: bi;j ¼ Pi ðZS1 ¼ j \ T > S1 Þ ¼ Pi ðXS1 ¼ jÞ for any 1 6 i; j 6 m: Symbol bi;j represents the probability for the system to be in state j by the time of first inspection (without any failure before inspection) given that the system started in state i. • Symbols bq;q , bq;mq , bmq;q and bmq;mq are sub-matrices of b such that bq;q ¼ ðbi;j Þ1 6 i;j 6 q ; bq;mq ¼ ðbi;j Þ1 6 i 6 q;qþ1 6 j 6 m ; bmq;q ¼ ðbi;j Þqþ1 6 i 6 m;1 6 j 6 q and bmq;mq ¼ ðbi;j Þqþ1 6 i;j 6 m or, equivalently,

q;q b bq;mq b ¼ mq;q mq;mq : b b We end this section with a technical result, that leads to an additional notation. Its proof may be found in Appendix A. Lemma 1. 1. The matrix Iq  bq;q is non-singular. 2. If x and y are two column m  1 vectors, then the m equalities xðiÞ ¼

q X j¼1

xðjÞbi;j þ yðiÞ for any 1 6 i 6 m

ð1Þ

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

553

are equivalent to x ¼ By; where matrix B is such that B¼

ðIq  bq;q Þ1 1 bmq;q ðIq  bq;q Þ

 0q;mq Imq

! :

ð2Þ

3. Computation of the long-run availability Let A1 be the long-run availability of the maintained system (if it exists). We recall that A1 is the probability that the system is up, when in long-time run, namely, m X PðZt ¼ kÞ: A1 ¼ lim t!þ1

k¼1

The problem then is to show the existence of a limiting distribution for the process ðZt Þ when t goes to infinity, and to compute it. 3.1. Asymptotic distribution for the semi-regenerative process ðZt Þ The key to study the existence of an asymptotic distribution for the process ðZt Þ is to note that the later evolution of the maintained system after a new start following a down period (for repair or preventive maintenance) only depends on the up-state in which the system starts again after the down period. As a consequence, the process ðZt Þ is a so-called semi-regenerative process: the successive up-states visited by the system after a down-period form a Markov chain and the successive times of such new starts delimit independent cycles, given the Markov chain. From general theorems of the Markov renewal theory [9] or [10], we then derive that a limiting distribution exists for process ðZt Þ and that it corresponds to the mean distribution of process ðZt Þ on a cycle. We get the following result. Theorem 2. When t ! þ1, the distribution of Zt tends to distribution p such that: pðjÞ ¼ jDMR BðIm  bÞ  gð ; jÞ for any 1 6 j 6 m; pðm þ kÞ ¼ jEðRmþk ÞDMR BðIm  bÞg  A2 ð ; kÞ for any 1 6 k 6 p; pðlj Þ ¼ jEðMj ÞDMR B  bð ; jÞ

for any q þ 1 6 j 6 m;

where



q  0 1m ÞDR þ DR Bb mq DMR ¼ ðDM BðIm  bÞ DM 1

and j is a normalizing constant such that i1 h j ¼ DMR BððIm  bÞg 1m þ ðIm  bÞgA2 EðRÞ þ bEðMÞÞ : The proof of this result may be found in Appendix A.

554

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

(To understand such a result, we may note from the proof that the row vector DMR corresponds to the stationary distribution say m of the Markov chain formed by the visited up-states after down-periods, up to a normalizing constant: m ¼ constant  DMR .) It is now easy to derive the existence and the value of the long-run availability A1 . We first give an expression for A1 in the general case and then indicate a few simplifications under additional assumptions commonly assumed. 3.2. The general case Theorem 3. The long-run availability of the maintained system exists and is A1 ¼

1 1 þ a1

a1 ¼

DMR BððIm  bÞgA2 EðRÞ þ bEðMÞÞ ; DMR BðIm  bÞg 1m

with ð3Þ

where DMR has been defined in Theorem 2. Proof. Using Theorem 2, we clearly have A1 ¼

m X k¼1

pðkÞ ¼

DMR BðIm  bÞg1m 1 ¼ ; m  DMR BððIm  bÞg1 þ ðIm  bÞgA2 EðRÞ þ bEðMÞÞ 1 þ a1

where a1 is given by (3).



Remark 1. From the proof of Theorem 2 (see Appendix A), it may be seen that symbol a1 represents the Mean Down Time on a cycle of the semi-regenerative process ðZt Þ divided by the Mean Up Time on the same cycle, as usual. Besides, we may note that the long-run availability depends on the durations of the maintenance actions only through their means, as expected. On the contrary, it depends on the distributions of the interinspections intervals q1 ; . . . ; qm (through b and B) and not only of their means. 3.3. A few particular cases 3.3.1. Case DM ¼ DR It is commonly assumed in literature that repairs and maintenance actions correspond to replacements of the system by a new one, with the same characteristics as the initial one. In that case, if the perfect working state is denoted by 1, this assumption means that DM ¼ DR ¼ ð1; 0; . . . ; 0Þ. More generally, if DM ¼ DR ¼denoted by D, we may note that DMR may be simplified as: DMR ¼ D. Indeed, we may get this result by computation using definition of DMR and (A.9) (see Appendix A). Also, it is easy to see that the stationary distribution m of the underlying Markov chain here is m ¼ D. As m is proportional to DMR , we easily derive that DMR ¼ constant  D. 3.3.2. Case where EðRmþk Þ is independent of k ð1 6 k 6 pÞ We assume here that the duration of repair is independent of the degradation state of the system. For example, one may imagine that the duration of the repair itself is negligible in front of the waiting time for the repair-man to arrive in case of failure or in front of the time necessary to start again the system after

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

555

failure. Also, one may imagine that the durations of repair for the components necessary to the good working of the system are long in front of the others. Then, in any of those cases, we may assume that EðRmþk Þ is independent of k ð1 6 k 6 pÞ. (Note that the case of a single down-state is a special case of this situation.) Under such assumption, let EðRÞ ¼ EðRmþk Þ, for any 1 6 k 6 p. Term DMR BðIm  bÞgA2 EðRÞ in a1 now becomes DMR BðIm  bÞgA2 EðRÞ ¼ EðRÞDMR BðIm  bÞgA2 1p : Pmþp Besides, as A is a generator, Pm we know Ppthat j¼1 ai;j ¼ 0 for any 1 6 i 6 m. This may be written as j¼1 ai;j þ k¼1 ai;mþk ¼ 0 for any 1 6 i 6 m or equivalently A1 1m þ A2 1p ¼ 0m . We derive gA2  1m ð4Þ 1p ¼ A1 A2  1p ¼  and the term DMR BðIm  bÞgA2 EðRÞ now becomes DMR BðIm  bÞgA2 EðRÞ ¼ EðRÞ  DMR BðIm  bÞ1m : 3.3.3. Case where EðMj Þ is independent of j ðq þ 1 6 j 6 mÞ In the same way as for repairs, the durations of the maintenance actions may also be independent of the degradation state of the system when they begin. We may then assume that EðMj Þ ¼ EðMÞ for any q þ 1 6 j 6 m. (Note that the case where maintenance actions are only performed on the most degraded upstate, case q ¼ m  1, is a special case of this situation.) In that case, term DMR BbEðMÞ in a1 now becomes

q  0 DMR BbEðMÞ ¼ EðMÞ  DMR Bb mq : 1 Remark 2. As for the different simplifications given just before, we may note that the different situations do not exclude each others and that the implied simplifications may be used simultaneously. We end this section with a few indications for numerical computation of the long-run availability. 3.4. Numerical computation of the long-run availability We have seen in Theorem 3 that the long-run availability may be expressed in terms of vectors DM , DR , EðRÞ and EðMÞ and of matrices A2 , g, B and b. Vectors DM , DR , EðRÞ and EðMÞ are given. Besides, let us recall that g ¼ A1 1 . Then we only have to compute matrix b, from where we easily derive matrix B (see (2)). Let us recall that, for 1 6 i; j 6 m, we have Z þ1 Z þ1 bi;j ¼ Pi ðXS1 ¼ jÞ ¼ Pi ðXt ¼ jÞ dqi ðtÞ ¼ Pt ði; jÞ dqi ðtÞ; 0

0

where ðPt Þ is the transition semi-group for the Markov process ðXt Þ. The point then is to compute the semigroup ðPt Þ. A lot of different methods may be found in the literature (see e.g. [9,10] or [17]). We chose here to use the exponential form of the semi-group: Pt ði; jÞ ¼ etA ði; jÞ ¼ etA1 ði; jÞ for any 1 6 i; j 6 m: Also, we assume that matrix A1 may be reduced to a diagonal form. We get: Lemma 4. If matrix A1 can be reduced to the diagonal form A1 ¼ PDP1 , where D ¼ diagðk1 ; . . . ; km Þ, then we have

556

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576



m X

ei P diagðq i ðk1 Þ; . . . ; q i ðkm ÞÞP1 ;

ð5Þ

i¼1

where for any i 2 f1; . . . ; mg, • ei is the m  m square matrix composed of Rzeros apart from the i  i element which is 1, þ1 • q i is the Laplace transform of qi (q i ðsÞ ¼ 0 ets dqi ðtÞ, for any s > 0). Proof. For any 1 6 i; j 6 m:  Z þ1  bi;j ¼ etA1 dqi ðtÞ ði; jÞ 0  Z þ1  ¼ P etD dqi ðtÞP1 ði; jÞ 0  Z þ1  diagðetk1 ; . . . ; etkm Þ dqi ðtÞP1 ði; jÞ ¼ P 0   ¼ P diagðq i ð  k1 Þ; . . . ; q i ð  km ÞÞP1 ði; jÞ: We can derive that the ith row of b is the ith row of P diagðq i ðk1 Þ; . . . ; q i ðkm ÞÞP1 , which can be written as ei b ¼ ei P diagðq i ðk1 Þ; . . . ; q i ðkm ÞÞP1 . By adding those m equations, we get the result.  Some numerical examples may be found in Section 6, using (5) to compute b. 4. A sufficient condition for the preventive maintenance policy to improve the long-run availability We have just seen, in the previous section, how to compute the long-run availability of the maintained system, from a theoretical as well as from a numerical point of view. A natural question now is to wonder whether the preventive policy does improve the long-run availability. More precisely, under which conditions is it worthy to maintain the system? We show here that: • if the new starts of the system after preventive maintenance are at least as good as after repair (or equivalently, if the probability vector DM is ‘‘at least as good’’ as the probability vector DR , in a sense to be specified), • if the up-states q þ 1; . . . ; m (from which maintenance actions are performed) are ‘‘more degraded’’ (to be specified) than the mean state in which the system starts again after repair (controlled by the probability vector DR ), • if the maintenance actions are not ‘‘too’’ long in average (upper bounds are provided for their mean durations), then, the preventive maintenance policy improves the long-run availability. To show this result, we first give a new expression for the long-run availability, which is more convenient to our study and in that aim, we need some new notations. Let us notice that the process describing the evolution of the initial system, with no truncation at failure time, is a semi-regenerative process, just as process ðZt Þ, describing the maintained system. (Indeed, the later evolution of the initial system after repair only depends on the up-state in which it starts again after repair.) Then, let MUTi and MDTi , respectively, be the mean up time and the mean down time on a cycle of this

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

557

process starting from state i ð1 6 i 6 mÞ. Also, let MUT and MDT be the m  1 column vectors of the MUTi ’s and of the MDTi ’s ð1 6 i 6 mÞ. One can easily check that MUT ¼ g1m and MDT ¼ gA2 EðRÞ. Now, let Aini 1 be the long-run availability of the initial system. We have 1 Aini 1 ¼ 1 þ aini 1 with aini 1 ¼

DR MDT : DR MUT

ð6Þ

For any vector x with m components, symbols xq and xmq now, respectively, represent the sub-vectors of x formed with the q first and m  q last components of x. Shape of xq and xmq (row or vector) is similar to the shape of x. Besides, we note 1 K ¼ ðIq  bq;q Þ bq;mq

and

L ¼ bmq;q K þ bmq;mq :

We can now give our new formulation for a1 . Lemma 5. With the above notations, a1 may now be written as mq

a1 ¼

mq DMR MDT þ ðDqMR K þ DMR LÞðEðMÞ  MDT mq q mq DMR MUT  ðDMR K þ DMR LÞMUT

mq

Þ

:

ð7Þ

This result follows from easy computations. Its proof is found in Appendix A. We now give the main result of this section. In that aim, symbol Aini 1 ðU Þ represents the long-run availability of the initial system in case of new starts after repair controlled by the probability row vector U (on f1; . . . ; mg), instead of DR . Namely, Aini 1 ðU Þ ¼

1 1 þ aini 1 ðU Þ

with aini 1 ðU Þ ¼

U  MDT U  MUT

ini (see (6)) and Aini 1 ðDR Þ ¼ A1 .

Theorem 6. Let us assume that ini Aini 1 ðDM Þ P A1 ðDR Þ

ðH1 Þ

and ini Aini 1 ðdk Þ 6 A1 ðDR Þ

for any q þ 1 6 k 6 m;

ðH2 Þ

where dk represents the Dirac measure in k (and corresponds to a new start in state k). Then if EðMk Þ 6 MDTk  aini 1 ðDR Þ  MUTk

for any q þ 1 6 k 6 m;

ð8Þ

the preventive maintenance policy improves the long-run availability. The proof of this result is found in Appendix A. Remark 3. Assumption ðH1 Þ means that, if the re-starting of the initial system after repair is controlled by DM instead of DR , then the long-run availability of the initial system is improved. In other words, the ‘‘mean’’ re-starting state after a maintenance action is ‘‘at least as good’’ as after repair. Note that assumption ðH1 Þ is true when DM is equal to DR , which is frequent.

558

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

As for assumption ðH2 Þ, it means that states q þ 1 to m are more degraded that the ‘‘mean’’ state of the system after repair (controlled by DR ). Inequality (8) provides us with upper bounds for the mean durations of the maintenance actions under which the long-run availability is improved by the maintenance policy. Note that those conditions are independent of b and B, and consequently of q1 ; . . . ; qm . They are then very easy to check. Moreover, when true, they mean that the maintenance policy improves the long-run availability for any inter-inspections intervals. Before testing the accuracy of the numerical bounds for the mean durations of the maintenance actions provided by Theorem 6 (which is done in Section 6), we now interest ourselves in optimizing the maintenance policy.

5. Optimization of the preventive maintenance policy, case DM ¼ DR Our problem is here to find the optimal inter-inspections distributions, namely such that the long-run availability is optimal. In this section, we deal with this problem under specific assumptions: we assume that the system starts again in the same way after a maintenance action as after a repair: DM ¼ DR ¼ D. Besides, we also assume that Aq;q is upper triangular, where Aq;q is the north-west q  q truncation of matrix A. This last assumption means that, if states 1 to q are ranked according to their increasing degradation degree, the system may only go ‘‘worse’’ as long as it is in f1; . . . ; qg: it is degrading while running (at least as long as it is in f1; . . . ; qg). Those assumptions are often true. Under these assumptions (the same as Cocozza-Thivent in [11], see Section 1), we show that in order to optimize the long-run availability with respect to the inter-inspections distributions, it is enough to limit the study to non-random (or deterministic) inter-inspections intervals. To indicate the dependence of distributions q1 ; q2 ; . . . ; qm , symbols A1 and a1 are now, respectively, denoted by A1 ðq1 ; q2 ; . . . ; qm Þ and a1 ðq1 ; q2 ; . . . ; qm Þ. We recall that Aini 1 is the long-run availability of the initial system. Theorem 7. Under the assumptions DM ¼ DR ¼ D and Aq;q upper triangular: 1. ðThere exist some distributions q01 ; q02 ; . . . ; q0m such that A1 ðq01 ; q02 ; . . . ; q0m Þ > Aini 1Þ m

ðH3 Þ

ðThere exist c01 ; c02 ; . . . ; c0m > 0 such that A1 ðd0c1 ; d0c2 ; . . . ; d0cm Þ > Aini 1 Þ:

ðH4 Þ

2. Under assumption ðH3 Þ or ðH4 Þ, there exist

copt 1 ,

Þ P A1 ðq1 ; q2 ; . . . ; qm Þ; A1 ðdcopt ; dcopt ; . . . ; dcopt m 1

2

copt 2 ,

. . .,

copt m

such that

for any distributions q1 ; q2 ; . . . ; qm :

The proof is found in Appendix A. We now know that, under assumptions DM ¼ DR ¼ D and Aq;q upper triangular, the optimization of the preventive maintenance policy with respect to the inter-inspections distributions may be restricted to the deterministic ones. This result is very useful. Indeed, from a practical point of view, it is much easier to look for the optimal distributions among the deterministic ones than among all possible distributions. Also, from a technical point of view, it is simpler too for the repair staff to know exactly when the system has to be inspected. We now give some examples illustrating our study.

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

559

6. Examples For each of the three following examples, we first test the advisability to maintain the system with Theorem 6. (Assumptions ðH1 Þ and ðH2 Þ are here always true.) In that aim, let xðiÞ ¼ MDTi  aini 1 ðDR Þ MUTi for 2 6 i 6 m. We then know that if EðMi Þ 6 xðiÞ for any q þ 1 6 i 6 m (with 1 6 q 6 n  2), the preventive maintenance policy improves the long-run availability for any inter-inspections distributions (see Remark 3). On the opposite situation, Theorem 6 does not allow us to conclude. After that test, we look for the optimal long-run availability of the maintained system, namely look for the inter-inspections distributions such that the long-run availability is optimal and we compare it to the long-run availability of the initial system. The optimizations have all been done with the tools of MATLAB. Note that, as DM and DR have been chosen to be probabilities on f1; . . . ; qg in the different examples, the long-run availability only depends on q1 ; . . . ; qq (instead of q1 ; . . . ; qm ), which leads us to optimize only with respect to q1 ; . . . ; qq . 6.1. Example 1 This example is the same as the first example of [11]. The initial system is a ‘‘k out of n system’’. It is composed with n identical components with constant failure rate k, which cannot be repaired while the system is working ð1 6 k 6 nÞ. The system is up if and only if at least k components are working. For i 2 f1; . . . ; ng, let i be the state where exactly i  1 components are down. There are m ¼ n  k þ 1 up-states and one single down-state. The associated matrices A1 and A2 are as follows: 3 2 2 3 nk nk 0

0 0 7 6 0 6 0 7 ðn  1Þk ðn  1Þk 0

0 7 6 6 7 7 6 0 6 7 0



7 and A2 ¼ 6 7: A1 ¼ 6 7 6 6 0 7



0 7 6 6 7 5 4 0 4 0 5 0 0 0 ðk þ 1Þk ðk þ 1Þk 0 0 0 0 0 kk kk The repairs and maintenance actions put the system back to the new state (i.e. state ‘‘1’’), so that we have DM ¼ DR ¼ D ¼ ð1; 0; . . . ; 0Þ. With simplifications of Sections 3.3.1 and 3.3.2, a1 may here be written as m þ bEðMÞÞ DBðEðRÞðIm  bÞ1 : a1 ¼ DBðIm  bÞg 1m We use the same numerical values as Cocozza-Thivent in [11]. Numerical results are similar. (They are not given in her paper.) We take k ¼ 2, k ¼ 1, EðRÞ ¼ 501 , EðMj Þ ¼ j=1000 for any q þ 1 6 j 6 m. We first check the advisability of our preventive maintenance policy with the help of Theorem 6 in Table 1. We can derive that, except from the four bold cases, the preventive maintenance policy improves the asymptotic availability. We cannot conclude in the bold cases without computing the long-run availability. We now look for the optimal long-run availability. According to Theorem 7, we here limit ourselves to deterministic inter-inspections intervals. For each 3 6 n 6 9, Table 2 provides us with the long-run availability of the initial system ðAini 1 Þ and for opt opt each 1 6 q 6 n  2, it gives the optimal long-run availability and its argument ðcopt ; c ; . . . ; c Þ just below, 1 2 q opt opt italicized. Note that all of the ðcopt ; c ; . . . ; c Þ are not given because the long-run availability sometimes 1 2 q q deviates very slowly from its maximum so that a very large part of ðR þ Þ gives the same long-run availability (with the chosen precision).

560

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

Table 1 Test for the mean durations of the maintenance actions, Example 1 i

n 3 4 5 6 7 8 9

xðiÞ EðMi Þ xðiÞ EðMi Þ xðiÞ EðMi Þ xðiÞ EðMi Þ xðiÞ EðMi Þ xðiÞ EðMi Þ xðiÞ EðMi Þ

2

3

4

5

6

7

8

0.008 0.002 0.0046 0.002 0.0031 0.002 0.0023 0.002 0.0018 0.002 0.0015 0.002 0.0012 0.002

0.0108 0.003 0.0070 0.003 0.0051 0.003 0.0039 0.003 0.0031 0.003 0.0026 0.003

0.0122 0.004 0.0085 0.004 0.0064 0.004 0.0051 0.004 0.0041 0.004

0.0131 0.005 0.0095 0.005 0.0074 0.005 0.0060 0.005

0.0137 0.006 0.0103 0.006 0.0082 0.006

0.0142 0.007 0.0109 0.007

0.0145 0.008

q¼3

q¼4

q¼5

q¼6

q¼7

0.9945 . . . 0.9934 . . . . 0.9926 0.6148 0.5665 0.5093 0.4420 0.3615

0.9943 . . . . 0.9931 . . . . .

0.9940 . . . . .

Table 2 Initial and optimal long-run availability, Example 1 n

Aini 1

q¼1

3 4

0.9766 0.9819

5

0.9847

0.9940 0.9924 0.0852 0.9922 0.3020

6

0.9864

0.9923 0.4861

7

0.9876

0.9923 0.6447

8

0.9885

0.9923 0.7957

0.9923 0.7801 0.6772

9

0.9892

0.9921 0.9487

0.9922 0.9453 0.8440

q¼2 0.9949 . 0.9935 0.0848 0.0680 0.9928 0.3524 0.2881 0.9925 0.5916 0.4978

0.9949 . . 0.9936 . . 0.9929 0.4225 0.3673 0.3004 0.9924 0.7110 0.6255 0.5271 0.9922 0.9249 0.8296 0.7223

0.9948 . . 0.9935 . . . 0.9928 0.5093 0.4589 0.3986 0.3260 0.9923 0.8456 0.7646 0.6734 0.5686

We can see from Table 2 that the preventive maintenance policy improves the long-run availability in each case. We can therefore derive that Theorem 6 provides us with sufficient but unnecessary conditions for the preventive maintenance policy to improve the long-run availability. For each n, the long-run availability is optimal when the system is stopped to be maintained only when in last state before failure (case q ¼ m  1 ¼ n  2).

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

561

By proceeding to computations, we have noticed too that the long-run availability depends all the more on ci (with 1 6 i 6 q) as index i is small. Namely, for q ‘‘big enough’’, cq hardly has any influence on A1 whereas A1 deviates quickly with c1 . opt opt At last, for each ðn; qÞ, the optimal inter-inspections intervals copt are decreasing 1 ; c2 ; . . . ; cq opt opt opt ðc1 > c2 > > cq Þ, as might have been expected: the worse the state of the system, the sooner the system has to be inspected. 6.2. Example 2 Just as in Example 1, a ‘‘k out of n’’ structure is here considered. The only difference with the previous example is that components may now be repaired while the system is running, with a constant repair rate l. We take the same numerical values as in Example 1, with l ¼ 2 in addition. The repairs and maintenance actions here again put the system back to the new state. Matrix A2 is similar as in Example 1. Matrix A1 now is 3 2 nk nk 0

0 0 0 7 6 l l  ðn  1Þk ðn  1Þk 0 0 0 7 6 7 6 0 2l

0 0 0 7 6 7: 6 A1 ¼ 6



7 7 6 0 0 0

ðk þ 2Þk 0 7 6 5 4 0 0 0

ðn  k  1Þl ðn  k  1Þl  ðk þ 1Þk ðk þ 1Þk 0 0 0

ðn  kÞl ðn  kÞl  kk The results for the test of Theorem 6 are given in Table 3. Bold results correspond to cases for which we cannot conclude yet on the opportunity to maintain the system. We now look for the optimal long-run availability. Oppositely to Example 1, assumptions of Theorem 7 are not valid here, for Aq;q is not upper triangular any more. Consequently, we cannot limit ourselves to deterministic inter-inspections intervals. As we cannot consider all the possible distributions, we assume that the distributions of the inter-inspections intervals belong to the large class of GAMMA distributions. We then use the optimization tools of MATLAB to find the best parameters of those GAMMA distributions. We find that those parameters correspond to very small standard deviations so that the best Table 3 Test for the mean durations of the maintenance actions, Example 2 i

n 3 4 5 6 7 8 9

xðiÞ EðMi Þ xðiÞ EðMi Þ xðiÞ EðMi Þ xðiÞ EðMi Þ xðiÞ EðMi Þ xðiÞ EðMi Þ xðiÞ EðMi Þ

2

3

4

5

6

7

8

0.0571 0.02 0.0333 0.02 0.0188 0.02 0.0101 0.02 0.0051 0.02 0.0024 0.02 0.0011 0.02

0.1000 0.03 0.0518 0.03 0.0262 0.03 0.0126 0.03 0.0058 0.03 0.0026 0.03

0.1271 0.04 0.0573 0.04 0.0258 0.04 0.0113 0.04 0.0048 0.04

0.1398 0.05 0.0544 0.05 0.0217 0.05 0.0087 0.05

0.1424 0.06 0.0474 0.06 0.0168 0.06

0.1392 0.07 0.0396 0.07

0.1341 0.08

562

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

inter-inspections intervals seem here again to be deterministic (though assumptions of Theorem 7 are not valid). The results are given in Table 4. opt opt Note that the three numerical remarks from Example 1 on the lack of precision on ðcopt 1 ; c2 ; . . . ; cq Þ, on opt opt the respective influence of the different ci for 1 6 i 6 q and on the decreasing nature of ðc1 ; c2 ; . . . ; copt q Þ are still valid here (as well as for next example). Also, for each n, the asymptotic availability is here again optimal when the system is stopped to be maintained only when in last state just before failure (case q ¼ m  1 ¼ n  2). Oppositely to Example 1, the preventive maintenance policy does not always improve the asymptotic availability. For instance, for n ¼ 7, it is better not to stop the system for a maintenance operation when in state 2, 3 or 4. By comparing Tables 3 and 4, we can also see that the sufficient condition given by Theorem 6 provides us with rather good numerical bounds for the maximal mean durations of the maintenance actions: if the mean durations of the maintenance actions are below or of the same order as the bounds, then it is useful to maintain; if they are clearly higher, it is useless (and even prejudicial). 6.3. Example 3 Here again, a ‘‘k out of n’’ structure is considered, with components that cannot be repaired while the system is running. The maintenance actions still put the system back to the perfect working state ðDM ¼ ð1; 0; . . . ; 0ÞÞ but repairs are not always complete: we take: DR ¼ dq for 1 6 q 6 5. (If we want the preventive maintenance policy to improve the system, we need DR to be better than dqþ1 ; dqþ2 ; . . . ; dm , see Theorem 6.) Consequently, we are in the case where DR 6¼ DM , except for q ¼ 1, so that assumptions of Theorem 7 are not valid. We take: n ¼ 7;

k ¼ 2;

k ¼ 1;

1 q EðRÞ ¼  5 100

and

EðMj Þ ¼

j 100

for q þ 1 6 j 6 m ¼ 6:

Table 4 Initial and optimal long-run availability, Example 2 n

Aini 1

q¼1

3 4 5

0.8537 0.8824 0.9140

0.9434 0.9326 0.9372 0.3561

6

0.9431

0.9470 0.8585

7

0.9658

Aini 1

Aini 1

Aini 1

8

0:9812

Aini 1

Aini 1

Aini 1

Aini 1

0.9853 0.04 0.01 0.0003 0.0001 < 104 Aini 1

9

0:9903

Aini 1

Aini 1

Aini 1

Aini 1

Aini 1

q¼2 0.9615 0.9515 0.1138 0.0894 0.9519 0.5296 0.4075

q¼3

q¼4

q¼5

q¼6

q¼7

0:9904 0.1 ... Aini 1

0:9942

0.9712

0.9627 0.1623 0.1379 0.1069

0.9789 0.0090 0.0016 0.0001 0 for any 1 6 i 6 m, sequence ðSn Þ almost surely tends to þ1 so that limn!þ1 Pi ðSn < T Þ ¼ Pi ðT ¼ þ1Þ ¼ 0 by n assumption. We conclude that ððbq;q Þ Þ tends to 0q;q so that Iq  bq;q is non-singular. q q mq mq 2. Let x and y (x and y ), respectively, be the q  1 column vectors formed with the q first (m  q last) components of x and y. Then equalities (1) for 1 6 i 6 q may be written as xq ¼ bq;qxq þ yq or, equivalently, xq ¼ ðIq  bq;q Þ1 yq :

ðA:1Þ

For q þ 1 6 i 6 m, equalities (1) may be written as xmq ¼ bmq;qxq þ ymq . By substituting xq , we now get xmq ¼ bmq;q ðIq  bq;q Þ1 yq þ ymq : Equalities (A.1) and (A.2) are then equivalent to x ¼ By.

ðA:2Þ 

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

565

Proof of Theorem 2. 1. Applying the Markov renewal theory. Let Tn be the nth new start of the system after a down period (for preventive maintenance or repair). Let Yn be the state of the system at time Tn . We assume that T0 ¼ 0 and we denote by C the set of the up-states in which the system almost surely jumps after a down period: C ¼ fi 2 f1; . . . ; mg such that DR ðiÞ þ DM ðiÞ > 0g. Then ðYn ; Tn Þ is a Markov renewal process (with state space C  Rþ Þ and ðZt Þ is a semi-regenerative process associated to the imbedded Markov renewal process ðY ; T Þ. ðYn Þ clearly is a recurrent irreducible Markov chain on C and it is easy to check that ðY ; T Þ is a non-arithmetic process. Then general theorems provide us with the existence of a limiting distribution for P ðZt Þ [9] or [10]. More precisely, let m be the stationary distribution of the Markov chain ðYn Þ. If k2C mk Ek ðT1 Þ < þ1, we know that lim PðZt ¼ gÞ ¼ pðgÞ

t!þ1

with pðgÞ ¼

X

Z

T1

mk Ek

, X IfZs ¼gg ds mk Ek ðT1 Þ

0

k2C

ðA:3Þ

k2C

for any g 2 f1; . . . ; m þ pg [ flqþ1 ; . . . ; lm g. To compute the stationary distribution m (point 5), we need to compute the transition matrix ðPi;j Þ of the Markov chain ðYn Þ (point 4). In that aim, we have to compute the probability for a cycle of ðZt Þ to end by a failure in state m þ k (point 2) or by a maintenance action in state lk (point 3), given that the cycle began in state i. 2. Computations of Pi ðZT1 ¼ m þ kÞ for 1 6 k 6 p. Let us recall that T is the first on-period of the initial system. We have Pi ðZT1 ¼ m þ kÞ ¼ Pi ðZT1 ¼ m þ k \ T > S1 Þ þ Pi ðZT1 ¼ m þ k \ T 6 S1 Þ ¼

q X

Pi ðZT1 ¼ m þ k=ZS1 ¼ l \ T > S1 ÞPi ðZS1 ¼ l \ T > S1 Þ

l¼1

þ Pi ðXT ¼ m þ k \ T 6 S1 Þ:

ðA:4Þ

(Indeed, if ZS1 2 fq þ 1; . . . ; mg, a preventive maintenance action is begun so that the cycle cannot end with a repair in state m þ k.) Let us first notice that, in those different expressions, we may for instance write T 6 S1 or T < S1 indifferently. Indeed, as the initial system behaves according to a Markov process as long as it is running, the random variable T admits a density towards Lebesgue measure, and we may derive from the independence assumptions that Pi ðT ¼ S1 Þ ¼ 0. Let us now look at the different terms in (A.4). As the system evolves according to a Markov process as long as it is running, we clearly have Pi ðZT1 ¼ m þ k=ZS1 ¼ l \ T > S1 Þ ¼ Pl ðZT1 ¼ m þ kÞ and this is the key of this proof. Besides, Pi ðZS1 ¼ l \ T > S1 Þ ¼ Pi ðXS1 ¼ lÞ ¼ bi;l .

for any 1 6 l 6 q

ðA:5Þ

566

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

Also, the last term in (A.4) is Pi ðXT ¼ m þ k \ T 6 S1 Þ ¼

Z

1

Pi ðXT ¼ m þ k \ T 6 sÞqi ds

0

¼

Z

1

m X

0

¼

Z

Pi ðXT ¼ m þ k \ XT  ¼ j \ T 6 sÞqi ds

j¼1 1

m Z X

0

s

Pt ði; jÞaj;mþk dt qi ds:

0

j¼1

Besides, note that d Pt ði; jÞ ¼ ðPt AÞði; jÞ ¼ ðPt A1 Þði; jÞ for any 1 6 i; j 6 m: dt As A1 is non-singular and as g ¼ A1 1 , we now derive Z 1X m ððPs  Im ÞA1 Pi ðXT ¼ m þ k \ T 6 S1 Þ ¼ 1 Þði; jÞaj;mþk qi ds 0

¼

j¼1 m  Z X j¼1

By noting that

R1 0

1

 Ps qi ds  Im g ði; jÞaj;mþk :

0

Ps ði; Þqi ds ¼ bði; Þ, we now get

Pi ðXT ¼ m þ k \ T 6 S1 Þ ¼ 

m X

½ðb  Im Þgði; jÞaj;mþk ¼ ½ðIm  bÞgA2 ði; kÞ:

j¼1

By substituting this expression in (A.4) and using (A.5), we now get Pi ðZT1 ¼ m þ kÞ ¼

q X

Pl ðZT1 ¼ m þ kÞbi;l þ ½ðIm  bÞgA2 ði; kÞ:

l¼1

According to Lemma 1, we now know that the column vector ½Pi ðZT1 ¼ m þ kÞ1 6 i 6 m may be written as h i Pi ðZT1 ¼ m þ kÞ ¼ BðIm  bÞgA2 ð ; kÞ: ðA:6Þ 16i6m

3. Computation of Pi ðZT1 ¼ lk Þ for q þ 1 6 k 6 m. This is done in a similar way as the previous point. Pi ðZT1 ¼ lk Þ ¼

q X l¼1

Pi ðZT1 ¼ lk =XS1 ¼ lÞPi ðXS1 ¼ lÞ þ Pi ðZT1 ¼ lk =XS1 ¼ kÞ Pi ðXS1 ¼ kÞ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼1

¼

q X

Pl ðZT1 ¼ lk Þbi;l þ bi;k :

l¼1

We derive from Lemma 1 that the column vector ½Pi ðZT1 ¼ lk Þ1 6 i 6 m is ½Pi ðZT1 ¼ lk Þ1 6 i 6 m ¼ B  bð ; kÞ:

ðA:7Þ

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

567

4. Computation of the transition matrix ðPi;j Þ of the Markov chain ðYn Þ. Let 1 6 i; j 6 m. We have Pi;j ¼ Pi ðY1 ¼ jÞ p m X X ¼ Pi ðY1 ¼ j=ZT1 ¼ m þ kÞPi ðZT1 ¼ m þ kÞ þ Pi ðY1 ¼ j=ZT1 ¼ lk ÞPi ðZT1 ¼ lk Þ k¼1

¼ DR ðjÞ

k¼qþ1 p X

Pi ðZT1 ¼ m þ kÞ þ DM ðjÞ

m X

Pi ðZT1 ¼ lk Þ;

ðA:8Þ

k¼qþ1

k¼1

because of the assumptions on new starts after a down period for repair or preventive maintenance. By substituting Pi ðZT1 ¼ m þ kÞ and Pi ðZT1 ¼ lk Þ with their values (see (A.6) and (A.7)), we now get



0q p  Pi;j ¼ DR ðjÞðBðIm  bÞgA2 1 ÞðiÞ þ DM ðjÞ Bb mq ðiÞ 1 for any 1 6 i; j 6 m. 1m , so that matrix P ¼ ðPi;j Þ may now be written as From (4), we know that gA2  1p ¼ 

 0q m  P ¼ BðIm  bÞ1 DR þ Bb mq DM : 1 5. Computation of the stationary distribution m of the Markov chain ðYn Þ. Let m ¼ ðm1 ; m2 ; . . . ; mm Þ. Then we have m ¼ mP . By substituting P with its value (see the previous point), we get



 0q m  DM : m ¼ ðmBðIm  bÞ1 Þ DR þ mBb mq |fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl} 1 |fflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflffl} 2R 2R

m

Let x be the real x ¼ mBðIm  bÞ1 . According to (A.6) and (A.7), we now get # " #

" X p m X  0q Bb mq ¼ Pi ðZT1 ¼ lk Þ ¼ 1m  Pi ðZT1 ¼ m þ kÞ 1 k¼qþ1 k¼1 16i6m

¼ 1m  BðIm  bÞgA2  1m  BðIm  bÞ1m 1p ¼  p

16i6m

ðA:9Þ

m

because gA2 1 ¼ 1 (see (4)). Consequently

 0q mBb mq ¼ m 1m ¼ 1  x 1m  mBðIm  bÞ 1 and m may be written as m ¼ xDR þ ð1  xÞDM :

ðA:10Þ

By substituting m in x, we get m ¼ x DR BðIm  bÞ 1m þ ð1  xÞ DM BðIm  bÞ1m x ¼ mBðIm  bÞ1 |fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl} 2R

and x¼

DM BðIm  bÞ 1m 1  DR BðIm  bÞ 1m þ DM BðIm  bÞ 1m

(note that the denominator is positive).

2R

ðA:11Þ

568

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

We now derive from (A.10) and (A.11):

DM BðIm  bÞ 1m DM BðIm  bÞ1m m¼ DR þ 1  DM 1  DR BðIm  bÞ 1m þ DM BðIm  bÞ 1m 1  DR BðIm  bÞ1m þ DM BðIm  bÞ1m m ÞDR þ ð1  DR BðIm  bÞ1m ÞDM  ¼ constant  ½ðDM BðIm  bÞ1 1m ÞDR þ DR ð1m  BðIm  bÞ1m ÞDM  ðbecause 1 ¼ DR 1m Þ ¼ constant  ½ðDM BðIm  bÞ 



 0q m  ¼ constant  ðDM BðIm  bÞ1 ÞDR þ DR Bb mq DM ðsee ðA:9ÞÞ 1 ¼ constant  DMR ;

ðA:12Þ

where DMR has been defined in the statement of Theorem 2 and constant ¼ ½1  DR BðIm  bÞ1m þ 1 DM BðIm  bÞ 1m  . 'R (   T 6. Computation of Ei 0 1 IfZs ¼gg ds for g 2 f1; . . . ; m þ pg [ lqþ1 ; . . . ; lm . Let us first note that, for g ¼ m þ k with 1 6 k 6 p, we clearly have

Z

T1

Ei 0

IfZs ¼mþkg ds ¼ EðRmþk Þ  Pi ðZT1 ¼ m þ kÞ ¼ EðRmþk Þ  ðBðIm  bÞgA2 Þði; m þ kÞ ðsee ðA:6ÞÞ:

In the same way, for g ¼ lk with q þ 1 6 k 6 m, we have

Z T1 IfZs ¼lk g  ds ¼ EðMk Þ  ðBbÞði; kÞ ðsee ðA:7ÞÞ: Ei 0

R T1 Let us now compute R T1 Ei ð 0 IfZs ¼jg dsÞ for j 2 f1; . . . mg. (Symbol j is here substituted to g.) Let us first cut Ei ð 0 IfZs ¼jg dsÞ according to what happens at time S1 :

Z T1 X

Z T1 q IfZs ¼jg ds ¼ Ei Ei  IfZs ¼jg ds  IfZS1 ¼k\S1 0:

ðA:24Þ

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

573

Let Aopt ; dcopt ; . . . ; dcopt Þ and ðPi Þ be the following statement (with 0 6 i 6 m): 1 ¼ A1 ðdcopt m 2  opt 1 A1 P A1 ðdc1 ; dc2 ; . . . ; dci ; qiþ1 ; . . . ; qm Þ ðPi Þ for any c1 ; . . . ; ci > 0 and for any distributions qiþ1 ; . . . ; qm : According to (A.24), ðPm Þ is true. We have to show that ðP0 Þ is true. Let us first study the inequality in ðPi Þ, for 0 6 i 6 m. Using formula (3) for a1 and DMR ¼ D (see Section 3.3.1), we get Aopt 1 P A1 ðdc1 ; dc2 ; . . . ; dci ; qiþ1 ; . . . ; qm Þ () aopt 1 6 a1 ðdc1 ; dc2 ; . . . ; dci ; qiþ1 ; . . . ; qm Þ

DB ðIm  bÞgA2 EðRÞ þ bEðMÞ () aopt 1 6 DBðIm  bÞg 1m () aopt DBðIm  bÞg 1m 6 DBðIm  bÞgA2 EðRÞ þ DBbEðMÞ 1

m () aopt 1 DBðIm  bÞg1 6 DBðIm  bÞgA2 EðRÞ þ DBðIm þ b þ Im ÞEðMÞ ( ' m () DBðIm  bÞ aopt 1 g1  gA2 EðRÞ þ EðMÞ 6 DBEðMÞ: Moreover BðIm  bÞ ¼



Iq

 0mq;q

K bmq;q K þ Imq  bmq;mq

ðA:25Þ

ðsee ðA:17ÞÞ

where we recall that K ¼ ðIq  bq;q Þ1 bq;mq : Now let symbol ð Þk;n design any k  n matrix independent on q1 , q2 , . . ., qm . Also, let symbol $ means ‘‘equivalent to an expression of the following shape’’. We get Aopt 1 P A1 ðdc1 ; dc2 ; . . . ; dci ; qiþ1 ; . . . ; qm Þ "

Iq K $ ðð Þ1;q ; ð Þ1;mq Þ mq;q ð Þm;1 bmq;q K þ Imq  bmq;mq 0 !

# 1 0q;mq 0q ðIq  bq;q Þ 1;q 1;mq Þ mq;q 6 ðð Þ ; ð Þ mq;1 1 ð Þ ðIq  bq;q Þ Imq b h i

ð Þq;1 1;q 1;q 1;mq mq;q 1;mq 1;mq mq;mq ÞK þ ð Þ þ ð Þ $ ð Þ ; ðð Þ þ ð Þ b b ð Þmq;1

 q 0 1;q 1;mq Þ 6 ðð Þ ; ð Þ ð Þmq;1 h i $ ð Þ1;1 þ ð Þ1;q þ ð Þ1;mq bmq;q Kð Þmq;1 þ ð Þ1;1 þ ð Þ1;mq bmq;mq ð Þmq;1 6 ð Þ1;1 h i $ ð Þ1;q þ ð Þ1;mq bmq;q Kð Þmq;1 þ ð Þ1;mq bmq;mq ð Þmq;1 6 ð Þ1;1 :

ðA:26Þ

Let us now  recall that 1 bi;j ¼ Pi ðXS1 ¼ jÞ, so that only the ith row of b depends on distribution qi . Then, matrix K ¼ Iq  bq;q bq;mq only depends on q1 ; q2 ; . . . ; qq whereas bmq;q and bmq;mq only depends on qqþ1 ; . . . ; qm .

574

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

Now, let us write again inequality ðPm Þ (which we know to be true) and let us specify the dependence of the different matrices on dc1 ; dc2 ; . . . ; dcm : Aopt 1 P A1 ðdc1 ; dc2 ; . . . ; dcm Þ i hh 1;q 1;mq mq;q mq;1 ðdcqþ1 ; dc2 ; . . . ; dcm Þ Kðdc1 ; dc2 ; . . . ; dcq Þð Þ b $ ð Þ þ ð Þ i 1;mq mq;mq mq;1 þð Þ ðdcqþ1 ; dc2 ; . . . ; dcm Þð Þ b It is now easy to see that is possible to integrate this inequality with respect to the (general) distributions qqþ1 ; . . . ; qm so that ðPq Þ is true. Let us now show ðPi Þ recursively downwards for 1 6 i 6 q. Let us assume that ðPi Þ is true for some fixed 1 6 i 6 q. We want to show that ðPi1 Þ is true. Let X k;n be any k  n matrix independent of ci (k; n 2 N ). According to (A.26), statement ðPi Þ may be written as X 1;q Kðdc1 ; dc2 ; . . . ; dci ; qiþ1 ; . . . ; qq ÞX mq;1 6 X 1;1 for any ðdc1 ; dc2 ; . . . ; dci ; qiþ1 ; . . . ; qq Þ. We have to show that it is possible to integrate this inequality with respect to ci . In that aim, we study the dependence of K ðdc1 ; dc2 ; . . . ; dci ; qiþ1 ; . . . ; qq Þ with respect to the ith row of b. As Aq;q is upper triangular and as 1 6 i 6 q, matrix Iq  bq;q is of the following shape: Iq  bq;q ¼



Ii  bi;i  0qi;i

bi;qi ; Iqi  bqi;qi

where bi;i , bi;qi and bqi;qi are the following sub-matrices of bq;q : bq;q ¼

bi;i qi;i  ¼ 0qi;i b

bi;qi  bqi;qi

! :

1 We derive that ðIq  bq;q Þ may be written as

1 ðIq  bq;q Þ ¼



ðIi  bi;i Þ  0qi;i

1

1 1 ðIi  bi;i Þ bi;qi ðIqi  bqi;qi Þ ; 1 ðIqi  bqi;qi Þ

with

where symbol  means ‘‘is equal to an expression of the following shape’’ and where matrix X i1;i1 is upper triangular. 1 Let us recall that bi;i 6¼ 1 for f1; . . . ; mg is a non-absorbing set so that ðIi  bi;i Þ may be written asc ! 1 i;i 1 i;i1 i;1  ; X ðIi  b Þ  X : 1  bi;i

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

575

We now have

and

6 ð1  bi;i ÞX 1;1 ; where bi;j ¼ Pi ðXci ¼ jÞ for any j 2 f1; . . . ; mg. It is now easy to see that it is possible to integrate this inequality with respect of qi and get the same inequality where Pi ðXS1 ¼ jÞ is now substituted to bi;j ¼ Pi ðXcj ¼ jÞ and S1 =X0 ¼ i has qi for distribution. In other words, ðPi1 Þ is true. Consequently, ðP0 Þ is true and we have shown that conclusion of point 2 is valid under assumption ðH4 Þ. We may show in the same way that, if A1 ðdc1 ; dc2 ; . . . ; dcm Þ 6 Aini 1 for any c1 ; c2 ; . . . ; cm > 0, then A1 ðq1 ; q2 ; . . . ; qm Þ 6 Aini for any distributions q ; q ; . . . ; q . 1 2 m 1 The contrapositive statement means that assumption ðH3 Þ implies assumption ðH4 Þ. As ðH4 Þ clearly implies ðH3 Þ, those two assumptions then are equivalent.

576

S. Bloch-Mercier / European Journal of Operational Research 147 (2002) 548–576

References [1] M. Abdel-Hameed, Inspection and maintenance policies of devices subject to deterioration, Advances in Applied Probability 19 (1987) 917–931. [2] R.E. Barlow, F. Proschan, Mathematical Theory of Reliability, Classics in Applied Mathematics, SIAM, Philadelphia, 1996 (first edition, 1965). [3] R.E. Barlow, L.C. Hunter, Mathematical models for system reliability, The Sylvania Technologist V. XIII (1 and 2) (1960). [4] R.E. Barlow, L.C. Hunter, F. Proschan, Optimal checking procedures, Journal of the Social for Industrial and Applied Mathematics 11 (4) (1963) 1078–1095. [5] C. Berenguer, E. Ch^atelet, A. Grall, Reliability valuation of systems subjects to partial renewals for preventive maintenance, in: Proceedings of ESREL’97, June 17–20, 97, Lisbon, Portugal, vol. 3, 1997, pp. 1744–1767. [6] S. Bloch-Mercier, Stationary availability of a semi-Markov system with random maintenance, Applied Stochastic Models in Business and Industry 16 (2000) 219–234. [7] S. Bloch-Mercier, Modeles et optimisation de politiques de maintenance de systemes, PhD Thesis, Universite de Marne-la-Vallee (in French), December 2000. Available from http://www-math.univ-mlv.fr/math/recherche.html. [8] P.G. Ciarlet, Introduction a l’analyse numerique matricielle et a l’optimisation, Masson, Paris, 1994. [9] E. C ß inlar, Introduction to Stochastic Processes, Prentice-Hall, Englewood Cliffs, NJ, 1975. [10] C. Cocozza-Thivent, Processus stochastiques et fiabilite des systemes, Mathematiques et Applications (28) (1997). [11] C. Cocozza-Thivent, A model for a dynamic preventive maintenance policy, Journal of Applied Mathematics and Stochastic Analysis 13 (4) (2000) 321–346. [12] C. Derman, On optimal replacement rules when changes of state are Markovian, in: B. Richard (Ed.), Optimal Decision Processes, 1963, pp. 201–212 (The RAND Corporation, R-396-PR). [13] L. Dieulle, Reliability of a system with poisson inspection times, Journal of Applied Probability 36 (4) (2000) 1140–1155. [14] L. Dieulle, C. Berenguer, A. Grall, M. Roussignol, Un modele de maintenance conditionnelle continu, in: XXXIIe Journees de Statistique, Fes, Maroc, May 15–19, 2000, pp. 318–321. [15] V. Kalashnikov, M. Roussignol, Reliability of a system with regular inspection times, Journal of Mathematical Sciences 81 (5) (1996) 2937–2950. [16] Z. Kander, Inspection policies for deteriorating equipment characterized by N quality levels, Naval Research Logistics Quarterly 25 (1978) 243–255. [17] M. Kijima, Markov Processes for Stochastic Modelling, Chapman & Hall, London, 1997. [18] M. Klein, Inspection–maintenance–replacement schedules under Markovian deterioration, Management Science 9 (1) (1962) 25–32. [19] H. Luss, Maintenance policies when deterioration can be observed by inspection, Operations Research 24 (1976) 359–366. [20] J.J. Call, Maintenance policies for stochastically failing equipment. A survey, Management Science, Series A 11 (1965) 493–524. [21] T. Nakagawa, Periodic and sequential preventive maintenance policies, Journal of Applied Probability 23 (1986) 536–542. [22] J.M. van Noortwijk, H.E. Klatter, Optimal inspection decisions for the block mats of the Eastern-Scheldt barrier, Reliability Engineering and System Safety 65 (1999) 203–211. [23] W.P. Pierskalla, J.A. Voelker, A survey of maintenance models: the control and surveillance of deteriorating systems, Naval Research Logistics 23 (1976) 353–388. [24] I.R. Savage, Cycling, Naval Research Logistics Quarterly 3 (1956) 163–175. [25] P.A. Scarf, On the application of mathematical models in maintenance, European Journal of Operational Research 99 (3) (1997) 493–506. [26] Y.S. Sherif, M.L. Smith, Optimal maintenance models for systems subject to failure – A review, Naval Research Logistics 28 (1981) 47–74. [27] C. Valdez-Flores, R.M. Feldman, A survey of preventive maintenance models for stochastically deteriorating single-unit systems, Naval Research Logistics 36 (4) (1989) 419–446. [28] J.K. Vaurio, Availability and cost functions for periodically inspected preventively maintained units, Reliability Engineering and System Safety 63 (1999) 133–140. [29] D. Zuckerman, Inspection and replacement policies, Journal of Applied Probability 17 (1980) 168–177.