VoIP Availability and Service Reliability through Software ...

1 downloads 29614 Views 237KB Size Report
Management Engineering, University ... Hence, we model software rejuvenation in a VoIP system with a semi-Markov process in order to capture ... arrive in a VoIP system as many resources are requested in order for the call to be terminated.
VoIP Availability and Service Reliability through Software Rejuvenation Policies

V. P. Koutras Department of Financial and Management Engineering, University of the Aegean, Chios, Greece GR82100 [email protected]

A. N. Platis Department of Financial and Management Engineering, University of the Aegean, Chios, Greece GR82100 [email protected]

Abstract Nowadays Voice over Internet Protocol (VoIP) is becoming an evolutionary technology in telecommunications. In this paper the study is focused on the resources that are allocated for VoIP calls. Resource allocation in a VoIP system and resource degradation when new demands for resources arrive at the system are modeled. To counteract resource degradation and improve availability and service reliability, we propose to perform software rejuvenation.. Moreover, the rate of resource allocation in such a system can be importantly affected by the time that an amount of resources is allocated in order to serve VoIP calls. Hence, we model software rejuvenation in a VoIP system with a semi-Markov process in order to capture the effects of time spent at resource degraded states of the system. Through the stochastic analysis of the system an optimal rejuvenation policy that maximizes system’s availability is proposed and furthermore the corresponding reliability levels in means of Mean Time To Failure are derived.

1. Introduction Voice over Internet Protocol (VoIP) is the technology of providing voice and telephone services over the internet. In recent years VoIP has became an evolutionary technology in the telecommunication area. The main advantage of this technology against the traditional circuitswitched telephony is the lower cost of a call especially for long distance calls. By reducing telephone call cost, VoIP has become a popular way of making a call among the internet users. Furthermore, the evolution that has been made in recent years on hardware and software equipment of computers has made VoIP easy accessible to more users or companies, that aim to reduce the cost of their phone calls. Due to the above statements, it is observed that the number of VoIP users is importantly increased over the last years. As any other new technology, VoIP experiences some drawbacks that telecommunication companies wish to eliminate or distinguish as soon as possible in order to provide service equivalent to the service provided by circuit switched telecommunication companies and organizations. Hence, a lot of research effort has been paid either by companies or the academic community, in the area of counteracting VoIP drawbacks. The main problem concerning VoIP services, is the Quality of Service (QoS) that such a technology can provide [18]. This is not the only problem as many others such as packet loss [19], jitter, latency [11], and degrading voice quality [6] appear in VoIP services. Among the stated problems, there exists one more that VoIP experiences and is the service quality degradation due to resource exhaustion of the service provider. As many call requests arrive in a VoIP system as many resources are requested in order for the call to be terminated. Hence, when the resource demands by the system users are importantly increased, VoIP

2nd International Conference on Dependability of Computer Systems(DepCoS-RELCOMEX'07) 0-7695-2850-3/07 $20.00 © 2007 Authorized licensed use limited to: Jyvaskylan Ammattikorkeakoulu. Downloaded on October 16, 2008 at 05:30 from IEEE Xplore. Restrictions apply.

provider may run out of resources. In that case, when a call demand arrives, the provider either cannot serve it or if the request is served, it may affect the quality of the ongoing calls [12]. In detail, an amount of resources is allocated when a VoIP call is initiated [18]. When a new call arrives, an additional amount of resources is allocated, leading to resource exhaustion when too many calls are served. Unfortunately, the resource allocation for each VoIP call is not the only reason of resource consumption. Usually, VoIP providers, install some background applications that run simultaneously on their VoIP server. These applications also allocate resources, deteriorating the level of free resources of the system. Techniques such as Resource Reservation Protocol or other similar have been developed to counteract the phenomenon of resource exhaustion [3], [7]. An approach to counteract resource exhaustion in a VoIP system is software rejuvenation [12]. Software rejuvenation, first introduced by Huang et al in [10], is a technique that can be periodically adopted to counteract the phenomenon of software aging. Software aging is caused when error conditions are accumulated in consciously running systems, causing resource degradation that can even lead to system failures [16], [8], [9]. When software rejuvenation actions are triggered on a computer system, the running software is stopped occasionally, its internal state is cleaned and it is then restarted [10]. The innovation introduced in this paper, extending the study made in [12], consists in considering that resource exhaustion depends on the time that the system spends at each degradation state. The amount of the resources that a call or a number of calls consume can be increased depending on call duration or either to some other actions such as additional packet transmissions. Hence, the time that the system enters a new degradation level state is not exponentially distributed and consequently a semi-Markov process (SMP) is used to model the VoIP system. The aim of our work is to derive the steady-state availability of the VoIP system and to propose rejuvenation policies, indicating when to perform rejuvenation. Furthermore, VoIP service reliability is studied in terms of determining the Mean Time To Failure (MTTF) and how is affected by the optimal rejuvenation polity that maximizes VoIP service availability.

2. Software rejuvenation modeling 2.1. VoIP model description When a request for resources arrive at the VoIP system in order to serve a new call, resources are allocated, as a call set up process is initiated When many resource requests arrive at the VoIP server and a high amount of calls are initiated, the system experiences a resource degradation. At this phase the system enters state 2, as presented in Figure 1, with an exponentially distributed time. At this point, a check about the remaining resources has to be performed in order to determine whether the system needs to be rejuvenated in order to release an amount of resources, or the system can still serve the new calls without call quality degradation. This kind of check is performed at state C1. Because the time to trigger rejuvenation is in practice a fixed duration, the time that the system needs to enter the check state can be considered as the unit step function. Depending on the status of the remaining resources, the system either returns to state 2 where new calls arrive or enters the rejuvenation state R within an exponentially distributed time. When entering the rejuvenation state, a set of actions has to be triggered. Rejuvenation in the system presented can also terminate some of the background applications running on the VoIP server,

2nd International Conference on Dependability of Computer Systems(DepCoS-RELCOMEX'07) 0-7695-2850-3/07 $20.00 © 2007 Authorized licensed use limited to: Jyvaskylan Ammattikorkeakoulu. Downloaded on October 16, 2008 at 05:30 from IEEE Xplore. Restrictions apply.

F4 F1 1

2

Fc

FREJ

F3

F2 3

FREP1 Fc

4

FREP2

C2

C1

FC1

FC2

R

Figure1. VoIP software rejuvenation model according to some priorities that have been assigned to them by the system administrator. When the system returns to state 2, the same call set up procedure is initiated resulting in a higher level of resource degradation and hence in entering state 3. Because the resource degradation depends on the time that the system has spent at state 2, the time that it needs to enter state 3 is non-exponentially distributed. It can be observed that as the time that the system spends at state 2 is increased, the amount of consumed resources is increased and hence the rate of entering state 3 is also increased. Consequently the time to enter state 3 can be described by an increasing failure rate function. When state 3 is reached, VoIP system has to be checked once more. Similarly, the system enters the control state C 2 and either transits to the rejuvenation state if the level of free resources has exceeded some predefined levels or returns to state 3 and continues to accept new requests for resources. In the case that the system enters the rejuvenation state software rejuvenation is performed resulting in cleaning the system’s internal state and the system returns to the robust state 1. In contrast, when the reply from state C 2 indicates that the system can accepts new requests, the system may experience resource degradation and finally enter the failure state 4 within a non exponential time, characterized again by an increasing failure rate function. In this case, all services have failed and the system has to be repaired in order to restore the total amount of resources, returning to state 1 with an exponentially distributed time. VoIP system’s behavior is presented in Figure 1, where state 1 is the robust state of the system. States 2 and 3 are the states indicating the levels of resource degradation and states C 1 and C 2 are the check states corresponding to states 2 and 3. The rejuvenation state in which the system enters when rejuvenation action need to be triggered, is state R and finally state 4 is the failure state. When the system is in state 1, the process will enter state 2 when a transition with exponential distribution F 1 (t) occurs. When it is in state 2 it may enter the check state C1 with a non-exponential distribution F C or it will enter the degraded state 3 with a non-exponential distribution F 2. According to the reply of the checking procedure on state 2 the system will enter the rejuvenation state R with an exponential distribution F C1 or it will return to state2 with an exponential distribution F REP1 in the case that the check indicates that new calls can arrive at the system. Correspondingly, when the process is at state 3 it may enter the check state C 2 following a non-exponential distribution F C of the failure state 4 with an also non-exponential distribution F 3. Leaving check state C2 either it will enter the rejuvenation state with an exponential distribution F C2 or it will return to state 3 again

2nd International Conference on Dependability of Computer Systems(DepCoS-RELCOMEX'07) 0-7695-2850-3/07 $20.00 © 2007 Authorized licensed use limited to: Jyvaskylan Ammattikorkeakoulu. Downloaded on October 16, 2008 at 05:30 from IEEE Xplore. Restrictions apply.

with an exponential distribution F REP2. Finally, after the appropriate repair actions the system will leave state 4 and enter the robust state 1 with an exponential distribution F4. 2.2. Semi-Markov analysis Semi-Markov analysis of the model described above is given by the so-called two-stage method [15], [14], [17]. Based on this method, the kernel matrix K(t) is needed [17], [13]:  0  0   0  K(t) =  0   0 k  R1  k 41

k12

0

0

0

0

0

k 23

k 2C1

0

0

0

0

0

k 3C 2

0

k C1 2

0

0

0

k C1R

0

k C2 2

0

0

k C2 R

0

0

0

0

0

0

0

0

0

0

0  0  k 34   0   0  0  0 

(1)

where k ij (t) = Pr{Y1 = j,T1 ≤ t Y0 =i }, i,j ∈ E and E={1, 2, 3, C 1, C2, R, 4} the state space of the model, {(Yn ,Tn ), n ≥ 0} the underlying Markov renewal sequence of random variables [17]. Thus kij is the probability that the SMP has entered state i, the next transition occurs within time t and the process transits to state j.

Denoting by F i(t) the cdf of any transition from state i ∈ E and Fi (t) = 1-Fi (t) its complementary cdf, the elements of the kernel matrix K(t) are given as follows: t

t

t

t

0

0

k 12 = F1 (t) , k 23 = ∫ F C (x)dF2 (x), k 2C1 = ∫ F 2 (x)dFC (x), k 34 = ∫ F C (x)dF3 (x), k 3C 2 = ∫ F 3 (x)dFC (x) 0

0

t

t

t

t

0

0

0

0

k C1 2 = ∫ F C1 (x)dFREP1 (x), k C1R = ∫ F REP1 (x)dFC1 (x), k C 2 3 = ∫ F C 2 (x)dFREP2 (x), k C 2 R = ∫ F REP2 (x)dFC 2 (x) k R1 = FREJ , k 41 = F4

Notice that the time to trigger rejuvenation is a fixed duration and hence its cdf can be given as FC (t) = u(t-r) , where u(t) is the unit step function and r is the time to trigger rejuvenation [5]. Based on the two-stage method of SMP analysis the steady-state probability matrix for the embedded Markov chain (EMC) can be derived as P = lim K(t) [13]: t →∞

0  0 0  P = 0  0 1   1

1

0

0

0

0

0

F2 (r)

0

0

F(r)

0

0

0

F(r)

0

p 0 0

0

0

0

1-p

q 0

0 0

0 0

1-q 0

0

0

0

0

0

0   0  F3 (r)   0   0  0   0 

(2)

where F(r) = 1-F2 (r)=1-F3 (r) is the complementary cdf of the transition from states 2 and 3 to the control states C 1 and C2 correspondingly. Parameter p is the probability that the resource degradation of the system, when it enters the control state C 1, is in a satisfactory level and the system is able to serve new resource requests. Hence, p is the probability of the transition from state C 1 back to state 2. Correspondingly, q is the probability that the system returns from state C 2 to state 3. In the following the

2nd International Conference on Dependability of Computer Systems(DepCoS-RELCOMEX'07) 0-7695-2850-3/07 $20.00 © 2007 Authorized licensed use limited to: Jyvaskylan Ammattikorkeakoulu. Downloaded on October 16, 2008 at 05:30 from IEEE Xplore. Restrictions apply.

distribution functions of the transitions from state 2 to state 3 and from state 3 to state 4 are assumed identical ( F 2 (t) = F 3(t) = F(t)). By solving the following system of linear equations, vector v of the steady-state probabilities of the EMC are derived: v = vP , ∑ vi =1, i ∈ E and the steady-state probabilities of each state i, i ∈E are given by the following formulas: (1 - pF(r)) ⋅ (1 - q F(r)) 1 - q F(r) F(r) F 2 (r) ⋅ (1 - pF(r)) ⋅ (1 - q F(r)) , v2 = , v3 = , v4 = D D D D 2 F(r) ⋅ (1 - q F(r)) F(r) ⋅ F(r) (1 - pF(r)) ⋅ (1 - q F(r)) - F (r) = , v C2 = , vR = D D D

v1 = v C1

(4)

where D = 2 ⋅ (1- pF(r)) ⋅ (1- qF(r)) + (1- pF(r)) ⋅ (1- qF(r) + F(r)) . According to the literature on SMPs [15], [10], [4], the steady-state probability of state i, for the SMP are given according to equation (5): vi ⋅ hi (5) π = , i∈ E



i

v jh j

j∈ E

where hi is the mean sojourn time that the process spends at each state i. In the present study the distribution of time spend at states 2 and 3, denoted by F(r), is assumed to be a Weibull distribution with a Weibull parameter α > 2, indicating that is an IFR distribution. Moreover, the mean sojourn times at the check states are assumed to be equal to zero in comparison with the remaining of the sojourn times. The other sojourn times are constant [14],[17]. ∞



h1 = ∫ F1 (t)dt = γ1

0



r

0

0

0

0



h 3 = ∫ FC (t)F4 (t)dt = ∫ [1-u(t-r)]F4 (t)dt = ∫ F(t)dt

h R = ∫ FR (t)dt = γ R

2

0

r

1

0





h 2 = ∫ FC (t)F3 (t)dt = ∫ [1-u(t-r)]F3 (t)]dt = ∫ F(t)dt

(6)

0



h 4 = ∫ F4 (t)dt = γ 4

h C = 0, h C = 0 1

2

0

3. Optimizing VoIP service availability The aim of the present work is to propose a rejuvenation policy that can increase the asymptotic service availability of the VoIP systems described above. The optimal policy will consist in deriving the optimal time to perform software rejuvenation that maximizes system’s availability. 3.1. VoIP service availability computations

Firstly, the asymptotic service availability AV = Σi ∈ Uπi has to be derived. Set U is a subset of the state space E constrained on the up states of the system. The complementary subset of U is subset D, which contains the down states of the system. Analytically, U={1, 2, 3}, D = {C1, C2, R, 4} and U ∪ D = E , U ≠ ∅ , U ≠ E . Deriving the steady-state probabilities of the SMP based on equations (5) and (6)provide the steady-state service availability: v1h1 + (v2 + v3 )h AV = π1 + π 2 + π 3 = (7) v1h1 + (v 2 + v3 )h + (vC + vC )h C + v R h R + v 4 h 4 1

2

2nd International Conference on Dependability of Computer Systems(DepCoS-RELCOMEX'07) 0-7695-2850-3/07 $20.00 © 2007 Authorized licensed use limited to: Jyvaskylan Ammattikorkeakoulu. Downloaded on October 16, 2008 at 05:30 from IEEE Xplore. Restrictions apply.

and it is a function of the time to trigger rejuvenation as probabilities p and q along with the constant mean sojourn times can be determined by the characteristics of the VoIP system studied. 3.2. Optimal rejuvenation policy

As far, the VoIP service availability AV(r) has been determined. The aim is to determine the optimal r that maximizes system’s availability. The corresponding time to trigger rejuvenation is the solution of the optimization problem max AV(r), r>0. Hence, by adopting this solution as a rejuvenation policy we manage to optimize VoIP service availability by performing software rejuvenation, when the system experiences resource exhaustion, and no more new calls can be served, or the quality of the ongoing calls is importantly degraded.

4. VoIP service reliability In order to study the reliability of the presented VoIP system, at first state 4 of Figure 1 is assumed to be an absorbing state. In other words, there in not any repair action in the case that the system runs out of resources and eventually fails. Hence the state space E of the model is partitioned into two new subset, T={1, 2, 3, C1, C2, R} and A={4}, containing the transient and the absorbing states correspondingly. In this case the corresponding one step transition probability matrix P′ of the EMC is [2]: 0  0 0 P′ =  0  0  1

1

0

0

0

0

F (r)

F (r)

0

0 p 0

0 0 q

0 0 0

F (r) 0 0

0

0

0

0

0   0  0   1 -p   1 -q  0 

(8)

Using the approach introduced in [15], [1], service reliability of the VoIP system, in terms of MTTF, can be computed according to equation (9): MTTF(r) = ∑ N i h i (9) i∈T

where hi is the mean sojourn time of state i, i ∈ T and can be given by equation (6), and Ni denotes the average number of times that state i ∈ T is visited, before the EMC is absorbed. These elements can be obtained by solving the system of equations: N i = p′i + ∑ N i p′ij , i ∈ T (10) j

with p i ′ denoting the probability that the EMC starts at state i and pij ′ the ij element of matrix P′. As it can be obtain by equation (9), MTTF is a function of the rejuvenation rate r. Hence, as the optimal time to perform software rejuvenation has been already derived in Section3, substituting the optimal r into equation (9) provides the corresponding value of the MTTF. Consequently we have managed not only to determine the optimal time to trigger rejuvenation but we can determine the corresponding MTTF as a service reliability measure for the VoIP system.

5. Numerical results In order to illustrate our study, a numerical example is presented based on experimental data. In order to derive the VoIP service availability function of equation (7) we assume the rate values shown in Table 1. The probability p of returning to state 2 after the resource check completion in C1, is greater than the probability q of returning to state 2 after check in

2nd International Conference on Dependability of Computer Systems(DepCoS-RELCOMEX'07) 0-7695-2850-3/07 $20.00 © 2007 Authorized licensed use limited to: Jyvaskylan Ammattikorkeakoulu. Downloaded on October 16, 2008 at 05:30 from IEEE Xplore. Restrictions apply.

Table1. Models’ parameters _________________________________________________ Parameter Value _________________________________________________ p 0.8 q 0.2 γ1 1 day γR 1/12 days γ4 1/3 days _________________________________________________ Service Availability

0.96 0.95 0.94 0.93 0.92 0.91 0

0.2 Time r

to

0.4 trigger

0.6 0.8 rejuvenation

1

Figure2. VoIP service availability vs. time to trigger rejuvenation

Service Availability

0.98 0.97 0.96

γ1 = 3

0.95 γ1 = 2

0.94

γ1 = 1

0.93

γ1 = 1/2

0.92 0.91 0

0.2 Time r

to

0.4 trigger

0.6 0.8 rejuvenation

Figure3. VoIP service availability for mean sojourn time

1

1

values

C2, because the level of resource exhaustion in state 3 is higher than the one in state 2, as more calls are served at this time. As far as the Weibull distribution of the time spent in states 2 and 3 is concerned, the distribution parameter is chosen to be α = 2, as it has to be an IFR distribution and the distribution parameter is assumed to be λ=0.78 days-1. Furthermore, the time that the system spends at the check states C1 and C2 is less than all the other sojourn times and can be considered as 0, without loss of generality. In Figure 2, the service availability function is shown for the parameter values of Table 1 and r values varying from 0 (no rejuvenation) to 1 indicating rejuvenation every day. Moreover in Figure 3, the service availability function is depicted for different γ1 .The mean sojourn time γ1 is examined because it depicts VoIP traffic and consequently can affect r* and AV(r*). By Figure 3, we obtained that higher levels of availability are achieved when the process spends more time in state 1, because resource exhaustion and transitions to the down states are delayed.

2nd International Conference on Dependability of Computer Systems(DepCoS-RELCOMEX'07) 0-7695-2850-3/07 $20.00 © 2007 Authorized licensed use limited to: Jyvaskylan Ammattikorkeakoulu. Downloaded on October 16, 2008 at 05:30 from IEEE Xplore. Restrictions apply.

By solving the optimization problem of max AV(r), 0