Threshold and Reservation Based Call Admission ... - Semantic Scholar

10 downloads 6430 Views 236KB Size Report
(CDNs), a CDN server with certain amount of streaming bandwidth needs to serve user requests for different types of multimedia objects that have different ...
Threshold and Reservation Based Call Admission Control Policies for Multiservice Resource-Sharing Systems Jian Ni

Danny H. K. Tsang

Sekhar Tatikonda

Brahim Bensaou

Computer Science Electrical Engineering Electrical Engineering Electrical & Electronic Engineering Yale University, New Haven HKUST, Clear Water Bay HKUST, Clear Water Bay Yale University, New Haven Hong Kong, PRC CT 06520, USA Hong Kong, PRC CT 06520, USA Email: [email protected] Email: [email protected] Email: [email protected] Email: [email protected]

Abstract— Many communications and networking systems can be modelled as resource-sharing systems with multiple classes of calls. Call admission control (CAC) is an essential component of such systems. For most practical systems it is prohibitively difficult to compute the optimal CAC policy that optimizes certain performance metrics because of the ‘curse of dimensionality’. In this paper we study two families of structured CAC policies: threshold and reservation policies. These policies are easy to implement and have good performance in practice. However, since the number of structured policies grows exponentially with the number of call classes and the capacity of the system, finding the optimal structured policies is a complex unsolved problem. In this paper efficient search algorithms are proposed to find the coordinate optimal structured policies among all structured policies. Through extensive numerical experiments we show that the search algorithms converge quickly and work for systems with large capacity and many call classes. In addition, the returned structured policies have optimal or near-optimal performance, and outperform those structured policies with parameters chosen based on simple heuristics. Keywords—Resource sharing, call admission control, threshold policies, reservation policies, combinatorial optimization.

I. I NTRODUCTION We consider a general model of resource-sharing systems with multiple classes of calls. The model has C units of resources and is shared by K different classes of calls. Class-k calls arrive according to a Poisson process with rate λk . When a call arrives, the system either accepts or blocks the call based on certain call admission control (CAC) policy. A class-k call, once accepted, occupies bk units of resources, and the bk units of resources will be released simultaneously when the call finishes after an exponentially distributed service time with mean 1/µk (extensions to general service time distributions will also be discussed). It is a loss system with no waiting room. This model is also referred to as a stochastic knapsack [6], [15], [17]. Many communications and networking systems can be modelled as such multiservice resource-sharing systems. CAC is an essential component of these systems. It determines how the system resources should be shared among the different call classes in order to achieve good performance.

Example 1: In (virtual) circuit-switched networks, an access link with certain amount of bandwidth needs to accommodate different service types including data, audio, video, and many emerging multimedia applications, each with heterogenous bandwidth requirement and traffic statistics. The goal of CAC here is, for example, to maximize the throughput of the link. Example 2: In mobile cellular systems, at each cell, a base station with a limited number of channels (frequency bands, time slots, or codes) needs to handle both new calls originating in this cell and handoff calls coming from neighboring cells. Normally handoff calls have higher priority over new calls because one does not want to terminate an ongoing call. The goal of CAC here is, for example, to minimize the new/handoff call blocking probabilities while giving priority to handoff calls. Example 3: In multimedia content delivery networks (CDNs), a CDN server with certain amount of streaming bandwidth needs to serve user requests for different types of multimedia objects that have different session times and streaming bandwidth requirements. The CDN server collects a revenue (the amount is determined by the type of the requested object) from each accepted user. The goal of CAC here is, for example, to maximize the revenue rate of the CDN server. This resource-sharing model also has numerous instances in many computer and social systems [5], [11]. The simplest CAC policy, known as complete sharing (CS), is to accept a call whenever there are sufficient resources, and to block the call otherwise. The CS policy is easy to implement, and performs well if the traffic demand is light or there is only one call class. However, in today’s communications and networking systems traffic demands are unpredictable and most often the system operates close to its full-capacity regime. The CS policy may lead to poor resource utilization. In addition, service differentiation is now becoming a more and more desirable feature for many systems. For example, some users may want to pay more to get better quality of service. If each accepted call contributes a class-based revenue, then the CS policy may lead to a poor revenue rate for the system. It is desirable to implement a CAC policy that optimizes

certain performance metrics like call blocking probabilities, throughput, or revenue rate of the system (these performance metrics are defined in the long term or at the steady state). The system can be modelled as a Markov decision process (MDP) and MDP tools can be applied to analyze and compute the optimal CAC policy [3], [11], [16]. However, the optimal CAC policy is in general complicated [3], [16]. Furthermore, for most practical systems it is prohibitively difficult to compute the optimal CAC policy using any MDP algorithm. Therefore, we are motivated to study two families of structured CAC policies: threshold and reservation policies. These policies are easy to implement and have good performance in practice. Under a threshold policy [5], [6], [15], [17], each class k is associated with a threshold parameter tk . A class-k call is accepted upon arrival if and only if there are sufficient resources and the number of class-k calls in the system does not exceed tk after acceptance. The system state under a threshold policy has product-form stationary distribution and its performance can be evaluated through efficient convolution algorithms [19]. However, since the number of threshold policies grows exponentially with the number of call classes and the system capacity, finding the optimal threshold policy is a complex combinatorial optimization problem. In this paper we propose an iterative coordinate search algorithm to find a coordinate optimal threshold policy among all threshold policies. We prove the convergence of the algorithm. Through extensive numerical experiments we show that the algorithm converges quickly and the returned threshold policy has nearoptimal performance. Under a reservation policy, each class k is associated with a reservation parameter rk . A class-k call is accepted upon arrival if and only if the system has rk or more free resources (reserved for other classes) after acceptance. The reservation policies are also known as trunk reservation in the literature of telephone/circuit-switched networks [2], [9], [12], and guard channel in the literature of mobile cellular systems [7], [14], [20]. For homogeneous cases in which all classes have the same resource requirement and mean service time, i.e., bk = b and µk = µ, the optimal CAC policy is a reservation policy [3], [9], [11]. For these cases the system state under a reservation policy can be modelled as a one-dimensional birth-death process. Direct calculation of its steady-state probabilities faces numerical overflow problem. We develop a set of fast recursive formulas that can evaluate the steady-state probabilities and the system performance under any reservation policy for systems with very large capacity and arbitrary number of call classes. However, since the number of reservation policies grows exponentially with the number of call classes and the system capacity, finding the optimal reservation policy is a complex combinatorial optimization problem. Again, the proposed iterative coordinate search algorithm can be applied to find an ordered coordinate optimal reservation policy among all reservation policies. Through extensive numerical experiments we show that the algorithm converges quickly and the returned reservation policy has optimal or near-

optimal performance. For heterogenous cases in which calls of different classes have heterogenous resource requirements and mean service times, we propose a uniformization technique to convert a heterogenous case into a homogenous case and then apply the proposed search algorithm to find a good reservation policy. We also compare the performance of the threshold policy and the reservation policy returned by the search algorithms. In terms of revenue rate, for homogenous cases the returned reservation policy always yields a higher revenue rate; while for heterogenous cases there is no strict winner. In terms of robustness, we find that reservation policies perform much better than threshold policies when real traffic demands vary. We organize the paper as follows. Section II introduces the model and the optimization problem. In Sections III and IV efficient search algorithms are developed to find the the optimal or near-optimal structured policies among all threshold policies and reservation policies respectively. In Section V we evaluate the performance of the search algorithms in terms of their convergence rate and the quality of the returned structured policy through extensive experiments. We conclude the paper and present some extensions in Section VI. II. T HE M ODEL AND T HE O PTIMIZATION P ROBLEM Let n = (n1 , n2 , ..., nK ) denote the state of the system, where nk ∈ Z+ is the number of class-k calls that are currently ∆ PK in the system. Let Ωcs = {n : n • bT = k=1 nk bk ≤ C} be the set of all feasible system states, where b = (b1 , b2 , ..., bK ). The admissible state set of a CAC policy includes all states the system may enter under that policy. Let Bk be the call blocking probability of class-k calls. The system revenue rate g is defined as the average revenue collected by all ongoing calls in the system per unit time. There are two revenue collecting models. In the first model, when a class-k call is accepted, a lump-sum revenue Rk is collected. In the second model, a class-k call generates revenue at rate Rk0 per unit time when the call is in progress. By Little’s Theorem, these two models are stochastically P exchangeable K via the relation Rk = Rk0 /µk . Since g = k=1 λk (1 − Bk )Rk , maximizing the system revenue rate is equivalent to minimizing the weighted summation of the call blocking probabilities. If we let Rk = bk /µk , then maximizing the system revenue rate is equivalent to maximizing the system throughput. Therefore, without loss of generality, we use the lump-sum revenue collecting model and choose the system revenue rate g as the optimality criteria, in which Rk = γk bk /µk . γk is the revenue coefficient of class-k calls, which can be interpreted as the revenue generated by one unit of resource per unit time when the resource is used by a classk call. Classes with higher revenue coefficients are more profitable. Since the state and action sets are finite, and under the assumptions of Poisson arrivals and bounded revenues, we only need to consider stationary Markov CAC policies in order to find the optimal CAC policy [13]. In addition, in this paper we only consider deterministic stationary CAC policies that

generate a unichain [13]. A discussion of multichain CAC policies is out of the scope of this paper. Let d(n) = (a1 (n), a2 (n), ..., aK (n)) be the decision rule when the system is in state n. ak (n) takes value 0 or 1 if a class-k call is blocked or accepted, respectively, when the system is in state n. Each policy d generates a controlled Markov reward process. Let Ωd denote the admissible state set including all recurrent states. For each state n ∈ Ωd , let π(n) be the steady-state probability that the system is in state n. Set π(n) = 0 for all state n ∈ / Ωd . The global balance equations of the system are: K X

{π(n − ek )λk ak (n − ek ) + π(n + ek )(nk + 1)µk }

k=1

= π(n)

K X

{λk ak (n) + nk µk }, for all n ∈ Ωd ,

(1)

is partitioned PK into K subsystems with capacities C1 , ..., CK such that k=1 Ck = C, subsystem k is dedicated to class-k calls, i.e., ak (n) = 1 if and only if bk (nk + 1) ≤ Ck ). A. Evaluate the Performance of a Threshold Policy The threshold policies form a subset of the coordinate convex policies [5], [15], [17]. A CAC policy with associated admissible state set Ω is said to be coordinate convex if 1) Ω ⊆ Ωcs is a nonempty set that satisfies: ∀n ∈ Ω, if nk > 0 then n − ek ∈ Ω. 2) A class-k call is accepted if and only if the system state remains in Ω after acceptance, i.e., ak (n) = 1 if and only if n + ek ∈ Ω. It is clear that a threshold policy with parameters t1 , t2 , ..., tK is coordinate convex, with the associated admissible state set Ωt = {n : n • bT ≤ C; 0 ≤ nk ≤ tk for 1 ≤ k ≤ K}.

k=1

X

π(n) = 1.

(2)

n∈Ωd

ek is the unit vector with the k-th element being 1 and other elements being 0. Once the equations are solved and the steady-state probabilities are known, performance metrics of interest, such as the call blocking probabilities and the system revenue rate, can be directly evaluated. One brute-force approach to finding the optimal CAC policy is to search among all CAC policies, which is obviously not practical. By modelling the system as a Markov decision process, MDP tools can be applied to analyze and compute the optimal CAC policy [3], [11], [16]. [16] applied both the value iteration algorithm and the linear programming algorithm (see [13] for a full description of the algorithms) to compute the optimal CAC policy and observed that the value iteration algorithm requires less CPU time. We have implemented the value iteration algorithm. Using it we can compute the optimal CAC policy for small K and C (e.g., K = 2, C = 100, or K = 4, C = 20). But for large C and especially large K, it faces the ‘curse of dimensionality’ because the number of states grows exponentially with K and C. Hence it is desirable to establish the structural properties of the optimal CAC policy that enable efficient computation and easy implementation. Unfortunately, the optimal CAC policy in general does not have a simple structure (except for homogeneous cases) [3], [16]. Therefore, instead of finding the optimal CAC policy, one may want to find a near-optimal CAC policy among policies that have simple structures, if the structured policy is easy to compute and implement, and if its performance is close to that of the optimal CAC policy. III. T HRESHOLD BASED CAC P OLICIES Let d(n) = (a1 (n), a2 (n), ..., aK (n)) be the decision rule when the system is in state n. Under a threshold policy with parameters t1 , t2 , ..., tK , ak (n) = 1 if and only if n•bT +bk ≤ C and nk + 1 ≤ tk . Threshold policies are a rich family of structured policies that include the CS policy and the complete partitioning policies (under which the system with capacity C

Note that the CS policy is also coordinate convex, with the associated admissible state set Ωcs . Under a coordinate convex policy the induced Markov process is time reversible and its stationary distribution has a product form, as summarized in the following theorem [10], [15]. T HEOREM 3.1. If the system is operated under a coordinate convex CAC policy with associated admissible state set Ω, then the induced Markov process is time reversible and its steadystate probabilities are given by: π(n) =

K 1 Y ρnk k , n ∈ Ω, G nk !

(3)

k=1

where ∆

G=

K XY ρnk k ∆ λk , ρk = . nk ! µk

(4)

n∈Ω k=1

Direct calculation of (3)-(4) is not efficient since it requires enumerating all states in Ω. It also faces numerical overflow problems for large C when calculating the normalization constant G. The convolution algorithm with binary tree implementation proposed in [19] can be applied to evaluate the system performance under any threshold policy, with a computational complexity of O(C 2 K log K). The insensitivity property [4], [8], [17] states that (3)-(4) still hold if the exponential service time distributions are replaced by arbitrary distributions with finite means. Exponential service times are good assumptions for many applications but not for all. For example, for video streaming and live streaming applications, the session times are more or less deterministic. The insensitivity property enables us to extend the results here to arbitrary service time distributions. B. Find the Optimal or Near-Optimal Threshold Policy Given other parameters C, K, (γk , bk , λk , µk )k=1,...,K , the system revenue rate g is a function of the threshold vector ∆ ∆ ∆ t = (t1 , t2 , ..., tK ) ∈ T = T1 × T2 × · · · × TK , where Tk = Q Q ∆ K K {0, 1, ..., Nk = b bCk c}. Since |T| = k=1 |Tk | = k=1 (Nk +

QK 1), there exist k=1 (Nk + 1), or O(C K ) different threshold policies. Finding the optimal threshold policy using brute-force search becomes intractable for large C and especially large K. We propose an iterative coordinate search algorithm to find a coordinate optimal threshold policy among all threshold policies. D EFINITION 3.1. A threshold vector t = (t1 , t2 , ..., tK ) ∈ T is a coordinate optimal point if it satisfies the following coordinate optimal property:

T HEOREM 3.3. Assume that the call arrival rates are finite and the revenues are bounded, ICSA THD converges in a finite number of iterations and the returned threshold policy is a coordinate optimal threshold policy. P ROOF : m−1 m m m−1 Define xm ) for 0 ≤ k ≤ K. k = (t1 , ..., tk , tk+1 , ..., tK At the m-th iteration, from (6) we have:

g(t1 , ..., tk , ..., tK ) ≥ g(t1 , ..., ξ, ..., tK )

The stopping criterion implies that if ICSA THD does not stop at iteration m, then from (7) we have

(5)

for all ξ ∈ Tk = {0, 1, ..., Nk }, k = 1, ..., K. A threshold policy with threshold vector t is a coordinate optimal threshold policy if t is a coordinate optimal point. Note that if we model the problem of finding the optimal threshold policy as a K-player cooperative game, each class k chooses its threshold parameter tk , and all classes have the same payoff function g(t1 , t2 , ..., tK ), then a coordinate optimal point that satisfies (5) is a Nash equilibrium of the cooperative game. T HEOREM 3.2. Assume that the call arrival rates are finite and the revenues are bounded, then there exists at least one coordinate optimal point in T. P ROOF : Q K |T| = k=1 (Nk + 1) so T is a finite set. For each t ∈ T, since the call blocking probabilities are numbers between 0 and 1, we have 0 ≤ g(t) =

K X

λk Rk (1 − Bk ) ≤

k=1

K X

λk Rk < ∞.

k=1

Hence {g(t) : t ∈ T} is a finite bounded set, and there exists a t∗ ∈ T such that g(t∗ ) = max{g(t) : t ∈ T}, i.e., t∗ is the threshold vector of the optimal threshold policy that achieves the highest revenue rate. Clearly t∗ satisfies the coordinate optimal property (5) and thus is a coordinate optimal point. 2 We order the classes by their revenue coefficients so that γ1 ≥ γ2 ≥ ... ≥ γK . If γk = γk+1 , then class k and k + 1 are sorted according to b such that bk ≥ bk+1 .

Algorithm ICSA THD I TERATIVE C OORDINATE T HRESHOLD P OLICIES

S EARCH

A LGORITHM

FOR

1. Choose an initial threshold vector t0 . Set m = 1. 2. For k = 1, 2, ..., K, search along the k-th coordinate set Tk in sequence to find tm k such that: m−1 m−1 m m ). tm k = arg max g(t1 , ..., tk−1 , ξ, tk+1 , ..., tK ξ∈Tk

(6)

(In a situation where more than one ξ ∈ Tk achieve the same maximum g, the largest ξ is chosen to break the tie.) ∆ m−1 m m 3. If tm = (tm , stop. Otherwise, increase 1 , t2 , ..., tK ) = t m by 1 and go to step 2.

m m m g(tm−1 ) = g(xm 0 ) ≤ g(x1 ) ≤ · · · ≤ g(xK ) = g(t ).

g(t0 ) < g(t1 ) < · · · < g(tm ).

(7)

(8)

For any t ∈ T, define lk (t) = {(t1 , ..., tk−1 , ξ, tk+1 , ..., tK ) : ξ ∈ Tk } which represents a line along the k-th coordinate of T. Let Lk = {lk (t) : ti ∈ Ti , i 6= k} be the set that includes all the lines along the k-th coordinate of T. At the m-th iteration, ICSA THD searches along exactly one line lk (xm k−1 ) in Lk for 1 ≤ k ≤ K in sequence. By the definition of xm k , lk (t), and from (6) we have : xm k = arg

max

t∈lk (xm k−1 )

g(t).

(9)

Assume, towards a contradiction, that ICSA THD searches the same line in some Lk more than once without stopping immediately, i.e., assume that ICSA THD does not converge at iteration m, but there exist some k and i, j ≤ m, i 6= j such that lk (xik−1 ) = lk (xjk−1 ). m Since tm k in (6) is uniquely determined, so is xk in (9). j j i i Hence lk (xk−1 ) = lk (xk−1 ) ⇒ xk = xk ⇒ lk+1 (xik ) = lk+1 (xjk ). Continuing induction and finally we have xiK = xjK ⇒ g(ti ) = g(xiK ) = g(xjK ) = g(tj ), which contradicts (8). Therefore, during every iteration ICSA THD searches a distinct line Q in Lk for 1 ≤ k ≤ K before it converges. K Since |Lk | = i=1,i6=k (Ni + 1), we conclude that ICSA THD converges in a finite number of iterations which is bounded by min{|Lk | : 1 ≤ k ≤ K}. When ICSA THD converges at iteration M , i.e., tM = M −1 , then from (6) it is clear that tM is a coordinate optimal t point and the returned policy is a coordinate optimal threshold policy. 2 Let Mt be the number of iterations required PKby ICSA THD. During each iteration, ICSA THD searches k=1 (Nk +1), or O(CK) different threshold policies. Let Tt be the computational complexity of evaluating the performance of a threshold policy. Hence the computational complexity of ICSA THD is O(Mt CKTt ). Theorem 3.2 shows that the optimal threshold vector is a coordinate optimal point. In addition if it is the unique coordinate optimal point, then ICSA THD will always return the optimal threshold policy. In general since ICSA THD is

Fig. 1.

The birth-death process under a reservation policy.

a local search algorithm [1], two important issues need to be investigated: 1) the convergence rate of the algorithm; 2) the quality of the returned threshold policy. One would also expect that the initial threshold vector t0 we choose may affect the performance of ICSA THD. These will be investigated in Section V. We observed that the threshold policy returned by ICSA THD may exclusively block some low-profit classes to maximize the system revenue rate, i.e., tk = 0 for some class k. In order to prevent such monopolism, we can modify Tk in Definition 3.1 and ICSA THD to be the set {tk , tk +1, ..., Nk } for some tk > 0. Clearly the statements of Theorem 3.2 and Theorem 3.3 still hold. In addition, the returned threshold policy satisfies that tk ≥ tk , i.e., every class has some share of the resources. IV. R ESERVATION BASED CAC P OLICIES Let d(n) = (a1 (n), a2 (n), ..., aK (n)) be the decision rule when the system is in state n. Under a reservation policy with parameters r1 , r2 , ..., rK , ak (n) = 1 if and only if n•bT +bk ≤ C − rk . In general, a reservation policy is not coordinate convex. As an example, suppose C = 5, b = (4, 1). Under the reservation policy with parameters r1 = 0 and r2 = 1, the admissible state set is Ωr = {(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (1, 0), (1, 1)}. Although (1, 0) + e2 = (1, 1) ∈ Ωr , the system in state (1,0) will not accept a class-2 call because r2 = 1. This violates the second condition of being a coordinate convex policy. Indeed, for the induced Markov process the transition rate from state (1,0) to (1,1) is zero, while the transition rate from (1,1) to (1,0) is positive since we cannot stop call departures. This implies that the induced Markov process is not time reversible and does not have product-form stationary distribution. A. Homogeneous Cases We order the classes by their revenue coefficients so that γ1 ≥ γ2 ≥ ... ≥ γK . For homogeneous cases, from Bellman’s optimality equations [13], it can be shown that the optimal CAC policy is a reservation policy with ordered parameters ∗ 0 = r1∗ ≤ r2∗ ≤ ... ≤ rK ≤ C (see [3], [11]). Therefore, we only need to search among reservation policies to find the optimal CAC policy. However, those theoretical results do not provide any practical method to determine the optimal reservation parameters.

1) Evaluate the Performance of a Reservation Policy: For homogeneous cases, since all classes have the same resource requirement (without loss of generality we assume bk = 1) and mean service time, once a call is accepted, its class becomes irrelevant to the future evolution of the system. Hence it suffices to choose n as the system state when there are n calls (of any class) in the system. Under a reservation policy with parameters 0 = r1 ≤ r2 ≤ ... ≤ rK ≤ C, the system state can be modelled as a one-dimensional birth-death ∆ Pk process as shown in Fig. 1, where λ(k) = i=1 λi . In any state n, the death rate (call departure rate) is nµ. In state 0, 1, ..., C − rK − 1, the system accepts calls of all classes so the birth rate is λ(K). When the system enters state C − rK , it does not accept class-K calls so the birth rate is λ(K − 1). Birth rates in other states are similarly calculated. Pk Let ρk = λk /µ and ρ¯k = j=1 ρj for 1 ≤ k ≤ K. Let rK+1 denote the system capacity. Let πC (n) be the steadystate probability that the system (with capacity C) is in state n. Based on the state-transition-rate diagram in Fig. 1, the steady-state probabilities of the birth-death process are: πC (0)

=

G(C)−1

=

(1 +

X K

X

C−rk

n−C+rk+1

ρ¯k

Q

K i=k+1

n!

k=1 n=C−rk+1 +1

r

ρ¯i i+1

−ri

)−1 (10)

For k = K, K − 1, ..., 1, C − rk+1 + 1 ≤ n ≤ C − rk , n−C+rk+1

πC (n) =

1 ρ¯k G(C)

Q

K i=k+1

n!

r

ρ¯i i+1

−ri

(11)

However, for large C, direct calculation faces numerical overflow problems when calculating the normalization constant G(C). We have developed the following set of fast recursive formulas (for systems with arbitrary K and C) to solve this problem. When a system with capacity C is operated under a reservation policy with parameters 0 = r1 ≤ r2 ≤ ... ≤ rK ≤ C, from (11) we have: QK ρ¯r12 i=2 ρ¯i ri+1 −ri πC (C) = (12) G(C)C! Similarly, for a system with capacity C − 1 and reservation parameters 0 = r1 ≤ r2 − 1 ≤ ... ≤ rK − 1 ≤ C − 1: QK r −r ρ¯1r2 −1 i=2 ρ¯i i+1 i πC−1 (C − 1) = (13) G(C − 1)(C − 1)!

Comparing (12), (13) and from (10) we have: πC (C) =

πC−1 (C − 1) πC−1 (C − 1) + C/¯ ρ1

(14)

(14) holds as long as r2 > 0. When r2 decreases to 0, the system accepts both class-1 and class-2 calls whenever there are available resources. Now the recursive equation becomes: πn (n) =

πn−1 (n − 1) πn−1 (n − 1) + n/¯ ρ2

(15)

(15) holds as long as r3 > 0. In general, for every n in [C − rk+1 + 1, C − rk ], k = K, K − 1, ..., 1, we derive the following set of recursive formulas: πn (n) =

πn−1 (n − 1) πn−1 (n − 1) + n/¯ ρk

(16)

with π0 (0) = 1. The following procedure can be applied to evaluate the system performance under any reservation policy.

Procedure PE RSV P ERFORMANCE E VALUATION OF R ESERVATION P OLICIES 1 2 3 4 5 6 7 8 9 10 11 12 13

π(0) ← 1 for k ← K to 1 for n ← C − rk+1 + 1 to C − rk π(n−1) π(n) ← π(n−1)+n/ ρ¯k PK log G ← i=1 (ri+1 − ri ) log(¯ ρi ) − log(π(C)) − log(C!) π(0) ← exp(− log G) for k ← K to 1 for n ← C − rk+1 + 1 to C − rk log π(n) ← − log G + (n − C + rk+1 ) log(¯ ρk ) PK + i=k+1 (ri+1 −ri ) log(¯ ρi )−log(n!) π(n) ← exp(log π(n)) for k ← 1P to K C Bk ← n=C−rk π(n) PK g ← i=1 λi (1 − Bi )Ri

Steps 1-4 compute πC (C) based on (16). The recursive computation greatly reduces the numerical overflow problem. Steps 5-10 compute the normalization constant (based on (12)) and the steady-state probabilities (based on (10) and (11)). Note that the logarithmic transformation converts the product into a sum and further reduces the numerical overflow problem. Steps 11-13 compute the call blocking probabilities and the system revenue rate. The computational complexity of Procedure PE RSV is O(C). Note that r1 = 0 assures that the birth-death process is ergodic. Hence it is time reversible [18] and has product-form steady-state probabilities (10)-(11). Again, the insensitivity property of product-form results enables us to extend the results here to arbitrary service time distributions.

2) Find the Optimal or Near-Optimal Reservation Policy: Given other parameters C, K, µ, and (γk , λk )k=1,...,K , the system revenue rate g is a function of the reservation vector ∆ ∆ r = (r1 , r2 , ..., rK ) ∈ R = {0, 1, ..., C}K , which can be evaluated via Procedure PE RSV. Since |R| = (C + 1)K , there exist O(C K ) different reservation policies. Finding the optimal reservation policy using brute-force search becomes intractable for large C and especially large K. With minor modifications, the iterative coordinate search algorithm proposed for threshold policies can be applied to find an ordered coordinate optimal reservation policy among all reservation policies. D EFINITION 4.1. A reservation vector r = (r1 , r2 , ..., rK ) ∈ R is an ordered coordinate optimal point if it satisfies the following two properties: ∆



r0 = 0 = r1 ≤ r2 ≤ ... ≤ rK ≤ rK+1 = C

(17)

g(r1 , ..., rk , ..., rK ) ≥ g(r1 , ..., ξ, ..., rK )

(18)

for all ξ in [rk−1 , rk+1 ], k = 1, ..., K. A reservation policy with reservation vector r is an ordered coordinate optimal reservation policy if r is an ordered coordinate optimal point. T HEOREM 4.1. Assume that the call arrival rates are finite and the revenues are bounded, then for homogeneous cases there exists at least one ordered coordinate optimal point in R. P ROOF : The proof is similar to that for Theorem 3.2. {g(r) : r ∈ R} is a finite bounded set, hence there exists a r∗ ∈ R such that g(r∗ ) = max{g(r) : r ∈ R}, i.e., r∗ is the reservation vector of the optimal reservation policy which is also the optimal CAC policy for homogeneous cases, so r∗ satisfies both (17) and (18), and is an ordered coordinate optimal point. 2

Algorithm ICSA RSV I TERATIVE C OORDINATE R ESERVATION P OLICIES

S EARCH

A LGORITHM

FOR

1. Choose an initial reservation vector r0 that satisfies (17). Set m = 1. m 2. Set r1m = 0, rK+1 = C. For k = K, K − 1, ..., 2, search in m−1 m [rk−1 , rk+1 ] to find rkm such that: rkm = arg

max

m−1 m ξ∈[rk−1 ,rk+1 ]

m−1 m m g(r1m−1 , ..., rk−1 , ξ, rk+1 , ..., rK ).

(19) m−1 m (In a situation where more than one ξ ∈ [rk−1 , rk+1 ] achieve the same maximum g, the smallest ξ is chosen to break the tie.) ∆ m ) = rm−1 , stop. Otherwise, 3. If rm = (r1m , r2m , ..., rK increase m by 1 and go to step 2.

T HEOREM 4.2. Assume that the call arrival rates are finite and the revenues are bounded, ICSA RSV converges in a finite number of iterations and the returned reservation policy is an

ordered coordinate optimal reservation policy. P ROOF : The proof is similar to that for Theorem 3.3. 2 Let Mr be the number of iterations required by ICSA RSV. During each iteration, ICSA RSV searches at most (C + 1)(K − 1) different reservation policies. Let Tr be the computational complexity of evaluating the performance of a reservation policy. Hence the computational complexity of ICSA RSV is O(Mr CKTr ). For K = 2, it is clear that ICSA RSV always returns the optimal reservation policy (in two iterations). In general since ICSA RSV is a local search algorithm [1], two important issues need to be investigated: 1) the convergence rate of the algorithm; 2) the quality of the returned reservation policy. One would also expect that the initial reservation vector r0 we choose may affect the performance of ICSA RSV. These will be investigated in Section V. B. Heterogenous Cases: A Uniformization Technique For heterogenous cases in which calls of different classes have heterogenous resource requirements and mean service times, a multidimensional birth-death process is required to model the system state under a reservation policy. In order to evaluate the system performance, we need to solve the global balance equations (1)-(2), which requires excessive computational effort. We propose a uniformization technique that converts a heterogeneous case into a homogeneous case as follows. For a class-k call that requires bk units of resources and has mean service time 1/µk , we ‘divide’ the call into bk /µk 0 0 individual and independent class-k calls, each requires bk = 1 0 unit of resource and has mean service time 1/µk = 1. In 0 the converted model, the arriving rate of class-k calls is 0 0 λk = λk bk /µk and the revenue coefficient of class-k is still γk . In the converted model, the new K classes of calls have 0 the same resource requirement b = 1 and mean service time 0 1/µ = 1. Then we can apply ICSA RSV to find a reservation policy for the converted model, and use it for the original model. Since the converted model and the original model are not stochastically equivalent, the returned reservation policy is not necessarily an ordered coordinate optimal reservation policy for the original model. We will evaluate its performance in Section V. V. N UMERICAL E XPERIMENTS In this section we investigate the following questions through extensive numerical experiments: 1. How good is ICSA THD in terms of its convergence rate and the quality of the returned threshold policy? 2. How good is ICSA RSV in terms of its convergence rate and the quality of the returned reservation policy? 3. Which one is better, the threshold policy returned by ICSA THD or the reservation policy returned by ICSA RSV? The results are shown in Table I-Table IV. As a comparison, for all cases we evaluate the performance of the CS policy.

We scale the revenue rate of the CS policy to be 100.0. The revenue rates of other policies are also scaled according to the CS policy so that percentage improvements of other policies over the CS policy are clearly shown. Let ηk = (λk bk /µk )/C P be the (normalized) traffic demand K of class-k calls, and η = k=1 ηk be the (normalized) total traffic demand of all calls. For any given η, we consider the following four traffic distributions among different classes: ∆ (1) UNIF: ηk = η = η/K for all k. 3 1 K (2) H H: ηk = 2 η for 1 ≤ k ≤ K 2 , ηk = 2 η for 2 + 1 ≤ k ≤ K, i.e., higher-profit classes have higher traffic demands. 3 K (3) H L: ηk = 12 η for 1 ≤ k ≤ K 2 , ηk = 2 η for 2 + 1 ≤ k ≤ K, i.e., higher-profit classes have lower traffic demands. (4) RAND: ηk is a random number uniformly chosen in [ 21 η, 23 η] for all k. For each set of K and C, we have done experiments over a wide range of η. In general, higher η and/or larger differences between the revenue coefficients result in greater improvement of the returned structured policies over the CS policy. Here we only list the results for η = 1.0 (critically loaded cases) and η = 1.6 (overloaded cases). The revenue coefficients are within the same order of magnitude. A. Performance of ICSA THD The initial threshold vector t0 of ICSA THD is chosen as follows: (1) Initial Threshold Vector 1 (ITV1): t0k = 0 for all k. (2) ITV2: t0k = Nk for all k, this is the CS policy. (3) ITV3: t0k is a random number uniformly chosen among numbers between 0 and Nk , for all k. (4) ITV4: t0 is chosen according to the heuristics proposed in [6]. 1) Convergence Rate: We found that for all cases ICSA THD converges very fast, with the number of iterations Mt on the order of K. For small K (K = 2 or 4), we found that Mt and the returned threshold policy are insensitive to the initial threshold vectors (ITV1 to ITV4). Therefore, in Table I and Table II for each case we only show one set of result (Mt , t, g) of ICSA THD with ITV1. For large K (K = 16), we found that the returned threshold policy may be different with different initial threshold vectors, but they all have the same revenue rate, as shown in Table III. In our experiments the computational complexity of evaluating the performance of a threshold policy is Tt = O(C 2 K log K) due to the convolution algorithm with binary tree implementation [19]. In practice Mt is on the order of K. Hence the practical computational complexity of ICSA THD is O(Mt CKTt ) = O(C 3 K 3 log K), and it can be applied for systems with large C and K. For example, for K = 16 and C = 100, on average ICSA THD converges in 1-2 minutes, running on a PC with Pentium III 933MHz CPU. 2) Quality of the Returned Threshold Policy: For small K and C, e.g., K = 2, C = 100 (Table I) and K = 4, C = 20 (Table II), we are able to find the optimal threshold policy using brute-force search. It’s rather striking that for all of the cases we studied the coordinate optimal threshold policy

returned by ICSA THD is exactly the optimal threshold policy. For these cases we can also find the optimal CAC policy using the value iteration algorithm. We found that the revenue rate of the optimal threshold policy is very close to that of the optimal CAC policy. For large K and C, e.g., K = 16 and C = 100 (Table III), it is infeasible to find the optimal threshold policy using bruteforce search. We compare the performance of the returned threshold policy with the restricted complete sharing (RCS) policy (a specific threshold policy with parameters heuristically chosen) proposed in [6]. For all cases the threshold policy returned by ICSA THD outperforms the RCS policy. Indeed, if we choose the initial threshold vector using a good heuristic threshold policy, then ICSA THD will return a threshold policy that performs better than or at least as well as the heuristic threshold policy. B. Performance of ICSA RSV The initial reservation vector r0 of ICSA RSV is chosen as follows: (1) Initial Reservation Vector 1 (IRV1): rk0 = 0 for all k, this is the CS policy. (2) IRV2: r10 = 0, rk0 = C for k > 1. (3) IRV3: r10 = 0, rk0 is a random number uniformly chosen among numbers between 0 and rk+1 , for k = K, ..., 2 0 = C). (rK+1 1) Convergence Rate: Like ICSA THD, we found that for all cases ICSA RSV converges very fast, with the number of iterations Mr on the order of K. For small K (K = 2 or 4), we found that Mr and the returned reservation policy are insensitive to the initial reservation vectors (IRV1 to IRV3), so in Table I and Table II for each case we only show one set of result (Mr , r, g) of ICSA RSV with IRV1. For large K (K = 16), we found that ICSA RSV converges faster when starting from IRV1 or IRV3 than starting from IRV2. The returned reservation policy may be different with different initial reservation vectors, but they all have the same revenue rate, as shown in Table III. In our experiments the computational complexity of evaluating the performance of a reservation policy is Tr = O(C) due to Procedure PE RSV. In practice Mr is on the order of K. Hence the practical computational complexity of ICSA RSV is O(Mr CKTr ) = O(C 2 K 2 ). It can be applied for systems with large C and K. For example, for K = 16 and C = 100, on average ICSA RSV converges in 1-2 seconds, running on a PC with Pentium III 933MHz CPU. 2) Quality of the Returned Reservation Policy: For small K and C, e.g., K = 2, C = 100 (Table I) and K = 4, C = 20 (Table II), we can find the optimal CAC policy using the value iteration algorithm. For homogeneous cases the optimal CAC policy coincides with the optimal reservation policy. It’s rather striking that for all the homogeneous cases we studied the reservation policy returned by ICSA RSV is exactly the optimal CAC policy. For large K and C, e.g., K = 16 and C = 100 (Table III), it is infeasible to find the optimal CAC policy using the value it-

eration algorithm. We compare the performance of the returned reservation policy with two heuristic reservation policies. We e such that PKe λk bk /µk ≤ first determine the number K k=1 PK+1 e C < k=1 λk bk /µk . Heuristic Reservation Policy 1 (HRP1) sets its reservation parameters as follows: e rk = 5%C for K e + 1 ≤ k ≤ K. rk = 0 for 1 ≤ k ≤ K; Heuristic Reservation Policy 2 (HRP2) sets its reservation parameters as follows: e rk = 10%C for K e + 1 ≤ k ≤ K. rk = 0 for 1 ≤ k ≤ K; We found that for all cases the reservation policy returned by ICSA RSV outperforms the two heuristic reservation policies. Indeed, if we choose the initial reservation vector using a good heuristic reservation policy, then ICSA RSV will return a reservation policy that performs better than or at least as well as the heuristic reservation policy. For a heterogeneous case, we first convert it into a homogeneous case using the uniformization technique. We then apply ICSA RSV to find a reservation policy. For small K and C, the revenue rate of the original case under the returned reservation policy can be evaluated by solving the global balance equations (1)-(2) using MATLAB. As shown in Tables I and II, when high-profit classes require more resources (larger b) than low-profit classes, the uniformization technique causes some performance loss of the returned reservation policy when compared with the optimal CAC policy. On the other hand when high-profit classes require less resources than low-profit classes, the revenue rate of the returned reservation policy is very close to that of the optimal CAC policy, which implies that in this case the uniformization technique provides a very good approximation. For large K and C, solving (1)(2) requires excessive computational effort. Hence in Table III for heterogenous cases we only list the results of threshold policies. C. Threshold vs. Reservation We first compare the threshold policy returned by ICSA THD and the reservation policy returned by ICSA RSV in terms of their revenue rate. As shown in the tables, for all homogeneous cases the returned reservation policy always yields a higher revenue rate. While for heterogenous cases there is no strict winner: when high-profit classes require more resources than low-profit classes, the returned threshold policy yields a higher revenue rate; when high-profit classes require less resources, the returned reservation policy yields a slightly higher or lower revenue rate. Another important performance metric of any CAC policy is robustness, i.e., how well a policy performs when the real traffic demands vary. We compare threshold and reservation policies in terms of their robustness as follows. Given a priori knowledge of the traffic demands (η10 , η20 ), we determine the a priori threshold policy with parameters (t01 , t02 ) returned by ICSA THD and the a priori reservation policy with parameters (r10 , r20 ) returned by ICSA RSV. Next we let the real traffic demands vary over some range. We then calculate the system revenue rate of the a priori threshold policy and of the a priori reservation policy respectively. As a comparison, we

also calculate the revenue rate g ∗ of the true optimal CAC policy. Again, these values are scaled according to the revenue rate of the CS policy. As shown in Table IV, we found that for most cases the a priori reservation policy performs much better than the a priori threshold policy when real traffic demands vary. In addition, the revenue rate of the a priori reservation policy is very close to that of the true optimal CAC policy. In contrast, the performance of the a priori threshold policy is very sensitive to the real traffic demands. Often there exists a large gap between its revenue rate and that of the true optimal CAC policy. We did similar tests for K = 4 and K = 16. Again we found that reservation policies are much more robust than threshold policies when real traffic demands vary. VI. C ONCLUSIONS AND E XTENSIONS In this paper we studied CAC for multiservice resourcesharing systems. For most practical systems computing the optimal CAC policy is prohibitively difficult because of the ‘curse of dimensionality’. Therefore we were motivated to study two families of structured CAC policies: threshold and reservation policies. These policies are easy to implement and have good performance. However, it is difficult to find the optimal structured policies among all structured policies. In practice the parameters of these structured CAC policies are chosen based on simple heuristics. The main contributions of this paper are: the fast recursive formulas developed to evaluate the system performance under any reservation policy; the iterative coordinate search algorithms proposed to determine the parameters of the structured policies, with the objective of maximizing the system revenue rate. We prove the convergence of the search algorithms. Through extensive numerical experiments we show that the search algorithms converge quickly and work for systems with large capacity and many call classes. In addition, the returned structured policies have optimal or near-optimal performance, and outperform those structured policies with parameters chosen based on simple heuristics. For real systems the call arrivals are not necessarily stationary and the traffic demands may not be known beforehand. In practice, adaptive schemes (like in [20]) can be designed to estimate the call arrival rates (traffic demands), and the system dynamically adjusts the parameters of the structured CAC policy using the search algorithms developed in this paper. The high efficiency of the search algorithms makes such online optimization possible, even for systems with large capacity and many call classes. There are many extensions of this single-system with singleresource (SSSR) model. First, we can extend a single system to a network of systems. For example, in telecommunication networks, instead of considering CAC for a single access link, we can consider CAC and routing for the whole network. Second, we can extend a single-resource model to a multiresource model. For example, in mobile wireless networks, in addition to channels, power is also an important resource. A common approach to studying these extended models is

to decompose a complex model into a collection of cascade or parallel SSSR models. Under the assumption that these SSSR models are statistically independent, each with a reduced offered load, the search algorithms developed in this paper can be applied to each one of the SSSR models. ACKNOWLEDGMENT The authors would like to thank the anonymous reviewers for their helpful comments. R EFERENCES [1] Local Search in Combinatorial Optimization, edited by E. Aarts and J. K. Lenstra, John Wiley & Sons, 1997. [2] J. M. Akinpelu, “The Overload Performance of Engineering Networks with Nonhierarchical and Hierarchical Routing,” AT&T Techical Journal 63, pp. 1261-1281, 1984. [3] E. Altman, T. Jimenez, G. Koole, “On Optimal Call Admission Control in a Resource-Sharing System,” IEEE Transactions on Communications, Vol. 49, No. 9, pp. 1659-1668, Sep. 2001. [4] D. Y. Burman, J. P. Lehoczky, Y. Lim, “Insensitivity of Blocking Probabilities in a Circuit-Switching Network,” Journal of Applied Probability, Vol 21, pp. 850-859, 1984. [5] G. J. Foschini, B. Gopinath, J. F. Hayes, “Optimum Allocation of Servers to Two Types of Competing Customers,” IEEE Transactions on Communications, Vol. 29, No. 7, pp. 1051-1055, July 1981. [6] A. Gavious, Z. Rosberg, “A Restricted Complete Sharing Policy for a Stochastic Knapsack Problem in B-ISDN,” IEEE Transactions on Communications, Vol. 42, No. 7, pp. 2375-2379, July 1994. [7] D. Hong, S. S. Rappaport, “Traffic Model and Performance Analysis for Cellular Mobile Radio Telephone Systems with Prioritized and Nonprioritized Handoff Procedures,” IEEE Transactions on Vehicular Technology, Vol. 35, pp. 77-99, Aug. 1986. [8] J. S. Kaufman, “Blocking in a Shared Resource Enviroment,” IEEE Transactions on Communications, Vol. 29, No. 10, pp. 1474-1481, Oct. 1981. [9] F. P. Kelly, “Routing and Capacity Allocation in Networks with Trunk Reservation”, Mathematics of Operations Research, Vol. 15, No. 4, pp. 771-792, 1990. [10] F. P. Kelly, Reversibility and Stochastic Networks, Wiley, Chichester, 1979. [11] B. L. Miller, “A Queueing Reward System with Several Customer Classes,” Management Science, Vol. 16, pp. 234-245, 1969. [12] T. Oda, Y. Watanabe, “Optimal Trunk Reservation for a Group with Multislot Traffic Streams,” IEEE Transactions on Communications, Vol. 38, No. 7, pp. 1078-1084, July 1990. [13] M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley, New York, 1994. [14] R. Ramjee, P. Nagarajan and D. Towsley, “On Optimal Call Admission Control in Cellular Networks,” Proc. IEEE INFOCOM’96, pp. 43-50, Mar. 1996. [15] K. W. Ross and D. H. K. Tsang, “The Stochastic Knapsack Problem,” IEEE Transactions on Communications, Vol. 37, No. 7, pp. 740-747, July 1989. [16] K. W. Ross and D. H. K. Tsang, “Optimal Circuit Access Policies in an ISDN Environment: A Markov Decision Approach,” IEEE Transactions on Communications, Vol. 37, No. 9, pp. 934-939, Sep. 1989. [17] K. W. Ross, Multiserice Loss Models for Broadband Telecommunication Networks, Springer, 1995. [18] S. M. Ross, Stochastic Process, Second Edition, Wiley, 1996. [19] D. H. K. Tsang and K. W. Ross, “Algorithms to Determine Exact Blocking Probabilities for Multirate Tree Networks,” IEEE Transactions on Communications, Vol. 38, No. 8, pp. 1266-1271, Aug. 1990. [20] O. T. W. Yu and V. C. M. Leung, “Adaptive Resource Allocation for Prioritized Call Admission over an ATM-Based Wireless PCN”, IEEE Journal on Selected Areas in Communications, Vol. 15, No. 7, pp. 12081225, Sep. 1997.

TABLE I N UMERICAL E XPERIMENTS FOR K = 2, C = 100

Parameters η 1.0

1.6

Traffic Distribution UNIF HH HL RAND UNIF HH HL RAND

Parameters (b1 , b2 ) (5,1)

(1,1) (1,5)

(µ1 , µ2 ) (5,1) (1,1) (1,5) (5,1) (1,5) (5,1) (1,1) (1,5)

Homogeneous Cases: (γ1 , γ2 ) = (5, 1), (b1 , b2 ) = (1, 1), (µ1 , µ2 ) = (1, 1) CS Threshold Policy Reservation Policy ICSA THD Brute-Force Search ICSA RSV g Mt (t1 b1 , t2 b2 ) g (t1 b1 , t2 b2 ) g Mr (r1 , r2 ) g 277.3→100.0 2 (100,41) 102.5 (100,41) 102.5 2 (0,4) 104.3 369.7→100.0 2 (100,12) 103.1 (100,12) 103.1 2 (0,9) 104.3 184.9→100.0 2 (100,71) 101.4 (100,71) 101.4 2 (0,2) 103.2 260.1→100.0 2 (100,45) 104.1 (100,45) 104.1 2 (0,4) 106.3 295.4→100.0 2 (100,6) 135.6 (100,6) 135.6 2 (0,13) 137.2 393.8→100.0 2 (100,0) 122.5 (100,0) 122.5 2 (0,75) 122.5 196.9→100.0 2 (100,51) 125.0 (100,51) 125.0 2 (0,4) 128.9 287.9→100.0 2 (100,5) 140.4 (100,5) 140.4 2 (0,14) 142.0

Heterogeneous Cases: (γ1 , γ2 ) = (5, 1), η = 1.6, Traffic Distribution: RAND Threshold Policy Reservation Policy ICSA THD Brute-Force Search Uniformization/ICSA RSV g Mt (t1 b1 , t2 b2 )1 g (t1 b1 , t2 b2 ) g Mr (r1 , r2 ) g 175.6→100.0 2 (100,0) 225.8 (100,0) 225.8 2 (0,19) 204.1 317.6→100.0 2 (100,0) 133.4 (100,0) 133.4 2 (0,41) 132.6 311.8→100.0 2 (100,0) 138.2 (100,0) 138.2 2 (0,54) 138.2 218.7→100.0 2 (100,48) 117.5 (100,48) 117.5 2 (0,4) 117.5 370.4→100.0 2 (100,0) 128.6 (100,0) 128.6 2 (0,58) 128.6 319.7→100.0 2 (100,20) 111.1 (100,20) 111.1 2 (0,9) 110.6 443.9→100.0 2 (100,0) 108.3 (100,0) 108.3 2 (0,96) 108.3 285.1→100.0 2 (100,35) 108.5 (100,35) 108.5 2 (0,7) 110.9 CS

Optimal CAC Policy Value Iteration (r1 , r2 ) g∗ (0,4) 104.3 (0,9) 104.3 (0,2) 103.2 (0,4) 106.3 (0,13) 137.2 (0,75) 122.5 (0,4) 128.9 (0,14) 142.0

Optimal CAC Policy Value Iteration g∗ 225.8 133.4 138.2 119.0 128.7 111.5 108.3 111.4

1. We show tk bk for each class in order to compare the threshold effects for classes with different resource requirements. TABLE II N UMERICAL E XPERIMENTS FOR K = 4, C = 20

Parameters η 1.0

1.6

Traffic Distribution UNIF HH HL RAND UNIF HH HL RAND

Parameters b (5,4,2,1)

(1,2,4,5)

µ (5,4,2,1) (1,1,1,1) (1,2,4,5) (5,4,2,1) (1,1,1,1) (1,2,4,5)

Homogeneous Cases: (γ1 , γ2 , γ3 , γ4 ) = (8, 4, 2, 1), (b1 , b2 , b3 , b4 ) = (1, 1, 1, 1), (µ1 , µ2 , µ3 , µ4 )=(1,1,1,1) CS Threshold Policy Reservation Policy Optimal CAC Policy ICSA THD Brute-Force Search ICSA RSV Value Iteration g Mt (t1 b1 , ..., t4 b4 ) g (t1 b1 , ..., t4 b4 ) g Mr (r1 , ..., r4 ) g (r1 , ..., r4 ) g∗ 63.1→100.0 82.0→100.0 44.2→100.0 65.6→100.0 70.3→100.0 91.4→100.0 49.2→100.0 70.3→100.0

2 3 3 2 2 2 2 3

(20,20,10,0) (20,16,2,0) (20,20,13,4) (20,20,9,0) (20,16,0,0) (20,8,0,0) (20,20,11,0) (20,14,2,0)

105.9 105.1 103.4 107.9 127.7 118.8 124.5 126.8

(20,20,10,0) (20,16,2,0) (20,20,13,4) (20,20,9,0) (20,16,0,0) (20,8,0,0) (20,20,11,0) (20,14,2,0)

105.9 105.1 103.4 107.9 127.7 118.8 124.5 126.8

3 3 2 2 3 3 3 3

(0,0,2,5) (0,1,4,9) (0,0,1,3) (0,0,2,6) (0,1,5,16) (0,2,11,20) (0,0,2,8) (0,1,4,11)

107.5 106.4 106.2 109.5 130.1 122.3 127.9 129.2

Heterogeneous Cases: (γ1 , γ2 , γ3 , γ4 ) = (8, 4, 2, 1), η = 1.6, Traffic Distribution: RAND Threshold Policy Reservation Policy ICSA THD Brute-Force Search Uniformization/ICSA RSV g Mt (t1 b1 , ..., t4 b4 ) g (t1 b1 , ..., t4 b4 ) g Mr (r1 , ..., r4 ) g 47.4→100.0 2 (20,8,0,0) 160.2 (20,8,0,0) 160.2 3 (0,1,4,11) 152.9 51.0→100.0 2 (20,8,0,0) 160.7 (20,8,0,0) 160.7 3 (0,1,6,17) 152.5 46.1→100.0 2 (20,4,0,0) 181.8 (20,4,0,0) 181.8 3 (0,2,11,20) 179.3 53.3→100.0 2 (20,20,12,0) 105.4 (20,20,12,0) 105.4 3 (0,0,1,5) 104.6 61.5→100.0 2 (20,20,8,0) 108.1 (20,20,8,0) 108.1 3 (0,0,3,9) 109.3 70.7→100.0 2 (20,20,0,0) 107.1 (20,20,0,0) 107.1 3 (0,1,5,17) 108.3 CS

(0,0,2,5) (0,1,4,9) (0,0,1,3) (0,0,2,6) (0,1,5,16) (0,2,11,20) (0,0,2,8) (0,1,4,11)

107.5 106.4 106.2 109.5 130.1 122.3 127.9 129.2

Optimal CAC Policy Value Iteration g∗ 161.5 163.9 192.5 105.5 109.4 108.6

TABLE III N UMERICAL E XPERIMENTS FOR K = 16, C = 100

Parameters

η 1.0

1.6

Traffic Distribution UNIF HH HL RAND UNIF HH HL RAND

Homogeneous Cases: γ = (48, 44, 40, 36, 32, 28, 24, 20, 16, 8, 4, 2, 1, 0.5, 0.25, 0.125), bk = 1, µk = 1 CS Threshold Policy Reservation Policy ICSA THD RCS ICSA RSV HRP 1 ITV1 ITV2 ITV3 ITV4 IRV1 IRV2 IRV3 5% g g(Mt ) g(Mt ) g(Mt ) g(Mt ) g g(Mr ) g(Mr ) g(Mr ) g 1755.4→100.0 2449.0→100.0 1061.9→100.0 1738.7→100.0 1870.0→100.0 2608.8→100.0 1131.1→100.0 1827.6→100.0

107.5(3) 106.6(3) 107.5(4) 104.4(5) 149.9(4) 129.8(4) 155.6(4) 152.1(3)

107.5(7)*1 106.6(5)* 107.5(9)* 104.4(6)* 149.9(5) 129.8(5) 155.6(7) 152.1(4)

107.5(3) 106.6(3) 107.5(4) 104.4(5) 149.9(4) 129.8(4) 155.6(4) 152.1(3)

107.5(7)* 106.6(5)* 107.5(7) 104.4(6)* 149.9(5) 129.8(4) 155.6(8) 152.1(5)

101.3 101.0 101.6 100.0 147.6 129.7 150.1 147.7

107.7(5) 106.9(5) 107.8(4) 104.6(4) 150.9(6) 131.7(4) 156.7(4) 153.0(5)

107.7(16) 106.9(17)* 107.8(19) 104.6(18) 150.9(12)* 131.7(13)* 156.7(19)* 153.0(14)*

107.7(5) 106.9(5) 107.8(4) 104.6(4) 150.9(6) 131.7(4) 156.7(4) 153.0(5)

HRP 2 10% g

100.0 100.0 100.0 100.0 136.7 124.6 145.3 141.4

Heterogenous Cases: γ = (48, 44, 40, 36, 32, 28, 24, 20, 16, 8, 4, 2, 1, 0.5, 0.25, 0.125), η = 1.6, Traffic Distribution: RAND Parameters CS Threshold Policy ICSA THD ITV1 ITV2 ITV3 ITV4 b µ g g(Mt ) g(Mt ) g(Mt ) g(Mt ) (20,20,15,15,10,10,8,8,5,5,4,4,2,2,1,1) (20,20,15,15,10,10,8,8,5,5,4,4,2,2,1,1) 788.9→100.0 262.5(3) 262.5(4) 262.5(3) 262.5(4) (1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1) 852.8→100.0 242.1(3) 242.1(4) 242.1(3) 242.1(4) (1,1,2,2,4,4,5,5,8,8,10,10,15,15,20,20) 925.5→100.0 220.9(3) 220.9(4) 220.9(3) 220.9(4) (1,1,2,2,4,4,5,5,8,8,10,10,15,15,20,20) (20,20,15,15,10,10,8,8,5,5,4,4,2,2,1,1) 2273.8→100.0 111.8(4) 111.8(4) 111.8(4) 111.8(3) (1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1) 2293.5→100.0 112.5(4) 112.5(3) 112.5(4) 112.5(4) (1,1,2,2,4,4,5,5,8,8,10,10,15,15,20,20) 2250.3→100.0 112.9(4) 112.9(4) 112.9(4) 112.9(3)

100.0 100.0 100.0 100.0 143.3 127.8 152.2 148.3

RCS g 249.8 228.3 214.5 110.6 111.1 111.9

1. ‘*’ means that the returned threshold policy (or reservation policy) with this initial threshold vector (or reservation vector) is different from the returned threshold policy (or reservation policy) with ITV1 (or IRV1).

TABLE IV ROBUSTNESS C OMPARISON OF T HRESHOLD

Threshold Reservation

Threshold Reservation

Threshold Reservation

K a priori traffic demands (η10 , η20 ) = (0.8, 0.8) (t01 , t02 ) = (100, 6) (r10 , r20 ) = (0, 13) a priori traffic demands (η10 , η20 ) = (0.4, 1.2) (t01 , t02 ) = (100, 51) (r10 , r20 ) = (0, 4) a priori traffic demands (η10 , η20 ) = (1.2, 0.4) (t01 , t02 ) = (100, 0) (r10 , r20 ) = (0, 75)

AND

R ESERVATION P OLICIES

= 2, C = 100, (γ1 , γ2 ) = (5, 1), (b1 , b2 ) = (1, 1), (µ1 , µ2 )=(1,1) real traffic demands (η1 , η2 ) = (0.4, 1.2) (η1 , η2 ) = (1.2, 0.4) (η1 , η2 ) = (0.6, 0.6) g = 104.6, g ∗ = 128.9 g = 117.5, g ∗ = 122.5 g = 105.7, g ∗ = 114.2 g = 125.3, g ∗ = 128.9 g = 122.2, g ∗ = 122.5 g = 112.9, g ∗ = 114.2 real traffic demands (η1 , η2 ) = (0.8, 0.8) (η1 , η2 ) = (1.2, 0.4) (η1 , η2 ) = (0.6, 0.6) g = 103.4, g ∗ = 137.2 g = 100.0, g ∗ = 122.5 g = 102.4, g ∗ = 114.2 g = 131.4, g ∗ = 137.2 g = 117.6, g ∗ = 122.5 g = 113.5, g ∗ = 114.2 real traffic demands (η1 , η2 ) = (0.8, 0.8) (η1 , η2 ) = (0.4, 1.2) (η1 , η2 ) = (0.6, 0.6) g = 134.9, g ∗ = 137.2 g = 101.6, g ∗ = 128.9 g = 103.7, g ∗ = 114.2 g = 134.9, g ∗ = 137.2 g = 101.6, g ∗ = 128.9 g = 103.7, g ∗ = 114.2

(η1 , η2 ) = (1.0, 1.0) g = 119.3, g ∗ = 155.6 g = 154.5, g ∗ = 155.6 (η1 , η2 ) = (1.0, 1.0) g = 103.7, g ∗ = 155.6 g = 144.0, g ∗ = 155.6 (η1 , η2 ) = (1.0, 1.0) g = 155.6, g ∗ = 155.6 g = 155.6, g ∗ = 155.6