Dynamic Assignment of Flexible Service Resources - CiteSeerX

2 downloads 1046 Views 469KB Size Report
specialized resources for individual job types and a versatile resource type that .... For call center management, another context that entails dynamic service decisions, researchers have studied skills-based routing policies to assign inbound calls to .... instance, quicker response requirements as the program offering date ...
Dynamic Assignment of Flexible Service Resources

Yalçın Akçay College of Administrative Sciences and Economics, Koç University, Istanbul, 34450, Turkey [email protected] Anant Balakrishnan McCombs School of Business, University of Texas at Austin, Austin, TX 78712, USA [email protected] Susan H. Xu Smeal College of Business, Penn State University, University Park, PA 16802, USA [email protected]

Revised: June 2009

Dynamic Assignment of Flexible Service Resources Abstract Resource flexibility is an important tool for firms to better match capacity with demand so as to increase revenues and improve service levels. However, in service contexts that require dynamically deciding whether to accept incoming jobs and what resource to assign to each accepted job, harnessing the benefits of flexibility requires using effective methods for making these operational decisions. Motivated by the resource deployment decisions facing a professional service firm in the workplace training industry, we address the dynamic job acceptance and resource assignment problem for systems with general resource flexibility structure, i.e., with multiple resource types that can each perform different overlapping subsets of job types. We first show that, for systems containing specialized resources for individual job types and a versatile resource type that can perform all job types, the exact policy uses a threshold rule. With more general flexibility structures, since the associated stochastic dynamic program is intractable, we develop and test three optimization-based approximate policies. Our extensive computational tests show that one of the methods, which we call the Bottleneck Capacity Reservation policy, is remarkably effective in generating near-optimal solutions over a wide range of problem scenarios. We also consider a model variant that requires dynamic job acceptance decisions but permits deferring resource assignment decisions until the end of the horizon. For this model, we discuss an adaptation of our approximate policy, establish the effectiveness of this policy, and assess the value of postponing assignment decisions.

1. Introduction Resource flexibility, the ability of manufacturing or service resources to perform many different job types, is an important tool for firms to cope with the uncertainties in the mix and volume of demand. For given total capacity, higher flexibility permits meeting more of the demand by adjusting the deployment of the flexible resources. This ability to match demand with available capacity is critical in today’s highly competitive and dynamic environment in which firms must offer a wide variety of products and services while simultaneously ensuring high service levels and good resource utilization. The operations management literature on flexibility largely emphasizes strategic issues and insights such as the role of flexibility in handling uncertainties and the level of flexibility needed to maximize expected profits or minimize lost sales. These models simplify the operational issues of how much of each product’s demand to serve and how to deploy the available resources by assuming that these decisions are made only after observing all demand. In service settings, however, demand acceptance and resource assignment decisions must be made as demand arises, without complete information; so, harnessing the benefits of flexibility requires effective operational policies. As Maister (1997) notes, service firms often treat these decisions as routine scheduling functions and tend to underestimate their impact; however, they are critical to firm profitability since the dynamic resource deployment decisions largely govern revenue generation and resource utilization. This paper is motivated by the dynamic demand selection and resource assignment decisions facing a workplace training firm that uses instructors with varying levels of expertise to offer different types of training programs. The goal of this research is to develop effective methods to support deployment decisions for flexible resources in such settings. The workplace training industry has grown significantly in recent years, driven by the competitive imperative for firms to continuously upgrade the knowledge and skills of their employees in order to cope with the rapid technological changes and business trends such as outsourcing and globalization of markets and competition. According to the 2007 state of the industry report by the American Society for Training and Development, training expenditures of large organizations increased from 1.8% of total annual payroll costs in 1997 to 2.33% in 2006, with an average annual training expenditure of over $1,000 per employee (Paradise 2007). Outsourced training accounts for around 28% of total training expenditures, with face-to-face delivery continuing as the dominant (over 70%) form of training. These pressing corporate training needs have led to the emergence of several workplace training companies that provide on-site training for client corporations by offering shortterm educational programs on various topics such as leadership, quality, negotiation, and global management. To offer these different programs, training firms employ instructors with varying capabilities—some instructors are qualified to teach only one particular type of training program,

1

whereas others are experienced enough to teach multiple types of programs. At the strategic level, the training firm must decide what types of training programs to offer, how to price these programs, and how many flexible and specialized instructors to employ. Given these choices, we study how the firm should dynamically respond to incoming requests for training on a particular date in order to maximize expected profits. Soon after receiving a client program request for this date, the firm must first decide whether to accept or decline this job; if accepted, the firm must also decide which among the available and qualified instructors or resources to assign to the job. These job acceptance and resource assignment decisions, constituting the dynamic resource deployment problem, must incorporate the following core tradeoff. By accepting the current request, the firm foregoes the opportunity to use the resource for a more profitable future request. On the other hand, if the firm declines the current request, it faces the risk of having to later assign the resource to a less profitable program or leaving the resource idle on the training date. With flexible resources, the problem is even more challenging due to the complex interrelationships between the acceptance and assignment decisions across job and resource types. The optimal decision depends on factors such as the number and flexibility of available resources, relative profitability of different job types, and the volume, mix, and variability of future demand. Past research on flexibility has focused primarily on long-term capacity planning and optimal investment strategies for flexible resources. Fine and Freund (1990), Jordan and Graves (1995), Van Mieghem (1998), Netessine et al. (2002), Bish and Wang (2004) and others study the value of flexibility as a hedge against uncertainty. Using stylized models, they develop insights or methods to determine the optimal level of investment in flexible and specialized capacities for manufacturing firms that face uncertainty in the volume and mix of demand. Typically, these models represent the capacity planning problem as a profit maximizing two-stage stochastic program in which the first stage selects the capacity levels and the second stage allocates the realized demand to the available capacity. Since these models focus on the higher-level capacity planning decisions, they assess the value of each candidate capacity portfolio by assuming that, at the second stage, the firm can optimally select the demand to serve and the resources to assign after observing all of the demand. For service operations contexts in which job acceptance and resource assignment decisions must be made soon after each job arrives, this “perfect demand information” assumption overestimates the true profit impact of capacity choices. Our work seeks to develop methods for dynamic resource deployment without full knowledge of demand, but assuming given capacity and flexibility levels. Like the problem we study, revenue management (e.g., Talluri and Van Ryzin 2004) also deals with dynamic decisions on using resources whose output cannot be inventoried. However, our flexible resource deployment problem differs in two important ways from traditional revenue

2

management applications in service industries such as airlines, hotels, and car rental agencies (e.g., Smith et al. 1992, Cross 1997, Geraghty and Johnson 1997, Sen and Zhang 1999). First, in these revenue management contexts, the resource structure is quite specialized, eliminating or simplifying the resource assignment decisions. For instance, many airline seat allocation models assume a single versatile resource type (e.g., economy class seats) that can accommodate all demand types (e.g., fare classes), and so focus solely on demand acceptance decisions. Similarly, the stochastic knapsack problem (e.g., Ross 1995, Kleywegt and Papastravou 1998), applicable to production and transportation contexts, focuses on acceptance decisions for jobs (with varying sizes and revenues) using a single resource type. In other applications, resources have a hierarchical structure that permits full downward resource substitution (e.g., a premium car can be downgraded to satisfy a request for a standard or compact car). In contrast, the resource flexibility structure in training firms is much more complex. Resources are not identical nor do they have a complete ordering in terms of their flexibility because different resource types can have arbitrary overlaps in their service capabilities. Second, in some revenue management contexts, firms can regulate demand by changing prices dynamically, whereas pricing decisions for training services are strategic (e.g., based on competitive factors or long-term client relationships) and therefore fixed during the operational decision horizon. For call center management, another context that entails dynamic service decisions, researchers have studied skills-based routing policies to assign inbound calls to multi-skilled agents who can handle different call types (Gans, Koole, and Mandelbaum 2003 and Aksin, Armony, and Mehrortra 2007 provide comprehensive surveys of call center models). However, call centers are also quite different from our problem context. First, incoming calls require service as soon as possible, and queue up until the next available agent is assigned to the call. The time to process each call is stochastic, and agents can handle multiple calls in sequence during each time interval. In contrast, for our problem setting, each accepted job requires service on the specified training date at the end of the horizon, and each resource can handle only one job. Second, most skills-based routing models focus on resource assignment and do not simultaneously consider call admission, whereas our problem requires joint consideration of job acceptance and resource assignment decisions. Third, call center models focus on minimizing expected waiting time or customer abandonment rate, while our objective is to maximize the expected profit. Queuing models provide a natural representation of call centers, and stochastic dynamic programming is a standard approach for determining optimal routing policies in Markov queues. Because the state space is prohibitively large for practical problems, researchers have only studied optimal call-routing policies for simplified systems with two call types and the following three canonical designs (Garnett and Mandelbaum 2001): V-design systems consisting of a single pool of crossed-trained agents (e.g., Gans and Zhou 2002, Bhulai and Koole

3

2003, Armony and Maglaras 2004a and 2004b), N-design systems containing two pools of resources, one dedicated to one of the job types and the other capable of handling both job types (e.g., Xu, Righter, and Shanthikumar 1992), and M-design systems having two pools of specialized resources and a common pool of flexible agents (e.g., Ormeci 2004). For systems with more than two call types and more complex resource flexibility structures, researchers have proposed various heuristic methods, such as static priority policies (Stanford and Grassman 2000 and Shumsky 2004) and schemes based on age factors for different call types (Perry and Nilsson 1992). The process of booking training programs also differs from contexts such as airline reservations and call centers because it is a business-to-business interaction. For instance, clients seeking training programs are willing to wait a few days for a response, permitting the training firm to decide on pending requests periodically rather than responding instantaneously to each request. Further, the general resource flexibility structure necessitates simultaneous consideration of job acceptance and resource assignment decisions (to ensure that a set of accepted jobs is feasible, given the number and mix of available resources). In Section 2, we formulate the flexible resource deployment problem as a profit-maximizing discrete-time stochastic dynamic program, and discuss a deterministic model to generate upper bounds on the optimal profits. We focus on situations that require the firm to decide both job acceptance and resource assignments in each period, but later discuss a variant in which the resource assignment decisions can be deferred until the end of the decision horizon. For either case, since the state space is very large, finding the exact policy by solving the associated Bellman equations is impractical. Section 3 analyzes the special case of a system containing specialized resources for each job type and a versatile resource type that can perform all job types, with no more than one job arriving in each period. For this setting, we show that the exact policy uses a threshold rule and identify some principles underlying this policy. In Section 4 we develop three approximate policies for job acceptance and resource assignment for general resource flexibility structures and job arrival processes. The first two policies, called Deterministic Capacity Allocation and Nested Capacity Reservation, explicitly account for the resource flexibility structure and demand stochasticity, respectively, while the third approach, called Bottleneck Capacity Reservation, jointly considers both dimensions. Section 5 reports the results of our extensive computational tests to assess the effectiveness and robustness of these policies under various operating environments, such as different levels of resource availability and flexibility. We benchmark the performance of the approximate policies against both a first-come first-served approach and the exact policy or perfect-information solution (which provides an upper bound on exact profit). Our computational results, for 25 different problem scenarios, with numerous replications for each, consistently confirm that our dynamic

4

policies are very effective. For problem instances that we could also solve exactly, the profits from our best policy deviates from optimality by only 1.28% on average (and a maximum of 2. 22%) compared to 18.79% for the first-come first-served policy. Section 6 discusses the model variant when acceptance decisions must be made in each period but resource assignments can be deferred until the end of the decision horizon. We present the modified problem formulation for this variant, demonstrate how to adapt one of our approximate policies, and report computational results both to demonstrate the effectiveness of this policy and to assess the value of deferring assignments. Section 7 concludes the paper.

2. Problem Definition and Formulation Consider a firm that offers several different service types using resources with varying levels of flexibility. In the workplace training context, the service types correspond to different training programs that the firm offers, and the resources are instructors who can teach one or more of these programs. Client requests for training programs arrive randomly, each specifying the type of program needed and the date on which the program is to be offered. We focus on the demand acceptance and resource assignment decisions for program requests for a particular date in the future. We refer to requests for training on this service date as jobs; the decision horizon is the time interval during which clients can request training on this date. Every accepted job requires one resource with the appropriate capabilities, and a resource cannot be assigned to more than one job. Typically, each client requesting a training program expects a prompt, but not necessarily immediate, response from the training firm on whether the firm is willing to conduct the program. Depending on the response times that clients expect, the training firm may be able to accumulate a few requests before deciding which programs to accept. Thus, unlike real-time business-to-customer interactions such as airline reservation systems or call centers that may require real-time policies, the process of booking training programs is a business-to-business interaction for which a discrete-time framework is appropriate. 1 This framework has the advantage of providing some latitude for modeling different situations. For instance, by making the period length small enough so that no more than one request arrives in each period, we can model immediate acceptance decisions for each individual request, whereas increasing the period length permits us to assess the benefits of demand accumulation. We can also vary the period length within the decision horizon to represent, for instance, quicker response requirements as the program offering date approaches. Turning to the frequency and timing of the resource assignment decisions, workplace training often involves some

1

Even for problem contexts that require real-time response, the research literature, including papers on revenue management, often uses discrete-time model formulations for tractability and convenience (see, for instance, Talluri and van Ryzin 2004 and references therein).

5

customization of the training program to the client firm. So, client firms as well as instructors prefer to have as much lead time as possible before the program date in order to prepare for the engagement. In this situation, the training firm makes acceptance and assignment decisions concurrently, i.e., when the firm accepts a program request it also assigns a specific instructor to the job rather than deferring the assignment decisions until the end of the decision horizon. Accordingly, our initial model incorporates dynamic decisions for both job acceptance and resource assignment; in Section 6, we address the model variant in which assignment decisions may be deferred until the end of the decision horizon. As we explain later, deferring the assignments does not reduce problem difficulty since making dynamic job acceptance decisions requires simultaneously developing a resource allocation plan, albeit tentative, to ensure that the acceptance decisions are feasible. The firm’s goal is to maximize the expected total net revenue over the decision horizon. In making the job acceptance and resource assignment decisions in each period, the firm must consider the probability distribution of future demand, the availability of instructors and their flexibility, and the profitability of different types of programs. We next introduce our notation and formulate the decision problem as a stochastic dynamic program. 2.1 Stochastic dynamic programming formulation Given the set of available resources and their capabilities, and the demand distribution and profit margin for each job type, the flexible resource deployment problem entails deciding which incoming jobs (service requests) to accept in each period before the service date and what resource to assign to each accepted job in order to maximize the total expected profit. We model this problem as a stochastic dynamic program in which stages correspond to periods of the finite decision horizon. Let J = {1, 2, …, m} be the set of m different job types offered by the firm. Each accepted job of type j ∈ J yields a profit margin of πj to the training firm. We index the job types in decreasing order of profitability, i.e., π 1 ≥ π 2 " ≥ π m . Consistent with revenue management models that seek to maximize profits using a given set of resources, we assume that the profit margin of a job is the same regardless of the resource type assigned to this job. (Later, we selectively discuss how some of our results and methods extend to problems with resource-dependent profit contributions.) Let R = {1, 2, …, l} denote the set of l different resource types that the firm employs, each capable of performing a specified subset of job types. For each resource type r ∈ R, let J r ⊆ J denote the subset of job types that resource type r can perform. Conversely, for each job type j ∈ J, let R j ⊆ R be the subset of resource types that can perform type-j jobs. The resource capability sets Jr represent the firm’s resource flexibility structure. We refer to any resource type that can perform only one job type as a specialized resource; at the other extreme, a flexible resource that can perform all job types is called a versatile resource. We are primarily concerned with solving problems with general resource

6

flexibility structures containing some flexible resource types with overlapping but non-dominated capabilities, but we do consider certain special flexibility structures to characterize the optimal policy and highlight the challenges of solving problems with general flexibility structure. The definition of job and resource types, with the associated capability sets, permits modeling a wide variety of situations. For instance, if profit margins vary by customer class or if certain customers have specific resource preferences, we can capture these features by defining separate job types for such customers. Similarly, although price is not a decision variable, we can incorporate different pre-determined price categories (analogous to fare classes in airline reservations) for the same service by introducing one job type corresponding to each price. We can also include additional costs (e.g., goodwill loss) for not accepting jobs. Specifically, if αj denotes the penalty cost for rejecting a type-j job, we can account for this cost by redefining the parameter πj for type-j jobs as the profit margin for this job type plus the penalty αj. We consider dynamic decisions over a decision horizon consisting of T periods, indexed backwards from t = T to 1. Period T denotes the start of the decision horizon, and period 1 is the last period for customer requests prior to the service date. As noted earlier, the choice of period length, which can vary during the decision horizon, depends on the desired customer response lead time and determines the extent to which demands can be accumulated before making acceptance and assignment decisions. In the limit, by choosing very short period lengths, we can model situations in which the firm must make acceptance and assignment decisions as soon as each job arrives. For r ∈ R and t = 1, 2, …, T, let nrt be the number of available (unassigned) type-r resources at the start of period t, and define nt = {n1t, n2t, …, nlt} as the resource availability vector at time t. The firm’s initial resource capacity, at the start of the decision horizon, is nT = {n1T, n2T, …, nlT}. Let Djt be the random demand for type-j jobs in period t; Dt = {D1t, D2t, …, Dmt} is the period-t demand vector. Let dt = {d1t, d2t, …, dmt} be the vector of realized demands djt for type-j jobs in period t. P(Dt = dt) is the joint probability that the demand in period t is d t . We do not assume any particular functional form for the demand distribution (demand can even be non-stationary, arrive in batches, or correlated among job types), but require demands across time periods to be independent. The stochastic dynamic programming formulation of the flexible resource deployment problem consists of T stages, one for each period t = 1, 2, …, T. The resource availability vector nt = {nrt} and realized demand vector dt ={djt} describe the state of the system in period t. Define ut(nt, dt) as the maximum expected profit-to-go at stage t for the state (nt , dt ) . Then, we can express the value function vt(nt) associated with having nt resources at the start of period t, before observing the actual demand in this period, as follows:

vt (n t ) = ∑ ∀d ut (n t ,d t )P(Dt = d t ) . t

(2.1)

7

The profit-to-go ut(nt, dt) depends on the job acceptance and resource assignment decisions in period t, after observing the actual demand dt, and the value function for the remaining unassigned resources at the end of the period. For each j ∈ J and r ∈ Rj, let xjrt be the number of type-r resources assigned to type-j jobs in period t after observing demand. These assignment decisions must satisfy: (i) the demand constraints



r∈R j

x jrt ≤ d jt for all j ∈ J, specifying that we cannot assign more

resources to any job type than the actual demand for that job type in period t, and



(ii) the resource availability constraints

j∈J r

x jrt ≤ nrt for all r ∈ R, ensuring that we do not

assign more resources than available for any resource type. Let FAt (nt , dt ) = {x jrt ≥ 0 ∀j ∈ J , r ∈ R j : ∑ r∈R x jrt ≤ d jt ∀j ∈ J , ∑ j∈J x jrt ≤ nrt ∀r ∈ R} denote the j

r

action set containing all the feasible assignments for resources in period t, given the state (nt , dt ) . Then, using ek to denote the kth unit vector, we can express the maximum profit-to-go function ut(nt,

dt), for all periods t = 1, 2, …, T, as follows:

ut (nt , dt ) =

max

{ x jrt }∈FAt ( nt ,dt )

{∑

j∈J



r∈R j

}

π j x jrt + vt −1 (nt − ∑ j∈J ∑ r∈R x jrt e r ) . j

(2.2)

That is, the profit-to-go ut (nt , dt ) is the sum of the rewards collected in period t plus the maximum expected profit over the remaining time horizon. Given the firm’s starting resource capacity nT, the maximum expected profit over the decision horizon is vT(nT). Computing this value entails recursively applying (2.2) from t = 1 to t = T, with the boundary condition v0(n0) = 0 for all n 0 ≤ nT , specifying that the value of any unassigned resources left over at the end of the horizon is zero. Finding the exact policy, i.e., the profit-maximizing dynamic policy for optimal job acceptance and resource assignment decisions at each period, is computationally intractable because the state space grows exponentially with the number of jobs, resources, and time periods. Next, we consider a deterministic version of the problem that provides upper bounds on the maximum expected profit for the stochastic problem and also underlies approximate policies. 2.2 Deterministic acceptance and resource assignment problem

Suppose, at period t, we know with certainty the cumulative future demand cdjt of each job type j ∈ J over the remaining t periods, and let cdt = {cd1t , cd 2t ," , cd mt } . Then, given the set of available

resources nt, the resource deployment decision reduces to the following deterministic demand acceptance and resource allocation problem, with decision variables yjr denoting the total number of type-r resources assigned to type-j jobs in the remaining t periods.

U t (nt , cdt ) = Max subject to:

∑y

r∈R j

jr

≤ cd jt

∑∑π j∈J r∈R j

for all j ∈ J,

j

y jr

(2.3)

(2.4)

8

∑y j∈J r

jr

y jr ≥ 0

≤ nrt

for all r ∈ R, and

(2.5)

for all j ∈ J, and r ∈ Rj.

(2.6)

Given the solution y = {yjr} to this problem, we can determine the job acceptance and resource assignments xjrt', for each period t' = t, …, 1, in many different ways (e.g., prefer accepting jobs that arrived earlier, and assign the allocated resource types in arbitrary order) to satisfy, for all j ∈ J, the demand requirements



r∈R j

x jrt ' ≤ d jt ' and resource allocations



1

t '= t

x jrt ' = y jr for all r ∈ Rj.

Since this model assumes that future demand is known, we refer to it as the Deterministic demand selection and resource assignment (DD) model. We can transform the optimization problem (2.3) –

(2.6) into a single-commodity minimum cost network flow problem defined over a network containing m job nodes, l resource nodes, and an additional source node. Since the minimum cost network flow model has an integer optimal solution if demands and capacities are integer-valued, we do not need to explicitly impose integrality restrictions on the y-variables. The DD problem is easy to solve for certain special resource flexibility structures. Consider, for instance, a system containing m resource types, indexed as r = 1, 2, …, m, with resource type r capable of performing job type j = r and all less profitable (higher-indexed) job types, i.e., with J r = {r , r + 1,..., m} . Thus, lower-indexed resource types are more flexible than higher-indexed resource types. We refer to this special resource flexibility configuration as the Downward flexibility structure. For this structure, a greedy procedure that sequentially considers job types in order of

increasing index j (decreasing profitability), and accepts as many jobs as possible using the remaining resources yields the optimal solution to the DD problem. In contrast, with general flexibility structure, since resources do not have a complete ordering in terms of their flexibility, making demand acceptance and resource assignment decisions even with known demand requires solving the linear program (2.3) – (2.6). The difficulty is compounded when we consider dynamic decisions with unknown future demand. For the stochastic problem, the DD model, applied using appropriate point estimates of demand, may serve to guide resource allocation decisions for approximate policies. Moreover, as we discuss next, this model is also useful for generating upper bounds on the optimal value of the stochastic problem. First, suppose we solve this model for every possible realization cdt of the stochastic cumulative demand vector CDt = {CD1t , CD2t ," , CDmt } , and obtain the corresponding optimal value Ut(nt, cdt). Multiplying this value by the probability of realizing the cumulative demand vector cdt

and summing over all demand scenarios gives the expected perfect-information value E[Ut(nt, CDt)]. This value is an upper bound on the expected profit vt(nt). Next, for each j ∈ J, suppose we replace cdjt in the demand constraint (2.4) with the expected value of the cumulative demand E(CDjt) for typej jobs from period t to the end of the decision horizon. We refer to this version of the DD model as

9

the expected demand model. Let U tLP (nt , E (CDt )) be the optimal value of its linear programming relaxation. The following proposition relates this value to the expected perfect-information value E[Ut(nt, CDt)] and the profit vt(nt) for the exact dynamic policy.

Proposition 1. U tLP (n t , E (CDt )) ≥ E[U t (n t , CDt )] ≥ vt (n t ) . Proof. The optimal value Ut(nt, cdt) of the linear program (2.3) – (2.6) is concave in cdt. Therefore, applying Jensen’s inequality, we get U tLP (n t , E (CDt )) ≥ E[U t (n t , CDt )] . Moreover, given the actual demand realization vector cdt, the optimal value Ut(nt, cdt) equals or exceeds the profit obtained using any dynamic policy (with unknown demand). Hence, the expected perfect-information value is no less than the expected profit using the exact dynamic policy, i.e., E[U t (n t , CDt )] ≥ vt (n t ) . Š Proposition 1 states that the optimal value of the expected demand model is an upper bound on the expected perfect-information value which in turn overestimates the expected profit of the exact dynamic policy. Since finding the exact policy is intractable, Proposition 1 provides a means to assess the effectiveness of approximate policies by comparing their expected profits to either of these upper bounds. Bertsimas and Popescu (2003) developed similar bounds for a network revenue management problem. Next, we analyze the structure of the exact dynamic policy for a special case, and later (in Section 4) develop approximate policies for the general problem.

3. Analysis of Exact Policy with Specialized and Versatile Resources Consider a system containing m specialized resource types, one for each job type, plus a versatile resource type that can perform all job types. We refer to this configuration of resource capabilities as the Star flexibility structure; it generalizes the M-design structure (for two job types) that researchers have previously studied in other contexts (e.g., Ormeci 2004). For a system with this resource structure, we characterize the exact policy for the problem with unit job arrivals, i.e., with at most one job arrival in each period t. For j = 1, 2, …, m, let pj denote the probability that a type-j job arrives in a period; the likelihood of having no demand in a period is p0 = 1 − ∑ j =1 p j . In the limiting case with m

an infinitesimal period length, this demand model corresponds to job arrivals generated by independent Poisson processes for the m job types. Researchers have used analogous discrete time demand models with unit arrivals for other revenue and resource management settings (e.g., Talluri and van Ryzin 2004, Kleywegt and Papastavrou 1998). For the Star flexibility structure, we index the versatile resource type as r = (m + 1) and the specialized resource types from 1 to m such that, for r = 1, 2, …, m, resource type j is capable of performing only job type j = r . For notational convenience, we omit the subscript t for the resource state, letting n = {nr} denote the vector of available resources at the start of period t. If a type-j job arrives in period t, we abbreviate the state of the system as (n, j). Let ut(n, j) be the maximum

10

expected profit-to-go in this state. The maximum expected profit vt(n) before a request (if any) arrives in period t is:

vt (n) = ∑ j =1 p j ut (n, j ) + p0 vt −1 (n) . m

(3.1)

If a type-j job arrives in period t, the following equation defines the profit-to-go function ut(n, j): ⎧π j + vt −1 (n − e j ) ⎪π + v (n − e ) ⎪ 1 t −1 m +1 ut (n, j ) = ⎨ + π max{ v ( j t −1 n − e m +1 ), vt −1 (n )} ⎪ ⎪⎩vt −1 (n)

if n j > 0, if j = 1, n1 = 0 and nm +1 > 0, if 1 < j ≤ m, n j = 0 and nm +1 > 0, and if n j = nm +1 = 0.

(3.2)

Equation (3.2) states that, for any incoming job type j, if a type-j specialized resource is available, the exact policy accepts the job and assigns a specialized resource. Otherwise, if j = 1 (i.e., the arriving job is the most profitable type) and a type-(m+1) versatile resource is available, then the policy accepts and assigns a versatile resource to this job. For j > 1, if the type-j resource is exhausted (i.e., nj = 0) but versatile resources are available, the exact policy accepts and assigns a versatile resource to the type-j job only if π j + vt −1 (n − e m +1 ) ≥ vt −1 (n) . Define Δ t (n) = vt (n + e m +1 ) − vt (n) as the marginal value (before observing the demand in period t) of adding one more versatile resource to the resource state n. Therefore, when nj = 0 for j > 1, the exact policy assigns a versatile resource to an incoming type-j job only if π j ≥ Δ t −1 (n − e m +1 ) , i.e., this job’s profit margin πj exceeds the marginal value of a versatile resource at resource state (n – em+1) in period (t – 1). Next, we develop some structural properties of the profit-to-go function. First, we can show that

vt (n) is increasing in nr, for r = 1, 2, …, m, m+1, and decreasing in t. Further, the value function satisfies the following second order properties. Lemma 2. For the Star flexibility structure with unit job arrivals, (1) Δ t (n) = vt (n + e m +1 ) − vt (n) is non-increasing in nr, for any r = 1, …, m, m+1; (2) Δ t (n) is non-decreasing in t; and, (3) Δ t (n + e r ) ≥ Δ t (n + e m +1 ) for any r = 1,…, m. Proof: See Appendix. The results of Lemma 2 have intuitive appeal. Parts (1) and (2) show that the value of an additional versatile resource is lower when more resources are available, i.e., in systems with abundant capacity, or as the end of the decision horizon approaches. Part (3) implies that this resource is worth more in a state containing a specialized resource instead of a versatile resource. Let ns = (n1, n2,…, nm) be the availability vector of specialized resources in period t; the system resource availability vector is n = (ns, nm+1). For any state n with nj = 0, for some j > 1, and for t = 1, 2, …, T, we define a threshold function Ft j (ns) = min{nm +1 : Δ t −1 (n − e m +1 ) ≤ π j } . Since the exact

policy accepts a type-j job only if π j ≥ Δ t −1 (n - e m +1 ), for n j = 0 and j > 1, , the definition of Ft j (ns) 11

implies that, if type-j resources are exhausted, the exact policy assigns a versatile resource to an incoming type- j job only if nm +1 ≥ Ft j (ns) . The next proposition summarizes the properties of this state-dependent threshold function Ft j (ns) . Proposition 3. For the Star flexibility structure with unit job arrivals, let n be the state of the system in period t, with nj = 0 for j > 1. (1) Ft j (ns) is non-increasing in nk , for any specialized resource type k ≠ j and fixed t; (2) Ft j (ns) is non-decreasing in t, for any specialized resource vector ns ; and, (3) if the exact policy accepts a type-j job in state (n + e k , j ) in period t, then it must also accept this job in state (n + e m +1 , j ) in period t, for any specialized resource type k and period t. Proof. See Appendix.

Since Ft j (ns) is non-increasing in nk , k ≠ j , for fixed t, and non-decreasing in t for fixed ns , the exact policy relaxes the type-j job acceptance criterion when the number of specialized resources for other job types increases or as the end of the decision horizon approaches. Moreover, the exact policy accepts a type-j job more readily in states with higher proportion of versatile resources, i.e., the threshold is lower in state (n + em+1) than in state (n + ek), for any specialized resource type k. When there are only two job types, Ft 2 (n1 ,0) specifies a switching curve, or a state-dependent threshold function, that is non-increasing in n1 and non-decreasing in t. Results analogous to Proposition 3’s threshold characterization of the exact policy also arise in other dynamic decision settings. For instance, Lautenbacher and Stidham (1999) prove the optimality of the threshold structure for acceptance decisions in the single-leg airline yield management problem. More generally, our threshold policy and its monotone structure are consistent with policies for other dynamic decision contexts that are characterized by state-dependent critical numbers or switching curves (e.g., Puterman 2005). However, no one has previously characterized exact policies for the flexible resource deployment problem with the Star resource structure, and previous results on threshold policies do not directly apply or extend to our problem. Our results regarding the threshold structure of the exact policy also apply to the following generalization of the model. Suppose, for each job type j, the profit margin depends on the resource type (specialized or versatile) assigned to that job type. Let πj,j and πj,m+1 denote, respectively, a type-j job’s profit margin if we assign a specialized or versatile resource to this job. We index the job types so that π 1,m +1 ≥ π 2 ,m +1 ≥ ... ≥ π m,m +1 . Then, Lemma 2 and Proposition 3 remain valid if π j , j ≥ π j ,m +1 , for all j, i.e., if we prefer using a specialized resource, if available, rather than a versatile resource for each job type. The analysis of dynamic resource deployment with the Star resource structure provides some useful policy insights that we can use to address the general flexible resource deployment problem.

12

The time and state-dependent threshold value Ft j (ns) represents the amount of versatile capacity that

we wish to reserve for future arrivals of jobs that are more profitable than type-j jobs; we accept an arriving type-j job only if the available resources exceeds this threshold. We later extend this approach to problems with general flexibility structure by first determining how many resources to reserve for future jobs of each type, and then deciding which specific resource types to hold. The development of the exact policy for the Star resource structure has also highlighted the following characteristics of a “good” dynamic resource deployment policy that we later adapt. The policy: (i) uses less-flexible (e.g., specialized) resources first before using more-flexible resources; (ii) favors not using flexible resources for less profitable job types at the beginning of the time horizon, or when the specialized capacity for more profitable job types is low; and, (iii) tends to preserve flexible resources for future use when flexible resource capacity is tight.

4. Approximate Dynamic Resource Deployment Policies For dynamic flexible resource deployment problems with general resource flexibility structure, since the associated stochastic dynamic program (2.1) and (2.2) is difficult to solve even for moderate-sized problems, we focus on developing good approximate policies. Problems with general flexibility configurations are significantly more difficult than those with special flexibility structures, such as Star or Downward flexibility, due to the lack of a clear hierarchy among resources in terms of their flexibility. For instance, with Downward flexibility, lower-indexed resource types (using the resource indexing scheme of Section 2.2) are more flexible since they can perform all the job types that any higher-indexed resource type can perform. Therefore, for each accepted job, it is optimal to assign the highest-indexed available resource type that can perform this job. With general flexibility structures, we do not have such a complete ordering of resource types. Moreover, the value of a resource depends on the tightness of capacity for the different job types that this resource can handle. For instance, if the anticipated demand for type-3 jobs is high (and there are no available specialized resources for this job type) whereas demand for type-2 jobs is low, then a flexible resource that can perform type-1 and type-3 jobs is more valuable than one that can perform type-1 and type-2 jobs even though type-2 jobs have higher unit profits than type-3 jobs. Further, due to the arbitrary overlaps in resource capabilities and resource “chaining” effects that we discuss later, using a particular resource for an accepted job affects the remaining capacities not only for the job types that this resource can perform but also for other job types. That is, rather than a single job type having tight capacity or a single resource type being the bottleneck, a group or set of resource types can constitute the bottleneck for an associated set of job types; further, these sets can vary from period to period depending on the demand realizations. Alternative ways to account for these complex interactions lead to different approximation approaches.

13

We propose three approximate policies—Deterministic Capacity Allocation, Nested Capacity Reservation, and Bottleneck Capacity Reservation policies—that differ in the principles and methods they use to make job acceptance and resource assignment decisions after observing the demand in each period. Analogous to the threshold criterion of the exact policy for the Star flexibility structure with unit job arrivals, our approximate policies first determine the desired level of capacity to be reserved for future demand. Since we have a choice of alternative flexible resource types that can perform each job type, we must also develop a tentative plan for the future allocation of resources to job types. Our three approaches differ in the methods they use for these two decisions. The Deterministic Capacity Allocation policy accounts well for the general flexibility structure, but ignores randomness in future demand. The Nested Capacity Reservation policy, on the other hand, incorporates demand stochasticity to reserve capacities in nested fashion, but does not fully account for the arbitrary overlaps in resource capabilities. The Bottleneck Capacity Reservation policy takes into account both demand stochasticity and flexibility structure. The first two policies employ linear programs to assess the match between demand (current and future) and available capacity in order to guide job acceptance decisions, and to choose an appropriate resource to perform each job accepted in the current period. In contrast, the Bottleneck Capacity Reservation method simultaneously considers job acceptance and resource assignment decisions. We next motivate and develop the three policies; Section 5 presents computational results to assess the effectiveness of these policies. 4.1 Preliminaries and notation

To streamline the presentation of our dynamic resource assignment policies, we introduce some additional notation and terminology. For each job type j ∈ J, let sj be the index of the specialized resource type that can perform only this job type, and let RFj = Rj\{sj} be the index set of flexible resource types that can perform type-j jobs. RF = R\{sj ∀ j ∈ J} is the index set of all flexible resource types that can each perform more than one job type. As we noted in Section 3, whenever specialized type-sj resources are available, it is optimal to accept and assign arriving type-j jobs to these resources. Accordingly, after observing the demand in every period, each of our policies first assigns any available specialized resources to the incoming jobs. So, if djt denotes the actual demand for type-j jobs and ns j t is the number of available type-sj resources at the start of period t, then this initial “Assign and update specialized resources” step accepts x*js j t = min{d jt , ns j t } units of the incoming type-j jobs to specialized type-sj resources, and updates ns j ,t −1 ← ns j t − x*js j t for every job type j ∈ J. Let d +jt = d jt − x*js j t denote the remaining or

residual demand for type-j jobs in period t. With CDjt denoting the (random) cumulative demand for type-j jobs from period t onwards, define CD +jt = max(0, CD jt − ns j t ) as the residual cumulative

demand from period t onwards after we use up all the available type-sj specialized resources.

14

4.2 Deterministic Capacity Allocation (DCA) policy The Deterministic Capacity Allocation (DCA) policy addresses the challenge of determining which subset of resource types will jointly constrain future job acceptances by using the expected demand model of Section 2.2 to plan a tentative allocation of resource types to current and future demand. This linear program captures well the capacity interactions among resource types within a general resource flexibility structure, but does not consider demand stochasticity. However, from Proposition 1 we know that the optimal value of the expected demand model is an upper bound on the true value function for the stochastic problem. For other stochastic dynamic programming problems, researchers (e.g., Bertsekas and Tsitsiklis 1998, Bertsimas and Popescu 2003) have previously proposed and successfully applied a similar tactic of replacing the true value function with a deterministic approximation obtained by assuming fixed values for the stochastic variables. For the dynamic flexible resource deployment problem, the DCA policy consists of two main steps in each period t. In the first step, we apply the deterministic optimization model, using the number of residual jobs on hand d +jt plus the expected value of residual cumulative demand

E (CD +j ,t −1 ) from the next period (t – 1) onwards as type-j demand. The solution to this profitmaximizing problem provides a proposed mix of job types to be accepted using the available resources. Since this model combines current and future demands and does not distinguish between using alternative resource types, we employ a second optimization step to decide how many of the

current jobs to accept and which resources to assign to these accepted jobs. This latter model uses shadow price information from the previous solution to identify and preserve “valuable” resources for future periods while accepting the same total number of jobs as the resource allocation plan from the first step. A formal description of the DCA policy follows.

Deterministic Capacity Allocation (DCA) policy:

Step 0: Assign and update specialized resources. Step 1: Solve the following Demand Selection Problem [DSP], with decision variables yjr denoting the number of type-j jobs assigned to type-r flexible resources in current and future periods.

Max

[DSP]

∑ ∑ j∈J

r∈RF j

π j y jr

(4.1)

subject to:

∑ ∑

r∈RF j j∈J r

y jr ≤ d +jt + E (CD +j ,t −1 )

y jr ≤ nrt y jr ≥ 0

for all j ∈ J,

(4.2)

for all r ∈ RF, and

(4.3)

for all j ∈ J and r ∈ RFj.

(4.4)

Let { y*jr } be the optimal solution to this linear program, and define ξ j = ∑ r∈RF y *jr as the j

total number of type-j jobs that this solution accepts and assigns to flexible resources from period t onwards. Let σrt be the optimal shadow price of resource constraint (4.3) for each resource type r ∈ RF.

15

Step 2:

Solve the following Resource Assignment Problem [RAP], using decision variables xjrt and wjr for the number of type-r resources to assign to current and future type-j jobs, respectively, for all j ∈ J and r ∈ RFj. [RAP]

Max

∑ ∑ j∈J

r∈RF j

(π j − σ rt ) x jrt

(4.5)

subject to:

∑ ∑ ∑ ∑

r∈RF j

x jrt ≤ d +jt

for all j ∈ J,

(4.6)

r∈RF j

w jr ≤ E (CD +j ,t −1 )

for all j ∈ J,

(4.7)

for all r ∈ RF,

(4.8)

for all j ∈ J, and

(4.9)

j∈J r

( x jrt + w jr ) ≤ nrt

r∈RF j

( x jrt + w jr ) = ξ j

x jrt , w jr ≥ 0

for all j ∈ J, and r ∈ RFj.

(4.10)

Let {x*jrt } denote the optimal x-values for this linear program. For each job type j ∈ J and resource type r ∈ RFj, assign x*jrt current type-j jobs to type-r resources, and update the number of available resources for the next period, i.e., set nr ,t −1 = nrt − ∑ j∈J x*jrt , for every r

flexible resource type r ∈ RF. The resource assignment model’s objective function (4.5) maximizes the net profit from jobs accepted this period, where we define the net profit of assigning a type-r resource to a type-j job as the profit margin πj of the job minus the shadow price σrt of the resource. Constraints (4.6) and (4.7) specify that we cannot assign more than d +jt and CD +j ,t −1 resources, respectively, to current and future jobs of each type. Constraints (4.8) are resource capacity constraints, while constraints (4.9) ensure that the total number of accepted jobs (for current and future demands) for each job type j equals the value ξj chosen by the optimal solution to model [DSP]. The model judiciously assigns resources to current jobs, preserving resource types with high value (shadow price) for future use. For a problem with m job types and l resource types, we can represent both problem [DSP] and problem [RAP] as minimum cost network flow problems defined over appropriate networks with O(m+l) nodes and

O(ml) arcs. The assignment of resources to current jobs in Steps 0 and 2 require O(ml) effort. The overall complexity of the DCA method is governed by the time to solve the network flow problems which is O{(ml)2(m+l)log(m+l)} per period (Orlin 1997). For problems in which job profitability varies with the resource assigned to the job, we can readily adapt the DCA method by replacing the parameters πj in the objective functions (4.1) and (4.5) of the optimization models [DSP] and [RAP] with resource-dependent profit margins.

4.3 Nested Capacity Reservation (NCR) policy For the Star flexibility structure that we analyzed in Section 3, the exact policy accepts an incoming type-j job only if the number of available flexible resources at any period t exceeds a statedependent threshold value. This value represents the amount of versatile capacity we wish to reserve

16

to meet the future demand for all the job types that are more profitable than type-j jobs. The Nested

Capacity Reservation (NCR) policy extends this approach to general resource structures. Specifically, for each job type j, we first determine the desired total capacity (of all capable resource types) to be reserved for future demand (after period t) of all job types that are more profitable than type-j jobs, and then determine which specific resource types to reserve. We refer to this capacity reservation scheme as a nested policy since, instead of reserving capacities separately for each job type j, it pools the capacity needed for the more-profitable job types 1, 2, …, (j – 1). Determining the exact threshold function, i.e., optimal capacity reservations, for this nested scheme is an intractable problem. We, therefore, use an approximate newsvendor-like method to determine the capacity reservations, and then solve an optimization problem to decide which specific resource types to reserve for future demand. The capacity reservation choice depends on the tradeoff between accepting a current job to capture its profit versus preserving the resource for possible future use to perform a more profitable job. Since this tradeoff is analogous to the balance between overage and underage costs in the traditional newsvendor problem, we use a critical fractile approach to determine the desired capacity reservations. Researchers have previously used the newsvendor principle to develop approximate policies for other revenue management problems. For instance, to allocate a common pool of seats (single resource type) among multiple fare classes (job types), Belobaba (1989) proposed setting the booking limit for each fare class using a newsvendor model. To motivate our capacity reservation method, we first explore the capacity reservation decision for a two-period problem with two job types, and three corresponding resource types – specialized type-1 and type-2 resources for job types 1 and 2, respectively, and versatile type-3 resources. In the first period (t = 2), after observing the demands for type-1 and type-2 jobs, the optimal policy accepts as many type-1 jobs as possible using available type-1 and type-3 resources, and assigns any available specialized type-2 resources to the incoming type-2 jobs. Let n3 > 0 be the number of remaining versatile type-3 resources available after these initial assignments. (To simplify the notation, we omit the time index for the state variables.) We wish to determine how many of these n3 resources to reserve for future type-1 jobs; this decision will determine how many of the current residual type-2 jobs to accept and assign to versatile resources (in the first period). Let d 2+ > 0 denote the residual type-2 jobs after assigning specialized type-2 resources. (If d 2+ = 0 , then all n3 versatile resources are carried forward to the next period.) To focus on the tradeoff between using available versatile resources to process current type-2 jobs versus future type-1 jobs, we consider the most adverse situation in which only type-1 jobs arrive in the second period (t = 1). Let D1 denote the unknown demand for type-1 jobs in the second period, and define D1+ = ( D1 − n1 ) + as the residual type-1

17

demand after assigning the n1 specialized type-1 resources carried forward from the first period. For our analysis, we assume that D1+ is a continuous random variable, and let G1+ (⋅) be its cumulative distribution function. If we decide to reserve Q versatile resources for use in the second period, with

n3 − d 2+ ≤ Q ≤ n3 , then we can accept (n3 – Q) of the residual type-2 jobs in the first period and assign versatile resources to these accepted jobs. Thus, the maximum expected profit-to-go after observing demand in the first period but before accepting any residual type-2 jobs is: u2 ( n1 ,0, n3 ;0, d 2+ ) =

max

n3 − d 2+ ≤ Q ≤ n3

{π (n 2

3

}

− Q) + π 1 E [ min(n1 , D1 ) ] + π 1 E ⎡⎣ min(Q,( D1 − n1 ) + ) ⎤⎦

(4.11)

The unconstrained (without resource limits) optimal quantity Q1 to maximize the expected profit satisfies the classical newsvendor-type condition G1+ (Q1 ) = (π 1 − π 2 ) π 1 , or

{

}

Q1 = Argmin Q P ( ( D1 − n1 ) + ≥ Q ) ≤ π 2 π 1 .

(4.12)

Accounting for the resource constraints n3 − d 2+ ≤ Q ≤ n3 , the optimal number of versatile resources to reserve for type 1 jobs in the second period is (n3 − d 2+ ) if Q1 < n3 − d 2+ , and is min{Q1, n3} otherwise. Since extending this characterization of the exact reservation policy to a system with more than two job types and with general resource flexibility structure is analytically intractable, we adapt the approach using some approximations. First, in each period t, we treat the demand during the remaining (t – 1) periods as an aggregate demand that occurs in a single future period. Second, analogous to the previous assumption that only type-1 jobs arrive in the second period, we only consider the future demand for type-j and more profitable jobs when deciding the capacity reservation for these jobs. Finally, for deciding whether to accept a current type-(j+1) job, instead of making a separate capacity reservation for each job type that is more profitable than job type (j+1), we determine a combined reservation for all the more-profitable job types k = 1, 2, …, j, effectively treating these job types together as a composite job type with an appropriate weighted average profit parameter, as discussed below. (For airline seat reservations, researchers have distinguished between analogous partitioned and nested booking limits except that these limits refer to allocation of versatile or downward flexible resources whereas we are concerned with determining the desired total capacity, over multiple resource types with arbitrary flexibility structure, to allocate for various job types.) For a multi-class newsvendor problem, Sen and Zhang (1999) showed numerically that reserving capacity separately for each demand class is inferior to using a combined reservation for all more-profitable demand classes. Our preliminary computational experiments (not reported here) also confirmed that, for the flexible resource deployment problem, our combined reservation approach yields higher profits than a separate reservation for each job type. For j = 1, 2, …, m – 1, let Qj be the desired number of resources that we wish to reserve for future arrivals of jobs of type-1 through type-j. We now discuss how to adapt expression (4.12) (for Q1 in

18

the two job-type problem) to compute Qj in the m job-type problem. First, since Qj will be used to decide whether to accept a current type-(j+1) job, the “overage” cost or opportunity cost of overreserving capacity for future jobs is the profit margin πj+1 of the current job. So, we replace π2 in (4.12) with πj+1. Next, for the two job-type problem, the value π1 in expression (4.12) represents the unit profit margin of the more-profitable job for which we are reserving capacity. In the context of our approximation, we replace this value with the following demand-weighted average profit for the all the job types that are more profitable than job type (j+1):

π j = ∑ k =1π k E (CD + ) j

k ,t −1



j k =1

E (CDk+,t −1 ) .

(4.13)

For k = 1, 2, …, j, expression (4.13) uses the expected future residual demand of type-k jobs from period (t – 1) onwards as the weight for the type-k job’s profit margin πk. Finally, instead of the uncertain future residual demand (D1 – n1)+ of the single more-profitable job type (i.e., type 1) in expression (4.12), we aggregate the future residual demands for all the job types k = 1, 2, …, j that are more profitable than job type-(j+1). Using these principles, we compute Qj in period t as:

{

Q j = Argmin Q P (∑ k =1 CDk+,t −1 ≥ Q) ≤ π j +1 π j j

}

for all j = 1, 2, …, m–1.

(4.14)

In the context of airline reservations, the expected marginal seat revenue (EMSR-b) heuristic (Belobaba 1989) uses the same approach to determine protection levels for controlling the usage of a single resource type. Since expression (4.14) does not consider the currently available resource capacities, we may not have as many resources as we would like to reserve. We therefore refer to Qj as the aggregate capacity reservation target for type-j and more profitable jobs. For the two job-type problem, since we have only a single flexible resource type (i.e., type 3), with n3 available units, we could easily determine the actual reservation as either n3 − d 2+ or min{Q1, n3}. With general flexibility structure, we have many different resource types that can perform some or all of the job types k = 1, 2, …, j. So, we face the additional challenge of deciding how many of each specific resource type to reserve for future jobs in order to meet the aggregate capacity reservation target Qj. To account for the actual available capacities, we employ a capacity reservation linear program [CRP] that simultaneously considers the aggregate capacity targets for all j, and attempts to achieve reservations as close as possible to (but not exceeding) the target values, giving preference to reservations for more profitable job types. The NCR policy first solves this linear program to determine the number of jobs of each type to accept in the current period, and later (in the second step) selects the specific resources to assign to the current accepted jobs based upon the resource optimal shadow prices for [CRP]. A formal statement of the procedure follows:

Nested Capacity Reservation (NCR) policy: Step 0: Assign and update specialized resources. Step 1: Determine the target capacity reservation Qj using expression (4.14).

19

Solve the following profit-maximizing linear program [CRP], with decision variables zjr and wjr representing, respectively, the number of type-r resources to assign to current and future type-j jobs, for all j ∈ J and r ∈ RFj. [CRP]

Max

∑∑π j∈J r∈RF j

j

z jr + ∑

∑π

j∈J r∈RF j

j

w jr

(4.15)

subject to j

∑∑w k =1 r∈RFk



r∈RF j

≤ Qj

z jr ≤ d +jt

∑ (z j∈J r

kr

jr

+ w jr ) ≤ nrt

z jr , w jr ≥ 0 * jr

for all j ∈ J\{m},

(4.16)

for all j ∈ J,

(4.17)

for all r ∈ RF, and

(4.18)

for all j ∈ J and r ∈ RFj.

(4.19)

* jr

Let {z , w } j∈J be the optimal solution to this linear program.

Step 2: Set u jt = min{d +jt , ∑ r∈RF ( z *jr + w*jr )} as the number of type-j jobs to accept in this period. For j

every flexible resource type r∈ RF, let σrt denote the shadow price of the corresponding resource constraint (4.18). For j = 1, 2, …, m, assign resources r ∈ RFj in increasing order of resource values σrt to a total of uj current jobs, updating the number of remaining (unused) resources after each assignment. Unlike the DCA algorithm, which uses the expected future residual demand of each job type j to guide planned resource allocations for future demand, model [CRP] reserves resources based on the target aggregate capacity reservation values Qj. The wjr values represent the planned (tentative) allocation of type-r resources for future type-j jobs. Since the objective function includes the profits from these allocations, the optimal solution will favor reserving capacity (as much as possible, using the currently available mix of resources) for more profitable job types, subject to the nested reservation limits Qj. Model [CRP] can have alternate optimal solutions that differ in their allocation of capacity between current and future type-j jobs, for each j. So, in Step 2, we first set the number of current type-j jobs to be accepted, ujt, equal to the smaller of the current residual demand or the total number of resources assigned by the [CRP] solution to type-j jobs, and then assign less valuable resources, based on resource shadow prices, to these accepted jobs. As in the DCA method, we can represent the optimization model [CRP] as an equivalent minimum cost network flow problem on a network with O(m+l) nodes and O(ml) arcs which can be solved in O{(ml)2(m+l)log(m+l)} time per period. Again, the assignment of resources to current jobs in Steps 0 and 2 require O(ml) effort in total, and so the NCR method has the same computational complexity as the DCA method. 4.4 Bottleneck Capacity Reservation (BCR) policy

Unlike the DCA policy, the NCR policy considers demand stochasticity when determining the capacity reservation targets Qj to decide whether to accept current jobs. However, by using nested

20

targets, the approach ignores the actual flexibility structure, effectively assuming that a unit of capacity that is not assigned to a current type-j job can be used to process any of the more-profitable job types k = 1, 2, …, or j–1. But, depending upon the resource structure, the acceptance decision for the type-j job may have no impact on the available capacity for some of these more-profitable job types, in which case we should not consider the future demand for such jobs when making acceptance decisions for type-j jobs. Ideally, when considering the use of a type-r resource for a current type-j job, we must examine whether this assignment can create or exacerbate a future capacity “bottleneck” for any subset of more-profitable job types. With general flexibility structures, each job type can be performed by multiple alternative resource types that have arbitrary overlapping capabilities. Hence, bottlenecks are not associated with individual job or resource types, but are rather defined by groups or subsets of job types and the corresponding resource types that can perform any job type in the subset. As we explain below, our Bottleneck Capacity Reservation (BCR) policy makes simultaneous job acceptance and assignment decisions based on capacity reservation targets for such bottlenecks. The BCR policy considers job types j in order of decreasing profitability and resource types r ∈

RFj in order of increasing value (to be defined later). To decide whether to assign a type-r resource to a current type-j job, the method first identifies subsets S of more-profitable job types for which capacity can become a future bottleneck due to this r-to-j assignment. Such subsets include not only the more-profitable job types that the type-r resource can perform, but also other job types that are indirectly affected by the r-to-j assignment due to job-resource chaining effects induced by resource flexibility. (Jordan and Graves 1995 introduced chaining in the context of flexible manufacturing; Graves and Tomlin 2003, Hopp et al. 2004, and others have since extended and applied this concept to various contexts.) We defer discussion on how to identify such subsets in order to first define what we mean by a bottleneck. For any subset of job types S ⊆ J, let RFS = ∪k∈S RFk denote the subset of flexible resource types that can perform one or more job types in S. Following our approach for the NCR policy, for any subset S, with πk > πj for all k ∈ S, we define a capacity reservation value QS,j as the number of resources that we would like to reserve for future demand of job types in S instead of assigning them to current type-j jobs. We say that the job subset S is a bottleneck relative to job type j if the total number of currently available resources to process jobs in S is less than or equal to QS,j, i.e., if



r∈RFS

nrt ≤ QS , j . To compute QS,j, we apply the previous critical fractile approach by first

defining the following weighted profit parameter for job types in S, with the expected future residual demand for each job type in S as the weight for that job type’s profit margin: πS = ∑ k∈S π k E (CDk+,t −1 ) ∑ k∈S E (CDk+,t −1 ) . Then, as in expression (4.14), we define:

{

}

QS , j = Argmin Q P(∑ k∈S CDk+,t −1 ) ≥ Q ) ≤ π j πS .

(4.20)

(4.21)

21

To assess the possible impact of assigning a type-r resource to a current type-j job on the available capacity for more profitable job types in the future, we should consider all job subsets S satisfying two conditions: S contains only job types k < j that are more profitable than job type j, and r ∈ RFS. We refer to such subsets as - subsets. If any such subset S is a bottleneck, i.e., if



r∈RFS

nrt ≤ QS , j , then we should not assign a type-r resource to the type-j job. If none of the -

subsets are bottlenecks, then resource type r is a candidate for assignment to a current type-j job; in this case, we refer to resource type r as a non-bottleneck resource. Rather than enumerating every possible subset S of job types 1, 2, …, j–1, and then checking if r ∈ RFS, we reduce the enumeration effort using the following approach. Consider a flexibility graph GF(j) containing one node for every currently available flexible resource type r ∈ RF with nrt > 0, and one node for every job type k < j. The graph contains edge (r', k) if k ∈ Jr', i.e., if type-r' resources can perform type-k jobs. We say that a type-k job is reachable from a type-r resource if this flexibility graph contains at least one path from the resource node r to the job node k. Observe that, even if the type-r resource cannot directly perform a type-k job, having one extra unit of a type-r resource permits accepting an extra unit of any type-k job that is reachable from r through a process of successive resource substitution for the intermediate job types in the path connecting r to k in GF(j). With this construction, each -subset S is a subset of the reachable jobs from the resource node r. Next, we discuss a resource valuation metric that the BCR policy uses to select from among alternative resource types that are available for assignment to a current type-j job. Define

θ j = min{E (CD + ) + d + , ∑ r∈RF nrt } j ,t −1

jt

j



r∈RF j

nrt as an indicator for the tightness of capacity for type-j

jobs. Observe that 0 ≤ θ j ≤ 1 (if at least one currently available resource can perform job type j), with higher values of θ indicating tighter capacity for type-j jobs. We now define the value of a type-r resource as:

τ r = ∑ j∈J π jθ j r

(∏

j '∈J r , j '< j

(1 − θ j ' )

)

for all r ∈ RF.

(4.22)

Roughly, interpreting θj as the likelihood that a type-j job will be assigned a type-r flexible resource, the resource value τr represents the expected profit generated by this resource. This metric accounts for both the profit margins of the job types that a type-r resource can perform and the capacity tightness for each of these job types. Moreover, it assigns higher value to resources that are more flexible: for any pair of resource types r and r' with Jr' ⊂ Jr, the definition (4.22) of relative value ensures that τr > τr', as desired. Returning to the overall BCR policy, at each period t, after assigning specialized resources, the method considers job types j = 1, 2, …, m in decreasing profit sequence. For each j, the method first attempts to assign resource types r ∈ RFj for which the type-j job is most profitable. Let

RP j = {r ∈ RF j : j = min j '∈J r ( j ')} denote these preferred resource types for type-j jobs. If RPj

22

contains more than one resource type, the method assigns resources in increasing order of the resource value τr. When all resources in RPj are exhausted, the method identifies all non-bottleneck resource types Mj ⊆ RFj \ RPj that can process job type j. The BCR policy considers these resource types r in decreasing order of slack capacity (i.e., smallest excess capacity over all subsets S of reachable jobs that are more profitable than type j), and assigns slack resources until they are exhausted or all type-j jobs have been accepted. A formal description of the method follows: Bottleneck Capacity Reservation (BCR) policy: Step 0: Assign and update specialized resources. Compute the value τr of each resource type r ∈ RF. Step 1: For j = 1, 2, …, m, • For each preferred resource type r ∈ RPj , in increasing order of value τr: Accept x*jrt = min{d +jt , nrt } jobs and assign type-r resources to these jobs; Update nrt ← nrt − x*jrt and d +jt ← d +jt − x*jrt Step 2: For j = 1, 2, …, m, Set Mj = ∅. Mj represents the set of non-bottleneck resources for job type j. Step 2.1: For every resource type r ∈ RFj \ RPj with nrt > 0, • construct the flexibility graph GF(j), and determine the job types j' < j that are reachable from resource type r. Let Ljr denote the set of these job types. • For every job subset S ⊆ Ljr, compute QS,j using expression (4.21), and let ε S = (∑ r '∈RF nr ' t − QS , j )+ denote the excess capacity for subset S. S

Let b jr = Min ∀S ⊆ L jr {ε S } be the slack for resource r. If bjr > 0, update M j ← M j ∪ {r} . Step 2.2: If Mj is empty, go to next job type j in Step 2. Otherwise, let r ' = Arg max r∈M j b jr . Step 2.3: Assign x*jr ' t = Min{e jt , b jr ' } units of type-r' resources to type-j jobs. Update the remaining current demand ( d +jt ← d +jt − x*jr ' t ) and resource levels ( nr ' t ← nr ' t − x*jr ' t ). If d +jt = 0 , go to next job type in Step 2. Otherwise, update the list of nonbottleneck resources ( M j ← M j \ {r '} ), and return to Step 2.2. If j = m, set nr ,t −1 = nrt for all r ∈ RF. By considering all possible subsets of more-profitable (and reachable) job types at each iteration j of Step 2, the BCR policy generalizes the NCR policy (which considers only one “nested” subset consisting of all job types that are more profitable than type j). In the worst-case, Step 2 of the BCR algorithm must enumerate all reachable job subsets for every pair of job and resource types, and hence the complexity of the algorithm is O(2mml) per period. Although the BCR policy must assess the capacity-demand match for many more job subsets than the NCR policy, in our computational tests the BCR policy required only a modest amount of additional computational time compared to the NCR and DCA policies. Note that the bottleneck set can change dynamically depending on the demand realizations and acceptance/assignment decisions in each period. The BCR policy accounts

23

for this “shifting” bottleneck phenomenon by recomputing the bottleneck set in every period. We conclude this discussion of approximate policies by noting that, although we have presented each policy in the discrete-time setting, these methods also apply when firms need to make job acceptance and assignment decisions as soon as each individual job arrives. Moreover, as we illustrate in Section 6, we can also adapt the methods to situations where acceptance decisions are dynamic but resource assignments can be deferred until the end of the decision horizon. Next, we describe our computational experiments and results to evaluate the performance of the three policies.

5. Computational Results To assess the relative effectiveness of the DCA, NCR, and BCR policies, we implemented and applied them to a wide variety of test problems. We measure the effectiveness of each method in terms of the profit that it generates for each problem instance relative to either the profit for the exact policy (if solving the associated dynamic program (2.1) and (2.2) is not very computationally intensive) or the expected perfect-information value (which is an upper bound on the exact profits) for the same problem instance. Since problem difficulty and policy performance depend on various key problem characteristics such as the resource flexibility structure and relative tightness of capacity, we implemented a random problem generator that permits us to systematically vary these characteristics, and designed a comprehensive set of computational experiments to address questions such as the following: ♦ How effective are the approximate policies, i.e., do they generate near-optimal job acceptance

and resource assignment decisions? ♦ Is policy performance robust to variations in problem characteristics? ♦ Does one of the three policies consistently outperform the others? ♦ Can we develop insights on drivers of problem difficulty and decision effectiveness?

Next, we outline our problem generation approach and experimental design. Sections 5.2 and 5.3 discuss computational results, confirming the effectiveness and robustness of our approximate policies to changes in various problem parameters. 5.1 Problem generation and experimental design

We test the performance of different policies over many different problem scenarios, obtained by varying six key problem characteristics that can influence problem difficulty. 1. Number of job types. Problems with more job types are harder to solve both due to larger problem dimensions and more complex job acceptance and resource assignment tradeoffs. For our computations, we consider problems with three to five job types. 2. Resource flexibility structure. For a problem with m job types, we can have up to (2m −1)

24

resource types. If different resource types have large overlap in terms of their capabilities, then the number of available resource choices for each job increases, making the problem more challenging. We expect problems with intermediate levels of flexibility to be more difficult than the two extreme configurations containing either only specialized resources or only versatile resources. Iravani, Van Oyen, and Sims (2005) propose indices to characterize the level of flexibility of manufacturing and service systems and measure their ability to respond to variability. Since acceptance and assignment decisions are trivial when the system contains specialized resources, we do not include any specialized resources in our test problems. To understand the influence of flexibility structure, we consider the following canonical structures: • k-Chain. For 1 < k < m, this structure consists of m resource types that can each perform k

“consecutive” (with wrap-around) job types, i.e., for r = 1, 2, …, m, resource type r can perform job type, r, (r+1) mod m, …, (r+k−1) mod m. Chains permit sequential reallocation of capacity among different job types, and so are effective for handling variability in the mix of demand across job types. Jordan and Graves (1995) showed that a 2-Chain structure meets almost as much total demand as a system in which all resources are versatile. Chou et al. (2008) and others have since studied similar desirable properties of k-Chains, for k > 2. • k-All. For 1 < k < m, this structure contains m!/k!(m−k)! resource types, one corresponding to

each combination of k job types out of the m job types. • Complete. This structure contains (2m−1−m) resource types, one for each subset of two or more

job types out of the m job types. • Versatile. In this structure, all resources are versatile.

The Versatile structure has the highest level of flexibility, and so will yield the largest profit. The relative profitability of the other structures depends on the number of resources of each type as well as the relative demand for different job types. With equal distribution of capacities and demand across resource and job types, the k-All structure is superior to the k-Chain structure. 3. Decision frequency. For a given length of the decision horizon, increasing the decision frequency corresponds to making each period shorter so as to respond quicker to customer requests. Since resource deployment decisions must now be made with less demand information, expected profits will decline as decision frequency increases. We consider problem scenarios with 5 to 60 decision periods, keeping the horizon length and demand rate constant. 4. Capacity tightness. With ample resources, we can accept most, if not all, jobs. At the other extreme, when capacity is very tight, all but the most profitable jobs must be rejected. We, therefore, expect dynamic resource deployment problems with intermediate levels of capacity

25

tightness to be most difficult. We define the capacity tightness index η as the ratio of the total number of resources initially available to the total expected demand (over all job types) during the horizon. For our experiments, we randomly select η from a user-specified interval [ηmin, ηmax]. We consider scenarios with moderately tight and loose capacities. 5. Relative profitability of job types. When some job types are much more profitable than others, a “wrong” acceptance or resource assignment decision can significantly decrease the relative profit performance of any approximate policy. To study this effect, we vary the ratio of profits for successive job types. For j = 1, 2, …, m–1, let γ j = π j π j +1 > 1 denote the reward ratio for type-j jobs relative to type-(j+1) jobs. We select γj randomly from a user-specified interval [γmin,

γmax]. By varying γmin and γmax, we can increase or decrease the relative profit margins of job types. For our experiments, we consider three ranges of reward ratios to capture small, moderate, and large differences in profit margins. 6. Demand distribution. As demand variability increases, the performance of the approximate policies may deteriorate. To permit proper profit comparisons across scenarios, we fix the total expected demand for all job types during the entire decision horizon; let κ denote this expected value. In our base case, arrivals of different job types follow independent Poisson processes, with mean λj = κ/mT for all j ∈ J in each period t = 1, 2, …, T. To study the effect of demand variability on policy performance, we also consider certain discrete demand distributions with small, medium, and large variances (relative to the mean). Further, we consider both balanced and unbalanced expected demand across job types. The preceding six key problem characteristics, and their associated parameters, provide a convenient framework for specifying problem scenarios. We refer to each combination of settings for the six dimensions as a scenario. In all, our computational study considers 25 different problem scenarios. For each scenario, we randomly generate 10,000 random problem instances, and average the results (relative profits) over these instances to quantify the performance (relative profits) of each policy. To generate a particular problem instance for a given scenario, we first randomly select a value for the capacity tightness index η from [ηmin, ηmax], and set the total number of resources equal to ηκ, where κ is the total expected demand of all job types over the decision horizon. Each of these– resources is equally likely to be designated as one of the l resource types in the chosen flexibility structure. To determine the profit margin for each job type, we normalize πm = 1; then, for j = m–1, …, 1, we randomly select γj from the interval [γmin, γmax], and set πj = γj πj+1. Finally, we generate the number of arrivals of each job type in every period t of the decision horizon using the chosen demand distribution. We set κ = 60 for all scenarios, and assume that demand is stationary. For every problem instance, we applied the DCA, NCR, and BCR policies and determined the 26

perfect-information solution by solving the DD model described in Section 2.2. For problem instances with three and four job types, we also computed the exact policy (for the problem with general flexibility structure and multiple job arrivals in each period) by solving the dynamic program (2.1) and (2.2) using backward induction. Finally, to provide a benchmark for comparing the profits of our optimization-based flexible resource deployment policies, we applied the following First-come First-served (FCFS) policy to every problem instance. In each period t, this policy considers job types j in order of decreasing profit, accepting as many of the arriving type-j jobs as possible using available resources, i.e., the policy accepts min{d jt , ∑ r∈R nrt } jobs, where nrt is the number of j

currently available type-r resources. To assign specific resources to accepted type-j jobs, the FCFS method favors using resources r∈ Rj that are less flexible, i.e., that can perform fewer job types (Chevalier and Van den Schrieck 2008 applied this rule for skills-based routing in a call center). When there are ties, i.e., if two or more available resource types in Rj can perform the same number of job types, the method prefers assigning resources that perform less-profitable job types. Specifically, for each resource type r, let Ar be the characteristic 0-1 m-vector with element arj = 1 if resource type r can perform job type j, and 0 otherwise. Then, among all the available resource types that can perform the same number of job types, the FCFS method assigns resources in lexicographically increasing order of Ar. We implemented the problem generator and various resource deployment policies in the C programming language, using CPLEX 8.1 for the optimization subroutines. We assess the economic value of each of our approximate policies by comparing their profits to those obtained by the FCFS policy. The profit of the exact policy or the perfect-information upper bound serves to validate the quality of the approximate solutions. We divide our computational results into two sets. In the first set, we focus on determining whether our policies are effective and identifying the best among the three approximate methods. We find that the BCR policy consistently generates near-optimal profits and outperforms the other policies. Our second set of computations tests the robustness of the approximate policies’ performance to variations in problem characteristics such as resource flexibility, capacity tightness, demand variability and balance, relative profitability of jobs, and decision frequency. 5.2 Effectiveness of the approximate policies

Our first set of computations seeks to address three main questions: which among our three approximate policies performs best, what benefit do these policies provide relative to the FCFS policy, and does the best policy generate near-optimal profits? For this purpose, we consider problem scenarios with three to five job types, and various flexibility structures. For problems with three and four job types, we are able to determine the exact policy using dynamic programming, whereas for five job-type problems, we use the perfect-information (PI) upper bound to assess the quality of the

27

approximate solutions. We fix the number of decision periods at T = 10, consider moderately tight resource capacities (η ∈ [0.6, 0.9]) and moderate reward ratios (γ ∈ [1.5, 2.5]) across job types, and simulate Poisson arrivals with equal expected demand across job types. Such scenarios with balanced demand across job types and equal expected resource capacity for each job type are likely to be the most challenging since all job types have equally tight capacities and contend for flexible resources. Table 1 summarizes the average optimality gaps between the approximate and exact solution values for problems with three and four job types for various flexibility structures. With three job types (m = 3), the 2-All flexibility structure is the same as the 2-Chain structure. For problems with more than three job types, to avoid spreading the resources among too many resource types, we do not consider the Complete or (m–1)-All flexibility structures. We measure the performance of each approximate policy for any scenario as the gap (averaged over 10,000 instances) between the profits of the exact and approximate policies expressed as a percentage of the exact policy’s profit. The results in Table 1 show that all three policies generate good solutions, with optimality gaps of less than 4% for every scenario. (For each scenario, Table 1 shows the smallest gap among the approximate policies in bold.) Among our three policies, the NCR policy is uniformly superior to the DCA policy, but the BCR policy consistently outperforms both the DCA and NCR policies in every problem scenario, generating profits that are within 1.7 % of the exact policy’s profits for every scenario. The average gap for the BCR policy, relative to the exact policy, is less than 1.5% for both three and four job types. This near-optimal performance of the BCR policy (confirmed by our later experiments) underscores the importance of considering both demand stochasticity and shifting resource bottlenecks to decide job acceptance and resource assignments. No. of job types m

3

4

Resource Structure 2-CHAIN COMPLETE VERSATILE

Average % 2-CHAIN 2-ALL 3-CHAIN VERSATILE

Average %

FCFS

DCA

NCR

BCR

% of FCFS-toExact gap closed by BCR

14.23 16.52 18.77 16.51 25.77 26.86 28.74 28.14 27.38

1.92 2.58 2.88 2.46 3.92 3.84 3.40 3.94 3.78

1.46 2.14 1.67 1.76 2.13 1.96 1.92 1.06 1.77

1.05 1.59 1.66 1.43 1.70 1.46 1.61 1.06 1.46

92.62 90.38 91.13 91.88 93.40 94.56 94.40 96.23 94.65

(Exact – Approx.)/Exact %

PI − Exact % Exact

3.23 1.42 1.27 1.97 5.15 4.13 2.66 1.64 3.40

Table 1: Policy effectiveness for three and four job-type scenarios

Comparing the gaps for our approximate policies with those for the FCFS policy reinforces the importance of implementing an optimization-based policy to guide dynamic resource deployment decisions. The profits generated by the FCFS policy are 14.2% to 28.7% lower than those of the exact policy, and the BCR policy closes over 90% of this gap. The last column of Table 1 shows the

28

percentage gaps between the perfect-information (PI) solution values and exact profits. The PI values overestimate the exact profits by as much as 5%, with the PI upper bound becoming weaker as the number of job types increases from three to four. As resource flexibility increases (e.g., comparing the results for 2-Chain and Versatile resource structures), the PI-to-Exact gaps decrease, implying that the exact policy is able to achieve almost as much profit as decisions based on perfect information. Table 2 reports the results for five job-type problem scenarios. Since finding the exact policy for these problems is computationally prohibitive, we use the PI value as the benchmark to assess the relative performance of our three approximate policies and the FCFS policy. As before, for all scenarios, the BCR policy provides the highest average profit among the three approximate policies. The percentage gaps in Table 2 are notably higher than those in Table 1 because we measure these gaps relative to the PI upper bound rather than the exact value. And, as we noted previously, this upper bound appears to become looser as the number of job types increases. Assuming that the PI-toexact gap increases by 1 to 2% when the number of job types increases from four to five (as it does when number of job types increase from three to four), the results in Table 2 suggest that the actual optimality gaps (relative to the exact policy’s profits) of our approximate policies should be comparable to those in Table 1. We also note that, as the problem size increases, the resource structure becomes more complex, vastly increasing the number of decision options and hence the possibility of sub-optimal choices at each stage. This increased complexity adversely affects performance, particularly for the DCA policy. (PI – Approx.)/PI %

Resource Structure 2-CHAIN 2-ALL 3-CHAIN 3-ALL 4-CHAIN VERSATILE

Average %

FCFS 37.50 39.16 40.76 40.66 39.16 41.81 39.84

DCA 10.13 8.54 7.77 7.21 6.18 5.34 7.53

NCR 7.93 6.95 6.31 4.87 4.33 4.16 5.76

BCR 7.29 6.55 6.35 4.64 4.24 4.16 5.54

Table 2: Policy effectiveness (relative to PI upper bound) for five job-type scenarios 5.3 Robustness of policy performance: Impact of key problem dimensions

Having established the effectiveness of our approximate policies, we now study their robustness with respect to the key problem dimensions. For this purpose, we consider variations around a base scenario with three job types, Complete flexibility structure, Poisson arrivals with balanced demand, moderately tight capacity availability η (∈ [0.6, 0.9]), moderate reward ratios γ (∈ [1.5, 2.5]), and ten periods in the decision horizon. We vary the base scenario, one dimension at a time, to assess the effect on policy performance of capacity tightness, demand characteristics, relative profit margins, and decision frequency. Figure 1 (a-e) displays the percentage gaps, averaged over 10,000 problem

29

instances, between the approximate policies’ profits and the exact value for each scenario variant. The results for the base scenario are displayed as grey bars. 4.0%

Tight capacity Average Approximate-to-Exact Percentage Gaps

3.0% 2.5% 2.0% 1.5% 1.0%

3.0% 2.5% 2.0% 1.5% 1.0%

0.5%

0.5%

0.0%

0.0%

NCR

Type-2 heavy demand Type-3 heavy demand

3.0% 2.5% 2.0% 1.5% 1.0%

0.0%

DCA

BCR

(a) Capacity tightness

3.5%

0.5%

NCR

BCR

DCA

(b) Demand variance

Large reward ratio 3.0% 2.5% 2.0% 1.5% 1.0%

T=10

8.0%

Moderate reward ratio

BCR

T=5

Small reward ratio

3.5%

NCR

(c) Unbalanced demand

9.0%

4.0%

T=20 Average Approximate-to-Exact Percentage Gaps

DCA

Type-1 heavy demand

Moderate variance High variance

Balanced demand

4.0%

Low variance

3.5%

Loose capacity

Average Approximate-to-Exact Percentage Gaps

Average Approximate-to-Exact Percentage Gaps

3.5%

Average Approximate-to-Exact Percentage Gaps

4.0%

0.5%

7.0%

T=60

6.0% 5.0% 4.0% 3.0% 2.0% 1.0%

0.0%

0.0%

DCA

NCR

BCR

(d) Reward ratio

DCA

NCR

BCR

(e) Decision frequency

Figure 1: Robustness of policy performance

As before, across all the 12 additional scenarios covered by these experiments, the results of Figure 1 show the same consistent pattern of relative performance as in Section 5.2: the BCR policy is superior to the NCR policy which in turn outperforms the DCA policy. For the 12 new scenarios, the average BCR-to-Exact gap is only 1.19%, validating the robustness of the BCR policy. We next discuss the results for each of the five scenario variants. Figure 1(a) compares the gaps for the base case, with moderately tight capacities, and a scenario with moderately loose capacities (η ∈ [0.9, 1.2]). As capacity increases, more incoming jobs can be accepted, and so we expect the approximate policies to achieve profits closer to the exact profits. Figure 1(a) confirms this performance improvement (reduction in gaps) for the loose capacity scenario. The FCFS-to-Exact gap also reduces (to 5.8%) when capacity becomes looser, implying that using our optimization-based policy is more important when capacity is tight. Figures 1(b) and 1(c) address two aspects of the demand process that can affect policy performance – variability in demand and the mix of demand across job types. For the base scenario, with the Poisson demand distribution, the variance of demand equals the mean. To assess the impact of demand variance, we consider three discrete demand distributions that have the same mean as the base scenario (two jobs per period per job type), but different variances. Specifically, we consider scenarios with low variance (1.2), medium variance (2), and high variance (3) using, respectively, a 30

triangular distribution, uniform distribution, and bi-modal distribution weighted at the extremes. Higher demand variability increases the chances of making erroneous job acceptance and resource assignment decisions. Figure 1(b) confirms this effect, showing that the approximate policies have lower gaps when demand has smaller variation. Turning to the demand mix, our base model assumes that every job type has the same expected demand per period. We are interested in assessing the effect of having unequal expected demand across job types. Varying the relative demand, while keeping the expected number of resources the same for all resource types, introduces variations in capacity availability across job types. We expect this variation to reduce problem difficulty (compared to balanced demand), and so improve the performance of the approximate policies. We consider three unbalanced demand scenarios, one corresponding to each job type j having higher demand than the other two job types. For j = 1, 2, and 3, we define the type-j- heavy demand scenario as one in which the expected arrival rate per period is 3 for type-j jobs and 1.5 for the other two job types (so the total expected demand across all job types is the same as the base scenario). Figure 1(c) confirms our intuition that scenarios with balanced demand are more difficult to solve. The average Approximate-to-Exact percentage gaps for the unbalanced problems are significantly lower than those for the balanced problems for all three approximate policies, with the gaps being lowest when the least profitable job type (type 3) has high demand. The FCFS policy has an average optimalicy gap of 7.97% over the three unbalanced demand scenarios. So, our optimization-based policies continue to be valuable for generating higher profits, although less so than with balanced demand. Figure 1(d) compares the performance of the approximate policies as the relative profitability of job types, captured by the reward ratios, changes. With small reward ratios (γ ∈ [1.5, 2.5]) job types do not differ much in their profit margins compared to the base scenario with moderate reward ratios, and so the impact of wrong acceptance or assignment decisions on overall profits is lower. In this case, we expect the approximate policies to yield profits that are closer to the exact profits. At the other extreme, with large reward ratios ((γ ∈ [2, 3]), wrong acceptance decisions can have significant profit impact. The results in Figure 1(d) validate this hypothesis; the gaps increase as the reward ratio increases. Notably, however, the BCR policy’s performance is robust to changes in the reward ratio, with the gap remaining at less than 1.65% even with high reward ratios. In contrast, the FCFS policy has an average gap of 11.9% for small reward ratios, whereas this gap more than doubles to 27.2% for large reward ratios. Finally, we study the effect of decision frequency on policy performance. Increasing the decision frequency (by increasing the number of decision periods T in the horizon) permits the firm to be more prompt in its responsiveness to customer requests, but reduces the opportunity to accumulate requests before making acceptance and assignment decisions. To contrast with the base setting of 10 periods,

31

we considered three alternative scenarios with 5, 20, and 60 decision periods, respectively, while keeping the total expected demand in the horizon fixed. As the results in Figure 1(e) show, increasing the number of periods, and hence decision frequency, increases the percentage gap, but only minimally for the BCR policy. The DCA policy’s optimality gap more than triples when the decision frequency increases from 10 to 60 (from 2.6% to 8.2%), whereas the BCR policy is much more robust (its gap increases from 1.6% to just 2.2%). The FCFS-to-Exact gap increases from 10.9% to 29.3% when the decision frequency increases from 5 to 60. This study also permits us to assess the “cost” of making decisions with less information as decision frequency increases. Figure 1 shows the normalized average profit for the exact, BCR, and FCFS policies as the number of periods T increases from 5 to 60. (We normalize the profits so that the average exact profit for the base case, with T = 10, is 100.) The figure also displays the PI value; when T = 1 (i.e., single decision period at the end of the horizon), both the exact and BCR policies achieve this value. As expected, the profits of all three policies decrease as decision frequency increases; the figure shows that profit decreases at a decreasing rate with T. Compared to the base case of ten decision periods, doubling the decision frequency (to T = 20) reduces the exact profits by 1.8%, while halving the frequency (to T = 5) raises profit by 1%. The FCFS policy’s performance deteriorates significantly as the number of periods grows; its profit gap relative to the exact policy increases from 16.5% to 29.3% when T increases from 10 to 60, whereas the BCR policy’s gap is relatively flat.

Normalized Average Profit (Base case Exact Profit = 100)

105.0% 100.0% 95.0% Exact

90.0%

BCR

85.0%

FCFS 80.0%

PI

75.0% 70.0% 65.0% 0

10

20

30

40

50

60

T (decision frequency)

Figure 2: Profit impact of decision frequency

To summarize, in this section, we have covered a wide range of problem scenarios to capture many different operating conditions of a firm. Remarkably, the BCR policy always achieves profits that are within 2.2% of the exact policy’s profits, and in many cases attains optimality gaps well below 1%. For problems with three or four job types (for which we have the exact profits), over the 32

19 corresponding problem scenarios (covered in Table 1 and Figure 1), the BCR policy has an average optimality gap of 1.3% (relative to the exact profits), while the NCR and DCA policies have gaps of 1.7% and 3.2%, respectively. The FCFS policy, on the other hand, has an average optimality gap of 18.8% over the same set of problem scenarios. Our experiments have confirmed that contextual features such as moderately tight capacities, high reward ratios, and balanced demand increase problem difficulty.

6. Deferred Resource Assignments Thus far, we have addressed the situation when both job acceptance and resource assignment decisions must be made dynamically, i.e., in each period, the firm must not only decide which incoming jobs to accept, but also which resource to assign to each accepted job. This model is appropriate for the workplace learning context since clients (and instructors) prefer to know which instructor will conduct their training program soon after their request is accepted in order to begin working on any program customization, plan logistical arrangements, and so on. Other contexts may require only prompt (i.e., dynamic) acceptance decisions, but may not require immediate resource commitments. That is, in each period, the firm only needs to decide which incoming jobs to accept, and can defer the assignment of a specific resource to each accepted job until the end of the horizon (before performing the service). Having the option to defer the resource assignment decisions can increase profits (compared to dynamic assignment decisions) since this option enlarges, albeit modestly, the feasible acceptance region in later periods2. We note that the profit with deferred assignment cannot exceed the perfect-information value, i.e., the profit when both acceptance and assignment decisions can be deferred until the end of the horizon. In this section, we seek to address the following related questions. Can we readily adapt our previous dynamic acceptance and assignment policies to the dynamic acceptance-deferred assignment model? How effective and robust is the performance of these modified policies for the revised model? What is the economic value (i.e., incremental profits) of deferring assignments compared to dynamic assignments, and how does this value depend on the problem characteristics? To explore these issues, we first (in Section 6.1) formally define the revised model by discussing its stochastic dynamic programming formulation. We then illustrate (in Section 6.2) how to adapt the approximate policies to handle deferred assignments by describing modifications to one policy—the BCR policy— and report (in Section 6.3) computational results for various problem scenarios comparing the profit

2

In general, we can consider a spectrum of “partially” deferred assignment models, parameterized by a deferral period td > 1, in which assignment decisions for jobs accepted before period td (i.e., for t > td) can be deferred until td, but the firm must simultaneously decide both acceptance and assignment for t < td. As td becomes smaller, profits increase. We focus on the limiting case with td = 1 in order to assess the maximum possible benefits due to assignment deferrals.

33

performance of this modified policy with the profits generated by the exact policy with and without deferred assignment. 6.1 Model formulation for dynamic acceptance-deferred assignment problem

To formulate the deferred assignment model, we first note that, since resources are not committed until the end of the horizon, the state of the system in each period must specify the number of jobs of each type accepted until that period (instead of the number of remaining unassigned resources for the dynamic assignment model). For t = T, …, 1 and j ∈ J, let ζjt denote the cumulative accepted demand for type-j jobs up to and including period t, and let ζt = {ζ1t, ζ2t,… ζ mt}. By definition, ζT+1 = 0 and ζ1 is the vector of total acceptances over the entire planning horizon. As before, dt denotes the vector of realized demands in period t. Then, the state of the system at the beginning of period t is (ζt+1, dt). In each period t, we must ensure that the number of new type-j jobs accepted in this period, say, gjt for all j ∈ J, is feasible, i.e., the available (initial) capacity must be adequate to process these and previously accepted jobs. For special resource structures, maintaining feasibility is easy. For instance, if all resources are versatile, then we only need to ensure that the total number of jobs of all types accepted thus far does not exceed the total number of resources. However, with general resource flexibility structures, the feasibility check is not trivial. Specifically, although resources assignments are not finalized until the end of the horizon, the firm must develop a “tentative” plan for resource assignments in each period to ensure that it has enough resources to process the jobs accepted thus far. For this purpose, we introduce auxiliary decision variables hjrt, for all j ∈ J and r ∈ Rj, denoting the number of type r resources that are tentatively assigned to the type-j jobs accepted at and before period t. Then, the feasible region of acceptance decisions in period t for the deferred assignment (DA) model is defined as follows: FAtDA (ζt +1 , dt ) = {g jt : g jt ≤ d jt , ∑ h jrt = ζ j ,t +1 + g jt ∀j ∈ J , ∑ h jrt ≤ nrT ∀r ∈ R, h jrt ≥ 0∀j ∈ J , r ∈ R j } . r∈R j

j∈J r

The constraints ensure that: (i) the number gjt of type-j jobs accepted in period t does not exceed the demand djt observed in the period; (ii) the total number of type-j jobs tentatively assigned to resources equals the number previously accepted (ζj,t+1) plus acceptances in the current period t, and (iii) the total number of jobs tentatively assigned to type-r resources does not exceed the initial capacity nrT . Now, let utDA (ζt +1 , dt ) be the maximum expected profit-to-go in period t at the current state (ζt+1, dt). Given that the firm has accepted ζt+1 jobs before period t, the value function is vtDA (ζt +1 ) = ∑ utDA (ζ t +1 , dt ) P( Dt = dt ) , where P (Dt = dt) is the joint probability that the demand in ∀dt

period t is dt . We can express the profit-to-go utDA (ζt +1 , dt ) as:

utDA (ζ t +1 , dt ) =

max { g jt } ∈ FA t (ζ t +1 , d t ) DA

{∑

j∈J

}

π j g jt + ∑ d utDA −1 (ζ t +1 + g t , d t −1 ) P ( Dt −1 = d t −1 ) . t −1

(6.1)

In the last period (t = 1), after determining ζ1, we can decide the final assignment of resources to

34

accepted jobs by solving a simple assignment problem. We refer to the solution to the stochastic dynamic program (6.1), with deferred assignment, as the ExactDA policy. 6.2 BCR policy with deferred resource assignments

We can modify all three approximate policies of Section 4 to handle deferred rather than dynamic resource assignments. To illustrate the needed changes, we modify the BCR policy to handle deferred assignments and report computational results using this modified policy which we call the BCRDA policy. The BCR policy’s strength lies in its ability to dynamically identify resource and job subsets that together constitute capacity bottlenecks taking into account stochasticicy of future demand and the flexibility structure of currently available resources. These bottlenecks then guide decisions on which jobs among the new arrivals in this period to accept. However, while the method accepts new jobs only if the system has adequate capacity to process these jobs, it requires as input the “remaining” resources of each type after assigning to previously accepted jobs to ensure feasibility of these acceptance decisions. For the context with deferred assignments, we have the opportunity to reallocate (tentatively) the resources among accepted jobs in each period. We, therefore, augment the original BCR policy (Section 4.4) with a third step, after all acceptance decisions in a period, to reassign resources and determine the remaining resources for new assignments in the next period. Specifically, at the end of Step 2 of the previous BCR procedure for period t, the total number of newly accepted type-j jobs in this period is g jt = ∑ r∈RF x*jrt . Hence, the total number of accepted j

type-j jobs up to and including this period is ζ jt = ξ j ,t +1 + g jt for all job types j ∈ J. We then apply the following Step 3 to tentatively assign resources to these accepted jobs, starting with assignment of specialized resource types (Step 3.1) and assignment to preferred resource types (Step 3.2). For this discussion, recall that, for each job type j, sj, RFj, and RPj refer respectively to the index of the specialized resource type, set of flexible resource types, and set of preferred resource types (for which ' to denote the remaining number of accepted job type j is the most profitable). We use ζ 'jt and nrT

type-j jobs and remaining number of type-r resources as we assign specialized and preferred resources. Step 3 for BCRDA policy: Step 3.1: For j = 1, 2, …, m, assign as many type-j jobs to specialized resources as possible, i.e., set ζ 'jt = ζ jt − max(ns j ,T − ζ jt ) and ns' j ,T = ns j ,T − min(ns j ,T , ζ jt ) .

Step 3.2: For j = 1, 2, …, m, assign preferred resources. That is, For resource types r ∈ RPj, in lexicographically increasing order, set nrT' = nrT − min{nrT , ζ 'jt } and update ζ rt' ← ζ rt' − min{nrT , ζ rt' } . Step 3.3: Let J ' = { j ∈ J : ζ 'jt > 0} be the set of job types with remaining accepted demand, and RF j' = {r ∈ RF j : nrT' > 0} be the set of flexible resources types with remaining resources capable of performing job type j ∈ J'. Solve the following Resource Reallocation Problem [RRP], with decision variables hjrt denoting the number of type-r resources assigned to

35

remaining accepted type-j jobs, for j ∈ J' and r ∈ RF j' . [RRP]

Minimize

∑ ∑τ h j∈J ' r∈RF 'j

s.t.

∑h

jrt

= ζ 'jt

r

(6.2)

jrt

for all j ∈ J',

(6.3)

r∈RF 'j



j∈J r ∩ J '

' h jrt ≤ nrT

for all r ∈ ∪ j∈J ' RF j' ,

for all j ∈ J' and r ∈ RF j' .

h jrt ≥ 0 Set nr ,t −1 = n − ∑ j∈J ' rT

r ∩J '

and,

h jrt if r ∈ RF for some j ∈ J', and nr ,t −1 = n ' j

' rT

(6.4) (6.5)

otherwise.

The [RRP] model assigns remaining flexible resources to the remaining accepted demand (after assignment of specialized and preferred resources) so as to minimize the value τr of the resources used. The objective function (6.2) minimizes this value, while constraints (6.3) and (6.4) specify, respectively, that resources must be assigned to all the remaining accepted demand for each job type, and the number of resources assigned must not exceed the number of remaining resources for each resource type r. The BCRDA policy consists of applying steps 1 and 2 of the BCR policy (Section 4.4) followed by the above Step 3 at each period t = T, T–1, …, 1. Next, we discuss computational results using this policy for the flexible resource deployment problem with deferred assignments. 6.3 Computational results for deferred assignment model

To assess the value of deferring resource assignment decisions and validate the effectiveness of the approximate policy, we implemented the ExactDA and BCRDA policies, and applied them to test problems obtained using our problem generation framework of Section 5. We expect the ExactDA policy’s profit to be higher than that of the previous exact policy with dynamic assignments but lower than the perfect-information (PI) solution value. We first explore these profit gaps for the different scenarios introduced in Section 5.3. As before, we use the three job-type problem with Complete flexibility structure, Poisson arrivals, ten period decision horizon, and moderate capacity tightness and reward ratios as the base scenario. We consider variants of this base model to study how the flexibility level, capacity tightness, demand variance, demand mix, relative profitability, and decision frequency affect the benefits of deferred assignments. Table 3 compares the average gap (for 10,000 instances) between the PI value and exact profits with the gap between the ExactDA and exact profits for various scenarios; both gaps are expressed as percentages of the exact profits. Since the PI solution assumes that both acceptance and assignment decisions can be made after observing the demand for all T periods, the PI-to-Exact gap measures the percentage improvement in profit if we can defer both job acceptance and resource assignment decisions to the end of the horizon. On the other hand, the ExactDA-to-Exact gap represents the profit improvement when just resource assignments are deferred to the end. On average, over all scenarios, deferring the assignment decision

36

increases profit by only 1% compared to the 2.1% increase when both acceptance and assignment decisions are deferred. However, deferring assignments can be attractive in certain scenarios. Specifically, when demand has high variance, reward ratios are large, or decision frequency is high, deferring resource assignment decisions captures more than 50% of the total profit increase that a firm can realize by deferring both acceptance and assignment decisions. This finding is not surprising since dynamic decisions are likely to reduce profits the most under these three extreme scenarios.

1. Flexibility Level 2-CHAIN COMPLETE VERSATILE

2. Capacity Tight Loose 3. Demand Balance Balanced Type-1 heavy Type-2 heavy Type-3 heavy

PI - Exact % Exact

Exact DA - Exact % Exact

3.23 1.42 1.27

1.13 0.89 0.00

1.42 1.05

0.89 0.88

1.42 3.91 1.91 1.43

0.89 0.92 0.57 0.55

PI - Exact % Exact

Exact DA - Exact % Exact

4. Demand Variance Low 1.75 Moderate 2.11 High 3.25 5. Reward Ratio Small 1.30 Moderate 1.42 Large 3.22 6. Number of Periods T=5 0.46 T = 10 1.42 T = 20 3.26

1.31 0.97 2.01 0.85 0.89 2.09 0.35 0.89 1.74

Table 3: Benefits of deferred resource assignment

Next, we test the effectiveness of the BCRDA policy, in terms of its near-optimality, for scenarios with three and four job types. Table 4 presents the average gap between the ExactDA and BCRDA profits, as a percentage of the ExactDA profits, for various flexibility structures. To facilitate comparison with the performance of the original BCR method (without deferred assignment), the last column of Table 4 reports the corresponding Exact-to-BCR gaps (previously reported in Section 5.2). The average ExactDA-to-BCRDA gap over all scenarios is only 1.3%, which is slightly lower than the previous average Exact-to-BCR optimality gap of 1.4% for the original BCR policy applied to the same problem scenarios but with dynamic assignments. These results suggest that the BCR policy continues to be effective in generating near-optimal solutions when adapted to problems with deferred assignments. No. of job types m

Resource Structure

Exact DA − BCR DA % Exact DA

2-CHAIN 3

COMPLETE

1.25 1.76 1.15 1.15 1.81 1.42 0.74

VERSATILE

4

2- CHAIN 2-ALL 3-CHAIN VERSATILE

Table 4: BCRDA policy effectiveness for three and four job-type scenarios

37

7. Conclusions Employing a mix of flexible and specialized resources permits firms to broaden their service offerings while maintaining high resource utilization and reducing lost sales. Prior work on the benefits of flexibility largely assumes that resource assignments can be deferred until all the demand is realized, but this assumption may not apply to service firms that must dynamically decide whether to accept incoming jobs, and what resource to assign to accepted jobs. Since these decisions greatly influence the profitability of the enterprise, realizing the potential benefits of flexibility requires a principled approach for dynamic resource deployment. To our knowledge, this paper is among the first to study the dynamic demand selection and resource assignment problem for general resource flexibility structures. Our analysis of the special case with unit arrivals and the Star resource structure revealed that the exact policy has a threshold structure. Our three approximate policies provide alternative ways to compute and incorporate threshold principles for more general settings. The DCA policy accounts for the capacity interactions with general flexibility structures, but uses expected values as demand estimates. The NCR method accounts for randomness in demands, using a newsvendor-type approach to determine aggregate capacity reservations, but simplifies the resource structure and interactions among job and resource types. The BCR policy combines the strengths of the previous policies by identifying subsets of resources that are bottlenecks and appropriately reserving capacities. These policies are readily adaptable to immediate decision-making, as each job arrives, as well as for contexts that permit deferred assignment decisions. Our extensive computational results demonstrate that the BCR and NCR policies are very effective in generating near-optimal solutions over a wide range of problem settings, both with dynamic assignments and deferred assignments. The BCR policy’s exceptional performance suggests that contexts with general resource flexibility structures require accounting for both demand stochasticity and shifting bottlenecks. The results also show that using simplistic rules such as first come, first served can result in significant profit degradation. Our study of the impact of decision frequency on profit performance reveals that assuming perfect demand information for resource deployment, as in most capacity planning models, can lead to considerable overestimation of profits for service contexts that require dynamic and frequent decisions with incomplete information. This observation raises the interesting opportunity for future research on more refined flexible capacity planning models that explicitly incorporate dynamic or multi-period resource deployment decisions to assess the profitability of candidate capacity investment decisions.

38

References Aksin, Z., M. Armony, and V. Mehrotra. 2007. The modern call-center: A multi-disciplinary perspective on operations management research, Production and Operations Management 16 665-688. Armony, M., and C. Maglaras. 2004a. Contact centers with a call-back option and real-time delay information, Operations Research 52 527-545. Armony, M., and C. Maglaras. 2004b. On customer contact centers with a call-back option: Customer decisions, routing rules and system design, Operations Research 52 271-292. Belobaba, P. P. 1989. Application of a probabilistic decision model to airline seat inventory control, Operations Research 37 183–197. Bertsekas, D. P., and J. N. Tsitsiklis. 1998. Neuro-Dynamic Programming, Athena Scientific, Belmont, MA. Bertsimas, D., and I. Popescu. 2003. Revenue management in a dynamic network environment, Transportation Science 37 257–277. Bhulai, S., and G. Koole. 2003. A queueing model for call blending in call centers, IEEE Transactions on Automatic Control 48 1434-1438. Bish, E. K., and Q. Wang. 2004. Optimal investment strategies for flexible resources, considering pricing and correlated demands, Operations Research 52 954–965. Chevalier, P., and J. Van den Schrieck. 2008. Optimizing the staffing and routing of small-size hierarchical call centers, Production and Operations Management 17 306-319. Cross, R. G. 1997. Revenue Management: Hard-Core Tactics for Market Domination, Broadway Books, New York, NY. Fine, C., and R. Freund. 1990. Optimal investment in product-flexible manufacturing capacity, Management Science 36 449–466. Gans, N., G. Koole, and A. Mandelbaum. 2003. Telephone call centers: Tutorial, review, and research prospects, Manufacturing & Service Operations Management 5 79141, 2003. Gans, N., and Y. P. Zhou. 2002. Managing learning and turnover in employee staffing, Operations Research 50 991–1006. Garnett, O., and A. Mandelbaum. 2001. An introduction to skills-based routing and its operational complexities, Teaching note, Technion, Israel. Geraghty, M. K., and E. Johnson. 1997. Revenue management saves National Car Rental, Interfaces, 27 107–127. Graves, S. C., and B. Tomlin. 2003. Process flexibility in supply chains, Management Science 49 907–919. Hopp, W. J., E. Tekin, and M. P. Van Oyen. 2004. Benefits of skill chaining in serial production lines with cross-trained workers, Management Science 50 83–98. Iravani, S. M., M. P. Van Oyen, and K. T. Sims. 2005. Structural flexibility: A new perspective on the design of manufacturing and service operations, Management Science 51 151-166. Jordan, W. C., and S. C. Graves. 1995. Principles on the benefits of manufacturing process flexibility, Management Science 41 577–594. Kleywegt, A. J., and J. D. Papastavrou. 1998. The dynamic and stochastic knapsack problem, Operations Research 46 17-35. Maister, D. H. 1997. Managing the Professional Service Firm, Simon and Schuster, New York, NY. Netessine, S., G. Dobson, and R. A. Shumsky. 2002. Flexible service capacity, Operations Research 50 375–388. Orlin, J. B. 1997. A polynomial time primal network simplex algorithm for minimum cost flows,,

39

Mathematical Programming 78 109-129. Ormeci, E. L. 2004. Dynamic admission control in a call center with one shared and two dedicated service facilities, IEEE Transactions on Automatic Control 49 1157- 1161. Ross, S. M. 1996. Stochastic Processes, Second edition, John Wiley and Sons, New York, NY. Paradise, A., 2007. State of the Industry Report, American Society for Training and Development, Alexandria, VA. Perry, M., and A. Nilsson. 1992. Performance modeling of automatic call distributors: Assignable grade of service staffing, Proceedings of the XIV International Switching Symposium, 294–298. Puterman, M. L. 2005. Markov Decision Processes: Discrete Stochastic Dynamic Programming, Second edition, John Wiley and Sons, Hoboken, NJ. Sen, A., and A. Zhang. 1999. The newsboy problem with multiple demand classes, IIE Transactions 31 431–444. Shumsky, R. A. 2004. Approximation and analysis of a queueing system with flexible and specialized servers, OR Spectrum 26 307-330. Smith, B. C., J. F. Leimkuhler, and R. M. Darrow. 1992. Yield management at American Airlines, Interfaces 22 8–31. Stanford, D. A., and W. K. Grassman. 2000. Bilingual server call centres, in Analysis of communication networks: Call centres, traffic and performance, D. R. McDonald and S. R. E. Turner (eds.), Fields Institute Communications 28 31-48. Talluri, K., and G. J. Van Ryzin. 2004. Theory and Practice of Revenue Management, Kluwer Academic Publishers, Norwell, MA. Van Mieghem, J.A. 1998. Investment strategies for flexible resources, Management Science 44 1071– 1078. Xu, S. H., R. Righter, and J. G. Shanthikumar. 1992. Optimal dynamic assignment of customers to heterogeneous servers in parallel, Operations Research 40 1126-1138. Lauterbacher C. J. and S. Stidham. 1999. The underlying markov decision process in the single-leg airline yield-management problem, Transportation Science 33 133-146.

40

Appendix Proof of Lemma 2. In this proof, we use the index r = j to represent the special-purpose resource type that can only perform type j jobs, j = 1, 2, …, m, and the index r = m+1 to represent the versatile resource type that can perform all job types.

Part (1): We use induction on t to prove these results. For t = 1, ⎪⎧∑ p jπ j − ∑ j:n j >0 p jπ j = ∑ j:n j =0 p jπ j if nm +1 = 0 Δ1 (n) = v1 (n + e m +1 ) − v1 (n) = ⎨ j . if nm +1 > 0 ⎪⎩0 Clearly, Δ1 (n) − Δ1 (n + e j ) = p jπ j if n j = 0 and nm +1 = 0, and 0 otherwise . That is,

(A.1)

Δ1 (n) − Δ1 (n + e j ) ≥ 0, for r = j = 1, 2, …, m. For the versatile resource type r = m+1, because

Δ1 (n + em +1 ) = 0, we have Δ1 (n) − Δ1 (n + em +1 ) = Δ1 (n) ≥ 0. Therefore, part (1) holds for t = 1, with r = 1, 2, …, m+1. Now, suppose part (1) holds when there are less than t periods remaining. Conditioning on the demand in period t ≥ 2, we have: Δt (n) = vt (n + em+1 ) − vt (n)

(

)

⎧∑ p Δ (n − e j ) + ∑ j:n =0 pj max{π j + vt −1 (n), vt −1 (n + em+1 )} − vt −1 (n) + p0Δt −1 (n) if nm+1 = 0, j ⎪ j:nj >0 j t −1 ⎪ =⎨ ⎛ max{π j + vt −1 (n), vt −1 (n + em+1 )} ⎞ ⎟ + p0Δt −1 (n) if nm+1 > 0. ⎪∑ j:n >0 pj Δt −1 (n − e j ) + ∑ j:n =0 pj ⎜ j j ⎜ − max{π j + vt −1 (n − em+1 ), vt −1 (n)} ⎟ ⎪ ⎝ ⎠ ⎩

Note that, for j = 1, if nj = 0, then max {π 1 + vt −1 (n), vt −1 (n + e m +1 )} = π 1 + vt −1 (n) and

{

}

max {π 1 + vt −1 (n − e m +1 ), vt −1 (n)} = π 1 + vt −1 (n − e m +1 ) . Since max π j + vt −1 (n), vt −1 (n + e m +1 ) − vt −1 (n)

= max {π j , Δ t −1 (n)} , and max {π j + vt −1 (n), vt −1 (n + e m +1 )} − max {π j + vt −1 (n − e m +1 ), vt −1 (n)}

= max {π j , Δ t −1 (n)} − max {π j − vt −1 (n − e m +1 ),0} = max {π j , Δ t −1 (n)} + min {vt −1 (n − e m +1 ) − π j ,0} , we can write Δ t (n), t ≥ 2, as: ⎧∑ p Δ (n − e j ) + ∑ j:n =0 p j max{π j , Δt −1 (n)} + p0Δt −1 (n) if nm+1 = 0, j ⎪ j:nj >0 j t −1 Δt (n) = ⎨ ⎪∑ j:n >0 p j Δt −1 (n − e j ) + ∑ j:n =0 p j max{π j , Δt −1 (n)} + min{Δt −1 (n − em+1 ) − π j ,0} + p0Δt −1 (n) if nm+1 > 0. j j ⎩

(

)

(A.2) (A.3)

To prove Δ t (n) is non-increasing in nj, for j = 1, 2, …, m, and t ≥ 2, note that our hypothesis for part (1) ensures that both the terms max {π j , Δ t −1 (n)} and min {Δ t −1 (n − e m +1 ) − π j ,0} are non-increasing

in nj, which ensures that both (A.2) and (A.3) are non-increasing in nj for fixed nm+1. (Again, for j = 1, if nj = 0, then max {π 1 , Δ t −1 (n)} = π 1 and min {Δ t −1 (n − e m +1 ) − π 1 , 0} = Δ t −1 (n − e m +1 ) − π 1 , and both terms are non-increasing in n1.) To prove that Δ t (n ) is decreasing in nm+1, for t ≥ 2, we consider two cases.

41

Case 1. nm+1 = 0: In this case, the difference between (A.2) and (A.3) is: Δ t (n) − Δ t (n + e m +1 ) = ∑ j:n

j

>0

p j ⎡⎣ Δ t −1 (n − e j ) − Δ t −1 (n − e j + e m +1 ) ⎤⎦

+ ∑ j :n

j

=0

p j ⎡⎣ max {π j , Δ t −1 (n)} − max {π j , Δ t −1 (n + e m +1 )} − min {Δ t −1 (n) − π j , 0}⎤⎦

+ p0 [ Δ t −1 (n) − Δ t −1 (n + e m +1 ) ] .

Using our hypothesis for part (1) with r = m+1, which states that Δt −1 (n) is decreasing at nm+1 = 0, we see that the value within each square bracket in the above expression is nonnegative.

Case 2. nm+1 > 0: The first and third terms on the RHS of (A.3) are non-increasing in nm+1, by our hypothesis for part (1) with r = m+1. Further, the same hypothesis also ensures that both the terms max {π j , Δ t −1 (n)} and min{Δt −1 (n − em+1 ) − π j ,0} are non-increasing in nm+1, which imply that the

second term in (A.3) is also non-increasing in nm+1. This completes the proof for part (1).

Part (2): We again consider two cases. Case 1. nm+1 = 0: In this case, the difference between (A.2) with t = 2 and (A.1) with t = 1 is: Δ 2 ( n ) − Δ 1 ( n ) = ∑ j :n

j

>0

p j ⎡⎣ Δ 1 (n − e j ) − Δ 1 ( n ) ⎤⎦ + ∑ j:n

j

=0

p j ⎡⎣ max {π j , Δ 1 ( n )} − Δ 1 (n ) ⎤⎦ .

We can easily see that both terms in the above summation are non-negative. For t ≥ 2, Δ t (n) is defined by (A.2) for nm+1 = 0. Now, using our hypothesis for part (2), each term in (A.2) is nondecreasing in t.

Case 2. nm+1 > 0: In this case, Δ 2 (n ) − Δ 1 (n ) ≥ 0 holds trivially since Δ 2 (n) ≥ 0 and Δ1 (n) = 0 . For t ≥ 2, Δ t (n) is defined by (A.3) for nm+1 > 0. Now, using our hypothesis for part (2), we notice that each term in (A.3) is non-decreasing in t. This proves part (2).

Part (3): From the proof of part (1) with t = 1, we have Δ1 (n + e j ) ≥ Δ1 (n + e m+1 ) = 0 . Therefore the statement holds trivially for t = 1. Assuming part (3) is true when there are less than t periods remaining, we now show that it holds when there are t periods remaining. We consider four cases.

Case 1. nm+1 = 0 and nj = 0: Using (A.2) for Δ t (n + e j ) and (A.3) for Δ t (n + e m+1 ) , we obtain: Δt (n + e j ) − Δt (n + em+1 ) = ∑l:n >0 pl ⎡⎣Δt −1 (n − el + e j ) − Δt −1 (n − el + em+1 )⎤⎦ l

+ p j ⎡⎣Δt −1 (n) − max {π j , Δt −1 (n + em+1 )} − min {Δt −1 (n) − π j ,0}⎤⎦

+∑l ≠ j:n =0 pl ⎡⎣max {π l , Δt −1 (n + e j )} − max {π l , Δt −1 (n + em+1 )} − min {Δt −1 (n) − π l ,0}⎤⎦ l

(A.4)

+ p0 ⎡⎣Δt −1 (n + e j ) − Δt −1 (n + em+1 )⎤⎦

The first, third, and fourth terms in the above expression are non-negative due to our hypothesis for part (3). The second term can be re-written as: Δt −1 (n) − max {π j , Δt −1 (n + em+1 )} − min {Δt −1 (n) − π j ,0} = max {π j , Δt −1 (n)} − max {π j , Δt −1 (n + em+1 )} ,

which is non-negative due to part (1) for r = m+1.

42

Case 2. nm+1 = 0 and nj > 0: The difference function Δ t (n + e j ) − Δ t (n + e m+1 ) can be expressed similarly as in (A.4), except that the second term becomes p j ⎡⎣ Δ t −1 (n) − Δ t −1 (n − e j + e m +1 ) ⎤⎦ , which is non-negative by our hypothesis for part (3).

Case 3. nm+1 > 0 and nj = 0: From (A.2) and (A.3) we obtain: Δt (n + e j ) − Δt (n + em+1 ) = ∑l:n >0 pl ⎡⎣Δt −1 (n − el + e j ) − Δt (n − el + em+1 )⎤⎦ l + p j ⎡⎣Δt (n) − max{π j , Δt −1 (n + em+1 )} − min{Δt −1 (n) − π j ,0}⎤⎦

⎡max{π l , Δt −1 (n + e j )} + min{Δt −1 (n + e j − em+1 ) − π l ,0}⎤ ⎥ +∑l ≠ j:n =0 pl ⎢ l ⎢− ⎥⎦ ⎣ max{π l , Δt −1 (n + em+1 )} − min{Δt −1 (n) − π l ,0}

(A.5)

+ p0 ⎡⎣Δt −1 (n + e j ) − Δt −1 (n + em+1 )⎤⎦

As we discussed for (A.4), the first, second and fourth terms of the above expression are nonnegative. The third term is also non-negative because

Δ t −1 (n + e j ) ≥ Δ t −1 (n + e m +1 ) ⇒ max {π j , Δ t −1 (n + e j )} ≥ max {π l , Δ t −1 (n + e m +1 )}

Δ t −1 (n + e j − e m +1 ) ≥ Δ t −1 (n) ⇒ min {Δ t −1 (n + e j − e m +1 ) − π l ,0} ≥ min {Δ t −1 (n) − π l ,0} Therefore, (A.5) is non-negative. Case 4. nm+1 > 0 and nj > 0: The difference function Δ t (n + e j ) − Δ t (n + e m+1 ) is the same as (A.5), except that the second term becomes p j ⎡⎣ Δ t −1 (n) − Δ t −1 (n − e j + e m +1 ) ⎤⎦ , which is non-negative by our hypothesis for part (3). ♦

Proof of Proposition 3 Recall that Ft j (ns) = min{nm +1 : Δ t −1 (n − e m +1 ) ≤ π j } ) for any state n with nj = 0, j = 2, 3, …, m, and t = 1, 2, …, T. Since, by parts (1) and (2) of Lemma 2, Δt −1 (n − em +1 ) is non-increasing nk, k ≠ j, and non-decreasing in t, parts (1) and (2) of Proposition 3 follow immediately. Also, by part (3) of Lemma 2, if π j ≥ Δ t (n + e k ) for nj = 0 and k ≠ j , then π j ≥ Δ t (n + e m +1 ) , which implies part (3) of Proposition 3. ♦

43