Optimal Processing Policies for an e-Commerce Web Server Yong Tan • Vijay S. Mookerjee • Kamran Moinzadeh University of Washington Business School, Box 353200, Seattle, Washington 98195-3200, USA School of Management, University of Texas at Dallas, JO4.624, Richardson, Texas 75803-0688, USA University of Washington Business School, Box 353200, Seattle, Washington 98195-3200, USA
[email protected] •
[email protected] •
[email protected]
T
he explosive growth in online shopping has provided online retailers impressive opportunities to extend revenue and profit. However, retailers may lose considerable online business from slow response times at electronic shopping sites. Although increasing server capacity may improve response time, the resources needed to do so are clearly not free. In this study, we propose a scheme that can improve a server’s performance with no more than its current capacity. This scheme is based on priority processing, where the priority assigned to a customer depends on the potential revenue that is likely to be generated from the customer. The results for single-period analysis show that the benefit from priority processing increases as the server becomes more heavily utilized. We have also modeled a multiperiod version of the problem, where the demand in a period depends on the quality of service (QoS) that buyers receive in the previous period. In the multiperiod problem, both the server capacity and the processing policy in each period are determined optimally. Here, it may be optimal for the retailer to sacrifice profit and increase QoS in initial periods in order to increase demand (and revenue) in later periods. Thus, the processing policy of the server evolves from an emphasis on QoS in initial periods, to one on profit in subsequent periods. (Electronic Commerce; Priority Processing; Queues; Reneging )
1.
Introduction
The last few years have observed an explosive growth in online business. This growth is likely to be sustained since the number of Internet users around the world continues to grow. The Computer Industry Almanac (2002) has reported that by the year 2004, 945 million people around the world will have Internet access, that is, 150 per 1000 people worldwide, and 232 people per 1000 by year’s end 2007. More significantly, Abdelmessih et al. (2001) have estimated that online retailing will grow to $168 billion by year 2005. While the above predictions may have become less credible due to the recent economic downturn, it is generally believed that retailers seeking to benefit from the huge online market must design the online experience with a view to establish long-term 0899-1499/03/0000/001 1526-5528 electronic ISSN
relationships with online buyers. This is because meeting and exceeding the expectations of buyers has an effect far beyond the immediate sale; a positive online experience with a specific retailer could go a long way toward securing a future stream of revenues. In this study, we consider the problem of congestion in e-commerce servers. Our concern is specifically with slow response time. A significant statistic here is one provided by Forrester Research Inc. (Green 1999): 66% of all electronic shopping carts are abandoned before a sale is completed. While the abandoning of a cart may occur for reasons that are not related to response time, it is quite clear that slow response time is indeed a significant cause (Mardesich 1999). Jupiter Communications (Green 1999) conducted a INFORMS Journal on Computing © 2003 INFORMS Vol. 00, No. 0, Xxxxx 2003, pp. 1–14
TAN, MOOKERJEE, AND MOINZADEH Optimal Processing Policies for an e-Commerce Web Server
detailed survey using 1000 online shoppers to study congestion problems in e-commerce servers. The survey revealed that 28% of the respondents found sites too slow and left, never to return. An obvious solution to the above congestion problem is to increase server capacity. An interesting case in point here is the e-merchant “800.com,” that sells consumer electronics (Green 1999). After its launch in October 1998, the company tried to launch a promotion that offered three movies or three CDs for $1. The promotion worked, perhaps too well. On November 27, 1998 the company’s site was swamped and virtually shut down as hundreds of thousands of customers flocked to the site to obtain cheap movies and CDs. Based on this experience, 800.com now relies on 50 rather than 16 main servers to handle site traffic! However, while adding more servers should clearly alleviate congestion problems, there is no easy formula to determine the optimal number of computers. Furthermore, adding capacity is obviously not free and should be done only if the additional capacity can be economically justified. 1.1. Problem Description Figure 1 depicts the high-level structure and function of a typical e-commerce website. The most important aspect of Figure 1 is the separation between pre-checkout and checkout activities. The web server (which in reality could be a collection of individual computers) handles front-end tasks and is dedicated for pre-checkout activities, such as browsing, comparing products, searching, and placing items in the shopping cart. On the other hand, the transaction server handles checkout activities such as credit-card processing, order confirmation, order tracking, etc. Web/Search Server
Transaction Server Purchase
Request
Search Only Early Quit Figure 1
2
Simplified e-Commerce Site Structure
While both pre-checkout and checkout processing are clearly important to customer satisfaction, our focus in this study is on pre-checkout processing. During the pre-checkout phase, customers should be more sensitive to response time and could quit if the response time is not acceptable. On the other hand, shoppers are less likely to quit during the checkout phase because they have invested more in the shopping process. For example, customers are less likely to quit after they have entered sensitive information such as a credit-card number or have provided personal information such as a shipping address or a home telephone number. We therefore consider the web server to be more of a service bottleneck (at least as far as loss of immediate revenue is concerned) and develop a model to optimize its performance. This paper advances an approach to manage the web sever of an e-commerce site using a priority processing scheme. The following are the four main elements of the priority scheme: 1. Customers enter the shopping site and are associated with an expected predicted purchase value h. 2. The purchase value h is mapped to a priority weight gh. 3. Each customer is assigned a time slice based on the priority weight gh. 4. The priority weight gh remains constant during the pre-checkout session. The pre-checkout session is the interval between the time at which the customer enters the site and the time at which the pre-checkout phase ends. The pre-checkout phase can end in one of three ways: the customer can proceed to checkout, the customer may decide not to purchase and leave the system, or the customer may renege because of poor response time. The goal of this paper is to derive the optimal weights gh to maximize performance. We consider two performance measures: profit (or revenue) per unit time and percentage of buyers lost. While profit is an obvious performance measure, we also consider the percentage of buyers lost as a measure of quality of service (QoS). The QoS achieved captures longterm effects of a priority scheme, i.e., beyond the immediate loss of revenue as captured by the profit measure. QoS can be considered analogous to throughput level, a measure commonly used to measure the performance of a computer network.
INFORMS Journal on Computing/Vol. 00, No. 0, Xxxxx 2003
TAN, MOOKERJEE, AND MOINZADEH Optimal Processing Policies for an e-Commerce Web Server
1.2. Contributions and Results Our analysis in this study shows that when profit is the goal of the processing policy, the amount of processing power assigned to a customer depends on two considerations: (a) the potential revenue that would be generated if it was successfully executed, and (b) the probability that the potential purchase will be lost because the customer’s delay tolerance is exceeded. On the other hand, when QoS is the goal, the amount of processing power assigned to a customer depends only on the probability that the customer will be lost, i.e., more processing power is allocated to customers that are more likely to be lost. We also study a multiperiod version of the servercongestion problem, where both the server capacity and the processing policy in each period are optimally determined. In the multiperiod version, we consider QoS externalities; demand in a given period depends on the QoS achieved in the previous period. The multiperiod problem is solved using dynamic programming. The multiperiod study provides guidelines on how to adjust the order-processing policy over time. For example, the analysis reveals that e-retailers should initially pay more attention to QoS. However, as the customer base of an e-retailer matures, a profitoriented order processing policy should be pursued. We also provide a variety of analytical and numerical results that prescribe how the processing power of a bottleneck server could be best utilized. The rest of the paper is organized as follows. In §2 we review related research. Sections 3 and 4 are dedicated to the analysis of optimal server operating policies. A single-period model is presented in §3 whereas §4 extends the single-period model to a multiperiod one. Numerical results are presented and practical implications are discussed. Section 5 provides a summary of results and offers directions for future research.
2.
Related Work
Our study draws from three research areas: timeshared systems, reneging in queuing theory, and differentiated (or prioritized) services. The concept of time sharing has been widely adopted in scheduling computer processing (Kleinrock 1964, 1967, 1976;
INFORMS Journal on Computing/Vol. 00, No. 0, Xxxxx 2003
Stallings 1997). In time sharing, processing power can be divided in two ways. The first method, often called “round robin,” is a true time-sharing policy. Each process gets a slice (quantum) of time during which the full capacity of the processor is dedicated to the process. The second method is to divide processing power on a full-time but part-capacity basis. This method, referred to as “Processor-Sharing,” represents the continuous limit of time sharing where the time slice approaches zero. Kleinrock (1964, 1967) obtained the expected response time (time spent in the system, waiting time plus required processing time) for exponentially distributed arrival and departure (time-shared M/M/1). This result was later found to hold for M/G/1 processor-sharing systems (Sakata et al. 1971, O’Donovan 1974). In time-shared queues, the waiting time is proportional to the processing time attained; suggesting that such systems are more “fair” than sequential queues since the waiting time in sequential systems does not depend on the actual amount of processing incurred on a job. Kleinrock (1976) recommended time-shared scheduling for interactive environments, such as a web server, which is the focus of this study. The second stream of related research deals with customer reneging in a queuing system. Reneging (and balking) has been studied in the past 40 years and is still an active topic within the queuing area. Barrer (1957a, b) was the first to investigate the impacts of customer impatience on queues. It was assumed that customers would leave the system after either a constant or an exponentially distributed random time. Ancker and Gafarian (1962, 1963a, b) provided more systematic studies on M/M/1 systems with an exponentially distributed reneging time. Rao (1968) extended the results to reneging in M/G/1 systems. Later studies provided extensions to more general queuing systems, for example a GI/G/1 model (Stanford 1979). Recently, Whitt (1999) studied a reneging problem in which customers are informed about anticipated delays. Finally, we draw upon the area of priority processing or differentiated services. Kleinrock (1967) introduced the notion of priority processing for computer processors, which is adopted in this study. He derived the expected waiting time for different priority
3
TAN, MOOKERJEE, AND MOINZADEH Optimal Processing Policies for an e-Commerce Web Server
classes with exponentially distributed processing time. O’Donovan (1974) showed that the result holds for M/G/1 processor-sharing systems. In computer networking, the vision of providing differentiated services has been around for well over a decade (Turner 1986). Future packet networks will likely support a full array of functionalities using differentiated services (Aiello et al. 2000). In addition to research on how differentiated services may be implemented, other studies have focused on economic issues; for example, pricing for providing differentiated services in communication networks as well as in other settings (Marchand 1974, Gupta et al. 1996) has been studied.
3.
The Model
In this section, we set up the basic, single-period model. The e-retailer acquires capacity for a given level of demand and implements a priority scheme for order processing for the period under consideration. In §4, we introduce a multiperiod problem where both the processing capacity and the priority scheme can change from one period to another. 3.1. Assumptions and Preliminaries There are four assumptions that we make in this model. These assumptions are: (1) static purchase value, (2) Poisson arrival and exponential processing time, (3) value-dependent delay tolerance, and (4) round-robin processing. These assumptions are discussed below. 3.1.1. Static Purchase Value. We assume that the predicted purchase value associated with each customer remains unchanged throughout the course of shopping. One way to predict purchase value is to use the customer’s historical data, such as an average value (or a moving average). After each purchase, the data can be modified and a new average value can be calculated for use for the next visit. For a first-time buyer, a typical value, such as the market average for all new buyers can be used. The predicted purchase value of a customer (denoted by h) follows a value distribution, f h. In a more sophisticated model, the predicted purchase value can vary “dynamically,” for example, by
4
using information about the value of items in the shopping cart. While a dynamic approach could offer more accurate value prediction, it would come at the cost of extra computational overhead. Furthermore, dynamically updating the purchase value could lead to strategic behavior on the part of the customers; e.g., customers could put high-value items in the cart to obtain higher priority and drop these items just before checkout. For the above reasons, we have chosen to use a static purchase value in the model. 3.1.2. Poisson Arrival and Exponential Processing Time. The total arrival rate of customers is per unit time. Moe and Fader (2001) show empirically, using clickstream data, that the Poisson process best describes customers’ arrival to a website. Each customer requires a certain amount of processing time to complete pre-checkout activities. Unless otherwise stated, we will consider only the pre-checkout phase in the rest of this paper. We assume that the processing time required for a customer with value h is exponentially distributed with an average time h, which is a function of the value. Therefore, the overall processing-time distribution for all customers is described by the hyper-exponential distribution, or a linear combination of exponential distributions with different means. Exponential processing time is a good approximation for time-shared systems where it has been shown (Sakata et al. 1971, O’Donovan 1974, Burman 1981) that, given Poisson arrivals, several important results (such as mean delays) depend only on the mean processing time and not on the distribution. 3.1.3. Value-Dependent Delay Tolerance. Impatient customers may quit shopping if the time spent in processing exceeds their tolerance for waiting. We assume the tolerance level of a customer to be dependent on the purchase value; specifically, customers with purchase value h will be willing to wait for a random time that is exponentially distributed (Ancker and Gafarian 1962) with a mean wh. The impatience of an on-line shopper can be better understood using an analogy from the “express lane” in a traditional grocery store. Typically, store express lanes provide faster service to customers with relatively fewer items. The implicit assumption here is
INFORMS Journal on Computing/Vol. 00, No. 0, Xxxxx 2003
TAN, MOOKERJEE, AND MOINZADEH Optimal Processing Policies for an e-Commerce Web Server
that customers with fewer items may be less tolerant of delays. We note that the model in this paper does not depend on whether delay tolerance increases or decreases with the purchase value. 3.1.4. Round-Robin Processing. Similar to scheduling a computer processor, server capacity is shared by customers in the server. The sharing scheme in a typical e-commerce server is based upon round-robin time sharing. Specifically, a request generated by a customer is given a slice of processing time, say Q, when it enters the processing unit of the server. It exits from the system if it finishes the desired processing during this allotted time. Otherwise, it goes back to the end of the queue and waits for its next turn. Round-robin scheduling is better because sequential processing may waste server capacity while waiting for a client’s response. One major advantage of round-robin processing in the context of e-commerce servers is that it permits interactive use of the server by many users simultaneously (Kleinrock 1976). During the pre-checkout phase, it is not unusual that customers generate frequent requests for small computational demands that require rapid response. Due to these reasons and the fact that prominent commercial e-commerce server software (Apostolopoulos 1999) uses round-robin processing, our model is also based on a round-robin processing scheme. 3.2. Priority Processing Scheme A priority processing scheme can be achieved by assigning a weight gk gk ≥ 0 for customers in priority class k so they receive processing time gk Q. Such a scheme was first proposed by Kleinrock (1967). In the e-commerce context, the priority assigned to a customer would be determined by the predicted purchase value. In this study, we limit ourselves to the case where there are an infinite number of classes. With infinitely many classes, the processing weight gk becomes continuous and is a function gh of h. Offering discrete classes may be considered if the delay tolerance wh and the processing requirements h are insensitive to h for certain ranges of h. While a discrete priority scheme may offer some implementation advantages, deploying a continuous priority scheme is not significantly more difficult; all one has to do
INFORMS Journal on Computing/Vol. 00, No. 0, Xxxxx 2003
is compute the priority weight gh for a given value of h. A model for discrete classes has been discussed elsewhere (Tan 2000). The effect of reneging in a priority-based timesharing system can be summarized in the following proposition. Proposition 1. The loss function density (defined as the number of customers lost per unit time per unit value), lh, for customers with value h, is lh =
hgh + 1 f h hgh + 1 + whgh
where gh must satisfy h∈H
whhgh + 1gh f h dh = 1 hgh + 1 + whgh
(1)
and H is the set of all possible values. The proof of Proposition 1 is presented in the Online Supplement to this paper on the journal’s website, where the proofs for all the propositions, corollaries, and lemmas can be found. The result of this proposition will be used later to calculate the total lost revenue, and consequently the optimal processing weight gh will be determined. The effect of increasing server capacity is to reduce the required processing time: h =
0 h C
(2)
where 0 h is the processing time for a standard unit of capacity, and C is the server capacity. A more powerful server has a larger value of C, and therefore requests from customers can be processed faster. There is a cost associated with acquiring capacity. We assume a linear cost, C, where is the unit cost (normalized to per unit time). The linear form can be justified as in most cases capacity can be additive, for example, if more servers are added. We ignore the fixed cost, as it does not affect our results. In the next section, we introduce a multiperiod version of this model where capacity choices can be made at the beginning of each period. In this section, however, we have assumed that capacity can be chosen only once, at the beginning of the period.
5
TAN, MOOKERJEE, AND MOINZADEH Optimal Processing Policies for an e-Commerce Web Server
3.3. Profit-Focused Policy The total expected revenue (per unit time) is S= hf h − lh dh
(3)
h∈H
A retailer’s objective is to choose capacity and processing weight to maximize expected profit per unit time. Explicitly, we define an e-retailer’s problem (Problem 1):
to solving for and C ∗ . It is quite straightforward to find numerical values of and C ∗ . This involves two well-behaved coupled equations. A small number of iterations will yield very accurate results. Corollary 1. The optimal capacity C ∗ satisfies ≤ C ∗ ≤ S ∗
S ∗ is the revenue when the optimal policy is adopted. The first inequality indicates that a customer max gh C will receive processing if the potential purchase can gh C recover the cost of capacity (necessary condition). The whCghh system can, however, afford to admit some customers = f h dh − C 0 h∈H hgh + 1 + whCgh with values that are below the cost of capacity. The second inequality guarantees a nonnegative profit; repsubject to resenting the individual rationality condition for the wh 0 hgh + 1gh retailer. 1= f h dh 0 h∈H hgh + 1 + whCgh In Corollary 2 we provide two results concerning the optimal priority weights, gh: gh ≥ 0 ∀h ∈ H (4) Note that (4) is identical to (1) with capacity C explicitly expressed. The solution to the above profitmaximizing problem is: Proposition 2. The optimal processing weight allocation is 1 h wh 1/2 1 − 1 1 + − 1 1 + h h 1+wh/h ∗ S g h = h∈H
0 h H S
where is the Lagrangian multiplier associated with (4), and the serviceable set H S is defined as
h H S = h ≥ h ∈ H h The optimal capacity is C∗ =
whhg ∗ h + 12 g ∗ h f h dh h∈H hg ∗ h + 1 + whg ∗ h
Proposition 2 indicates that there is a revenuerealization threshold defined by h/h ≥ ; only customers with revenue realization above this threshold receive processing capacity. The revenue-realization threshold is the rate at which revenue is realized. The value of can be obtained by substituting the expressions of g ∗ h and C ∗ in (4). The problem is reduced
6
Corollary 2. For the optimal processing-weight allocation gh, g ∗ h i. > 0; h/h g ∗ h ii. < 0 wh/h The first result can be interpreted to mean that customers with a higher rate of revenue realization (i.e., the h/h ratio) receive more processing time. The second result relates the priority weights to customers’ patience levels (i.e., the ratio wh/h): customers with a higher value of this ratio (more patient buyers) can tolerate more delay and hence receive less processing time. 3.3.1. Numerical Illustration. Figure 2 shows the optimal processing weight g ∗ h for linear mean processing time and waiting tolerance: h = 07 + 03h and wh = 1 + 05h. The value distribution is uniform, namely, f h = 1 for h ∈ 0 1. Unless otherwise stated, we use these forms of h, wh, and f h for numerical calculations. As can be seen from Figure 2, g ∗ h increases with h since the ratio h/h is an increasing function of h, while wh/h decreases with h. However, in Figure 3, with nonlinear forms of h = 03 + 07h3 and wh = 05 + 08h2 , g ∗ h is not monotonic. It turns out that the ratio h/h
INFORMS Journal on Computing/Vol. 00, No. 0, Xxxxx 2003
TAN, MOOKERJEE, AND MOINZADEH Optimal Processing Policies for an e-Commerce Web Server
0.8
λ =3 λ =7
g * (h )
g * (h )
0.6 0.4
1.2
++
1
−+
0.8
+− −−
0.6 0.4
0.2
0.2 0
0 0
0.2
0.4
0.6
0.8
0
1
0.2
0.4
Figure 2
Optimal Policy for Linear Processing Time and Waiting Tolerance
0.4 0.3
g* (h )
Figure 4
0.2 0.1
1
Optimal Policy for Decreasing or Increasing Linear Processing Time and Waiting Tolerance
3.3.2. Bursty Traffic. Figures 6 and 7 demonstrate the superiority of the priority policy in the presence of traffic fluctuations. Let the priority scheme be determined using a predicted mean demand of = 5. This policy should suffer when the actual mean demand is = 10. In Figure 6, we show the optimal policy (for = 10) and the inaccurate policy (based on = 5). As expected, Figure 6 clearly shows the optimal policy to be more discriminating than the inaccurate policy. However, Figure 7 shows that the inaccurate policy still outperforms a non-discriminating (uniform) policy in most regions of the actual demand. Only when the actual demand is low, over-discrimination hurts revenue. On the other hand the (inaccurate) priority policy is better suited for upswings in demand. 60%
++
50%
−+
40%
+− −−
30% 20% 10% 0%
0 0
0.2
0.4
0.6
0.8
0
1
Optimal Policy for Non-Linear Processing Time and Waiting Tolerance
INFORMS Journal on Computing/Vol. 00, No. 0, Xxxxx 2003
2
4
6
8
10
λ
h Figure 3
0.8
e-retailer realizes the most gains from the priority scheme.
Percent Revenue Increase
peaks around the value h ≈ 06. This example shows the importance of gathering information on customer purchasing behavior (e.g., the delay tolerance); a higher value h alone does not warrant higher priority. We also examined other forms of processing time and waiting tolerance. Specifically, we considered h = 07 + 03h or h = 1 − 03h for processing times, and wh = 1 + 05h or wh = 15 − 05h for waiting tolerance. The functional forms are such that the average (over h) processing times (or waiting tolerances) is the same for the increasing and decreasing forms. We depict four combinations, denoted in Figures 4 and 5 as + +, − +, + −, and − −. The value distribution is uniform between 0 and 1. The case where higher-value customers take less processing time and are more impatient (case − −) is the most discriminating. This is also the case where the
0.6
h
h
Figure 5
Percentage Revenue Increase for Decreasing or Increasing Linear Processing Time and Waiting Tolerance
7
TAN, MOOKERJEE, AND MOINZADEH Optimal Processing Policies for an e-Commerce Web Server
0.3 Inaccurate Optimal
0.2
0.2
0.1
0.1
0.05
0
0 0
0.2
0.4
0.6
0.8
1
0
h Inaccurate Policy Compared to Optimal Policy
Figure 8
3.3.3. Optimal Capacity. In this section we examine the behavior of the optimal capacity with changes in different parameters. When total demand increases, the expected profit can also be increased by adding more capacity. At the same time, the server should become more discriminating by raising the threshold below which orders will be rejected. These results are formally presented in Corollary 3.
Normalized Revenue ( S / λ )
Corollary 3. The following results hold with change in demand, : ∗ i. > 0; ∗ C ii. > 0; iii. > 0 if ∗ > a0 ∗ C C
0.25 Inaccurate
0.2
Uniform 0.15 0.1 0.05 0 0
Figure 7
8
0.4 0.3
0.15
Figure 6
Optimal Capacity Constant Capacity
hc
g * (h )
0.25
0.5
2
4
6
Actual Demand (λ )
8
10
Normalized Revenue for Inaccurate and Uniform Policies
2
4
λ
6
8
10
Value Threshold for Serviceable Customers
where the expression for the coefficient a0 is provided in the proof in the Online Supplement to this paper on the journal’s website. The value threshold hc can be obtained by solving hc = 0 hc C ∗ Figure 8 depicts the value threshold hc as a function of demand, with the same parameter setting as the one used to generate Figure 2. If capacity is fixed, the e-retailer has to drop more low-value customers so that higher-value customers are more likely to complete their shopping. Otherwise, the capacity will be increased optimally with demand, without much increase in the value threshold. Corollary 4. The following results hold with change in capacity cost, : ∗ < 0; i. ∗ C ii. < 0; iii. > 0 C ∗ It is intuitive that when the cost of capacity increases, the expected profit and optimal capacity will decrease. As the cost of capacity increases, the server becomes more discriminating by raising the threshold. 3.4. QoS-Focused Policy In this section, we consider a policy that focuses on QoS. As mentioned earlier, QoS should be interpreted
INFORMS Journal on Computing/Vol. 00, No. 0, Xxxxx 2003
TAN, MOOKERJEE, AND MOINZADEH Optimal Processing Policies for an e-Commerce Web Server
0.4
h∈H
The processing-time threshold c can be found by substituting the above expression into (1). It is obvious from the above equation that customers who require more processing will be assigned less priority. More patient buyers (with higher wh/h ratio) also receive less processing time. The value threshold hc is given by hc = c . Assuming h increases with the value h, customers with value above the threshold hc will not receive any processing capacity. We next compare three policies: profit-focused, QoS-focused, and uniform. The uniform policy is a non-discriminating policy, where gh is a constant. Here, linear forms of h = 07 + 03h and wh = 1 + 05h are used, and the capacity C is set to be 1. Figure 9 clearly shows the difference between profitfocused and QoS-oriented policies. QoS policies favor low-value customers, as orders from such buyers consume less processing time. Figure 10 plots the expected revenue (or profit since the capacity is fixed). The QoS-focused policy is the worst performer because it favors low-value orders (with shorter processing time). In Figure 10 we show that the relative advantage of the profit-focused policy over the other two policies increases at higher levels of demand (higher ). Figure 11 plots the loss ratio of customers, L/, a measure for the QoS. The QoS-focused policy is most effective in the middle range of demand, since L/ converges to 0 as → 0,
INFORMS Journal on Computing/Vol. 00, No. 0, Xxxxx 2003
0.3 0.2 0.1 0 0
0.2
0.4
0.6
0.8
1
h Figure 9
Optimal Processing Weight for Different Policies
0.8 0.6
Revenue (S * )
where lh is the loss-function density, given in Proposition 1. The optimal processing allocation can be obtained as 1 1 + wh/h
c wh 1/2 · 1+ −1 1+ −1 g ∗ h = h h h < c
0 h ≥ c
Profit-focused QoS-focused Uniform
0.5
0.4 Profit-focused QoS-focused Uniform
0.2 0 0
Figure 10
2
4
λ
6
8
10
Optimal Revenue for Profit-Focused, QoS-Oriented, and Uniform Policies
1.0 0.9
L /λ
gh
0.6
g *(h )
as a throughput-level measure, namely, the number of lost customers per unit time (regardless of value). With QoS as the performance objective, the e-retailers’ problem becomes min L ≡ lh dh (5)
0.8 0.7
Profit-focused QoS-focused Uniform
0.6 0.5 0 Figure 11
2
4
λ
6
8
10
QoS for Profit-Focused, QoS-Oriented, and Uniform Policies
9
TAN, MOOKERJEE, AND MOINZADEH Optimal Processing Policies for an e-Commerce Web Server
and to 1 as → , for all policies. For this example, the uniform policy is very close to the QoS-focused policy (Figure 9) and gives a similar performance in the loss ratio.
4.
Multiperiod Model
In §3, we presented various policies that attempt to optimize an e-retailer’s profit in a single period. However, retailers often value more than immediate profit. This consideration is not without merit since lost customers rarely come back. This suggests that during initial periods, the value of orders should be paid less importance to build a solid customer base. In this section, we propose a multiperiod model that includes feedback on the QoS. Specifically, the QoS that customers receive in a period affects the demand in the next period; e-retailers are allowed to increase capacity to match the demand. We add an index j for the period to the notation used in the previous section. We start with the lossfunction density lj h in the jth period. The expression for lj h is the same as the one shown in Proposition 1 with subscripts for the period j wherever necessary. The expected number of buyers lost per unit time in the jth period, Lj , is calculated using (5) and lj h. We model the demand in period j + 1 as follows: j+1 = Lj · pj + j − Lj · rj
(6)
where, pj is the probability that unsatisfied buyers return and rj is the rate of growth in demand due to successful purchases, for example through “word of mouth”. Now (6) can also be expressed in terms of the expected percentage of buyers lost, Lj /j , a measure of QoS. In the multiperiod model, e-retailers are allowed to adjust capacity in each period. Capacity expansion is a well-studied problem in production and inventory settings (Luss 1982, Rao 1976). In our model, we assume that the capacity can be added only at the beginning of each period, and that the cost of acquiring capacity is proportional to the capacity added, namely, j Cj − Cj−1 , where Cj is the capacity and j is the marginal capacity cost in the jth period. Here we do not consider equipment replacement
10
(Nair and Hopp 1992) or scale economies of capacity adjustment (Rajagopalan 1998). Similar to (2), we have j h = j0 h/Cj . Since hardware cost typically decreases with time, the coefficient j is assumed to decrease from period to period. In addition, we include an operational (or maintenance) cost linear in capacity, i.e., m Cj , for each period. Without loss of generality, we assume that demand becomes stationary in the N th period. Then e-retailers have N opportunities to adjust capacity, after which the capacity remains unchanged from the N th period onwards. The discounted profit is max
gh C
j−1 j =
j=1
N
j−1 Sj − j Cj − Cj−1 − m Cj
j=1
+
N S − m CN 1− N
where is the discount factor; and gh and C are the processing weight and capacity vectors. The profit, j = SN , for j ≥ N + 1, because the operating policy and capacity remain unchanged from the N th period onwards. The discounted profit can be represented by a finite-horizon formulation, Nj=1 j−1 j , where N is redefined as N =
S N − m C N − N CN − CN −1 1−
We can solve this multiperiod problem using the method of dynamic programming (Dreyfus and Law 1977), more specifically, backward induction. Let us define j j Cj−1 ≡ max
gl h Cl
N
l−j l l Cl−1 gl h Cl
l=j
where gl h and Cl , for l ≥ j, are the decision variables. l is the demand for the lth period, which can be obtained recursively using the demand-generation model (6). j j Cj−1 is the maximum discounted profit starting from the jth period. It depends on the demand arriving in this period, j , and the capacity carried over, Cj−1 . This allows us to write the recursive relation j j Cj−1 ≡ max j j Cj−1 gj h Cj gj h Cj
+ · j+1 j+1 Cj
(7)
INFORMS Journal on Computing/Vol. 00, No. 0, Xxxxx 2003
TAN, MOOKERJEE, AND MOINZADEH Optimal Processing Policies for an e-Commerce Web Server
where 0 0 is the objective function. We first introduce the following lemma.
1
j Cj−1
= j
gj*(h )/gj*(1)
0.8
Lemma 1. The effect of carry-over capacity on the maximum discounted profit is
0.6
0.2
This result is intuitive since more existing capacity reduces the cost of capacity in the current period, and consequently increases the discounted profit. Now we characterize the optimal policies.
where the serviceable value set HjS is HjS = h j h ≥ j h ∈ Hj , and j+1 h 1 rj − pj + j h j+1 j h
(8)
The optimal capacity is Cj∗ =
j j m + j − j+1 + j ·
h∈Hj
wj hj hgj∗ h + 12 gj∗ h j hgj∗ h + 1 + wj hgj∗ h
fj h dh
where, j , is a Lagrangian multiplier to ensure that capacity is non-decreasing. For periods j ≥ N , the processing weight and capacity are given by Proposition 2 with the effective unit capacity cost, m + 1 − N . As displayed in (8), this policy seems to combine the profit-focused and QoS-oriented policies for the single-period problem; the factor j h is a linear combination of two ratios, h/h and 1/h. The relative weight of this combination is determined by the discount factor , and two parameters rj and pj that determine the demand growth. This controls the evolution of the operating policy from a QoS-focus at
INFORMS Journal on Computing/Vol. 00, No. 0, Xxxxx 2003
0 0
0.2
0.4
0.6
0.8
1
h Normalized Optimal Processing Weights for = 03
Figure 12
the beginning to a profit focus when the demand is steady. In the following, we present some numerical demonstrations. Given the results described in Proposition 3, in each period we numerically evaluate j and Cj∗ , making use of Corollary 5. Corollary 5. The following recursive relation holds: j j
= j
h∈Hj
wj hgj∗ h2 fj h dh + pj
j+1 j+1
The computation procedure is as follows. We start from the N th period where the policy in this period is described by Proposition 2. The recursion in Corollary 5 is then applied in the previous period to obtain the optimal policy in period N − 1, using Proposition 3. Figures 12 and 13 plot the optimal processing allocations for N = 3, rj = 2, pj = 02, j0 h = 07 + 03h, 1.2 1
g *j (h) /g j* (1)
Proposition 3. The optimal processing-weight allocation in the jth period j < N is 1 h/j h 1+w j
j h wj h 1/2 ∗ · 1+ −1 1+ −1 gj h = j j h h ∈ HjS
0 h HjS
j h =
Period 1 Period 2 Period 3+
0.4
0.8 0.6 Period 1 Period 2 Period 3+
0.4 0.2 0 0
0.2
0.4
0.6
0.8
1
h Figure 13
Normalized Optimal Processing Weights for = 07
11
TAN, MOOKERJEE, AND MOINZADEH Optimal Processing Policies for an e-Commerce Web Server
8 6 4 2
π j*
wj h = 1 + 05h, 1 = 04, 2 = 035, 3 = 03, m = 0, and uniform value distribution for 0 ≤ h ≤ 1. The processing allocation is purely profit-focused from the third period onwards. e-retailers with higher are less discriminating, as is evident from a lower value threshold hc . A higher value of the discount factor indicates that the e-retailer is more concerned about the long term. There is heavier capacity investment in the initial period to accommodate more customers. As increases, the policies in the initial periods become less discriminating, or more QoS-oriented, due to the influence of future demand growth. Figure 14 shows the optimal capacity increments in three periods. We assume the capacity remains unchanged from the third period onwards. e-retailers with low values of invest in capacity in the initial period and maintain a roughly constant capacity level, even with the decreasing cost of capacity. They focus on profit making even in the initial periods. This results in poor QoS, and therefore there is no growth in demand. e-retailers with higher invest heavily in the initial period and adopt a QoS-oriented policy to build future demand. The second period sees a small or even zero increase in capacity. Partly, the demand growth has yet to take full effect so the slightly increased or the same capacity can sustain the same level of QoS. Also, the acquiring of capacity can be postponed to the third period that has a lower capacity cost. Recall that the effective unit capacity cost is m + 1 − N . It decreases with discount factor , so we expect the optimal capacity increment in the third period to increase with .
-4 -6 0 Figure 15
0.4
δ
0.6
0.8
1
Optimal Profit in Each Period
Figure 15 shows that the profits in the second, third, and fourth periods increase with the discount factor. This is because of the increased customer base (or demand). In the second period, releasing capacity gives a slightly faster increase of profit with higher . The difference between the third period and periods from the fourth onwards is the cost of capacity. It is apparent that e-retailers with longer horizons (higher ) suffer more in the initial periods from heavy investments in capacity that are aimed at improving QoS. Figures 16 and 17 depict priority polices when the capacity is fixed C = 1; that is, no capacity is acquired after the initial period. When the discount factor is low = 03, the first period is the most important one. Therefore, a more discriminating policy is adopted to make more profit. The remaining periods (from second and on) are roughly identical,
0.8
g j* (h) /g j*(1)
12
∆ Cj*
0.2
1 Period 1 Period 2 Period 3
15
9 6
0.6 Period 1 Period 2 Period 3+
0.4 0.2
3 0
0 0
12
Period 1 Period 2 Period 3 Period 4+
-2
18
Figure 14
0
0.2
0.4
δ
0.6
0.8
Optimal Capacity Increment in Each Period
1
0
0.2
0.4
0.6
0.8
1
h Figure 16
Normalized Optimal Processing Weights for = 03
INFORMS Journal on Computing/Vol. 00, No. 0, Xxxxx 2003
TAN, MOOKERJEE, AND MOINZADEH Optimal Processing Policies for an e-Commerce Web Server
5.
1
g j*(h )/g j*(1)
0.8 0.6 Period 1 Period 2 Period 3+
0.4 0.2 0 0
0.2
0.4
0.6
0.8
1
h Normalized Optimal Processing Weights for = 07
Figure 17
and less discriminating compared to the first one. However, when the discount factor is high = 07, in order to maximize demand for future periods, the second period is less discriminating. Compared to the case where the capacity can be adjusted, the policy here is more “short-sighted.” In other words, it is nondiscriminating only in the second period and not in the first period. How much extra capacity would be needed when using a uniform policy to achieve the profit of a priority scheme with fixed capacity C = 1? It is difficult to match profits period by period. Instead, we try to find a set of capacity increments that must be applied to the uniform policy such that its total discounted profit is equal to that of the priority policy. Figure 18 shows the capacity increment in each period needed for the uniform policy. For most values of , we observe significant increases in the capacity. 2.5 Period 1 Period 2 Period 3
2
The authors thank the Area Editor (Ramayya Krishnan), the Associate Editor, and the two anonymous reviewers for their valuable comments and suggestions. This paper has also benefited greatly from the careful reading and editing of the Editor-in-Chief (David Kelton). Yong Tan acknowledges with gratitude research support in the form of a grant from the Ford Motor Company.
∆ Cj*
1 0.5 0
Figure 18
0.2
0.4
δ
0.6
0.8
We have proposed a processing scheme that improves performance of an e-commerce site. The scheme is based on priority order processing, where the priority of an order depends on the potential value of the order. We presented the model and the results for a single period. It is shown that the benefit from priority processing increases as the server becomes busier, i.e., the demand increases for the same capacity. We have also modeled a multiperiod problem, where the demand in a period depends on QoS that buyers received in the previous period. The retailer usually loses money in the first period in order to provide better service and growth in future demand. The operating policy of the server evolves from QoS-focused in initial periods to a profitfocused in subsequent periods. In multiperiod problem, the server capacity in each period is determined optimally. To implement the policies studied in this paper, it is necessary to examine the modeling assumptions, like using clickstream data. Such data will also enable us to derive functional forms for required service time h and waiting tolerance wh. We have assumed that customer’s value h is accurately assessed. It will be useful to examine the sensitivity of priority processing with respect to errors in estimating customer value. Another interesting extension would be analysis of dynamic order processing policies that allow the priority of an order to vary during the course of shopping, especially in response to a more accurate prediction of the final value and the delay tolerance. Acknowledgments
1.5
0
Conclusions and Future Research
1
Capacity Advantage of Priority Processing
INFORMS Journal on Computing/Vol. 00, No. 0, Xxxxx 2003
References Abdelmessih, N., M. Silverstein, P. Stanger. 2001. The next chapter in business-to-consumer e-commerce: Advantage incumbent. Report, The Boston Consulting Group, Boston, MA.
13
TAN, MOOKERJEE, AND MOINZADEH Optimal Processing Policies for an e-Commerce Web Server
Aiello, W. A., Y. Mansour, S. Rajagopolan, A. Rosen. 2000. Competitive queue policies for differentiated services. Proc. IEEE INFOCOM 2 431–440. Ancker, C. J., A. V. Gafarian. 1962. Queuing with impatient customers who leave at random. J. Indust. Engrg. 13 84–90. Ancker, C. J., A. V. Gafarian. 1963a. Some queuing problems with balking and reneging—I. Oper. Res. 11 88–100. Ancker, C. J., A. V. Gafarian. 1963b. Some queuing problems with balking and reneging—II. Oper. Res. 11 928–937. Apostolopoulos, N., ed. 1999. Professional Site Server 3.0. Wrox Press, Birmingham, UK. Barrer, D. Y. 1957a. Queuing with impatient customers and indifferent clerks. Oper. Res. 5 644–649. Barrer, D. Y. 1957b. Queuing with impatient customers and ordered service. Oper. Res. 5 650–656. Burman, D. 1981. Insensitivity in queueing systems. Adv. Appl. Probab. 13 846–859. Computer Industry Almanac Inc. 2002. Internet users will top 1 billion in 2005. Press release, March 21. http://www.c-i-a.com/ pr032102.htm. Dreyfus, S. E., A. M. Law. 1977. The Art and Theory of Dynamic Programming. Academic Press, New York. Green, H. 1999. The great yuletide shakeout. Bus. Week (November 1) 18–28. Gupta, A., D. O. Stahl, A. B. Whinston. 1996. An economic approach to network computing with priority classes. J. Organ. Comput. Electr. Commerce 6 71–95. Kleinrock, L. 1964. Analysis of a time-shared processor. Naval Res. Logist. Quart. 11 59–73. Kleinrock, L. 1967. Time-shared systems: A theoretical treatment. J. Association Comput. Machinery 14 242–261. Kleinrock, L. 1976. Queueing Systems: Volume 2, Computer Applications. Wiley, New York.
Luss, H. 1982. Operations research and capacity expansion problems: A survey. Oper. Res. 30 907–947. Marchand, M. G. 1974. Priority pricing. Management Sci. 20 1131–1140. Mardesich, J. 1999. The web is no shopper’s paradise. Fortune 149 188–198. Moe, W. W., P. S. Fader. 2001. Capturing evolving visit behavior in clickstream data. Wharton School Working paper, University of Pennsylvania, Philadelphia, PA. Nair, S., W. J. Hopp 1992. A model for equipment replacement due to technological obsolescence. Eur. J. Oper. Res. 63 207–221. O’Donovan, T. M. 1974. Direct solutions of M/G/1 processorsharing models. Oper. Res. 22 1232–1235. Rajagopalan, S. 1998. Capacity expansion and equipment replacement: A unified approach. Oper. Res. 46 846–857. Rao, M. R. 1976. Optimal capacity expansion with inventory. Oper. Res. 24 291–300. Rao, S. S. 1968. Queuing with balking and reneging in M/G/1 systems. Metrika 12 173–188. Sakata, M., S. Noguchi, J. Oizumi. 1971. An analysis of the M/G/1 queue under round-robin scheduling. Oper. Res. 19 371–385. Stallings, W. 1997. Operating Systems: Internals and Design Principles. Prentice Hall, Upper Saddle River, New Jersey. Stanford, R. 1979. Reneging phenomena in single channel queues. Math. Oper. Res. 4 162–178. Tan, Y. 2000. Value-based design of electronic commerce servers. Doctoral dissertation, Department of Management Science, University of Washington, Seattle, WA. Turner, J. S. 1986. New directions in communications. IEEE Comm. Magazine 24 8–15. Whitt, W. 1999. Improving service by informing customers about anticipated delays. Management Sci. 45 192–207.
Accepted by Ramayya Krishnan; submitted May 2002; revised December 2002, April 2003; accepted May 2003.
14
INFORMS Journal on Computing/Vol. 00, No. 0, Xxxxx 2003
Summary of Comments on IJOC0044 Page: 13 Sequence number: 1 Author: Date: 12/16/2003 12:53:47 PM Type: Note Edits OK?
Page: 14 Sequence number: 1 Author: Date: 12/16/2003 12:55:51 PM Type: Note Can you update?