A Fully Polynomial Time Approximation Scheme for Single-Item Stochastic Inventory Control with Discrete Demand N. Halman∗
D. Klabjan*
M. Mostagir*
J. Orlin
†
D. Simchi-Levi*‡
Abstract The single-item stochastic inventory control problem is to find an inventory replenishment policy in the presence of independent discrete stochastic demands under periodic review and finite time horizon. In this paper, we prove that this problem is intractable and for any ² > 0, we design an algorithm with running time polynomial in the size of the problem input and in 1/², that finds a policy whose value is within a factor (1 + ²) of the value of an optimal policy. In addition, we formally prove that finding an optimal policy is intractable.
1
Introduction
The standard single-item stochastic inventory control problem is to find replenishment quantities in each time period that minimize the expected procurement and holding/backlogging cost. We assume dynamic time replenishment over a finite number of time periods. We assume also that the holding cost, which includes a potential penalty for backlogging, is convex. In addition, the procurement cost is convex and nondecreasing. In the case of linear procurement costs, it is well known that the base stock policy is optimal, see e.g., [SCB05, Zip00]. This policy assumes a time-dependent number called the base-stock level, and the policy places an order that brings the inventory level up to the base-stock level. If the inventory level is above the base-stock level, then no order is placed. While this theory shows the existence of base-stock levels, it does not provide an efficient way to compute these levels. In the linear procurement cost case, the computation of the optimal base-stock levels is a nontrivial task. It is for this reason that we resort to approximation algorithms. Our main result is a fully polynomial time approximation scheme (FPTAS) for the aforementioned stochastic inventory control problem. A minimization problem has an FPTAS if for every ² > 0 and for every instance I we have A(I) ≤ (1 + ²)opt(I),
(1)
where opt(I) is the optimal value and A(I) is the value returned by the approximation algorithm A. The running time of algorithm A must be polynomial in 1/² and the size of the problem input. We assume that the demands are discrete, independent random variables, but not necessarily identically distributed. We also assume that the distributions are known in advance and there is zero lead time. Although the policy structure in the general convex nondecreasing procurement cost case is not known, our FPTAS works also in this case. The standard dynamic programming approach gives a pseudo-polynomial algorithm. This algorithm is linear with respect to the maximum demand value, which is exponential in the input size. This dynamic ∗ Research
supported by ONR Contracts N00014-95-1-0232 and N00014-01-1-0146 and NSF Contracts DMI-0085683 and DMI-0245352, and NASA interplanetary supply chain and logistics architectures project. † Research supported in part by ONR grant N00014-98-1-0317. N. Halman, M. Mostagir, J. Orlin and D. Simchi-Levi: Massachusetts Institute of Technology, Cambridge, MA, E-mail: {halman,mostagir,jorlin,dslevi}@mit.edu D. Klabjan: Northwestern University, Evanston, IL, E-mail:
[email protected] ‡ Research is also supported by NSF Contracts CMMI-0758069
1
program considers an exponential number of different inventory levels. To circumvent this, in each time period we carefully select only a subset of possible inventory levels and compute an approximation of the optimal value function on this subset. At all other inventory levels, the approximate value function is interpolated. Our approach differs from other FPTAS’s, that rely on scaling and rounding the data to reduce the state space. We note that Woeginger’s general framework for transforming a DP into an FPTAS [Woe00] does not include our problem. First of all, his approach does not consider cases in which the action space is exponentially large. Second, he does not consider applications to stochastic optimization problems. Perhaps the most closely related technique methodologically to our approach is that of [DGV08], who also rely on using an approximation function combined with linear interpolation (see Section 6, proof of Lemma 9 in their work). In addition to the standard inventory control problem, we also show how to obtain an FPTAS from our framework for many extensions: the capacitated and discounted versions, the lost sales model, the model where disposal of excess inventory at a cost is allowed, and a model where only a non-exact evaluation of the cost functions is available. We also provide a reduction that shows the NP-hardness of the stochastic problem with linear procurement and holding costs. The main methodological contribution of this work is the development of FPTASs using approximate, efficiently-representable value functions. While this concept is widely used in computational dynamic programming algorithms, to the best of our knowledge, we are the first to use this approach to develop FPTASs. Another key contribution is the hardness proof of the basic stochastic inventory control problem. The manuscript is structured as follows. We introduce and state the stochastic inventory control problem in Section 3. In Section 4, we show that this problem is NP-hard. The framework of approximation sets and functions is given in Section 5. In Section 6, we present the FPTAS and its analysis. In Section 7, we provide FPTASs for several extensions of the basic inventory control problem. We conclude the introduction with a literature review.
Literature review The stochastic inventory control problem is one of the most widely studied problems in inventory theory. The so called (s, S) policy, which results from the economies of scale in procurement, is a widely used policy in practice. We refer the reader to the books [SCB05, Por02, Zip00] for an in-depth coverage of the topic. For the finite time horizon inventory control problems with linear cost functions, the base-stock levels can be computed in pseudopolynomial time recursively from the optimality equations. For infinite horizon problems, the optimality equation still holds. [ZF91] and [FZ84b] propose algorithms for finding an optimal policy for the infinite horizon inventory control problem. The multi-echelon problem, [FZ84a], and a variant of inventory control with fixed cost for backlogging, [RM00], have also been considered, although no efficient approximation algorithms have been developed for these problems. The deterministic capacitated lot-sizing problem was solved with FPTASs by [VW01] and [SO95]. However, their approach does not appear to extend to the stochastic lot-sizing problem. Recently there is a growing interest in approximation algorithms for stochastic problems. [SS06] provide an FPTAS for a broad class of linear 2-stage stochastic recourse problems, even in the case of exponentially many second stage variables. Their algorithm is based on the standard convex programming formulation. They solve the linear programming relaxation by the ellipsoid algorithm and then they round the solution to the closest integral vector. The multi-stage setting is considered in [SS05]. A 2-approximation algorithm for stochastic inventory control is presented in [LPRS07]. They assume stochastic lead times with no order crossing, possibly correlated demand distributions, and the “no speculative cost assumption”, which is equivalent to assuming that there is no ordering cost. The first two assumptions are more general than our assumptions, whereas their latter assumption is more restrictive. They study the so called dual-balancing policy, which is not a base-stock policy. An extension to the capacitated version is given in [LRST08]. In [LRS07], the authors use sampling to analyze the inventory control problem with convex cost functions. Their model is more general than ours since they do not assume an explicit knowledge of the demand distribution. The algorithm is based on sampling, and they provide approximate base-stock levels. The number of required samples to achieve an ² approximation is upper
2
bounded by 1/², the number of time periods, and (h + b)2 / min{b, h}2 , where h is the stationary per unit holding cost and b is the stationary per unit backlog penalty. Note that this bound is not polynomial in the input size and therefore their algorithm is possibly exponential in the size of the instance and in 1² . In addition, their algorithm is not deterministic. The approaches in [SS06] and [LRS07] are significantly different. While both use sampling, the former relies heavily on optimization techniques and the latter exploits the underlying nature of stochastic inventory problems. To the contrary of these two approaches, we do not use sampling and our approach is more combinatorial in nature.
2
Notation
Let R, Z, Z+ , N denote the set of real numbers, integers, nonnegative integers, and positive integers, respectively. For any pair A, B of integers with −∞ < A < B < ∞, let [A, B] = {A, A + 1, . . . , B} denote the set of integers between A and B. For any function f over [A, B] we denote by f max the maximal value f achieves ¯ = f max be the maximal over [A, B]. Let U = B − A + 1 be the number of points in the domain, and U 0 value f achieves on the domain. For any subset D ⊆ [A, B], a function f is called unimodal over D0 if there exist x∗ ∈ D0 such that f is nonincreasing over [A, x∗ ] and nondecreasing over [x∗ , B]. The value x∗ is called an arg min of f . For any subset D0 ⊆ [A, B], we define the piecewise linear extension of f induced by D0 as the continuous function obtained by making f linear between successive values of D0 . For any subset D0 ⊆ [A, B], a function f on [A, B] is called convex over D0 if its piecewise linear extension induced by D0 is convex. For any function f on [A, B] and subset D0 ⊆ [A, B], we define the convex extension of f induced by D0 to be the piecewise linear extension of f induced by D00 ⊆ D0 , where D00 is the projection of the convex hull of {(x, f (x)) | x ∈ D0 } on the x-axis. The base two logarithm of z is denoted by log z.
3
Problem statement
Let T be the length of the planning horizon. At the beginning of a time period, we observe the inventory level and a replenishment decision is made. If we place an order, it arrives immediately, i.e., we assume there is no lead time. Backlogging is represented as a negative inventory level. A single convex holding cost function is used to model both backlogging and positive inventory. The holding cost is accounted for at the end of the time period. For each time period t = 1, ..., T we define: xt : procurement quantity in time period t; It : inventory level at the beginning of time period t; I¯t : inventory level at the end of period t (i.e., It = I¯t−1 ); ct (xt ) : procurement cost in time period t, given an order of size xt ; ht (I¯t ) : holding cost in time period t, given inventory level I¯t . For each time period t = 1, ..., T , we assume that there is an oracle that computes functions ct , ht , and there is a discrete random variable Dt describing the demand in time period t. For each Dt , we are given nt , the number of different values it admits with positive probability, and the demand values dt,1 < ... < dt,nt . Moreover, we are also given positive integers qt,1 , ..., qt,nt such that qt,i Prob[Dt = dt,i ] = Pnt
j=1 qt,j
3
.
The random variables Dt are assumed to be independent for different t. We define for every t = 1, ..., T and i = 1, ..., nt the following values: pt,i = Prob[Dt = dt,i ] probability that there is a demand of dt,i units in time period t; n∗ = maxt nt maximum number of different values Dt can take over the entire time horizon; d∗ = maxt dt,nt maximum demand over the entire time horizon; PT D∗ = P t=1 dt,nt maximum total demand over the entire time horizon; nt Qt = j=1 qt,j a common denominator of all the probabilities in time period t; QT Mt = j=t Qj a common denominator of all the probabilities in all time periods following time period t − 1; MT +1 = 1. We make the following assumptions. Assumption 3.1. All demand, procurement and inventory levels are integral. Moreover, the demand and procurement levels are nonnegative. Assumption 3.2. The procurement cost function ct is nondecreasing nonnegative convex over Z+ for every t = 1, ..., T . Assumption 3.3. The holding cost function ht is nonnegative convex over Z for every t = 1, ..., T . Assumption 3.4. All cost functions can be evaluated in polynomial time at any value in their domain. Note that the binary input size of the problem is bounded below by Ω(T + n∗ + log d∗ + log max{ct (D∗ ), ht (D∗ ), ht (−D∗ )}). t
The last term takes into account the space needed for storing the value of ct or ht at D∗ (see more details about this aspect in Section 6.1). The objective is to minimize the total expected cost. The problem can be formulated as finding a policy xt = xt (It ) for t = 1, ..., T that realizes T X z ∗ = min ED ( ct (xt ) + ht (It + xt − Dt )), xt
t=1
subject to the system dynamics It+1 = It + xt − Dt ,
t = 1, ..., T.
The action space requirement is xt ∈ Z+ for t = 1, ..., T , and the initial state is I1 = 0. PT PT Pnt Note that in the above context we have ED ( t=1 ct (xt )+ht (It +xt −Dt )) = t=1 ct (xt )+ j=1 pt,j ht (It + xt − dt,j ). The standard optimality equation is presented later in Section 6.1.
4
Hardness result
In this section we show that the single-item stochastic inventory control problem is #P-hard (and thus NP-hard) even in the case of linear procurement and holding costs. We make a transformation from a problem concerning the evaluation of convolutions of discrete variables, which in turn is proved #P-hard from a transformation from the Kth largest subset problem (see problem SP20 on page 225 of [GJ79] for its NP-hardness and the same reference for its #P-hardness). Problem: Kth largest subset Instance: A finite set A = {a1 , ..., an } of nonnegative integers and P positive integer numbers K ≤ 2n and B. 0 Question: Are there K or more distinct subsets A ⊆ A for which a∈A0 a ≤ B? 4
Remark: This problem is named the “Kth largest subset problem”, but it actually looks for the Kth smallest subset of A. In the counting version of the problem, one is asked to count the number of distinct subsets of A whose sum is at most B. Problem: Evaluating the cdf of convolutions of discrete random variables Instance: Discrete random variables X1 , . . . , Xn and probabilities pi,j for i = 1, . . . , n and j = 1, . . . , m, and values q and Q. (The Pnvalue pi,j = Prob(Xi = ai,j ).) Question: Is Prob( i=1 Xi ≤ Q) ≥ q? Theorem 4.1. The problem of evaluation the cdf of convolutions of discrete variables is #P-hard even in the case that m = 2 and pij = 21 for all i = 1, . . . , n and j = 1, 2. Proof. Let A = a1 , . . . , an , K, and B be the instance of the Kth largest subset problem. For each i = 1, . . . , n, let Xi be the random variable that equals ai with probability 12 and equals 0 with probability 1 K 2 . Let Q = B, Pnand let q = 2n . Then there are K or more distinct subsets of A with sum at most B if and only if Prob( i=1 Xi ≤ Q) ≥ q. We next establish the #P-hardness of the single-item stochastic inventory problem. Theorem 4.2. The single-item stochastic inventory control problem is #P-hard even in the case that all costs are linear. Proof. Let X1 , . . . , Xn , a1 , . . . , an , for i = 1, . . . , n and j = 1, 2 = m, and values q and Q be the input for the Problem of evaluating the cdf of convolutions of discrete variables, where Xi takes value ai with probability 21 and takes value 0 with probability 12 . This problem is #P-hard as established in the proof of Theorem Pn 4.1. We create an instance of the single-item stochastic inventory control problem as follows. Let A = i=1 a1 . The Demand in period i is Di = Xi . The cost of production in period 1 is c1 (x) = (1 − q)x. The cost of production in every other period i 6= 1 is ci (x) = Ax. (This gives fractional costs. One can scale by 2n to get integral costs.) The holding costs are 0 in each period, and the backorder costs are 0 in each period except for period n, in which hn (x) = −qx for x < 0. We now claim that there is a feasible solution for the Problem of Evaluating the cdf if and only if there is a feasible solution to the inventory control problem with expected cost at most A − qQ. To see that, we first note that any optimal policy will consist of ordering only in the first period. Any unsatisfied demand in later periods will be handled through backordering. This instance in turn is equivalent to newsvendor problem in which the cost cm of ordering one unit too much is (1 − q), and the cost of ordering one item too few is c` = q. Therefore, the c` ` optimal decision is to produce the minimum amount S such that Prob(D ≤ S) ≥ c` +c m = c , where D is a random variable describing the total demandP(see, e.g., [SCB05], Section 8.2.1). The optimal order level is n thus the minimum value x∗ such that Prob( i=1 ≤ x∗) ≥ q. But calculating x∗ is polynomially equivalent to evaluating the original cdf problem. Remark. If costs are linear, then an optimal policy is “an order up to policy” in which there are values b1 , . . . , bn , and the optimal policy is to order up to bj in period j if inventory falls below period j. Now suppose that b01 , . . . , b0n is a proposed policy. (In the lingo of complexity analysis, b’ is a certificate.) Note that already to evaluate the expected cost of this policy is #P-hard. We note that many #P-hard enumeration problems have a fully polynomial randomized approximation scheme (FPRAS) including the following: counting Hamiltonian cycles in dense graphs, [DFJ98], counting knapsack solutions, [Dye03], counting Eulerian orientations of a directed graph, [MW95], counting perfect matchings in a bipartite graph, [JS89], and computing the permanent, [JSV04]. To the best of our knowledge, the only deterministic FPTAS for a #P-hard problem known up-to-date and published in the literature is the recent FPTAS of [Wei06] for counting independent sets in sparse graphs. (Counting independent sets of graphs of maximum degree 4 is known to be #-P complete, Theorem 3.1.5 in [Rot96].)
5
5
K-approximation sets and functions
A K-approximation algorithm for a minimization problem guarantees its output to be no more than K times the optimal solution. In this section we define K-approximation functions and K-approximation sets. Definition 5.1. Let K ≥ 1 and let f : D→R+ be a function. We say that f˜ : D→R is a K-approximation function of f (K-approximation of f , in short) if for all x ∈ D we have f (x) ≤ f˜(x) ≤ Kf (x). The following proposition follows directly from the definition of K-approximation functions. Proposition 5.2. Let K > 1, let f1 , f2 : D→R+ be functions over domain D, let f˜1 , f˜2 : D→R+ be K-approximations of f1 , f2 , respectively, let g : D→D, and let α, β ∈ R+ . The following properties hold: 1. α + β f˜1 is a K-approximation of α + βf1 , 2. f˜1 + f˜2 is a K-approximation of f1 + f2 , 3. f˜1 (g) is a K-approximation of f1 (g), 4. f˜3 (y) := minx∈D {f˜1 (x) + f˜2 (x + y)} is a K-approximation of f3 (y) := minx∈D {f1 (x) + f2 (x + y)}. In order to get a polynomial time approximation scheme, we consider only a subset of all possible optimal inventory values, whose cardinality is polynomially bounded by the input size. Of course this can only be done by sacrificing accuracy in the final solution. We use the following definition. Definition 5.3. Let K > 1 and let f : [L, U ]→Z+ be a monotone function. A K-approximation set of f is an ordered set S = {i1 < ... < ir } of integers satisfying the following two properties: 1. L, U ∈ S ⊆ {L, . . . , U }; 2. for each k = 1 to r − 1, if ik+1 > ik + 1, then
f (ik ) K
≤ f (ik+1 ) ≤ Kf (ik );
The canonical K-approximation set for a monotonically nondecreasing function is defined as follows: i1 ← L; for k ≥ 1, if ik < U, then ik+1 ← max{ik + 1, max{x | f (x) ≤ Kf (ik ) and x ≤ U }} The canonical K-approximation set for a monotonically nonincreasing function is defined analogously except that ik+1 ← max{ik + 1, max{x | f (x) ≤ f (ik )/K and x ≤ U }}. One can determine each element of the canonical K-approximation set in O(log(U −L)) time by using binary search. We have just proved the following lemma. Lemma 5.4. Let f : [L, U ]→Z+ be a monotone function. For every K > 1 there exists a K-approximation set S of f of cardinality O(logK f max ). Furthermore, it takes O((1 + tf ) logK f max log(U − L)) time to construct this set, where tf is the time needed to evaluate f . We use K-approximation sets to construct approximations functions in the following way. Definition 5.5. Let K > 1 and let f : [L, U ]→Z+ be a monotone function. Let S be a K-approximation set of f . A function fˆ defined as follows is called the approximation of f corresponding to S. For any integer L ≤ x ≤ U and successive elements ik , ik+1 ∈ S with ik < x ≤ ik+1 let ½ f (x) if x ∈ S; fˆ(x) := max{f (ik ), f (ik+1 )} otherwise. Note that if we calculate the values of f on S in advance and store them in a sorted array (x, f (x)), then any query for the value of fˆ(x), for any x, can be calculated in O(log |S|) = O(log logK f max ) time. This is done by performing binary search over S to find the consecutive elements ik , ik+1 ∈ S such that ik < x ≤ ik+1 . The following proposition follows directly from the above definitions. 6
Proposition 5.6. Let K > 1, let f : [L, U ]→Z+ be a nondecreasing function, and let S be a K-approximation set of f . If fˆ is the approximation of f corresponding to S, then fˆ is a nonnegative nondecreasing integervalued step K-approximation function of f . We extend the definition of K-approximation sets to nonnegative unimodal functions in the following way. Definition 5.7. Let K ≥ 1 and let f : [L, U ]→Z+ be a unimodal function with minimum value occurring at J. A K-approximation set of f if it can be expressed as the union S = S1 ∪ S2 of a K-approximation set S1 of the nonincreasing function f : [L, J], and a K-approximation set S2 of the nondecreasing function f : [J, U ]. A function fˆ is called the approximation of f corresponding to S if it merges the approximations of f corresponding to S1 and S2 . Definition 5.5 holds for convex functions of one variables defined on an integer domain. Suppose now that f attains its minimum at x∗ (if x∗ is not unique, we set x∗ = min{argminx∈[L,U ] f (x)} to be the smallest such minimizer). Since f is convex, finding x∗ can be done efficiently by standard binary search. Therefore, Lemma 5.4 and Proposition 5.6 hold for such functions as well. We summarize the results of this section as follows. Theorem 5.8. Let f : [L, U ]→Z+ be a unimodal function with a given x∗ = arg min f . For every K > 1 there exists a K-approximation set S for f of cardinality O(logK f max ) that can be constructed in O(tf logK f max log(U − L)) time, where tf is the time needed to evaluate f . Furthermore, the approximation fˆ of f corresponding to S is a nonnegative integer-valued unimodal step K-approximation function of f , whose query time is O(log logK f max ). We denote an algorithm that calculates a canonical K-approximation set for a unimodal f with a given x∗ = arg min f by ApxSet(K, f, x∗ ).
6
An FPTAS
In this section we develop an FPTAS for the inventory control problem defined in Section 3.
6.1
Preliminaries
Let gt (It ) denote the optimal total expected cost for periods t, ..., T , starting in period t with an inventory of It . Therefore, our goal is to calculate z ∗ = g1 (0). Let rt (I¯t ) be the expected cost for periods t, ..., T , if the inventory at the end of period t is I¯t . It follows that for t = 1, ..., T we have rt (I¯t ) = ht (I¯t ) + gt+1 (I¯t ), and gt (It ) = min {ct (xt ) + xt ∈Z+
nt X
pt,j rt (It + xt − dt,j )},
(2)
(3)
j=1
where gT +1 (y) = 0 for any y. Let us define an auxiliary function yt as follows. yt (z) =
nt X
pt,j rt (z − dt,j ),
j=1
so yt (z) is the expected total cost for periods t, ..., T after a procurement decision has been made in period t, and the procurement plus the inventory level is z. We note that gt (It ) = min {ct (xt ) + yt (It + xt )}. xt ∈Z+
7
(4)
Let us also define Rt = Mt+1 rt , Yt = Mt yt , Gt = Mt gt . We state several basic properties of functions rt , yt , gt , Rt , Yt and Gt . Proposition 6.1. For every t = 1, ..., T , functions rt , gt and yt are convex over Z. Proof. We prove this by backward induction. We consider first the base case of t = T . Since gT +1 = 0 we get that rT = hT , which is convex by assumption. Since the convexity of rT (z) induces the convexity of rT (z + d) for every constant d, and since a convex combination of convex functions is convex, we get that yT is convex. We next prove that gT is convex. From the definition of convex functions over discrete domains (see Section 2), it suffices to show that 2g(I) ≤ g(I + 1) + g(I − 1) for all I. (For convenience we drop the subscript T .) Consider now a fixed value of I. Choose x0 and x” so that g(I − 1) = c(x0 ) + y(x0 + I − 1); g(I + 1) = c(x”) + y(x” + I + 1). Case 1. x0 = x”. Then 2g(I) ≤ 2c(x0 )+2y(x0 +I) ≤ 2c(x0 )+y(x0 +I −1)+y(x0 +I +1) = g(I −1)+g(I +1). Case 2. x0 ≥ x” + 1. Then 2g(I) ≤ c(x0 − 1) + y((x0 − 1) + I) + c(x” + 1) + y(x” + 1 + I). Therefore, 2g(I) − g(I + 1) − g(I − 1) ≤ c(x0 − 1) + c(x” + 1) − c(x0 ) − c(x”), which is true because x0 > x” and c is convex. Case 3. x0 ≤ x” − 1. Then 2g(I) ≤ c(x0 + 1) + y((x0 + 1 + I) + c(x” − 1) + y(x” − 1 + I). Therefore, 2g(I) − g(I + 1) − g(I − 1) ≤ c(x0 + 1) + c(x” − 1) − c(x0 ) − c(x”) + y((x0 + 1 + I) + y(x” − 1 + I) − y(x0 + I − 1) − y(x” + I + 1) ≤ 0. The latter inequality holds by the convexity of c and y and the fact that x0 ≤ x” − 1. Thus gT is convex over Z. Let us assume by induction that rt+1 , gt+1 and yt+1 are convex over Z. We need to prove that rt , gt and yt are convex over Z as well. By assumption ht is convex over Z. By the induction hypothesis also is gt+1 . Hence, rT is convex over Z as the summation of two such functions. The rest of the proof for this case is similar to the base case. We next show that all the values of gt (·), rt (·) and yt (·) over all inventory levels and time periods are rational numbers, and we bound the least common multiple of their denominators. Proposition 6.2. For every t = 1, ..., T , functions Mt+1 rt , Mt gt , Mt yt , Rt , Yt and Gt are nonnegative, integer valued and convex over Z. Proof. The proof for Mt+1 rt and Mt gt is by induction. Since rT ≡ hT , and gT +1 ≡ 0, rT is an integer function. Considering gT we have nT X 1 gT (IT ) = min{QT cT (xT ) + qT,j rT (IT + xT − dT,j )}. QT xT j=1
Since cT and rT are integer-valued functions, QT gT is an integer-valued function as well. Assuming by induction that the statement holds for t = k + 1, we get immediately from (2) that the statement holds for rk as well. From (3) we have gk (Ik ) =
nk X 1 min{Qk ck (xk ) + qk,j rk (Ik + xk − dk,j )}. Qk xk j=1
8
Since ck and Mk+1 rk are both integer-valued functions, by the induction hypothesis, Mk gk is an integervalued function as well. We conclude the proof for Mt+1 rt and Mt gt by applying Proposition 6.1. The proof for Mt yt is due to the fact that yt is a convex combination of several rt ’s, and that Qt is a common denominator of all the probabilities in time period t. The integrality and convexity of Rt , Yt and Gt immediately follows. The inventory It at the beginning of time period t following an optimal policy satisfies −D∗ ≤ −
t−1 X
dj,nj ≤ It ≤ D∗ −
j=1
t−1 X
dj,1 ≤ D∗
j=1
Pt−1 for every t = 1, ..., T . (To see this, note that the lower bound holds for any policy since j=1 (xj − Dj ) ≥ Pt−1 − j=1 dj,nj . The upper bound holds for any optimal policy, since any such policy will order at most a total of D∗ over the time horizon. Moreover, forP every time period t, the inventory It + xt after the procurement Pt−1 t−1 decision has been made satisfies −D∗ ≤ − j=1 dj,nj ≤ It + xt ≤ D∗ − xt − j=1 dj,1 ≤ D∗ .) Hence, xt can be restricted to take values between 0 and D∗ . The running time for computing the values of gt and rt by dynamic programming for all possible optimal inventory levels and for every period t = 1, ..., T , is therefore 2 O(n∗ T D∗ ), i.e., pseudo polynomial in the input size. By the discussion above we restrict without loss of generality the domain of rt and gt to be [−D∗ , ..., D∗ ] and let L = −D∗ , U = D∗ . We conclude this section by giving several properties of approximation functions. Proposition 6.3. Let K > 1 and f be an integer-valued function. If f 0 is a (general) K-approximation of f then bf 0 c is an integer-valued K-approximation of f . This proposition is due to f ≤ bf 0 c ≤ f 0 ≤ Kf , where the first inequality derives from the integrality of f and since f 0 is a K-approximation of f . Proposition 6.4. Let K > 1 and f be an integer-valued function. If fˇ is a convex K-approximation of f with arg min f = x∗ , then bfˇc is a unimodal integer-valued K-approximation of f with the same arg min. Proof. The monotonicity of the floor function coupled with the convexity of fˇ implies that bfˇc is a unimodal function that is minimized at x∗ . Function bfˇc is an integer-valued K-approximation of f due to the previous proposition. Proposition 6.5. Let K > 1, f be a convex function, S be a K-approximation set of f , and fˆ be the approximation of f corresponding to S. Then the convex extension of fˆ induced by S is a convex Kapproximation of f . Proof. This proposition is true because the convex extension of fˆ induced by S is the greatest convex function which does not lie above fˆ, and the fact that f itself is a convex function (i.e., the convex extension of fˆ induced by S is “sandwiched” between fˆ from above, and f from below).
6.2
Algorithm
In this section we give a formal description of our approximation scheme for rt , gt , and yt for t = 1, ..., T . The outline of the proof of correctness of the algorithm goes as follows. Since gT +1 ≡ 0 and hT is given explicitly, we can calculate rT ≡ hT by a single query to hT and calculate yT by performing nT queries to hT . Since cT (x) and yT (z + x) are convex functions in x for any fixed z, calculating gT (z) is done by performing binary search in the action space [−D∗ − z, D∗ − z] to find an action x that minimizes cT (x) + yT (z + x). This results in log U queries to cT and yT . As for calculating gt (z), in order not to perform overall O(logT +1−t U ) queries, which is exponential in the input size, our algorithm will efficiently ¯ ) time, for any compute a compressed approximation for gt , so a query to it will cost only O(log logK U 9
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11:
Function FPTAS(²) ² ˇ T +1 := 0 Let K := 1 + 2T and G for t := T downto 1 do ˇ t := Mt+1 ht + G ˇ t+1 Let R Pnt ˇ t (z − dt,j ) Let Yˇt := Qt j=1 pt,j R 0 Let Gt (z) := minx∈[−D∗ −z,D∗ −z] {Mt ct (x) + Yˇt (z + x)} ¯ t := bG0t c Let x∗ := arg min G0t and G ¯ ˆ t be its corresponding K-approximation function Let St :=ApxSet(K, Gt , x∗t ) and G ˇ ˆ t induced by St Let Gt be the convex extension of G end for ˇ 1 (0) Return GM 1 Algorithm 1: An FPTAS.
t. The algorithm proceeds backwards from t = T down to t = 1. In the (T − t + 1)th iteration, given a previously calculated approximation for gt+1 , the algorithm approximates rt , yt , and gt . The relative error of these later approximations is only slightly worse than the one of the given gt+1 , and the accumulated error throughout the execution of the algorithm is under control.
6.3
Analysis
Let ² > 0 be given, where we seek a (1+²)-approximation, as in (1). In this section we show that Algorithm 1 produces the required approximation, and runs in time that is polynomial in both the input size and 1² . ˇ t , which is a convex K T +1−t -approximation of Gt , for every t = Lemma 6.6. Algorithm 1 computes G 1, . . . , T + 1. ¯ t is an integer-valued unimodal function. To see this, note that by ProposiProof. We first note that G tion 6.1, G0t is a convex function, and the floor function is monotone. We prove the approximation ratio by backwards induction. We consider first the base case of time period ˇ T +1 = 0, we get from the definition of R ˇ that R ˇ T = RT . Hence, YˇT = YT and G0 = GT . T . Since GT +1 = G T 0 ¯ T = GT . By Theorem 5.8, Therefore, by Proposition 6.2, GT is an integer-valued convex function, and G ˆ T is an integer-valued K-approximation function of GT . By Proposition 6.5, G ˇ T is a (not necessarily G integer-valued) convex K-approximation function of GT as needed. ˇ t+1 is a convex K T −t -approximation of Gt+1 . By the second property of We assume inductively that G ˇ t is a (convex) K T −t -approximation of Rt (Mt+1 ht is a K T −t -approximation Proposition 5.2 we get that R of itself). By the first two properties of the same proposition, Yˇt is a convex K T −t -approximation of Yt . By the last three properties of the same proposition and Proposition 6.1, we get that G0t is a convex K T −t ¯ t is a convex K T −t -approximation function of Gt , approximation function of Gt . Due to Proposition 6.4, G i.e., ¯ t ≤ K T −t Gt . Gt ≤ G (5) ˆ t is a K-approximation of G ¯ t , Therefore, by (5) we have By Theorem 5.8, G ¯t ≤ G ˆt ≤ KG ¯ t ≤ K T +1−t Gt ; Gt ≤ G ˆ t is a K T +1−t -approximation of Gt . We conclude the proof by applying Proposition 6.5. so G We conclude this section by proving that Algorithm 1 is an FPTAS for our problem. Theorem 6.7. Algorithm 1 gives an FPTAS for the single-item discrete stochastic inventory control problem ² when K = 1 + 2T , for any ² < 1.
10
ˇ
1 (0) Proof. By Lemma 6.6, the approximation gˆ1 (0) := GM of the optimal total expected cost in periods 1 ² 1, ..., T starting in period 1 with an inventory of 0 satisfies g1 (0) ≤ gˆ1 (0) ≤ K T g1 (0). Setting K = 1 + 2T ² T x n gives g1 (0) ≤ gˆ1 (0) ≤ (1+ 2T ) g1 (0). From the inequality (1+ n ) ≤ 1+2x, which holds for every 0 ≤ x ≤ 1, and since the optimal solution is g1 (0), we get that z ∗ ≤ gˆ1 (0) ≤ (1 + ²)z ∗ . We now compute upper bounds on the values of the functions computed during the execution of the algorithm. Let
∗
∗
∗
A = 2T max{ct (D ), ht (D ), ht (−D )} ≥ t
T X
(ct (D∗ ) + max{ht (D∗ ), ht (−D∗ )})
t=1
be an upper bound for the values of the rt ’s and gt ’s. Let B = K T M1 A. ¯ t ’s and G ˇ t ’s. Therefore, B Due to Lemma 6.6, B is an upper bound for the values of the various Rˇt , Yˇt , G0t , G serves as an upper bound for the functions considered throughout the algorithm. Note that Assumption 3.4 implies that log B is polynomially bounded by the input size. Since U = O(D∗ ), log U is also polynomially bounded by the input size. We next consider the running time of our algorithm. It consists of T iterations. We next analyze the ˇ t is done by performing running time for iteration t only. Step 4 is executed in constant time. A query to R ˇ a single query to each of ht and Gt . The former takes th time, and the later takes O(log logK B) time due to Theorem 5.8. Step 5 is done in constant time, and each query to Yˇt is performed in O(nt (th + log logK B)) time. Step 6 is done in constant time. A calculation of G0t (z) is carried out by performing binary search over [−D∗ −z, D∗ −z] since Mt ct (x)+ Yˇt (z+x) is a convex function of x for every fixed z. Hence, a query of G0t (·) is done in O(log U (tc +nt (th +log logK B))) time. As for Step 7, due to Proposition 6.1 G0t is convex. So finding x∗ is done by binary search over [−D∗ , D∗ ], resulting in a total time of O(log2 U (tc + nt (th + log logK B))). By Theorem 5.8, Step 8 is performed in O(log2 U logK B(tc + nt (th + log logK B))). Since the elements in S are kept sorted, the construction of the convex extension in Step 9 is done in linear time in |S| [PS85]. Therefore, the running time of each iteration of the for loop is dominated by Step 8. The the total running time of the algorithm for iteration t is O(T log2 U logK B(tc + nt (th + log logK B))). Note that O(logK B) = O(T + logK (M1 A)). Without loss of generality, we can assume that ² < 1 and 1 ² thus that K < 2. So O(logK Y ) = O( K−1 log Y ) for every Y . Replacing K with 1 + 2T we obtain T T O(logK Y ) = O( ² log Y ). In this way we get that O(logK B) = O( ² log(M1 A)). Replacing a query time of ct and ht by tf we conclude that the running time of the algorithm is µ O
¶ T 2 log2 U log(M1 A) ∗ T log(M1 A) [n (tf + log( ))] , ² ²
which is polynomial in the binary size of the input data and 1² . The dependance on T is almost quadratic, and that of 1² is almost linear.
7 7.1
Extensions Capacitated version
Convex cost functions can model capacity limits by making the procurement cost sufficiently high beyond these limits, while preserving convexity. Thus, our results hold also for single-item stochastic capacitated inventory control problems with discrete demands. 11
7.2
The lost sales model
So far, we have assumed that excess demand is backlogged. We now consider the case when the excess demand is lost. Even if the costs are convex, the value function gt is not necessarily convex. However, we can transform the lost sales case to the previous convex cost case if all costs are linear, and under a reasonable assumption on the cost of lost sales, to be specified next. We consider a problem with lost sales, in which It ≥ 0 in each period t, and where the lost sales in period t is Lt = max(0, −I¯t ). In addition, the linear costs of production, inventory, and lost sales in period t are ct xt , ht I¯t , and `t Lt respectively. We also assume that `t ≥ ct+1 −ht for t = 1, . . . , T . Note that this condition is typically true in practice since typically `t ≥ ct+1 in practice. [Zip00], pages 386-387 showed under this assumption that the optimal value function is convex, and a base stock policy is optimal. We next give an alternative proof by transforming this lost sales instance to an instance of our original stochastic inventory control problem with convex costs and backordering. We transform the lost sales model by splitting period t in the original problem into two periods in the transformed problem, denoted as periods 2t − 1 and 2t. Period 2t − 1 corresponds to period t of the original problem. Production in period 2t in the transformed problem corresponds to lost sales in period t of the original problem. As such we define costs c0t (x) and 0 h0t (I¯t ) for the transformed problem as follows: c02t−1 (x) = ct x, h02t−1 (I) = 0, D2t−1 = Dt ; c02t (x) = `t x, ½ ht I for I ≥ 0; 0 h2t (I) = −M I otherwise, 0 = 0, where M is a very large number polynomially bounded by the input size (e.g., M = D∗ maxt ct ). and D2t Note that h02t is convex. Suppose that we have a feasible policy to the original problem. For any realization of the demand and for any solution with cost C, we can transform a feasible solution for the original problem into a feasible solution for the transformed problem with the same cost by setting x02t = xt and x02t+1 = Lt . We may restrict attention to solutions for the transformed problem that have x02t = backordering in period 2t − 1. There is no additional production in period 2t since any additional unit would need to be inventoried to period 2t + 1, and due to our assumptions above the cost would be no higher by producing the unit in period 2t + 1. Any solution satisfying the condition stated in the above paragraph leads to a solution for the original problem with the same cost.
7.3
Discounted version
In the discounted single-item stochastic inventory control problem, we are also given a rational discounting factor 0 < λ = λλ12 < 1, where λ1 and λ2 are positive integers. In this version the objective function changes to T X z ∗ = min ED ( λt−1 (ct (xt ) + ht (It + xt − Dt ))). t=1
It is easy to adapt our algorithm to this case. The recursive formula (2) for rt changes to rt (I¯t ) = ht (I¯t ) + λgt+1 (I¯t ),
(6)
while (3) remains intact. We need to account for the fact that rt (I¯t ) is not necessarily integer valued if Pneven t gt+1 (I¯t ) is integer valued. However, by multiplying (6) by λ2 , i.e., by changing Qt to be Qt = λ2 j=1 qt,j , we recover integrality, and we can use the same algorithm. The analysis of the running time can be easily extended. The only change is that M1 is multiplied by λT2 , and therefore the running time of the entire algorithm is multiplied by O(T log λ2 ). Therefore, it remains polynomial in the binary size of the data.
12
7.4
Disposal at a cost
We can also approximate a version of the problem where the procurement cost function is defined over Z− as well. If the procurement level is positive, then the actual procurement is occurred, otherwise, a disposal cost is charged. We replace Assumption 3.2 by the following assumption. Assumption 7.1. The procurement cost function ct is nonnegative convex over Z and ct (0) = 0 for every t = 1, ..., T . By extending the value of the procurement level of an optimal policy to be in the interval [−D∗ , D∗ ] we get that our results carry over for this case as well.
7.5
Constant lead time
Under general lead times, the value function is multivariate. It is well known that this dynamic program can be transformed into a single variate dynamic program (the state corresponds to inventory position, which is defined as the inventory on-hand and all outstanding inventory) of the same form as the one presented in Section 3. It is easy to show that this transformation preserves the approximation ratio and as a result it suffices to find an FPTAS for this single variate dynamic program. If L P > 0 is an arbitrary lead time, then ¯ t = t+L−1 the underlying demand distribution of the transformed problem is D Dtˆ. The presented FPTAS tˆ=t ¯ t = d¯t,i ], which is a convolution of L distributions. As a result, computing requires that we know Prob[D these probabilities takes (n∗)L time. If L is 2 or 3 (or any other constant value), then the term (n∗)L is polynomial, and the algorithm is an FPTAS. If L is not constrained to be small (e.g., L = T /4), then the running time is exponentially large. In the latter case, our algorithm is not an FPTAS. It is an open question whether one can modify the approach and create an FPTAS for the problem in which the lead times are permitted to be a fraction of T . Note that due to Theorem 4.1, in the presence of lead times it is hard to compute even a myopic policy that aims to minimize the period cost a lead time ahead.
7.6
Non-exact evaluation of cost functions
In Assumption 3.4, we require that c and h be evaluated in polynomial time. We can weaken this assumption as follows. Assumption 7.2. For every δ ≥ 0 and for every time period t there exist convex integer-valued convex ˜ δ such that functions c˜δt and h t ˜ δ (x) − ht (x)| |h t ≤ δ, ht (x)
|˜ cδt (x) − ct (x)| ≤ δ, ct (x)
for every x, and these functions can be evaluated in polynomial time in the input size and 1/δ. This assumption is equivalent to the statement that for every K > 1, ct and ht have a two-sided Kapproximation. Definition 7.3. Let K > 1 and let f : D→R be a function. We say that f˜ : D→R is a two-sided Kapproximation of f if for all x ∈ D we have f (x)/K ≤ f˜(x) ≤ Kf (x). ¯ > 1 there exists an integer-valued convex function Let f be either ct or ht . By assumption 7.2, for every K ˜ ˜ ¯ ¯ fK¯ such that f (x)/K ≤ f (x)K¯ ≤ Kf (x) for every x. Let tf˜ be the time needed to evaluate fK¯ on x. By √ 1 ¯ = 3 K so Assumption 7.2, tf˜ is polynomially bounded in the size of the problem and in K−1 . We set K ¯ ¯ that for every pair x1 , x2 of succeeding elements in the K-approximation set S of f˜K¯ we get ˜ ¯ f˜¯ (x2 ) f (x2 ) K ¯ 3 = K. ¯ 2 fK¯ (x2 ) ≤ K ≤ 1 K =K ˜K¯ (x1 ) ˜K¯ (x1 ) f (x1 ) f f ¯ K 13
Theorem 5.8 remains true with the exception that the approximation corresponding to S is a two-sided Kapproximation. The FPTAS for the inventory control problem remains the same, its analysis is identical, and 1 (0) ≤ gˆ1 (0) ≤ the resulting approximation functions are two-sided (1 + ²)-approximations. In particular g1+² (1 + ²)g1 (0). Note that (1 + ²)ˆ g1 (0) serves as a one-sided (1 + ²)2 -approximation for g1 (0).√ In order to get a onesided (1 + ²)-approximation we change Step 2 of Algorithm 1 such that K := 1 + 1+²−1 , so the algorithm 2T √ √ calculates gˆ1 , which is a two-sided ( 1 + ²)-approximation of g1 . Therefore, ( 1 + ²)ˆ g1 is a one-sided (1+²)approximation of g1 as needed.
8
Conclusions and Future Research
We presented the first FPTAS for the single-item stochastic inventory control problem. Other recent developments in approximation algorithms for stochastic dynamic and multistage programs are based on gradients or sampling. Our framework is based on the notion of approximation sets and functions. We still use the standard optimality equation or recursion; however, we consider only polynomially many states. Our algorithm relies on the convexity of the value function. We presented extensions to the basic model that do not require substantial modifications to the algorithm. It is an interesting open question of whether there is an FPTAS if one relaxes the assumption of constant lead time. Yet another interesting open extension is the consideration of the infinite time horizon problem under stationary data. In [DGGJ03], the authors investigate classes of counting problems that are interreducible under approximation-preserving reductions. One of these classes is the class of counting problems that admit an FPRAS. It is therefore interesting in this context to investigate the class of counting problems that admit an FPTAS. Acknowledgments. We thank the referees for their detailed comments which helped us to improve the presentation of this paper considerably. We thank Retsef Levi for pointing out that our original proof of #P-hardness for stochastic inventory control also implies the #P-hardness of evaluating cdfs for convolutions of discrete variables. Last, we thank Oded Goldreich for inspiring discussions and Leslie Ann Goldberg for pointing out references.
References [DFJ98] M. Dyer, A. Frieze, and M. Jerrum. Approximately counting Hamilton paths and cycles in dense graphs. SIAM Journal on Computing, 27:1262–1272, 1998. [DGGJ03] M. Dyer, L.A. Goldberg, C. Greenhill, and M. Jerrum. The relative complexity of approximate counting problems. Algorithmica, 38:471–500, 2003. [DGV08] B.C. Dean, M. X. Goemans, and J. Vondr´ ak. Approximating the stochastic knapsack problem: the benefit of adaptivity. Mathematics of Operations Research, 33:945–964, 2008. [Dye03] M. Dyer. Approximate counting by dynamic programming. In Proceedings of the 35th Annual ACM Symposium on the Theory of Computing, pages 693–699, 2003. [FZ84a] A. Federgruen and P.H. Zipkin. Computational issues in an infinite-horizon, multi-echelon inventory model. Operations Research, 32:818–836, 1984. [FZ84b] A. Federgruen and P.H. Zipkin. An efficient algorithm for computing optimal (s,S) policies. Operations Research, 32:1268–1286, 1984. [GJ79] M.R. Garey and D.S. Johnson. Computers and Intractability: A Guide to the Theory of NP-completeness. W. H. Freeman, 1979. [JS89] M. Jerrum and A. Sinclair. Approximating the permanent. SIAM Journal on Computing, 18:1149–1178, 1989.
14
[JSV04] M. Jerrum, A. Sinclair, and E. Vigoda. A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative entries. Journal of the ACM, 51(4):671–697, 2004. [LPRS07] R. Levi, M. Pal, R. Roundy, and D.B. Shmoys. Approximation algorithms for stochastic inventory control models. Mathematics of Operations Research, 32:284–302, 2007. [LRS07] R. Levi, R. Roundy, and D.B. Shmoys. Provably near-optimal sampling-based policies for stochastic inventory control models. Mathematics of Operations Research, 32:821–838, 2007. [LRST08] R. Levi, R. Roundy, D.B. Shmoys, and V.A. Truong. Approximation algorithms for capacitated stochastic inventory control problems. Operations Research, 56:1186–1199, 2008. [MW95] M. Mihail and P. Winkler. On the number of Eulerian orientations of a graph. Algorithmica, 16:402–414, 1995. [Por02] E.L. Porteus. Foundations of Stochastic Inventory Theory. Stanford University Press, 2002. [PS85] F.P. Preparata and M.I. Shamos. Computational Geometry: An Introduction. Springer, 1985. [RM00] R. Roundy and J.A. Muckstadt. Heuristic computation of periodic-review base stock inventory policies. Management Science, 46:104–109, 2000. [Rot96] D. Roth. On the hardness of approximate reasoning. Artificial Intelligence, 82:273–302, 1996. [SCB05] D. Simchi-Levi, X. Chen, and J. Bramel. The Logic of Logistics: Theory, Algorithms, and Applications for Logistics and Supply Chain Management. Springer-Verlag, second edition, 2005. [SO95] H. Safer and J. Orlin. Fast approximation schemes for multi-criteria flow, knapsack, and scheduling problems. Working paper 3757-95, Sloan School of Business, Massachusetts Institute of Technology, 1995. [SS05] C. Swamy and D.B. Shmoys. Sampling-based approximation algorithms for multi-stage stochastic optimization. In 46th Annual IEEE Symposium on Foundations of Computer Science, pages 357–366, Pittsburgh, PA, 2005. [SS06] D.B. Shmoys and C. Swamy. An approximation scheme for stochastic linear programming and its application to stochastic integer programs. Journal of the ACM, 53:973–1012, 2006. [VW01] C.P.M. Van Hoesel and A.P.M. Wagelmans. Fully polynomial approximation schemes for single-item capacitated economic lot-sizing problems. Mathematics of Operations Research, 26:339–357, 2001. [Wei06] D. Weitz. Counting independent sets up to the tree threshold. In Proceedings of the 38th Annual ACM Symposium on Theory of Computing, 2006. [Woe00] G.J. Woeginger. When does a dynamic programming formulation guarantee the existence of a fully polynomial time approximation scheme (FPTAS)? INFORMS Journal on Computing, 12:57–75, 2000. [ZF91] Y.Z. Zheng and A. Federgruen. Finding optimal (s,S) policies is about as simple as evaluating a single policy. Operations Research, 39:654–665, 1991. [Zip00] P.H. Zipkin. Foundations of Inventory Management. McGraw-Hill, 2000.
15