c 2006 INFORMS | isbn 0000-0000 ° doi 10.1287/educ.1053.0000
INFORMS—Pittsburgh 2006
Model Uncertainty, Robust Optimization and Learning Andrew E. B. Lim, J. George Shanthikumar & Z. J. Max Shen Department of Industrial Engineering & Operations Research, University of California at Berkeley, lim, shanthikumar,
[email protected] Abstract
Classical modelling approaches in OR/MS under uncertainty assume a full probabilistic characterization. The learning needed to implement the policies derived from these models is accomplished either through (i) classical statistical estimation procedures or (ii) subjective Bayesian priors. When the data available for learning is limited, or the underlying uncertainty is non-stationary, the error induced by these approaches can be significant and the effectiveness of the policies derived will be reduced. In this tutorial we discuss how we may incorporate these errors in the model (that is, model model uncertainty) and use robust optimization to derive efficient policies. Different models of model uncertainty will be discussed and different approaches to robust optimization with and without bench-marking will be presented. Two alternative learning approaches Objective Bayesian Learning and Operational Learning will be discussed. These approaches could be used to calibrate the models of model uncertainty and to calibrate the optimal policies. Throughout this tutorial we will consider the classical inventory control problem, the inventory control problem with censored demand data and the portfolio selection problem as examples to illustrate these ideas.
Keywords Model uncertainty, Robust optimization, learning, operational statistics
1. Introduction The majority of the early models in OR/MS are been deterministic. Specifically, models for production planning, logistics and transportation have been based on the assumption that all variables of interest are known in advance of the implementation of the solutions. While some models, such as queueing, insurance and portfolio selections naturally call for incorporating stochasticity, it is usually assumed that the full probabilistic characterization of these models are known in advance of the implementation of the solutions. Even when it is assumed that the parameters of a parametric stochastic model are unknown, it is assumed that a Bayesian prior for the parameters is known (e.g., see Azoury (1985), Berger (1985), Ding, Puterman and Bisi (2002), Robert (2001)). Such an approach is often justified by the axiomatic framework of Savage (e.g., see Savage (1972)) for decision making. Assuming this one ends up with a model that has been fully characterized. In economics, with the initial work of Knight (1921) and the Ellsberg paradox (Ellsberg (1961)) questions on this basic idea of full probabilistic characterization has been raised. The seminal work of Gilboa and Schmeidler (1989) provides an axiomatic framework justifying the notion of multiple fully characterized stochastic models for a single decision problem with a max min objective. This sparked the basis for model uncertainty and robust optimization in the economics and finance areas (e.g. see Anderson, Hansen and Sargent (1998), (2003), Cagetti, Hansen, Sargent and Williams (2002), Cao, Wang and Zhang (2005), Dow and Werlang (1992), Epstein (2006), Epstein and Miao (2003), Epstein and Schneider (2003), (2005a), (2005b), Epstein and Wang 1
2
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
(1994), Garlappi, Uppal and Wang (2005), Hansen and Sargent (2001), (2001), (2003)). For a recent account of the application of model uncertainty and robust optimization in economics and finance see the monograph by Hansen and Sargent (2006). Within the OR/MS community interest in deterministic robust optimization has been strong recently (e.g., see Atamturk (2003), Atamturk and Zhang (2004), Averbakh (2000), (2001), (2004), Ben-Tal and Nemirovski (1998), (1999), (2000), (2002), Bertsimas, Pachamanova and Sim (2004), Bertsimas and Sim (2004a), (2004b), (2006), El Ghaoui and Lebret (1997) and El Ghaoui, Oustry and Lebret (1998)). See Soyster (1973) for one of the earliest contribution to this area and the book by Kouvelis and Yu (1997) for a detailed account of the developments until the mid 90’s. However stochastic models of model uncertainty has not received as much attention as the others in the OR/MS literature. In this tutorial we will describe the different ideas in modelling model uncertainty, finding the solution to this model using robust optimization and its implementation through learning. Consider a static or a discrete time dynamic optimization problem defined on a sample space (Ω, F, (Fk )k∈M ). Here M = {0, 1, 2, . . . , m}, where m is the number of decision epochs (m = 1 for a static optimization problem, m = 2 in a stochastic programming problem with recourse, and m ≥ 2 for a discrete dynamic optimization problem). Ω is the set of all possible outcomes of the input variables Y0 and the future values Y = {Yk , k = 1, 2, . . . , m} of interest for the optimization problem (such as the demand over time for different items in an inventory control problem, the arc lengths and costs in a network optimization problem, etc.). F is the sigma algebra of event in Ω and F0 is (the sigma algebra of) all possible information on the input variables that may be available to the decision maker at time 0 (such as the past demand or sales data for the different items in an inventory control problem or the arc lengths and costs in network optimization problem). The actual information I0 available to the decision maker is an element of F0 . Though it is not required, Fn is often the sigma algebra generated by the internal history of the variables {Yk , k ∈ M} (that is, Fk = σ(Yj , j = 0, 1, 2, . . . , k)). It should be noted that the information that is available to the decision maker at the beginning of period k + 1 (k ≥ 1) may not be Fk (for example, in an inventory control problem one may only have information on the sales and not the actual demand values). Let π1 be the decision made at the beginning of period 1 (which is adapted to an information subset I0 in F0 ). This leads to an information set that may depend on π1 . Let I1 (π1 ) be the sigma algebra generated by this information set (which satisfies I1 (π1 ) ⊂ F1 ). Now let π2 be the decision made at the beginning of period 2 (which is adapted to I1 (π1 )). In general, the policy π is adapted to an information filtration ((Ik (π))k∈M ) which in turn is sequentially generated by the policy π. Let ψ(π, Y) be the reward obtained with policy π and Γ be the collection of all admissible policies π. We are then interested in finding a policy π ∗ ∈ Γ that maximizes ψ(π, Y) in some sense. One may adapt several alternative approaches to do this. All of these approaches in some way need to define a probability measure (say P ) on (Ω, F, (Fk )k∈M ) given I0 . Classical modelling approaches in OR/MS under uncertainty assume that a full probabilistic characterization can be done very accurately (that is, we have perfect forecasting capability when a non degenerate measure is used in our model AND that we have the capability to predict the future perfectly when the assumed measure is degenerate). When we do this we hope one or both of the following is true: THE ASSUMPTIONS • A1: The chosen probability measure P is the true probability measure P0 or very close (in some sense) to it.
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
3
• A2: The solution (optimal in some sense ) obtained with P leads to a performance that is either optimal or close to optimal (in some sense) with respect to P0 . The learning needed to implement the policies derived from these models is accomplished either through (i) classical statistical estimation procedures or (ii) subjective Bayesian priors. It is not hard to see that the assumptions in many cases need not be true! When the data available for learning is limited, or the underlying uncertainty is non-stationary, the error induced by these approaches can be significant and the effectiveness of the policy derived will be reduced. In this tutorial we discuss how we may incorporate these errors in the model (that is, model model uncertainty) and use robust optimization to derive efficient policies. Different models of model uncertainty will be discussed and different approaches to robust optimization with and without bench-marking will be presented. Two alternative learning approaches Objective Bayesian Learning and Operational Learning will be discussed. These approaches could be used to calibrate the models of model uncertainty and to obtain robust optimal policies. Before proceeding further with this discussion we will introduce a very simple canonical example: The Newsvendor Inventory Problem with Demand Observed. This can be thought of as a sequence of n static problems. This model is almost always used as a RAT to experiment with to test different ideas in inventory control. It will allow us to discuss the importance of model uncertainty and the integration of optimization and estimation. Later, in Section 7 we will work out three classes of dynamic optimization problems that will serve as examples to illustrate our ideas on learning with integrated dynamic optimization and estimation and robust optimization with bench-marking. THE INVENTORY RAT: Consider perishable item inventory control problem. Items are purchased at c per unit and sold for s per unit. There is no salvage value and no lost sales penalty. Suppose Y1 , Y2 , . . . , Ym represent the demand for this item for the next m periods. We wish to find the optimal order quantities for the next m periods. Suppose we order πk units in period k. Then the profit is ψ(π, Y) =
m X
{s min{Yk , πk } − cπk }.
k=1
This problem allows us to illustrate the effects of separating modelling and optimization from model calibration without having to bring in the consequences of cost-to-go (that is, residual) effects of current decisions at each decision epoch on future time periods. In evaluating the different approaches we will assume that Y1 , Y2 , . . . , Ym are i.i.d. with an absolutely continuous distribution function FY . Further, if needed we will assume that Yk is exponentially distributed with mean θ (that is FY (y) = 1 − exp{− θ1 y}, y ≥ 0). Let {X1 , X2 , . . . , Xn } be the past demand for the last n periods. This information is contained in Y0 . We will also assume that {X1 , . . . , Xn } are i.i.d samples from the same distribution as Yk . In Section 2 we will discuss what is done now: how models are formulated, optimized and implemented. Following a discussion on the possible errors in the current approaches in Section 2, alternative approaches to model these errors through flexible modelling will be discussed in Section 3. Flexible modelling will be accomplished through defining a collection of models that is very likely to contain the correct model or a close approximation of it. Hence finding a robust solution to these collection of models depends on defining a robust optimization approach. Alternative approaches to robust optimization are discussed is Section 4. Section 5 is devoted to the calibration of flexible models using classical statistics.
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
4
Integrated learning in flexible models using (i) min-max, duality and objective Bayesian learning and (ii) operational learning is introduced in Section 6. Detailed application of the concepts discussed in this tutorial to dynamic inventory control and portfolio selection are given in Section 7.
2. Modelling, Optimization and Implementation Almost always the abstract formulation of the model and optimization is done independent of I0 and how the model will be calibrated. Here and in the remaining of the paper we will assume that Y0 contains the past n values {Xk , k = 1, 2, . . . , n} that will be used to calibrate Y (that is, its probability measure P ). 2.A. Deterministic Modelling, Optimization and Implementation Though this is obvious, we wish to discuss deterministic modelling here since it forms a basis for a large body of work currently being done in robust optimization (e.g., see the special issue of Mathematical Programming, 107, Numbers 1-2 on this topic). Let Pωd0 = I{ω = ω0 }, ω0 ∈ Ω be a collection of degenerate (Dirac) probability measures on (Ω, F, (Fk )k∈M ). In deterministic modelling one assumes that for some chosen ω0 ∈ Ω, we have P = Pωd0 . Then φ(π, ω0 ) = E[ψ(π, Y)] = ψ(π, Y(ω0 )). Given that the feasible region of π is Γ one then has the following optimization problem: φd (ω0 ) = max{φ(π, ω0 )} π∈Γ
and choose a π d (ω0 ) ∈ Γ such that φ(π d (ω0 ), ω0 ) = φd (ω0 ). To implement this policy, however, one would have to estimate Y(ω0 ). For example one may assume that {X1 , . . . , Xn , Y1 , . . . , Ym } are i.i.d. and estimate Y(ω0 ) by say, ¯ k = 1, 2, . . . , m, Yˆk (ω0 ) = X, where
n
X ¯=1 Xk . X n k=1
For some problems, the effect of variability on the final solution may be insignificant so that such an assumption of determinism can be justified. For most real problems, however, such an assumption may be unacceptable. Often, such an assumption is made so that the resulting optimization problems are linear programs or integer linear programs so that some of the well established approaches in OR can be used to solve these optimization problems. Sometimes, even with this assumption of determinism, the solution may be hard to get. It is fair to say that the decision to assume determinism is mostly motivated by the desire to get a solution rather than to capture reality. However, with all the advances that have been made in convex optimization (e.g., see Bertsekas (2003) and Boyd and Vandenberghe (2004)) and in Stochastic Programming (e.g., see Birge and Louveaux (1997), Ruszczynski and Shapiro (2003), and van der Vlerk (2006)), it seems possible to relax this assumption and proceed to formulate stochastic models. Before we proceed to discuss stochastic modelling, we will
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
5
give the deterministic version of the inventory rat. We will later use this result in robust optimization with bench-marking. THE INVENTORY RAT (continued) φd (ω0 ) = max{
m X
ψ(πk , Yk (ω0 )) : πk ≥ 0} = (s − c)
m X
Yk (ω0 )
k=1
k=1
and πkd (ω0 ) = Yk (ω0 ), k = 1, 2, . . . , m. Then the expected profit is φd (θ) = (s − c)mθ. where θ = E[Yk ]. To implement this policy we need to know the future demand. If we don’t, maybe we can approximate the future demand by the observed average! Hence the implemented policy would be ¯ k = 1, 2, . . . , m π ˆkd = X, with profit ˆ )= ψ(Y
m X
¯ − cX}, ¯ {s min{Yk , X}
k=1
¯ = 1 Pn Xk . Depending on when policy change is allowed, re-optimization will where X k=1 n take place at sometime in the future. Here and in the rest of the paper we will assume that we are allowed to re-optimize at the end of each period. Now depending on the belief we have on the i.i.d assumption for the demand, we may be willing to estimate the demand for the next period based only on the last, say, l periods. For ease of exposition we will assume that l = n. Set Xn+j = Yj , j = 1, 2, . . . , m. Then using an updated estimate of Yk (ω0 ) at the beginning of period k we get ¯ k , k = 1, 2, . . . , m, π ˆkd = X P ¯ k = 1 n+k−1 Xj is the n-period moving average for k = 1, 2, . . . , m. The associated where X j=k n profit is m X ˆ )= ¯ k } − cX ¯ k }. ψ(Y {s min{Yk , X k=1
Suppose the demand is exponentially distributed with mean θ. It is easy to verify that n n 1 ˆ ) . ψ(Y ) = (s − c)θ − sθ( m→∞ m n+1 lim
As n → ∞ one gets an average profit of (s − c)θ − sθ exp{−1}. It can be verified that this profit can be very inferior to the optimal profit. For example, when sc = 1.2, c = 1 and θ = 1, the optimal profit is 0.121 while the above policy results in a profit of −0.241.
2.B. Stochastic Modelling and Optimization For stochastic modelling, we assume a non-degenerate probability measure. That is, we define , given I0 a non-degenerate probability measure P on (Ω, F, (Fk )k∈M ). Wanting to specify a probability measure without any statistical assumption is indeed an idealized goal. Even if we are able to solve the resulting optimization problem, the calibration of P given
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
6
I0 will almost always require us to make some statistical assumptions regarding Y and Y0 . These assumptions are often such as i.i.d., Markovian, autoregressive of some order etc. If the state space of Y is finite, then we may try to solve the problem with respect to the probabilities assigned to the different states (treating them as parameters). Even then it may be difficult to solve the optimization problem. In such cases and in cases where further information on the distributional characteristic are known we make additional assumptions that allow one to fully characterize P up to some finite dimensional parameter. 2.B.1. Parametric modelling, Optimization and implementation Suppose we have fully characterized P up to some finite dimensional parameter, say, θ. For example this may be achieved by postulating that Yk has an exponential or normal distribution or that the transition kernel of the Markov process Y is parameterized by a finite set or the state space if finite. Let Pθp be the corresponding probability measure parameterized by θ. Define φp (π, θ) = E[ψ(π, Y)]. Finding the solution to this formulation depends on one of two approaches one chooses for implementation: Frequentist or Bayesian Approach. Frequentist Approach Suppose we assume that the information I0 we have will allow us to estimate the parameter θ exactly. Then one solves φp (θ) = max{φ(π, θ)} π∈Γ
and choose a π p (θ) ∈ Γ such that φ(π p (θ), θ) = φp (θ). To implement this policy, however, one would have to estimate θ. Suppose we use some ˆ statistical estimator Θ(X) of θ using the data X. Then we would implement the policy ˆ π ˆ p = π p (Θ(X)). THE INVENTORY RAT (continued): When the demand is exponentially distributed one has (e.g., see Liyanage and Shanthikumar (2005), Porteus (2002) and Zipkin (2000)), π φp (π, θ) = E[ψ(π, Y)] = sθ(1 − exp{− }) − cπ, θ s π p (θ) = θ ln( ), c and
s φp (θ) = (s − c)θ − cθ ln( ). c For an exponential distribution, the sample mean is the uniformly minimum variance unbiased (UMVU) estimator. Hence we will use the sample mean of the observed data to estimate θ. Then the implemented policy would be ¯ log( s ), k = 1, 2, . . . , m. π ˆkp = X c with profit ˆ )= ψ(Y
m X
¯ log( s )}, ¯ log( s )} − cX {s min{Yk , X c c
k=1
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
¯= where X get
1 n
Pn
k=1 Xk .
7
If we use the updated estimate of θ at the beginning of period k we ¯ k log( s ), k = 1, 2, . . . , m. π ˆkp = X c
With this implementation, ˆ )= ψ(Y
m X
¯ k log( s )} − cX ¯ k log( s )}, {s min{Yk , X c c
k=1
and it can be easily verified that (see Liyanage and Shanthikumar 2005) lim
m→∞
n s 1 ˆ n ψ(Y ) = sθ(1 − ( s ) ) − cθ log( ). m n + log( c ) c
Observe that the average profit achieved is smaller than the expected profit (s − c)θ − cθ ln( sc ). For small values of n this loss can be substantial. For example, when n = 4 and s c = 1.2, the percent loss over the optimal value with known θ is 22.86. (see Liyanage and Shanthikumar 2005, page 343). When the demand is non-stationary, we will be forced to use a moving average or exponential smoothing to forecast the future demand. In such a case, we will need to use a small value for n.
Subjective Bayesian Approach Under the subjective Bayesian approach, given I0 one assumes that the parameter characterizing the measure is random and postulates a distribution for that parameter (Θ). Suppose, we assume that the density function of Θ is fΘ (θ), θ ∈ Θ and the conditional density of {Θ|X} as fΘ|X (θ|X), θ ∈ Θ. The objective function in this case is Z EΘ [φ(π, Θ)|X] = φ(π, θ)fΘ|X (θ|X)dθ . θ∈Θ
Let πfBΘ (X) = arg max{EΘ [φ(π, Θ)|X] : π ∈ Γ} and B φB fΘ (θ) = EX [φ(πfΘ (X), θ)].
THE INVENTORY RAT (continued): Often the subjective prior is chosen to be the conjugate of the demand distribution (e.g., see Azoury 1985). When the demand is exponentially distributed we should choose the Gamma prior for the unknown rate, say λ = θ1 of the exponential distribution (e.g., see Robert (2001), page 121). So let (for α, β > 0) fΘ (θ) =
( βθ )α+1 β exp{− }, θ ≥ 0. βΓ(α) θ
1 ]= α Note that E[Λ] = E[ Θ β . We still need to choose the parameters α and β for this prior distribution. Straightforward algebra will reveal that 1 ¯ s ) α+n − 1). πfBΘ (X) = (β + nX)(( c
Even if the demand distribution is exponential, if the demand mean is non-stationary the Bayesian estimate will converge to an incorrect parameter value. Hence we need to re-initiate the prior distribution every now and then. Suppose we do that every n periods. Then
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
8
1 B ¯ k )(( s ) α+n πk:f (X) = (β + nX − 1), k = 1, 2, . . . , m, Θ c
with profit ˆ )= ψ(Y
m X
1 1 ¯ k )(( s ) α+n ¯ k )(( s ) α+n − 1)} − c(β + nX − 1)}. {s min{Yk , (β + nX c c
k=1
With this implementation, it can be verified that lim
m→∞
1 ˆ β s 1 s 1 θ )n exp{− (( ) α+n − 1)}) − c(β + nθ)(( ) α+n − 1). ψ(Y ) = sθ(1 − ( 1 s α+n m θ c c ( ) +θ−1 c
For bad choices of α and β the performance can be poor. The success of this policy will depend on a lucky guess for α and β. 2.B.2 Non-Parametric modelling Suppose we have characterized P without making any assumptions regarding the parametric form of Y. Now define φg (π, P ) = E[ψ(π, Y)] and solve φg (P ) = max{φ(π, P )} π∈Γ
g
and choose a π (P ) ∈ Γ such that ψ(π g (P ), P ) = φg (P ).
THE INVENTORY RAT (continued): Observe that the optimal order quantity π g (FY ) for demand distribution FY is given by c π g (FY ) = F¯Yinv ( ), s where F¯Yinv is the inverse of the survival function (F¯Y = 1 − FY ) of the demand. We may therefore use the empirical demand distribution (Fˆ¯ Y ) to obtain an estimate of the order quantity. Let X[0] = 0 and X[r] be the r-th order statistic of {X1 , . . . , Xn }, r = 1, 2, . . . , n. Since the demand is assumed to be continuous, we set x − X[r−1] 1 Fˆ¯ Y (x) = 1 − {r − 1 + }, X[r−1] < x ≤ X[r] , r = 1, 2, . . . , n. n X[r] − X[r−1] Then the implemented order quantity π ˆ g based on the empirical distribution is c ˆ(X[ˆr] − X[ˆr−1] ), π ˆ g = Fˆ¯ X ( ) = X[ˆr−1] + a s inv
where rˆ ∈ {1, 2, . . . , n} satisfies c c n(1 − ) < rˆ ≤ n(1 − ) + 1 s s and
c a ˆ = n(1 − ) + 1 − rˆ. s
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
9
It can be shown that (see Liyanage and Shanthikumar (2005), page 345), rˆ−1
X 1 ˆ s n − rˆ + 2 n − rˆ + 1 a ˆ 1 lim ψ(Y) = cθ{ (1 − ( )( )) − − }. m→∞ m c n+1 n − rˆ + 1 + a ˆ n − k + 1 n − rˆ + 1 k=1
The loss in expected profit in this case can be substantially bad. For example, when n = 4 and sc = 1.2, the percent loss over the optimal value with known θ is 73.06. (This is much worse than the 22.86 percent loss with the use of the sample mean for this example). It is clear that with limited and/or non-stationarity in the underlying stochastic process we may have significant errors in our models due to errors in the statistical assumptions we used for the parametric or non parametric models and due to estimation errors. Therefore we should find solutions that are robust to these errors. We could do this by attending to two issues: (1) find ways to incorporate these errors in the model itself and (ii) find a way to obtain a robust solution.
3. Model Uncertainty and Flexible Modelling From the preceding discussion it is clear that we have to account for the errors we will have in calibrating the stochastic model. Therefore, we will not know the exact probability measure for our model. Given this it is reasonable to argue that one should not make a decision based only on a single model (that is, using a single probability measure). Under flexible modelling we would consider a collection of models and modify our ASSUMPTION. MODIFIED ASSUMPTION • A1: The chosen collection of probability measures P contains the true probability measure P0 or one that is very close (in some sense) to it. It is up to us now to define this collection of measures. Following tradition, we will have three different approaches one could take to develop models of model uncertainty. 3.1. Flexible modelling with a variable uncertainty set If the goal is to keep the resulting optimization problem within a class that has efficient solution algorithms or strong approximations, one may consider a collection of degenerate probability measures. That is, one considers P = {Pωd , ω ∈ Ω}. This is essentially to identify the possible values that Y can take. Let Y be this state space. Then one considers a collection of problems ψ(π, Y ), Y ∈ Y. It is easy to see that in almost all real problems, the probability measure P0 will not be in P. Yet, a vast majority of robust optimization reported in the OR/MS literature follows this modelling approach (e.g., see see Atamturk (2003), Atamturk and Zhang (2004), Averbakh (2000), (2001), (2004), Ben-Tal and Nemirovski (1998), (1999), (2000), (2002), Bertsimas, Pachamanova and Sim (2004), Bertsimas and Sim (2004a), (2004b), (2006), Bertsimas and Thiele (2003), Kouvelis and Yu (1997), Soyster (1973)).
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
10
3.2. Flexible modelling with a parametric uncertainty set Suppose our statistical assumptions are valid and the only unknown are the true parameter values. Then the collection of measures we consider could be P = {Pθp , θ ∈ Θ}, for some set Θ of parameter values. Then one considers a collection of problems φp (π, θ), θ ∈ Θ. This appears to be a very promising way to formulate and solve real problems. Application of this approach to portfolio optimization is discussed in Lim, Shanthikumar and Watewai (2005), (2006b). 3.3. Flexible modelling with a non-parametric uncertainty set For flexible modelling with a non-parametric uncertainty set, we first identify a nominal model (or probability measure, say Pˆ ). Then the collection of models are chosen to be a closed ball around this nominal model. Let d(P, Pˆ ) be some distance measure between P and Pˆ . If the measures are fully characterized by a density (or distribution) function, the distance will be defined with respect to the density (or distribution) functions. The collection of models thus considered will be P = {P : d(P, Pˆ ) ≤ α}, where α is the minimum deviation that we believe is needed to assure that the true probability measure P0 is in P. Some of the distance measures commonly used are listed below. Distance Measures for Density Functions We will specify the different types of distances for the density functions of continuous random variables. Analogous distances can be defined for discrete random variables as well. Kullback-Leibler Divergence (Relative Entropy) Z f (x) dKL (f, fˆ) = f (x) log( )dx. ˆ(x) f x It is easy to verify that dKL takes values in [0, ∞] and is convex in f . However it is not a metric (it is not symmetric in (f, fˆ) and does not satisfy the triangle inequality). One very useful property of dKL is that it is sum separable for product measures. This comes in very handy in dynamic optimization with model uncertainty. Hellinger Distance r
Z p q 1 1 [ ( f (x) − fˆ(x))2 dx] 2 . 2 x Hellinger distance as defined above is a metric that takes a value in [0, 1]. One useful property of this metric in dynamic optimization is that the Hellinger affinity (1 − d2H ) is product separable for product measures. dH (f, fˆ) =
Chi-Squared Distance
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
dCS (f, fˆ) =
Z
x
11
(f (x) − fˆ(x))2 dx. fˆ(x)
Discrepancy Measure dD (f, fˆ) = sup{|
Z
b
(f (x) − fˆ(x))dx| : a < b}.
a
Total Variation Distance Z 1 ˆ dT V (f, f ) = sup{ h(x)(f (x) − fˆ(x))dx : |h(x)| ≤ 1}. 2 x Wasserstein (Kantorovich) Metric Z ˆ dW (f, f ) = sup{ h(x)(f (x) − fˆ(x))dx : |h(x) − h(y)| ≤ |x − y|}. x
Distance Measures for Cumulative Distribution Functions Kolmogorov (Uniform) Metric dK (F, Fˆ ) = sup{|F (x) − Fˆ (x)| : x ∈ R}. Levy (Prokhorov) Metric dL (F, Fˆ ) = inf{h : F (x − h) − h ≤ Fˆ (x) ≤ F (x + h) + h; h > 0; x ∈ R}. Wasserstein (Kantorovich) Metric dW (F, Fˆ ) =
Z
|F (x) − Fˆ (x)|dx.
x
Distance Measures for Measures Kullback-Leibler Divergence (Relative Entropy) Z dP )dP. dKL (P, Pˆ ) = log( dPˆ Ω Prokhorov Metric Suppose Ω is a metric space with metric d. Let B be the set of all Borel sets of Ω and for any h > 0 define B h = {x : inf y∈B d(x, y) ≤ h} for any B ∈ B. Then dP (P, Pˆ ) = inf{h : |P (B) ≤ P (B h ) + h; h > 0; B ∈ B}.
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
12 Discrepancy Measure
Suppose Ω is a metric space with metric d. Let B c be the collection of all closed balls in Ω. dD (P, Pˆ ) = sup{|P (B) − Pˆ (B))| : B ∈ B c }.
Total Variation Distance dT V (P, Pˆ ) = sup{|P (A) − Pˆ (A)| : A ⊂ Ω}. Wasserstein (Kantorovich) Metric Suppose Ω is a metric space with metric d. Z ˆ dW (P, P ) = sup{ h(ω)(P (dω) − Pˆ (dω)) : |h(x) − h(y)| ≤ d(x, y), x, y ∈ Ω}. Ω
The majority of the flexible modelling in finance is done using uncertainty sets for measures (e.g., see Hansen and Sargent (2006) and the references in there). Application of this approach to dynamic programming is given in Iyengar (2005) and in revenue management in Lim and Shanthikumar (2004) and Lim, Shanthikumar and Watewai (2006a).
4 Robust Optimization Now that we have a collection of models, we need to decide how to find a solution for these models such that the solution is indeed a very good solution for the true model. For this we assume that our robust optimization will give such a good solution. MODIFIED ASSUMPTION • A2: The robust solution (optimal in some sense ) obtained with the collection of measures P leads to a performance that is either optimal or close to optimal (in some sense) with respect to P0 . 4.1. Max-min objective The most commonly used approach to finding a (so-called) robust solution for the given set of models is to find the best solution to the worst model among the collection of models. The optimization problem is φr = max{min {φ(π, P )}}. π∈Γ P ∈P
and the solution sought is: π r = arg max min {φ(π, P )}. π∈Γ P ∈P
If the true model is the worst one, then this solution will be nice and dandy. However, if the true model is the best one or something close to it, this solution could be very bad (that is, the solution need not be robust to model error at all!). As we will soon see, this can be
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
13
the case. However, this form of (so called) robust optimization is still very popular, since the resulting optimization tends to preserve the algorithmic complexity very close to that of the original single model case. However, if we really want a robust solution, its performance needs to be compared to what could have been the best for every model in the collection. This idea of bench-marking will be discussed later. Let us now look at the inventory example: THE INVENTORY RAT (continued): We will now apply max-min robust optimization to the inventory rat with the three different flexible modelling ideas. Uncertainty Set for Demand: Suppose the demand can take a value in [a, b]. That is a ≤ Yk ≤ b, k = 1, 2, . . . , m. Then we have the robust optimization problem r
φ = max { min
πk ≥0 a≤Yk ≤b
m X
{s min{Yk , πk } − cπk }}
k=1
Since the inner minimization is monotone in Yk it is immediate that φr = max
πk ≥0
m X
{s min{a, πk } − cπk } = (s − c)ma
k=1
and πkr = a, k = 1, 2, . . . , m. Clearly this a very pessimistic solution (for example if a = 0). Specifically, if the true demand happens to be b, the performance of this solution will be the worst! Furthermore observe that the solution is independent of s and c. Uncertainty Set for the Mean of Exponentially Distributed Demand: Suppose the mean demand can take a value in [a, b]. That is a ≤ E[Yk ] = θ ≤ b, k = 1, 2, . . . , m. Then we have the robust optimization problem φr = max { min
πk ≥0 a≤θ≤b
m X
{sθ(1 − exp{−
k=1
πk }) − cπk }} θ
As before the inner minimization is monotone in θ and it is immediate that φr = max
πk ≥0
m X
{sa(1 − exp{−
k=1
πk s }) − cπk }} = ((s − c)a − ca log( ))m a c
and
s πkr = a log( ), k = 1, 2, . . . , m. c Clearly this too is a very pessimistic solution (for example if a = 0). If the true mean demand happens to be b, the performance of this solution will be the worst! Uncertainty Set for Density Function of Demand: Suppose we choose the Kullback-Leibler Divergence (Relative Entropy) to define the collection of possible demand density functions. ˆ That is Suppose the nominal model chosen is an exponential distribution with mean θ. 1 1 fˆ(x) = exp{− x}, x ≥ 0. ˆ θ θˆ Then the collection of density functions for the demand is Z ∞ Z ∞ f (x) f (x)dx = 1; f ≥ 0}. )dx ≤ α; f (x) log( P = {f : fˆ(x) x=0 x=0
14
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
The min max robust optimization is then Z π Z max min{s {
∞
π≥0 f ∈P
x=0
f (z)dz}dx − cπ}
z=x
Defining κ(x) = ffˆ(x) and considering the Lagrangian relaxation of the above problem one (x) obtains (with β ≥ 0), Z π Z ∞ Z ∞ Z ∞ max min{s { κ(x)fˆ(z)dz}dx−cπ +β κ(x) log(κ(x))fˆ(x)dx : κ(x)fˆ(x)dx = 1}. π≥0 κ≥0
x=0
z=x
x=0
x=0
It can be verified that the solution to the above relaxation is κ(x) =
(s − c)θˆ + β exp{−sx}, 0 ≤ x ≤ π r , β
κ(x) = and
(s − c)θˆ + β exp{−sy}, π r ≤ x, β
s (s − c)θˆ + β β ˆ π r = θ{log( }. ) + log( )}{ c β β + sθˆ
Furthermore it can be shown that the solution to the original problem is obtained by choosing β such that Z ∞ κ(x) log(κ(x))fˆ(x)dx = α. x=0
It can be shown that β monotonically decreases as a function of α with β → 0 as α → ∞ and β → ∞ as α → 0. Notice that the robust order quantity goes to zero as β → 0 (that is when α → ∞) and the order quantity becomes the nominal order quantity θˆ log( sc ) when β → ∞ (that is when α → 0). Clearly, in the former case we allow a demand that is zero with probability one and in the latter case we restrict the collection of models to the nominal one. All the above three formulations suffer from the fact that the inner minimization is monotone and the worst model is chosen to optimize. In what follows we will see that the idea of using benchmarks will overcome this shortcoming. 4.2. Min-max regret objectives, utility and alternative coupling with benchmark Recall that φg (P ) is the optimal objective function value we can achieve if we knew the probability measure P . Hence we may wish to find a solution that gives an objective function value that comes close to this for all measures in P. Hence we consider the optimization problem: φr = min{max{φg (P ) − φ(π, P )}}. π∈Γ P ∈P
and the solution sought is: π r = arg min max{φg (P ) − φ(π, P )}. π∈Γ P ∈P
One may also wish to see how the robust policy works with respect to the optimal policy with the actual profit and not its expectation. Given that one has a utility function U r for this deviation, the coupled objective function is φr = min{max{EP [U r (ψ(π g (P ), Y) − ψ(π, Y))]}}. π∈Γ P ∈P
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
15
and the solution sought is: π r = arg min max{EP [U r (ψ(π g (P ), Y) − ψ(π, Y))]}. π∈Γ P ∈P
THE INVENTORY RAT (continued): Observe that clairvoyant ordering will result in a profit of (s − c)Y . Hence if we order π units, the regret is (s − c)Y − {s min{π, Y } − cπ} = s max{Y − π, 0} − c(Y − π). Hence we wish to solve min max {s max{Y − π, 0} − c(Y − π)} a≤Y ≤b
The optimal solution is
s−c ). s Unlike in the min-max robust optimization, here the order quantity depends on s and c. π r = a + (b − 1)(
4.3. Max-min competitive ratio objective with alternative coupling with benchmark Suppose φg (P ) ≥ 0 for all P ∈ P. Then instead of looking at the difference in the objective function values, we may wish to look at the ratios (and find a solution that achieves a ratio close to one for all P ). Hence we consider the optimization problem: φr = min{max{ π∈Γ P ∈P
φ(π, P ) }}. φg (P )
and the solution sought is:
φ(π, P ) }. φg (P ) One may also wish to see how the robust policy works with respect to the optimal policy with the actual profit and not its expectation. Suppose ψ(π g (P ), Y) ≥ 0. Given that one has a utility function U r for this deviation, the coupled objective function is π r = arg min max{ π∈Γ P ∈P
φr = min{max{EP [U r ( π∈Γ P ∈P
ψ(π, Y) )]}}. ψ(π g (P ), Y))
and the solution sought is: π r = arg min max{EP [U r ( π∈Γ P ∈P
ψ(π, Y) ]}. ψ(π g (P ), Y)))
5. Classical Statistics and Flexible Modelling We will now discuss how classical statistics can be used to characterize model uncertainty of different types. To do this, first we have to postulate a statical model for X, Y. Suppose the extended measure for this is P e (note that, then P = {P e |I0 }). 5.1 Predictive Regions and Variable Uncertainty Set Let SY be the state space of Y. Now choose a predictive region Y(X) ⊂ SY for Y such that P e {Y ∈ Y(X)} = 1 − α,
16
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
for some appropriately chosen value of α (0 < α < 1). Then we could choose Y = {Y(X)|I0 }. THE INVENTORY RAT (continued): Suppose {X1 , X2 , . . . , Xn , Y } are i.i.d. exponential random variables with mean θ. Let χ2k be a Chi-squared random variable with k degrees of freedom and Fr,s be an F -random variable with (r, s) degrees of freedom. Then, 2n ¯ d 2 X = χ2n θ and Therefore and
2 Y =d χ22 θ Y d ¯ = F2,2n X ¯ ≤ Y ≤ f2,2n, α X} ¯ = 1 − α, P {f2,2n,1− α2 X 2
where P {f2,2n,β ≤ F2,2n } = β, β ≥ 0. ¯ f2,2n, α X). ¯ Hence with a min-max A (1 − α)100% predictive interval for Y is (f2,2n,1− α2 X, 2 objective, the robust solution is (see Section 4.1) ¯ π r = f2,2n,1− α2 X. Observe that this implementation is independent of s and c. Alternatively, one may use a ¯ ∞). Then one sided predictive interval (f2,2n,1−α X, ¯ π r = f2,2n,1−α X. This too is independent of s and c. Therefore there is no guarantee that this solution will be robust to model uncertainty! Suppose we choose an α such that s 1 1 − α = P {(( ) 1+n − 1)n ≤ F2,2n }. c Then
s 1 ¯ π r = (( ) 1+n − 1)nX. c Later in operational learning we will find that this is indeed the optimal order quantity when θ is unknown. It is thus conceivable that a good policy could be obtained using a deterministic robust optimization provided we have stable demand and sufficient data to test various α. If that is the case, then retrospective optimization using the past data would have yielded a very good solution anyway! The issue in this method of using min-max robust optimization is that the solution can be sensitive to the choice α and that a good value for it cannot be chosen a priori. Hence we need a robust optimization technique that is robust with respect to the choice of α. 5.2 Confidence Regions and Parameter Uncertainty Set Let t(X) be an estimator of θ. Now choose a region T (θ) such that P e {t(X) ∈ T (θ)} = 1 − α,
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
17
for some appropriately chosen value of α (0 < α < 1). Now define Θ(X) = {θ : t(X) ∈ T (θ)}. Then we could choose Θ = {Θ(X)|I0 }. THE INVENTORY RAT (continued): Suppose {X1 , X2 , . . . , Xn , Y } are i.i.d. exponential random variables with mean θ. Observing that 2n ¯ d 2 X = χ2n , θ it is immediate that P{
¯ ¯ 2nX 2nX ≤ θ ≤ } = 1 − α, χ22n, α χ22n,1− α 2
where
2
P {χ22n,β ≤ χ22n } = β, β ≥ 0.
A (1 − α) 100 % confidence interval for θ is
¯ 2nX χ22n, α
¯
, χ22nX α ). Hence with a min-max objective, 2n,1−
2
the robust solution is (see Section 4.1) πr =
2
¯ 2nX . 2 χ2n, α 2
Observe that this implementation is independent of s and c. Alternatively, one may use a ¯ X one sided predictive interval ( χ2n , ∞). Then 2 2n,α
πr =
¯ 2nX . 2 χ2n,α
This too is independent of s and c.
6 Learning Outside of Bayesian learning, the two popular techniques used for learning in decision making are: (i) Reinforcement Learning (e.g., see Sutton and Barto (1998)) and (ii) Statistical Learning (e.g., see Vapnik (2000)). Applying either one of these approaches to the inventory rat problem results in a solution, that is the same as in the non-parametric model discussed in Section 2.B.2 (see Jain, Lim and Shanthikumar (2006)) which we already know can result in poor results. We will not discuss these two approaches here. 6.1 Max-min, Duality and Objective Bayesian Learning In this section we will pursue the max-min bench-marking approach discussed earlier as a learning tool. Specifically, we will consider the dual problem, which can then be seen as a form of the objective Bayesian approach (see Berger (1985), Robert (2001)). In a dynamic optimization scenario, it is the recognition that the implemented policy π ˆk at time k is a function of the past data X that motivates the need to incorporate learning in the optimization itself. Hence in integrated learning and optimization, the focus is max Eθe [φ(π(X), θ)], π
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
18
where the expectation over X is taken with respect to the probability measure Pθe . This is indeed the focus of Decision Theory (Wald (1950)), where minimization of a loss function is the objective. Naturally one could define −φ as the risk function and apply the existing decision theory approaches to solve the above problem. It has already been recognized in decision theory that without further characterization of π one may not be able to solve the above problem (e.g., see Berger (1985), Robert (1994)). Otherwise one could conclude that π p (θ) is the optimal solution. Hence one abides by the notion of an efficient policy π defined below: Definition: A policy π0 is efficient if there does not exist a policy π such that Eθe [φ(π(X), θ)] ≥ Eθe [φ(π0 (X), θ)], ∀ θ, with strict inequality holding for some values of θ. Observe that π0 = π p (θ0 ) for almost any θ0 will be an efficient solution. Indeed it is well known that any Bayesian solution π B (fΘ ), if unique, is an efficient solution. Thus one may have an unlimited number of efficient policies and the idea of an efficient solution does not provide an approach to identifying a suitable policy. While it is necessary for a solution to be efficient, it is not sufficient (unless it is optimal). Definition: A policy π0 is optimal, if Eθe [φ(π0 (X), θ)] ≥ Eθe [φ(π(X), θ)], ∀ θ, for all π. It is very unlikely that such a solution can be obtained without further restriction on π for real stochastic optimization problems. Consequently, in decision theory, one follows one of the two approaches. One that is commonly used in the OR/MS literature is to assume a prior distribution for the unknown parameter(s) (see Section 2.B.1). This eliminates any model uncertainty. However this leaves one to have to find this prior distribution during implementation. This task may not be well defined in practice (e.g., see Kass and Wasserman (1996)). To overcome this there has been considerable work done on developing non-informative priors (e.g., see Kass and Wasserman (1996)). The relationship of this approach to what we will do in the next two sections will be discussed later. The second approach in decision theory is min-maxity. In our setting, it is max min{Eθe [φ(π(X), θ)]}. π
θ
e Unfortunately though, in almost all applications in OR/MS, EX [φ(π(X), θ)] will be monotone in θ. For example, in the inventory problem, the minimum will be attained at θ = 0. In general, suppose the minimum occurs at θ = θ0 . In such a case, the optimal solution for the above formulation is π p (θ0 ). Hence it is unlikely that a direct application of the min-max approach of decision theory to the objective function of interest in OR/MS will be appropriate. Therefore we will apply this approach using objectives with benchmark (see Sections 4.2 and 4.3 and also Lim, Shanthikumar and Shen (2006b)). In this section, we will consider the relative performance φ(π(X), θ) . η(π, θ) = φp (θ)
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
19
The optimization problem now is η r = max min{Eθe [η(π(X), θ)]}. π
θ
The dual of this problem (modulo some technical conditions; see Lim, Shanthikumar and Shen (2006b)) is e min max{EΘ [η(π(X), Θ)]}, fΘ
π
where fΘ is a prior on the random parameter Θ of X. For each given prior distribution fΘ , the policy π that maximizes the objective η is the Bayesian solution. Let πfBΘ be the solution and η B (fΘ ) be the objective function value. Two useful results that relate the primal and the dual problems are (e.g., see Berger (1985)): Lemma: If η B (fΘ ) = min θ
Eθe [φ(πfBΘ (X), θ)] , φp (θ)
then πfBΘ is the max-min solution to the primal and dual problems. (l)
Lemma: If fΘ , l = 1, 2, . . . , is a sequence of priors and πfBΘ is such that (l)
lim η B (fΘ ) = min
l→∞
θ
Eθe [φ(πfBΘ (X), θ)] , φp (θ)
then πfBΘ is the max-min solution to the primal problem. Now we add a bound that apart from characterizing the goodness of a chosen prior fΘ or the corresponding policy πfBΘ , will aid an algorithm in finding the max-min solution. Lemma: For any prior fΘ , Eθe [φ(πfBΘ (X), θ)] min ≤ ηr ≤ θ φp (θ)
R
θ
Eθe [φ(πfBΘ (X), θ)]fΘ (θ)dθ R . φp (θ)fΘ (θ)dθ θ
6.2 Operational Learning This section is devoted to describing how learning could be achieved through operational statistics. Operational statistics is introduced in Liyanage and Shanthikumar (2005) and further explored in Chu, Shanthikumar and Shen (2005, 2006a). The formal definition of operational statistics is given in Chu, Shanthikumar and Shen (2006b). In operational learning, we seek to improve the current practice in the implementation of the policies derived assuming the knowledge of the parameters. In this regard, let π p (θ) be the policy derived assuming that the parameter(s) are known. To implement, in the ˆ ˆ traditional approach, we estimate θ by, say Θ(X) and implement the policy π ˆ p = π p (Θ(X)). The corresponding expected profit is ˆ φˆp (θ) = Eθe [φ(π p (Θ(X)), θ)], where the expectation over X is taken with respect to Pθe . In operational learning, first we identify a class of functions Y and a corresponding class of functions H such that ˆ ∈Y Θ
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
20 and
ˆ ∈ H. πp ◦ Θ
The second step is to choose a representative parameter value, say θ0 and solve max Eθe0 [φ(π(X), θ0 )] π∈H
subject to
Eθe [φ(π(X), θ)] ≥ φˆp (θ), ∀θ
ˆ ∈ H, we are guaranteed that a solution exists for the above First note that since π p ◦ Θ optimization problem. Second note that the selection of θ0 is not critical. For it may happen that the selection of H is such that the solution obtained is independent of θ0 (as we will see in the inventory examples). Alternatively, we may indeed use a prior fΘ on θ and reformulate the problem as, Z max Eθe [φ(π(X), θ)]fΘ (θ)dθ π∈H
subject to
θ
Eθe [φ(π(X), θ)] ≥ φˆp (θ), ∀θ
It is also conceivable that alternative forms of robust optimization may be defined. ¯ So we ˆ = X. THE INVENTORY RAT (continued): Recall that π p (θ) = θ log( sc ) and Θ(X) could choose H to be the class of order one homogenous functions. Note that H1 = {π : Rn+ → R+ ; π(αx) = απ(x); α ≥ 0; x ∈ Rn+ }. is the class of non-negative order-one-homogeneous functions. Furthermore, observe that ψ is a homogeneous-order-one function (that is, ψ(αx, αY ) = αψ(x, Y )). Let Z be an exponential r.v. with mean 1. Then Y =d θZ, and one finds that φ too is a homogeneous order one function (that is, φ(αx, αθ) = αφ(x, θ)). Now suppose we restrict the class of operational statistics π to homogeneous-order-one functions. That is, for some chosen θ0 , we consider the optimization problem: max {Eθe0 [φ(π(X), θ0 )]}.
π∈H1
subject to
Eθe [φ(π(X), θ)] ≥ φˆp (θ), ∀θ.
Let Z1 , Z2 , . . . , Zn be i.i.d exponential r.v.s with mean 1 and Z = (Z1 , Z2 , . . . , Zm ). Then X =d θZ. Utilizing the property that φ, π and φˆp are all homogeneous order-one functions, we get Eθe [φ(π(X), θ)] = θEZe [φ(π(Z), 1)] and φˆp (θ) = θφˆp (1). Hence we can drop the constraints and consider max {EZe [φ(π(Z), 1)]}.
π∈H1
Let V (with |V| =
Pm
k=1 Vk
= 1) and the dependent random variable R be defined such that
fR|V (r|v) =
1 1 1 exp{− }, r ≥ 0, rn+1 (n − 1)! r
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
21
and fv (v) = (n − 1)!, |v| = 1; v ∈ Rn+ . Then Z =d
1 V. R
Therefore
V ), 1)|V]]. R Since we assumed π to be a homogeneous-order-one function, we get EZ [φ(π(Z), 1)] = EV [ER [φ(π(
EV [ER [φ(π(
1 Z ), 1)|V]] = EV [ER [ φ(π(V), R)]|V]]. R R
Hence all we need to find the optimal operational statistics is to find 1 φ(π, R)|V = v] : π ≥ 0}, v ∈ Rn+ ; |v| = 1. R Pn Then the optimal homogenous order one operational statistic is (with |x| = k=1 xk ), π os (v) = arg max{ER [
π os (x) = |x|π os (
x ), x ∈ Rn+ . |x|
After some algebra one finds that (see Liyanage and Shanthikumar (2005) and Chu, Shanthikumar and Shen (2005)): n
X s 1 π os (x) = (( ) 1+n − 1) xk c k=1
and
s 1 s φˆos (θ) = Eθ [φ(π os (X), θ)] = θ[c{ − 1 − (n + 1)(( ) 1+n − 1)}]. c c This policy compared to the classical approach improves the expected profit by 4.96% for n = 4 and sc = 1.2 (see page 344 of Liyanage and Shanthikumar (2005)).
7. Examples 7.1 Inventory Control with Observable Demand Consider an inventory control problem with instantaneous replenishment, backlogging and finite planning horizon. Define the following input variables: • • • • • •
m - number of periods in the planning horizon c - purchase price per unit s - selling price per unit {Y1 , Y2 , . . . , Ym } - demand for the next m periods b - backlogging cost per unit per period h - inventory carrying cost per unit per period
At the end of period m all remaining inventory (if any) is salvaged (at a salvage value of c per unit). If at the end of period m orders are backlogged then all orders are met at the beginning of period m + 1. Let πk (πk ≥ 0) be the order quantity at the beginning of period k (k = 1, 2, . . . , m). Then the total profit for the m periods is
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
22
ψ(π, Y) =
m X
{−cπk + s{max{−Wk−1 , 0} + Yk − max{−Wk , 0}}}
k=1
+ c max{Wm , 0} + (s − c) max{−Wm , 0} −
m X
{h max{Wk , 0} + b max{−Wk , 0}},
k=1
(0) where W0 = 0 and Wk = Wk−1 + πk − Yk , k = 1, 2, . . . , m. Simple algebra reveals that, ψ(π, Y) =
m X
ψk (πk , Yk ),
k=1
where ψk (πk , Yk ) = (s − c − b)Yk + (b + h) min{Wk−1 + πk , Yk } − h(Wk−1 + πk ), k = 1, 2, . . . , m. Given Ik = Fk , we wish to find the optimal order quantity πk∗ for period k (k = 1, . . . , m). First let us see what we can do if we are clairvoyant. Here we will assume that all the future demand is known. It is not hard to see that πkd (ω0 ) = Yk (ω0 ), k = 1, 2, . . . , m, and φd (ω0 ) = (s − c)
m X
Yk (ω0 ).
k=1
Pm ˆ If we can implement this, then the profit experienced is ψ(Y) = (s − c) k=1 Yk and the ˆ expected profit is E[ψ(Y)] = (s − c)mθ. Suppose we assume that the future demand {Y1 , Y2 , . . . , Ym } for the next m periods given I0 are i.i.d. with exponential density function with mean θ (that is fY (y) = θ1 exp{− θ1 y}, y ≥ 0). Let q φk (q, θ) = E[(b + h) min{q, Yk } − hq] = (b + h)θ(1 − exp{− }) − hq, k = 1, 2, . . . , m. θ Then q ∗ (θ) = arg max{φk (q, θ)} = θ log(
b+h ). h
It is then clear that πk (θ) = q ∗ (θ) − Wk−1 , k = 1, 2, . . . , m, and
b+h ). h ¯ as an estimate for the θ for implementing this policy, we get If we use X φ(θ) = (s − c)mθ − hmθ log(
ˆ ψ(Y) = (s − c − b)
m X
k=1
Yk + (b + h)
m X
k=1
m
¯ log( min{X
X b+h ¯ log( b + h ), ), Yk } − h X h h k=1
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
23
and an a priori expected profit of Ee[
n 1 ˆ n b+h ψ(Y)] = (s − c)θ − bθ( ) − 1). )n − hθ(( )n + log( b+h m h n + log( b+h n + log( ) ) h h
However, if we continue to update the estimate we have ¯ k log( b + h ) − Wk−1 , 0}, k = 1, 2, . . . , m, π ˆk = max{X h and
1 ˆ ˆ lim ψ(Y) = E e [ ψ(Y)] m
m→∞
We will now apply operational learning to this problem (for details of this analysis see Lim, Shanthikumar and Shen (2006a)). Specifically let H1 be the collection of order-onehomogeneous functions. Then, in operational learning we are interested in max1
πk ∈H
m X
Eθe [φk (πk , θ)],
k=1
where φk (πk , θ) = (b + h)E[min{Wk−1 + πk , Yk }] − hE[(Wk−1 + πk )], W0 = 0 and Wk = Wk−1 + πk − Yk , k = 1, 2, . . . , m. First we will consider the last period. Let Y1 be an empty vector and Yk = (Y1 , . . . , Yk−1 ), k = 2, . . . , m. Define the random vector Vm (|Vm | = 1) and the dependent random variable Rm such that (see Section 6.2) Vm d = (X, Ym ). Rm Now let π ˜m (z) = arg max{ERm [
φm (q, Rm ) n+m−1 |Vm = z] : q ≥ 0}, z ∈ R+ , |z| = 1, Rm
and π ˜m (x) = |x|˜ ym (
x n+m−1 ), x ∈ R+ . |x|
Define πm (X, Ym , w) = max{˜ ym (X, Ym ), w − Ym−1 }. and n+m−2 φ∗m−1 (x, q, θ) = φm−1 (q, θ) + EYm−1 [φm (πm (x, Ym−1 , q), θ)], x ∈ R+ .
Having defined this for the last period, we can now set up the recursion for any period as follows: Define the random vector Vk (|Vk | = 1) and the dependent random variable Rk such that Vk d = (X, Yk ), k = 1, 2, . . . , m − 1. Rk Now let π ˜k (z) = arg max{ERk [
φ∗k (z, q, Rk ) n+k−1 , |z| = 1, |Vk = z] : q ≥ 0}, z ∈ R+ Rk
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
24 and π ˜k (x) = |x|˜ ym (
x n+k−1 . ), x ∈ R+ |x|
Define πk (X, Yk , w) = max{˜ πk (X, Yk ), w − Yk−1 }. and n+k−2 φ∗k−1 (x, q, θ) = φk−1 (q, θ) + EYk−1 [φ∗k (yk (x, Yk−1 , q), 1)], x ∈ R+ .
Now the target inventory levels π ˜k and the cost to go functions φ∗k−1 can be recursively computed starting with k = m. Computation of this operational statistics using numerical algorithms and/or simulation is discussed in Lim, Shanthikumar and Shen (2006a). 7.2 Inventory Control with Sales Data Let m, c, s, and {Y1 , Y2 , . . . , Ym } be as defined earlier. At the end of each period all remaining inventory (if any) is discarded (and there is no salvage value). Furthermore any excess demand is lost and lost demand cannot be observed. Let πk (πk ≥ 0) be the order quantity at the beginning of period k (k = 1, 2, . . . , m). Then the total profit for the m periods is ψ(π, Y) =
m X
ψk (πk , Yk ),
k=1
where ψk (πk , Yk ) = sSk − cπk , where Sk =Smin{πk , Yk } is the sales in period k, k = 1, 2, . . . , m. Here Ik (π) = σ({(Sj , πj ), j = 1, 2, . . . , k} I0 ). We wish to find the optimal order quantity πk∗ for period k (k = 1, . . . , m). Suppose we assume that the future demand {Y1 , Y2 , . . . , Ym } for the next m periods given I0 are i.i.d. with an exponential density function with mean θ (that is fY (y) = θ1 exp{− θ1 y}, y ≥ 0). If we know θ this would then be exactly the same as the inventory rat problem. However, if θ is unknown (which will be the case in practise), we need to estimate it using possibly censored data. Suppose we have past demands, say {X1 , . . . , Xm } and past sales {R1 , . . . , Rm }. Let Ik = I{Xk = Rk } be the indicator that the sales is the same as the demand in period k (which will be the case if we had more on hand inventory than thePdemand). Given (R, I) n the maximum likelihood estimator ΘM LE of θ is (assuming that k=1 Ik ≥ 1, that is, at least once we got to observe the true demand) ΘM LE = Pn
1
n X
k=1 Ik k=1
Rk .
The implemented order quantities are then (assuming no further updates of the estimator) s π ˆk = ΘM LE log( ), k = 1, 2, . . . , m, c and the profit is ˆ ψ(Y) =
m X
s s {s min{ΘM LE log( ), Yk } − cΘM LE log( )}. c c
k=1
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
25
We will now show how operational learning can be implemented for a one period problem (m = 1). Integrated learning for the multi period case can be done similar to the first example (see Lim, Shanthikumar and Shen (2006a)). Suppose we are interested in e max EX {sEYe 1 [min{π, Y1 }] − sπ},
π∈Ht
for some suitably chosen class Ht of operational functions that includes the MLE estimator. This function also should allow us to find the solution without the knowledge of θ (what to do in operational learning if this is not possible is discussed in Chu, Shanthikumar and Shen (2006b)). Since Rk ≤ Xk and Rk = Xk when Ik = 1, and choosing a value of Xk > Rk for Ik = 0, we could rewrite the MLE estimator as n
X 1 min{Xk , Rk }. k=1 I{Xk ≤ Rk }
ΘM LE = Pn
k=1
Suppose Ht satisfies the following
Ht = {η : Rn+ × Rn+ ⇒ R+ ; η(αx, αr) = αη(x, r), α ≥ 0; η(y, r) = η(x, r), y = x + (α1 I{x1 ≥ r1 }, . . . , αn I{xn ≥ rn }), α ≥ 0}. (0) It is now easy to see that the function n
X 1 min{xk , rk } k=1 I{xk ≤ rk }
h(x, r) = Pn
k=1
is an element of Ht . Within this class of functions, the optimal operational statistics is n
X s ( Pn 1 ) π(x, r) = (( ) 1+ k=1 I{xk ≤rk } − 1) min{xk , rk }. c k=1
Hence the operational order quantity is n
X s ( P n1 ) Rk . π ˆ = (( ) 1+ k=1 Ik − 1) c k=1
Observe that if Ik = 1, k = 1, 2, . . . , n (that is, if there is no censoring), the above policy is identical to the policy for the newsvendor problem (see Section 6.2). 7.3. Portfolio Selection with Discrete Decision Epochs We wish to invest in one or more of l stocks with random returns and a bank account with a known interest rate. Suppose at the beginning of period k we have a total wealth of Vk−1 . If we invest πk (i)Vk−1 in stock i (i = 1, 2, . . . , l) and leave (1 − πk′ e)Vk−1 in the bank during period k, we will have a total wealth of Vk (πk ) = Yk (πk )Vk−1 at the end of period k, k = 1, 2, . . . , m. Here πk = (πk (1), πk (2), . . . , πk (l))′ and e = (1, 1, . . . , 1)′ is an l-vector of ones and Yk (πk ) − 1 is the rate of return for period k with a
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
26
portfolio allocation πk . The utility of the final wealth Wm for a portfolio selection π and utility function U is then ψ(π, Y) = U (v0 Πm k=1 Yk (πk )). where v0 initial wealth at time 0. We will now discuss how we traditionally complete these models, find the optimal policies and implement them. Naturally, to complete the modelling, we need to define a probability measure P on (Ω, F, (Fk )k∈M ) given I0 and decide the sense (usually in the sense of expectation under P ) in which the reward function is maximized. In these examples, almost always we simplify our analysis further by assuming a parametric family for FY . We will first describe the classical continuous time model, which we will use to create our discrete time parametric model Yk (πk ), k = 1, 2, . . . , m. Suppose the price process of stock i is {St (i), 0 ≤ t ≤ m} given by dSt (i) = (µt (i) + σt′ (i)dWt )St (i), 0 ≤ t ≤ m; i = 1, 2, . . . , l, where {Wt , 0 ≤ t ≤ m} is a vector valued diffusion process, µt (i) is the drift and σt (i) are the volatility parameters of stock i,i = 1, 2, . . . , l. Let rt , 0 ≤ t ≤ m be the known interest rate. Suppose the value of the portfolio is Vt (π) at time t under a portfolio allocation policy π. Under π the value of investments in stock i at time t is πt (i)Vt (π). The money in the bank at time t is (1 − πt e)Vt (π). Then the wealth process Vt (π) evolves according to: dVt (π) = Vt (π){(rt + πt′ bt )dt + πt′ σt′ dWt }, 0 ≤ t ≤ m, where bt (i) = µt (i) − rt , i = 1, 2, . . . , l and V0 (π) = v0 . Now suppose we are only allowed to decide on the ratio of portfolio allocation at time k − 1 and the same ratio of allocation will be maintained during [k − 1, k), k = 1, 2, . . . , m. In the classical continuous time model now assume that µt = µk ; σt = σk and πt = πk , k − 1 ≤ t < k,k = 1, 2, . . . , m. Then the utility at T = m is 1 ′ ′ ψ(π, Z) = U (v0 Πm k=1 exp{rk + πk bk − πk Qk πk + πk σk Zk }), 2 where Qk = σk σk′ and {Zk , k = 1, 2, . . . , m} are i.i.d. unit normal random vectors. Observe that the probability measure for this model is completely characterized by the parameters (bk , σk ), k = 1, 2, . . . , m. We will assume that these parameters are independent of {Zk , k = 1, 2, . . . , m} (though this assumption is not needed, we use them to simplify our illustration). Suppose the values of parameters (bk , σk ), k = 1, 2, . . . , m are unknown but we know a parameter uncertainty set for them. That is (bk , σk ) ∈ Hk , k = 1, 2, . . . , m. We wish to find a robust portfolio. We will use the robust optimization approach with competitive ratio objective with bench-marking. Specifically we will now carry out the bench marking with a log utility function. In this case, the bench mark portfolio is the solution of m
X 1 1 ′ ′ max E log(v0 Πm {rk + πk′ bk − πk′ Qk πk }. k=1 exp{rk + πk bk − πk Qk πk + πk σk Zk }) ≡ max π π 2 2 k=1
It is not hard to see that and
πkp = Q−1 k bk , k = 1, 2, . . . , m
1 ′ −1 −1 ′ Vmp = v0 Πm k=1 exp{rk + bk Qk bk + bk Qk σk Zk }. 2
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
27
Taking the ratio of Vm under a policy π and the bench mark value Vmp , we find that the bench marked objective is max min E[U (Πm k=1 π
exp{rk + πk′ b′k − 21 πk′ Qk πk + πk′ σk Zk }
(b,σ)∈H
−1 ′ exp{rk + 12 b′k Q−1 k bk + bk Qk σk Zk }
)].
This simplifies as 1 ′ −1 −1 −1 ′ ′ ′ max min E[U (Πm k=1 exp{− (πk − bk Qk )Qk (πk − Qk bk ) + (πk − bk Qk )σk Zk })]. π (b,σ)∈H 2 Observe that 1 ′ −1 −1 −1 ′ ′ ′ E[Πm k=1 exp{− (πk − bk Qk )Qk (πk − Qk bk ) + (πk − bk Qk )σk Zk }] = 1. 2 −1 −1 −1 1 ′ ′ ′ ′ Furthermore Πm k=1 exp{− 2 (πk − bk Qk )Qk (πk − Qk bk ) + (πk − bk Qk )σk Zk } is a log concave stochastic function. Hence for any concave utility function U the above objective can be rewritten as: m X −1 (πk′ − b′k Q−1 min max k )Qk (πk − Qk bk ). π
(b,σ)∈H
k=1
It now breaks into a sequence of single period problems: m X
{min
k=1
max
πk (bk ,σk )∈Hk
−1 (πk′ − b′k Q−1 k )Qk (πk − Qk bk )}.
Given the uncertainty set Hk , k = 1, 2, . . . , m the above robust optimization problem can be solved using duality (see Lim, Shanthikumar and Watewai (2006a)).
8. Summary and Conclusion The interest in model uncertainty, robust optimization and learning in the OR/MS areas is growing rapidly. The type of model uncertainties considered in the literature can be broadly categorized into three classes: models with uncertainty sets for (1) variables, (2) parameters and (3) measures. The robust optimization approaches used to find (robust or lack thereof) solutions falls into (a) min-max and (b) min-max with bench-marking. Two common ways to bench-marking are through (1) regret and (2) competitive ratio. The main focus in OR/MS has been in the development of models with uncertainty sets for variables (deterministic models of model uncertainty) and deterministic min-max and min-max-regret robust optimization. Within this framework, the focus has been on developing efficient solution procedures for robust optimization. Only a very limited amount of work has been done on looking at stochastic models of model uncertainty and robust optimization with bench-marking. Very little is done in learning. We believe that a substantial amount of work needs to be done in the latter three topics.
Acknowledgement This work was supported in part by the NSF grant DMI-0500503 (for Lim and Shanthikumar) and the NSF CAREER awards DMI-0348209 (for Shen) and DMI-0348746 (for Lim).
28
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
References Agrawal, V. and S. Seshadri (2000) Impact of Uncertainty and Risk Aversion on Price and Order Quantity in the Newsvendor Problem, Manufacturing and Service Operations Management, 2, 410-423. Ahmed, S., U. Cakmak and A. Shapiro (2005) Coherent Risk Measures in Inventory Problems, Technical Report, School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA. Anderson, E. W., L. P. Hansen, and T. J. Sargent (1998) Risk and Robustness in Equilibrium, Technical Report, University of Chicago. Anderson, L. W., P. Hansen, and T. J. Sargent (2003) A Quartet of Semigroups for Model Specification, Robustness, Price of Risk, and Model Detection, Journal of the European Economic Association, 1, 68-123. Atamturk, A. (2003) Strong Formulations of Robust Mixed 0-1 Programming, to appear in Mathematical Programming. Atamturk, A. and M. Zhang (2004) Two-Stage Robust Network Flow and Design under Demand Uncertainty, to appear in Operations Research. Averbakh, I. (2000) Minmax regret solutions for minmax optimization problems with uncertainty, Operations Research Letters, 27, 57-65. Averbakh, I. (2001) On the complexity of a class of combinatorial optimization problems with uncertainty, Mathematical Programming, 90, 263-272. Averbakh, I. (2004) Minmax regret linear resource allocation problems, Operations Research Letters, 32, 174-180. Azoury, K. S. (1985) Bayes Solution to Dynamic Inventory Models under Unknown Demand Distribution, Management Science, 31, 1150-1160. Ben-Tal, A. and A. Nemirovski (1998) Robust Convex Optimization, Mathematics of Operations Research, 23, 769-805. Ben-Tal, A. and A. Nemirovski (1999) Robust solutions of uncertain linear programs, Operations Research Letters, 25, 1-13. Ben-Tal, A. and A. Nemirovski (2000) Robust Solutions of Linear Programming Problems Contaminated with Uncertain Data, Mathematical Programming, A88, 411-424. Ben-Tal, A. and A. Nemirovski (2002) Robust optimization - methodolgy and applications, Mathematical Programming, B92, 453-480. Berger, J. O. (1985) Statistical Decision Theory and Bayesian Analysis, Second Edition, Springer, New York, NY. Bernhard, P. (2003) A robust control approach to option pricing, Applications of Robust Decision Theory and Ambiguity in Finance, (M. Salmon, ed.), City University Press, London. Bernhard, P. (2003) Robust control approach to option pricing, including transaction costs, Advances in Dynamic Games, Annals of the International Society of Dynamic Games, 7, (A.S. Nowak, K. Szajowski, eds.), Birkhauser.
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
29
Bertsekas, D. (2003) Convex Analysis and Optimization, Athena Scientific. Bertsimas, D., D. Pachamanova and M. Sim (2004) Robust Linear Optimization under General Norms, Operations Research Letters, 32, 510-516 . Bertsimas, D. and M. Sim (2003) Robust Discrete optimization and Network Flows, Mathematical Programming Series B, 98, 49-71. Bertsimas, D. and M. Sim (2004) The price of Robustness, Operations Research, 52, 35–53. Bertsimas, D. and M. Sim (2004) Robust Discrete Optimization under Ellipsoidal Uncertainty Sets, working paper, MIT. Bertsimas, D. and M. Sim (2006) Tractable Approximation to Robust Conic Optimization Problems, Mathematical Programming, 107, 5-36. Bertsimas, D. and A. Thiele (2003) A Robust Optimization Approach to Inventory Theory, Operations Research, 54, 150-168. Bienstock, D. and N. Ozbay (2005) Computing Robust Basestock Levels, CORC Report TR-2005-09, Columbia University, NY. Birge, J. R. and F. Louveaux (1997) Introduction to Stochastic Programming, Springer, New York. Boyd, S. and L. Vandenberghe (2004) Convex Optimization, Cambridge University Press, Cambridge, UK. Cagetti, M., L. P. Hansen, T. Sargent and N. Williams (2002) Robust Pricing with Uncertain Growth,Review of Financial Studies, 15(2), 363-404. Cao, H. H., T. Wang and H. H. Zhang (2005) Model Uncertainty, Limited Market Participation, and Asset Prices, Review of Financial Studies, 18, 1219-1251 Chen, X., M. Sim, D. Simchi-Levi and P. Sun (2004) Risk Aversion in Inventory Management. Working paper, MIT, Cambridge, MA. Chen, X., M. Sim and P. Sun (2004) A Robust Optimization Perspective of Stochastic Programming, Technical Report, National University of Singapore, Singapore. Chen, X., M. Sim, P. Sun and J. Zhang (2006) A Tractable Approximation of Stochastic Programming via Robust Optimization, Technical Report, National University of Singapore, Singapore. Chen, Z. and L. G. Epstein (2002) Ambiguity, Risk and Asset Returns in Continuous Time, Econometrica, 70, 1403-1443. Chou, M., M. Sim and K. So (2006) A Robust Framework for Analyzing Distribution Systems with Transshipment, Technical Report, National University of Singapore, Singapore. Chu, L. Y., J. G. Shanthikumar and Z. J. M. Shen (2005) Solving Operational Statistics via a Bayesian Analysis. Working paper, University of California at Berkeley. Chu, L. Y., J. G. Shanthikumar and Z-J. M. Shen, (2006a) Pricing and Revenue Management with Operational Statistics. Working paper, University of California at Berkeley.
30
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
Chu, L. Y., J. G. Shanthikumar and Z-J. M. Shen, (2006b) Stochastic Optimization with Operational Statistics: A General Framework, Working paper, University of California at Berkeley. D’Amico, S. (2005) Density Selection and Combination under Model Ambiguity: An Application to Stock Returns, Technical Report 2005-09, Division of Research and Statistics and Monetary Affairs, Federal Reserve Board, Washington, D. C. Ding, X., M. L. Puterman and A. Bisi (2002) The Censored Newsvendor and the Optimal Acquisition of Information, Operations Research, 50, 517-527. Dow, J., and S. Werlang (1992) Ambiguity Aversion, Risk Aversion, and the Optimal Choice of Portfolio, Econometrica, 60, 197-204. El Ghaoui, L. and H. Lebret (1997) Robust Solutions to Least Square Problems to Uncertain Data Matrices, SIAM Journal on Matrix Analysis and Applications, 18, 1035-1064. El Ghaoui, L., F. Oustry and H. Lebret (1998) Robust Solutions to Uncertain Semidefinite Programs, SIAM Journal on Optimization, 9, 33-52. Ellsberg, D. (1961) Risk, Ambiguity and the Savage Axioms, Quarterly Journal of Economics, 75, 643-669. Epstein, L. G. (2006) An axiomatic model of non-Bayesian updating, Review of Economic Studies, forthcoming. Epstein, L. G. and J. Miao (2003) A Two-Person Dynamic Equilibrium under Ambiguity, Journal of Economic Dynamics and Control, 27, 1253-1288. Epstein, L. G. and M. Schneider (2003) Recursive Multiple Priors, Journal of Economic Theory, 113, 1-31. Epstein, L. G. and M. Schneider (2003) IID: independently and indistinguishably distributed, Journal of Economic Theory, 113, 32-50. Epstein, L. G. and M. Schneider (2005) Learning under ambiguity, University of Rochester. Epstein, L. G. and M. Schneider (2005) Ambiguity, information quality and asset pricing, University of Rochester. Epstein, L. G., J. Noor, and A. Sandroni (2005) Non-Bayesian updating: a theoretical framework, University of Rochester. Epstein, L. G. and T. Wang (1994) Intertemporal Asset Pricing Under Knightian Uncertainty, Econometrica, 62, 283-322. Erdogan, E. and G. Iyengar (2006) Ambiguous Chance Constrained Problems and Robust Optimization, Mathematical Programming, 107, 37-61. Follmer, H. and A. Schied (2002) Robust representation of convex measures of risk. Advances in Finance and Stochastics, Essays in Honour of Dieter Sondermann, Springer-Verlag, 3956. Follmer, H. and A. Schied (2002) Stochastic Finance: An Introduction in Discrete Time, de Gruyter Studies in Mathematics 27, Second edition (2004), Berlin. Garlappi, L., R. Uppal, and T. Wang (2005) Portfolio Selection with Parameter and Model Uncertainty: A Multi-Prior Approach, C.E.P.R. Discussion Papers 5041.
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
31
Gallego, G., J. Ryan and D. Simchi-Levi (2001) Minimax Analysis for Finite Horizon Inventory Models, IIE Transactions, 33, 861-874. Gilboa, I. and D. Schmeidler (1989) Maxmin Expected Utility with Non-unique Prior, Journal of Mathematical Economics, 18, 141-153. Goldfarb, D. and G. Iyengar (2003) Robust Portfolio Selection Problem, Mathematics of Operations Research, 28, 1-28. Hansen, L. P. and T. J. Sargent (2001) Acknowledging Misspecification in Macroeconomic Theory, Review of Economic Dynamics, 4, 519535. Hansen, L. P. and T. J. Sargent (2001) Robust Control and Model Uncertainty, American Economic Review, 91, 60-66. Hansen, L. P. and T. J. Sargent (2003) Robust Control of Forward Looking Models, Journal of Monetary Economics,. Hansen, L. P. and T. J. Sargent (2006) Robust Control and Economic Model Uncertainty, Princeton University Press, Princeton, NJ. (forthcoming) Hansen, L. P., T. J. Sargent, and T. D. Tallarini, Jr. (1999) Robust Permanent Income and Pricing, Review of Economic Studies, 66, 873-907. Hansen, L. P., T. J. Sargent, G. A. Turmuhambetova, and N. Williams (2002) Robustness and Uncertainty Aversion, University of Chicago. Hansen, L. P., T. J. Sarget and N. E. Wang (2002) Robust Permanent Income and Pricing with Filtering, Macroeconomic Dynamics, 6, 4084. Iyengar, G. (2005) Robust Dynamic Programming, Mathematics of Operations Research, 30, 257-280. Jain, A., A. E. B. Lim and J. G. Shanthikumar, Incorporating Model Uncertainty and Learning in Operations Management. Working paper, University of California at Berkeley. Karlin, S. (1960) Dynamic Inventory Policy with Varying Stochastic Demands, Management Science, 6, 231-258. Kass, E. and L. Wasserman (1996) The Selection of Prior Distributions by Formal Rules, Journal of the American Statistical Association, 91, 1343-1370. Knight, F. H. (1921) Risk, Uncertainty and Profit, Houghton Mifflin, Boston, MA. Kouvelis, P. and Yu, G. (1997) Robust Discrete Optimization and Its Applications, Kluwer Academic Publishers, Boston, MA. Lariviere M. A. and E. L. Porteus (1999) Stalking Information: Bayesian Inventory Management with Unobserved Lost Sales, Management Science, 45, 346-363. Lim, A. E. B. and J. G. Shanthikumar (2004) Relative Entropy, Exponential Utility and Robust Dynamic Pricing. Working paper, University of California at Berkeley (to appear in Operations Research). Lim, A. E. B., J. G. Shanthikumar and Z-J. M. Shen (2006a), Dynamic Learning and Optimization with Operational Statistics. Working paper, University of California at Berkeley.
32
Model Uncertainty, Robust Optimization and Learning c 2006 INFORMS INFORMS—Pittsburgh 2006, °
Lim, A. E. B., J. G. Shanthikumar and Z-J. M. Shen (2006b), Duality for relative performance objectives. Working paper, University of California at Berkeley. Lim, A. E. B., J. G. Shanthikumar and T. Watewai (2005) Relative Performance Measures of Portfolio Robustness. Working paper, University of California at Berkeley. Lim, A. E. B., J. G. Shanthikumar and T. Watewai (2006a) Robust Multi-Product Dynamic Pricing. Working paper, University of California at Berkeley. Lim, A. E. B., J. G. Shanthikumar and T. Watewai (2006b) A Balance Between Optimism and Pessimism in Robust Portfolio Choice Problems through Certainty Equivalent Ratio. Working paper, University of California at Berkeley. Liu, J., Pan, J, and T. Wang (2006), An Equilibrium Model of Rare-Event Premia,Review of Financial Studies, to appear. Liyanage, L. and J. G. Shanthikumar (2005) A Practical Inventory Policy Using Operational Statistics, OR Letters, 33, 341-348. Porteus, E. L. (2002) Foundations of Stochastic Inventory Theory, Stanford University Press, Stanford, CA. Robert, C. P. (2001) The Bayesian Choice, Second Edition, Springer, New York, NY. Ruszczynski, A. and A. Shapiro (Editors) (2003) Stochastic Programming, Hanbooks in Operations Research and Management Series, Volume 10, Elsevier, New York. Savage, L. J. (1972) The Foundations of Statistics, Second Edition, Dover, New York. Scarf, H. (1959) Bayes Solutions of Statistical Inventory Problem, Annals of Mathematical Statistics, 30, 490-508. Soyster, A. L. (1973) Convex Programming with Set-Inclusive Constraints and Applications to Inexact Linear Programming, Operations Research, 21, 1154-1157. Sutton, R. S. and A. G. Barto (1998) Reinforcement Learning: An Introduction, The MIT Press, Cambridge, MA. Uppal, R. and T. Wang (2003) Model Misspecification and Under Diversification, Journal of Finance, 58, 2465-2486. van der Vlerk, M. H. (2006) Stochastic Programming Bibliography, World Wide Web, http://mally.eco.rug.nl/spbib.html, 1996-2003 Vapnik, V. N. (2000) The Nature of Statistical Learning Theory, Second Edition, Springer, New York, NY. Wald, A. (1950) Statistical Decision Functions, John Wiley and Sons, New York. Zipkin, P. H. (2000) Foundations of Inventory Management, McGraw Hill, New York.