Using error bounds to compare aggregated ... - Semantic Scholar

4 downloads 0 Views 380KB Size Report
Jun 24, 2006 - j∈Tk ci j gi j , dik = ∑ j∈Tk di j gi j , bk = ∑ j∈Tk bj . We will denote by (˜. AP) the aggregated problem obtained for the weights gi j before.
Ann Oper Res (2006) 146:119–134 DOI 10.1007/s10479-006-0051-6

Using error bounds to compare aggregated generalized transportation models Igor S. Litvinchev · Socorro Rangel

Published online: 24 June 2006  C Springer Science + Business Media, LLC 2006

Abstract A comparative study of aggregation error bounds for the generalized transportation problem is presented. A priori and a posteriori error bounds were derived and a computational study was performed to (a) test the correlation between the a priori, the a posteriori, and the actual error and (b) quantify the difference of the error bounds from the actual error. Based on the results we conclude that calculating the a priori error bound can be considered as a useful strategy to select the appropriate aggregation level. The a posteriori error bound provides a good quantitative measure of the actual error. Keywords Clustering . Network models . Approximation algorithms

Introduction An important issue in optimization modelling is the trade-off between the level of detail to employ and the ease of using and solving the model. Aggregation/disaggregation techniques have been developed to facilitate the analysis of large models and provide the decision maker a set of tools to compare models that have a high degree of detail with smaller aggregated models that have less detail. Aggregated models are simplified, yet approximate, representations of the original complex model and may be used in various stages of the decision making process. Frequently, a number of aggregated models can be considered for the same original model, and it is then desirable to select an aggregated model providing the tightest (in some sense) approximation. In optimization-based modelling, the difference between the actual optimal objective function value and the aggregated optimal objective function value can be used as a measure

I. S. Litvinchev Computing Center Russian Academy of Sciences, Vavilov 40, Moscow 119991, Russia S. Rangel ( ) Department of Computer Science and Statistics, UNESP, Rua Crist´ov˜ao Colombo, 2265, S.J. Rio Preto, SP 15054-000, Brazil e-mail: [email protected] Springer

120

Ann Oper Res (2006) 146:119–134

of the error introduced by aggregation. Many upper bounds on this error have been developed for the aggregated optimization models and are referred to as error bounds (Leisten, 1997; Litvinchev and Rangel, 1999; Litvinchev and Tsurkov, 2003, and references therein). Two types of error bounds are considered: a priori error bounds and a posteriori error bounds. A priori error bounds may be calculated after the model has been aggregated but before the smaller aggregated model has been optimized. A posteriori error bounds are calculated after the aggregated model has been optimized. Error bounds allow the modeler to compare various types of aggregation and, if the error bounds are not adequate, modify the aggregation strategy in an attempt to tighten the error bounds (Rogers et al., 1991). There are two main issues associated with the error bounds. The first one is addressed in the difference of an error bound from the actual error. The closer the error bound to the actual error, the better. The second issue is how to compare different aggregation strategies based on the error bounds. It is not obvious that the aggregation yielding the tightest error bound has the smallest actual error (Rogers et al., 1991). Moreover different bound calculation techniques may result in a different relationship (correlation) between the error bound and the actual error. The first issue above is relatively well explored, at least for the case of a posteriori error bounds for linear programming (see Leisten, 1997; Litvinchev and Tsurkov, 2003; Rogers et al., 1991, and the references therein). The second area is much less investigated. To our knowledge the first comprehensive study in this direction has been made recently in Norman, Rogers, and Levy (1999) for the classical transportation problem (TP) where they explored, among others, the following question:

r If a model is aggregated using different methods to create several different smaller models and an error bound (a priori and/or a posteriori) is calculated for each aggregated model, does the model with the tightest error bound(s) give the closest approximation of the objective function value? Many applications of aggregation/disaggregation theory in optimization involve network problems such as location models (Francis, Lowe, and Tamir, 2002), the generalized assignment model (Hallefjord, Jornsten, and Varbrand , 1993) and, in particular, the TP. For the latter both a priori and a posteriori error bounds were developed and investigated (Rogers et al., 1991; Zipkin, 1980a). In network flow problems the nodes traditionally correspond to constraints, and arcs to variables. For most aggregation schemes, aggregation of nodes (constraints) implies or necessitates a corresponding aggregation of arcs (variables). Usually, only variables and constraints of the same type (e.g., sources or destinations) are aggregated. The TP is for minimizing the cost of transporting supplies from the set of sources to the set of destinations, and it is assumed that the transportation flow is conserved on every arc between the source and the destination. In the generalized transportation problem (GTP) the amount of flow that leaves the source can differ from the amount of flow that arrives to the destination. A certain multiplier is associated with each arc to represent a “gain” or a “loss” of a unit flow between the source and the destination. There are two common interpretations of the arc multipliers (Ahuja, Magnanti, and Orlin, 1993). For the first interpretation the multipliers are employed to modify the amount of flow of some particular item. Therefore it is possible to model situations involving physical or administrative transformations, for example, evaporation, seepage, or monetary growth due to the interest rates. For the second interpretation, the multipliers are used to transform one type of item into another (manufacturing, machine loading and scheduling, currency exchanges, etc.). For a more detailed discussion on the GTP and its applications see Ahuja, Magnanti, and Orlin (1993) and Glover, Klingman, and Phillips (1990). In contrast to the TP, much less work has been done Springer

Ann Oper Res (2006) 146:119–134

121

in aggregation for the GTP. Only a posteriori error bounds for aggregation in the GTP have been developed in Evans (1979). In this paper we present a study of the correspondence between the error bounds and the actual error for the aggregated GTP. The paper is organized as follows. In Section 1, we state the aggregated GTP and consider its properties. In the next section, a family of a priori error bounds for a class of the GTP is derived, and new a posteriori error bounds are proposed. A numerical study carried out to compare the quality of the error bounds and to determine if there are relationships between the error bounds and the actual error is presented in Section 3. Based on the results of this study we can conclude that there is a significant difference among the various methods of calculating error bounds. Moreover, the improved a priori and a posteriori error bounds have a statistically significant correlation with the actual error.

1. The original and the aggregated models The original GTP before aggregation (OP) is considered in the following form Evans (1979): z ∗ = min

(OP) s. t.





ci j xi j

i∈S j∈T

di j xi j ≤ ai ,

(i ∈ S),

j∈T



xi j = b j ,

( j ∈ T ),

i∈S

xi j ≥ 0,

(i ∈ S, j ∈ T ).

Here S and T are the set of all sources and all destinations (customers), respectively; xi j is the amount of flow from a source node i to a destination node j; ci j is the transportation cost; di j is the flow gain; ai is the node supply, and b j is the node demand. It is assumed that all the coefficients in (OP) are positive. The role of sources and destinations in (OP) can be reversed (Evans, 1979) by a simple transformation of variables: yi j = di j xi j . This transformation removes the arc multipliers from the first group of constraints in (OP) and places them in the second group. Taking into account this correspondence between the sources and the destinations we consider here only aggregation of the destinations (customers) in (OP). The aggregation of the sources can be treated similarly. The aggregation is defined by a partition of the set T of the destinations into a set T A K of clusters Tk , k = 1, . . . , K such that Tk ∩ T p = ∅, k = p and k=1 Tk = T. Note that the columns in the (OP) are also partitioned/clustered respectively. The fixed-weight variable aggregated linear programming problem (Zipkin, 1980b) is derived by replacing the columns in each cluster by their weighted average. The objective coefficients are combined similarly. For each k, consider the non-negative weights gi j , i ∈ S, j ∈ Tk fulfilling the following normalizing conditions: 

gi j = 1.

(1)

j∈Tk

Springer

122

Ann Oper Res (2006) 146:119–134

The aggregated problem can be obtained by fixing the normalized weights gi j and substituting the linear transformation (disaggregation): xi j = gi j X ik ,

( j ∈ Tk )

into (OP) (Litvinchev and Rangel, 1999). Thus the aggregated problem is:   z = min X ik ci j gi j 

s.t.

i∈S k∈T A

k∈T A

X ik

(2)

(3)

j∈Tk



di j gi j ≤ ai ,

(i ∈ S),

(4)

gi j X ik = b j ,

( j ∈ Tk , k ∈ T A ),

(5)

j∈Tk

 i∈S

X ik ≥ 0,

(i ∈ S, k ∈ T A ).

(6)

   To write (3)–(6) we have used the observation that j∈T = k∈T A j∈Tk .  From (1) and (2) it follows that for any normalized weights we have X ik = j∈Tk xi j and hence X ik is the supply from the source i to the aggregated customer Tk . However, (3)–(6) still has |T | restrictions (5) corresponding to destinations, and hence cannot be considered as some GTP associated with S sources and T A destinations. The number of constraints (5) can be reduced as follows.  Let the weights be chosen as g i j = b j / j∈Tk b j , j ∈ Tk . Then for each k ∈ T A in (5) we obtain |Tk | conditions of the same form i∈S X ik = j∈Tk b j . The redundant constraints can be removed without changing the primal solution to form the problem which is usually refereed to as the aggregated GTP (Evans, 1979):  (AP) z = min cik X ik i∈S k∈T A

s.t.



d ik X ik ≤ ai ,

(i ∈ S),

k∈T A



X ik = bk ,

(k ∈ T A ),

i∈S

X ik ≥ 0

(i ∈ S, k ∈ T A ),

   where cik = j∈Tk ci j g i j , d ik = j∈Tk di j g i j , bk = j∈Tk b j .  the aggregated problem obtained for the weights g i j before We will denote by (AP)  and (AP) are the same removing the redundant constraints. The primal solutions X ik of (AP) and the fixed-weight disaggregated solution x i j = g i j X ik , j ∈ Tk is feasible to the original  always has multiple optimal solutions, even if the problem (OP). The dual problem to (AP) optimal dual solution to (AP) is unique, as supported in Proposition 1: Proposition 1. Let {u i , v k } be an optimal dual solution to (AP). Denote 

    v j b j = v k bk , k = 1, . . . , K . D(v) = v j , j ∈ T  j∈Tk

Springer

(7)

Ann Oper Res (2006) 146:119–134

123

 and Then any pair {u, v } with v ∈ D(v) is an optimal dual solution to (AP) z=−



ai u i +

i∈S



vjbj.

(8)

k∈T A j∈Tk

 or, which is the same, the dual to (3)–(6) for gi j = g i j : Proof: Consider the dual to (AP)  (D AP)





ai u i +

i∈S

s.t.



v j b j → max

k∈T A j∈Tk

cik + d ik u i −



v j g i j ≥ 0,

(i ∈ S, k ∈ T A ),

j∈Tk

u i ≥ 0,

(i ∈ S).

The dual to (AP) is: (DAP)



 i∈S

s.t.

ai u i +

 k∈T A

vk



b j → max

j∈Tk

cik + d ik u i − vk ≥ 0, u i ≥ 0,

(i ∈ S, k ∈ T A ), (i ∈ S).



These two dual problems should have equal optimal value since the primal problems (AP)  have the same optimal solutions. Let {u i , v k } be an optimal solution to (DAP). and (AP)  Construct { u i , v j } as follows: u i = u i and v j is such that j∈Tk v j g i j = v k . Comparing the  we see that this {  Moreover, restrictions of (DAP) and (D AP) u , v } is feasible to (DAP). i j   for { since g i j = b j / j∈Tk b j it is not hard to verify that the objective value of (DAP) u i , vj}  is equal to the optimal objective of (DAP). Hence this { u i , v j } is optimal to (DAP).  follows from the way the Note, that the nonuniqueness of the optimal duals in (AP) aggregation was performed, no matter whether (AP) has a unique optimal dual solution or not. The nonuniquiness of the duals will be used further to derive a priori error bounds for the GTP.

2. The error bounds If the fixed weights g i j are used to form the aggregated problem (AP), the optimal value z of the objective function for (AP) is also an upper bound for the optimal objective z ∗ to the original problem, such that z − z ∗ ≥ 0. This follows from the fact that the disaggregated solution x i j = g i j X ik , j ∈ Tk is feasible to the original problem. To get an upper bound on the value z − z ∗ we will use the following result established in Litvinchev and Rangel (1999) for the fixed-weight aggregation in convex programming: if W is a closed convex and bounded set containing an optimal solution to the original problem (such a W is called a localization), then z − z ∗ ≤ minx∈W L(x, τ ). Here L(·, ·) is the standard Lagrange function associated with the original problem and τ is a vector of Lagrange multipliers. Assume that the condition x ≥ 0 is always included in the definition of W . Springer

124

Ann Oper Res (2006) 146:119–134

For the GTP, τ = {u, v}, the Lagrangian has the form:    L(x, u, v) = − u i ai + vjbj + (ci j + u i di j − v j )xi j , i∈S

j∈T

(u ≥ 0)

i∈S j∈T

and hence for any fixed dual multipliers u ≥ 0 and v we have:



z−z ≤ z+



u i ai −

i∈S



vjbj

+ max x∈W

j∈T

 (v j − ci j − u i di j )xi j ≡ (u, v, W ). i∈S j∈T

(9)  then: In particular, if u i , v j is an optimal solution of the dual to (AP), −



u i ai +

i∈S

( u , v , W ) = max x∈W



vjbj = z

and

j∈T

 ( v j − ci j − u i di j )xi j .

(10)

i∈S j∈T

2.1. A priori error bounds To get an a priori error bound for z − z ∗ by (10) we need to estimate the objective coefficients of the maximization problem in (10) without solving the aggregated problem. For the TP this can be done by choosing ui = ui , v j = v k , j ∈ Tk , that is v ∈ D(v). To see this, note that for di j = 1, the term in the brackets in (10) is v k − ci j − u i and by the restrictions of (DAP) this is less than or equal to cik − ci j , j ∈ Tk , which gives: z − z ∗ ≤ max x∈W



cik( j) − ci j ,

i∈S j∈T

where k( j) is the cluster to which j belongs, k( j) ∈ T A . If the localization W is defined before an aggregated problem has been solved (we will refer to such W as a priori localization), then the inequality above provides the a priori error bound for the TP. However, if di j = 1, we can’t estimate the coefficients in (10) under the same choice of the duals u , v . For this reason it was conjectured in Evans (1979) and Rogers et al. (1991) that “. . . a priori bounds cannot be obtained for the GTP as is possible for simple transportation problems”. To cope  with this problem, we will use the nonuniqueness of the optimal dual multipliers for the (A P), as stated in Proposition 1. Proposition 2. Let di j = pi t j , where pi > 0, t j > 0. Then there exists v ∈ D(v) such that for the pair u, v we have: v j − ci j − u i di j ≤ (cik di j − ci j d ik )/d ik ,

(i ∈ S, j ∈ Tk ).

Proof: From the restrictions of (DAP) we have u i ≥ (v k − cik )/d ik

and hence:

v j − ci j − u i di j ≤ [(cik di j − ci j d ik ) − (v k di j − v j d ik )]/d ik . Springer

Ann Oper Res (2006) 146:119–134

125

By the definition of d ik and the correspondence between v k and v j , j ∈ Tk we have:  b j (v k di j − v j d ik ) = 0 (∀i). j∈Tk

v j d ik ) > 0 ( 0, then the case (v k di j − Now choose v j to fulfill conditions v k di j − v j d ik = 0 for all j ∈Tk . Resolving the latter system of equations with respect to v j for di j = pi t j and d ik = ( j∈Tk b j di j )/( j∈Tk b j ) we obtain:  j∈Tk b j v j = vk t j  , ( j ∈ Tk ). j∈Tk b j t j It is simple to check that for this v j we have and the claim follows.

 j∈Tk

v j b j = vk

 j∈Tk

b j . Hence v ∈ D(v) 

Proposition 2 together with (10) yields immediately the following expression for the a priori error bound: Corollary . For di j = pi t j we have: z − z ∗ ≤ max



x∈W

where

 δiaj ≡

δiaj xi j ≡ εa (W ),

i∈S j∈T

cik( j) d ik( j)



 ci j di j di j

and k( j) is the cluster to which j belongs, k( j) ∈ T A .  Remark 1. Under the assumption di j = pi t j we have j∈T t j x i j ≤ ai / pi , i ∈ S in the “source” restrictions of the original problem. This means that the “gain” (“loss”) along the arc (i, j) is the same for all sources i connected with a destination j. We can treat this case as an intermediate one between the TP, where the gains are the same for all arcs, and a general form of the GTP, where the gains vary from arc to arc. Interpreting the GTP as the machine loading problem (see, e.g., Ahuja, Magnanti, and Orlin (1993), and Glover, Klingman, and Phillips (1990)), producing one unit of product j on machine i consumes di j hours of the machine’s time. If pi is interpreted as a production “velocity” of the machine i, and t j is a “length” of the product j measured in suitable units, then the assumption di j = pi t j holds. A priori localizations can be defined, for example, by manipulating the constraints of the original problem. Let Wb = {xi j | 0 ≤ xi j ≤ ri j , i ∈ S, j ∈ T }, ri j = min{b j , ai /di j },     Ws = xi j  di j xi j ≤ ai , i ∈ S, xi j ≥ 0 ∀i, j , j∈T

 Wd =

  xi j  xi j = b j , j ∈ T, xi j ≥ 0

 ∀i, j ,

i∈S

Springer

126

Ann Oper Res (2006) 146:119–134

where the lower indices b, s, and d show that the restrictions defining the respective localization are associated with bounds, sources and destinations. Obviously Ws , Wd are localizations as well as Wbs = Wb ∩ Ws and Wbd = Wb ∩ Wd . For these four localizations the optimization problem to calculate εa (W ) can be solved analytically. Due to the decomposable structure of the localizations, the computation of εa (Ws ), εa (Wd ) results in a number of independent single-constrained continuous knapsack problems, which yields εa (Ws ) =



 +  di j , ai max δiaj j∈T

i∈S

εa (Wd ) =



b j max δiaj , i∈S

j∈T

where [α]+ = max{0, α}. Similarly, the calculation of εa (Wbs ), εa (Wbd ) is reduced to the solution of independent single-constrained knapsack problems with upper bounds for the variables. The respective solutions can also be obtained analytically (see, e.g., Litvinchev and Rangel, 1999). By construction we have min{εa (Wb ), εa (Ws )} ≥ εa (Wbs ) and similarly for Wd . Then a ε Best is defined as εaBest = min{εa (Wbs ), εa (Wbd )}. It is interesting to compare the obtained a priori error bounds with the results known for the TP (Zipkin, 1980a). Let di j = 1, i ∈ S, j ∈ T and all “≤ ” in the linear restrictions of (OP) be substituted for “=”. Then δiaj = cik( j) − ci j and to get the estimation stated in proposition 2 we should put v j = v k , j ∈ Tk . The expressions for our four a priori error bounds will continue to be valid for the classical transportation problem if we change [δiaj ]+ for δiaj . Then εa (Ws ) and εa (Wd ) coincide with the a priori error bounds obtained in Zipkin (1980a) for the customer aggregation in the TP. Our error bound εaBest is at least as good as min{εa (Ws ), εa (Wd )}. 2.2. Tightening the a priori error bounds From the definition of εa (W ) it follows that the tighter the localization W , the better (smaller) the error bound εa (W ). Defining W by the original constraints, we should retain as many constraints as possible. Alternatively, localization W should be simple enough to guarantee that the computational burden associated with the calculation of the error bound is not too expensive. A compromise decision is to utilize “easy” restrictions directly, while dualizing “complicated” restrictions by the Lagrangian. Suppose that the localization Wall is defined by all constraints of (OP) and the destination constraints are treated as “easy”, while the source constraints are considered as “complicated”. Then by the Lagrangian duality we have : ε (Wall ) ≡ max a

x∈Wall



 δiaj xi j

i∈S j∈T

= min u≥0





δiaj − u i di j xi j u i ai + max

i∈S

x∈Wd

i∈S j∈T

≡ min ε (u, Wd ) ≤ ε (0, Wd ) ≡ ε (Wd ), a

a

a

u≥0

where εa (u, Wd ) denotes the expression in the figure brackets. Springer



Ann Oper Res (2006) 146:119–134

127

After the error bound εa (0, Wd ) = εa (Wd ) has been calculated, it is natural to look for another value of u ≥ 0 to diminish the current error bound. If s is the search direction, we get u = ρs, where the stepsize ρ ≥ 0 and the components of the search direction should be nonnegative. Let εa (W ) = max x∈W



δiaj xi j ,

i∈S j∈T

and denote by x (W ) its optimal solution. Consider that εa (Wd ) has been calculated and let the marginal value theorem, εa (u, Wd ) is differentiable at u = 0, x (Wd ) be unique. Then by a and ∇u i ε (0, Wd ) = ai − j∈T di j xi j . Now take one step of the projected gradient technique for the problem minu≥0 εa (u, Wd ) starting from u = 0. This gives  u i (ρ) = ρ max 0,



 di j xi j − ai

(i ∈ S, ρ ≥ 0).

j∈T

The stepsize ρ associated with the steepest descent can be defined by a one-dimensional search: under the uniquiness of x the choice of u i ( ρ ) guarantees a strict decrease of the a priori error bound, that is either εa ( u , Wd ) < εa (0, Wd ) = εa (Wd ) or εa (Wd ) = εa (Wall ). The other a priori error bounds can be strengthened similarly. We will refer to the a priori error bound εa (W ), tightened by searching the duals in the direction of the projected anti-gradient as ε∇a (W ). 2.3. A posteriori error bounds Suppose now that (AP) has been solved and the values u and v are known. The expression for ( u , v , W ) in (10) still provides a valid error bound and we only need to construct u , v  By Proposition 1 we may put optimal to (AP). u = u and v j = v k , j ∈ Tk . We can calculate the a posteriori error bounds ε p ( u , v , W ) analytically for the same localizations used in Section 2.1 since: z − z ∗ ≤ max x∈W



p

δi j xi j ≡ ε p ( u , v , W ),

i∈S j∈T

p p where δi j ≡ v j − ci j − u i di j . Hence we only need to substitute δi j for δiaj in the expressions for the a priori error bounds obtained earlier. Note that the a posteriori error bound is valid for the GTP in its general form, without any assumptions about the specifics of the arc multipliers. For the choice of u , v described above, the error bounds ε p ( u , v , Ws ) and ε p ( u , v , Wd ) coincide with the a posteriori error bounds obtained in Evans (1979). The new error bound p ε Best = min{ε p ( u , v , Wbs ), ε p ( u , v , Wbd )} obtained for the same choice of u , v is at least as good as ε p ( u , v , Ws ), ε p ( u , v , Wd ), and a numerical study below demonstrates that in many cases it is significantly better. After the a posteriori error bound has been calculated for some fixed  u ≥ 0, v , it is natural to look for another dual solution to decrease the error bound. Let the duals { u , v } be changed in the direction {s, q} with the stepsize ρ, such that  u + ρs ≥ 0. Then the best (i.e., error

Springer

128

Ann Oper Res (2006) 146:119–134

bound minimizing) duals in this direction are defined from the problem: min ε p ( u + ρs, v + ρq, W ).

˜ ρ:u+ρs≥0

The dual-searching approaches to tightening the a posteriori error bound are well known for linear programming aggregation (see Leisten (1997) and the references therein). Mendelssohn (1980) proposed to look for the new dual solution in the direction of the dual aggregated solution, i.e., for our case s = u, q = v. Shetty and Taylor (1987) used the right-hand side vector of the original problem as the search direction. In Litvinchev and Rangel (1999) one step of the projected gradient technique was used, similar to what was done in Section 2.2 for the a priori error bounds. In all cases the stepsize ρ can be obtained either by a one-dimensional search or by analyzing the non-differentiability points of the expression for ε p ( u + ρs, v + ρq, W ) as the function of the stepsize. The detailed description and examples of usage of error bounds for Mendelssohn, Taylor and Shetty, and Litvinchev and Rangel can p be found in Litvinchev and Tsurkov (2003). We will denote these error bounds by ε M (W ), p p p p ε ST (W ), and ε∇ (W ), respectively, while ε (W ) is used to denote the bound ε ( u , v , W ). Thus for each of the four localizations Wd , Ws , Wbs , and Wbd we can calculate two a priori p p error bounds εa (W ), ε∇a (W ) and four a posteriori error bounds ε p (W ), ε M (W ), ε ST (W ), and p ε∇ (W ). 3. Numerical study We tested the error bounds derived in this paper for the GTP with di j = pi t j . The objective of the study is to compare the performance of the proposed error bounds regarding the difference from the actual error and the correlation with it. Our numerical study is based on the study design presented in Norman, Rogers, and Levy (1999). To analyze the results we used the Pearson Correlation Coefficient, descriptive statistics and the hypothesis test for the mean of two independent samples (t-test) (Montgomery, 2001). The problem instances were generated using the open source code GNETGEN, which is a modification by Glover of the NETGEN generator (Klingman, Napier, and Stutz, 1974). Five problem sizes grouped in two sets were considered: SET1 of relatively small problems and SET2 of relatively large problems. The problems have the following dimensions: SET1 (30×50, 50×100, 100×150) and SET2 (50×1000, 50×2000), such that the number of destinations is greater (or significantly greater) than the number of sources. For each problem size ten instances were randomly generated using random number seeds. The available supplies   ai were randomly generated with ai = 104 for all instances in SET1 and ai = 1.8 · 106 for all instances in SET2. The cost coefficients ci j and the arc multipliers di j were randomly generated in the intervals [2.0, 20.0] and [0.5, 3.0], respectively. The destinations are clustered in 5 levels: 80, 60, 40, 20 and 10% of its total number. This clustering is referred to as level 1 - level 5 in the tables below (the higher the level, the smaller the number of destinations). Each cluster has |T |/|T A | destinations sorted by their index, except the last cluster that contains all the others (α denotes the integer part of α). There are a total of (5 problem sizes)×(10 problem instances)×(5 cluster levels) = 250 aggregated instances (150 in Set1 and 100 in SET2). Further we refer to the aggregated instances as simply instances. Four localizations were used to calculate the aggregation error bounds: Wd , Ws , Wbs , and Wbd . For each localization we calculated two a priori error bounds εa (W ), ε∇a (W ) and four p p p a posteriori error bounds ε p (W ), ε∇ (W ), ε ST (W ) and ε M (W ). We will pay special attention Springer

Ann Oper Res (2006) 146:119–134

129

to the localizations Wbd and Wd . We conjecture that since only demands were aggregated, the bounds calculated by allocating the reduced costs based on demands may be tighter and may have a higher correlation with the actual error. The dimension of the original problem allows the exact solution for all problems sizes. Correlations between the error bounds and the actual error, z¯ − z ∗ , were calculated using the Pearson coefficient and are presented in Table 1 for all localizations. In all correlation tables to be presented the p-values are less than 0.0001 if not indicated. In Table 1 the exceptions are: in SET1, εa (Ws ) ( p = 0.0003); and in SET2, ε∇a (Ws ) ( p = 0.0005), εa (Wbs ) ( p = 0.1357), p ε p (Ws ) ( p = 0.2876), and ε ST (Ws ) ( p = 0.3112). Table 1 Correlation between the actual error and the error bounds SET1

SET2

Wd

Ws

Wbd

Wbs

Wd

Ws

Wbd

Wbs

ε a (W) ε∇a (W)

0.6609 0.7131

0.2929 0.4043

0.7196 0.7376

0.4441 0.6471

0.5615 0.5626

−0.4840 −0.3418

0.5617 0.5624

0.1502 0.5611

ε p (W) p ε∇ (W) p εST (W) p εM (W)

0.9826 0.9897 0.9827 0.9989

0.5016 0.9897 0.4979 0.9765

0.9864 0.9910 0.9864 0.9997

0.5738 0.7288 0.5753 0.9660

0.9985 0.9998 0.9986 1.0000

−0.1074 0.9998 −0.1023 0.9419

0.9985 0.9998 0.9986 1.0000

0.4239 0.6958 0.4243 0.9086

The a priori error bounds calculated for the localization Wd and Wbd are correlated with the actual error at a statistically significant level ( p < 0.0001) for both SET1 and SET2, and the highest correlation is achieved for the error bound ε∇a . The correlation coefficient is smaller for the instances in SET2 than for the ones in SET1. The a posteriori error bounds for SET1 are strongly and statistically significantly correlated with the actual error for all localizations used. For the instances in SET2 the bounds ε p (Ws ) p and ε ST (Ws ) are not correlated with the actual error. For both SET1 and SET2 the highest p p correlation is achieved for the bounds ε∇ (W ) and ε M (W ) calculated for the localizations Wd and Wbd . The error bounds calculated using localizations Ws and Wbs have lower correlation values and for some error bounds the correlation value is not statistically significant. The correlations between the a priori error bounds, ε∇a and εa , and the a posteriori error bounds are presented in Table 2, with the values for ε∇a in bold. For the localizations Wd and Wbd both a priori error bounds are correlated with the a posteriori error bounds, the Table 2 Correlation between the a priori error bounds, εa∇ (W) and εa (W), and the a posteriori error bounds SET1

ε p (W) p

ε∇ (W) p

εST (W) p

εM (W)

SET2

Wd

Ws

Wbd

Wbs

Wd

Ws

Wbd

Wbs

0.7831 0.7259 0.7382 0.6788 0.7830 0.7258 0.7176 0.6693

0.5937 0.6513 0.5905 0.6319 0.6080 0.6383 0.4687 0.3719

0.7905 0.7677 0.7582 0.7364 0.7900 0.7671 0.7402 0.7230

0.8207 0.7695 0.8263 0.5913 0.8218 0.7680 0.7259 0.5418

0.5953 0.5944 0.5743 0.5733 0.5933 0.5924 0.5617 0.5607

0.5289 0.4141 −0.1365 −0.1815 0.5267 0.4071 −0.2326 −0.3261

0.5951 0.5945 0.5741 0.5734 0.5930 0.5924 0.5614 0.5606

0.3776 0.6970 0.1677 0.0748 0.3781 0.6972 0.4849 0.2847 Springer

130

Ann Oper Res (2006) 146:119–134

correlations for the instances in SET2 are weaker than for the instances in SET1. For these localizations ε∇a is more correlated with a the posteriori error bounds than εa . For the instances p p in SET2, the correlation with ε∇ and ε M when using localizations Ws and Wbs are very low and in some cases not significant (e.g., p-values for the correlation between ε∇a (Ws ) and p p ε∇ (Ws ) and for εa (Wbs ) and ε∇ (Wbs ) are 0.1757 and 0.4596 respectively). In Table 3 we present, sorted by aggregation level, the correlations between the actual error and the error bounds calculated for localization Wbd , the tightest localization based on demands. The a priori error bounds are highly and statistically significantly correlated with the actual error for all levels of aggregation. The lowest correlation was obtained for level 5. The correlations for the a priori error bound ε∇a are higher (except for level 4 in SET1) than for εa . Comparing with a priori error bounds, the a posteriori error bounds are more p correlated with the actual error for all levels of aggregation. The error bound ε M has the strongest correlation which is almost insensitive to the aggregation level. Table 3 Correlation between the actual error and the error bounds by level of aggregation (localization Wbd ) SET1

εa ε∇a εp p ε∇ p εST p εM

SET2

level 1

level 2

level 3

level 4

level 5

level 1

level 2

level 3

level 4

level 5

0.6871 0.7100 0.9858 0.9863 0.9858 0.9997

0.7660 0.9647 0.9954 0.9965 0.9954 0.9997

0.6625 0.8000 0.9837 0.9867 0.9835 0.9995

0.6297 0.6281 0.9912 0.9948 0.9908 0.9993

0.5636 0.5803 0.9557 0.9663 0.9556 0.9992

0.9849 0.9849 0.9984 0.9998 0.9984 0.9997

0.9883 0.9887 0.9992 0.9997 0.9992 0.9995

0.9080 0.9101 0.9956 0.9992 0.9957 0.9996

0.9281 0.9288 0.9786 0.9974 0.9783 0.9993

0.6920 0.7261 0.9710 0.9948 0.9706 0.9997

Similar to Leisten (1997) and Litvinchev and Rangel (1999), the numerical quality of the error bounds was characterized by the upper and the lower bounds on z ∗ : z¯ ≥ z ∗ ≥ L Bε , where L B = z¯ − ε(W ) and ε(W ) is an error bound. We used the indicators (¯z − z ∗ )/z ∗ and (z ∗ − L Bε )/z ∗ to represent the proximity ratio. In particular, the smaller the proximity ratio (z ∗ − L Bε )/z ∗ is, the tighter is the error bound. The average values of these indicators, together with standard deviations, are presented in Tables 4 and 7, sorted by localizations and levels of clustering. In Table 5 we present the relative number of instances having W as the best (bound minimizing) localization for a fixed type of error bound (in % of the total number of instances). The localization Ws is not represented in Table 5 since it was never the best. In Table 6 we give the relative number of instances having a particular combination bound-localization as the best among all the error bounds calculated. Note that by construction, for any fixed localization W we have εa (W ) ≥ ε∇a (W ) and p p p p ε p (W ) ≥ max {ε M (W ), ε ST (W ), ε∇ (W )} since the improved error bounds ε∇a (W ), ε M (W ), p p a p ε ST (W ) and ε∇ (W ) are at least as good as the original error bounds ε (W ) and ε (W ), respectively. To compare different localizations, observe that in Mendelssohn (1980)’s approach we look for the new duals in the direction of the dual aggregated solution, while Shetty and Taylor (1987) scheme uses the right-hand side vector of the original problem as the search direction. In both cases the search direction is independent on the localization used. Hence, p p p p ε M (Wbd ) ≤ ε M (Wd ) (ε ST (Wbd ) ≤ ε ST (Wd )) since Wbd ⊆ Wd and the objective functions to p p calculate ε M (Wbd ) and ε M (Wd ) are the same. This observation is also valid for the localizations Wbs and Ws . But this is not the case for the gradient-based improvement scheme p used to obtain the bound ε∇ (W ). Here the dual search direction is defined by the optimal solution x (W ) of the problem used to calculate the bound ε p (W ) and hence depends on the p p localization used. Thus the objective functions used to calculate ε∇ (Wbd ) and ε∇ (Wd ) are not Springer

Ann Oper Res (2006) 146:119–134  Table 4 Proximity ratio

131

average standard deviation

(¯z − z∗ )/z∗



SET1

SET2

1.6137 0.6463

1.0293 0.4953

(z∗ − LBε )/z∗

Wd

Ws

Wbd

Wbs

Wd

Ws

Wbd

Wbs

ε a (W)

1.4870 0.8922 1.2335 0.8105

7.1782 5.9698 4.4308 3.4014

1.2440 0.7905 1.1452 0.7532

5.4950 4.5655 2.9979 1.5581

1.1410 0.9503 1.1226 0.9316

10.9971 13.1592 4.5688 1.7838

1.1377 0.9481 1.1219 0.9318

3.4889 1.7821 2.3655 0.6100

0.2707 0.1610 0.1684 0.1095 0.2706 0.1609 0.1043 0.0323

9.2814 7.7396 3.5182 4.4342 8.9535 7.4195 0.6292 0.1434

0.2076 0.1349 0.1228 0.0991 0.2068 0.1348 0.0595 0.0169

5.3673 4.4300 2.1924 1.6776 5.3447 4.4138 0.5476 0.1756

0.0495 0.0352 0.0140 0.0115 0.0481 0.0346 0.0125 0.0047

6.0126 6.8493 1.1395 2.7697 5.7335 6.5401 0.6359 0.1673

0.0493 0.0351 0.0138 0.0114 0.0478 0.0344 0.0109 0.0039

1.9566 1.8763 0.9779 0.7742 1.9549 1.8762 0.4226 0.2197

ε∇a (W) ε p (W) p

ε∇ (W) p

ε ST (W) p

ε M (W)

Table 5 Relative number of instances (in %) having W as the best localization

SET1 Wbd

p

SET2

Wd

Wbs

Wbd

Wd

Wbs

εa (W) ε∇a (W)

98 89.33

0 5.33

2 5.33

100 77

0 23

0 0

εp (W) p ε∇ (W) p εST (W) p εM (W)

100 98.67 100 100

0 1.33 0 0

0 0 0 0

100 94 100 100

0 6 0 0

0 0 0 0

p

the same and we cannot guarantee that ε∇ (Wbd ) ≤ ε∇ (Wd ). Similarly for the localizations Wbs and Ws as well as for the a priori error bound ε∇a (W ). It can be seen from Table 4 that the use of localizations Wd and Wbd give the lowest average proximity ratios for the a priori error bounds and there is no statistically significant difference between εa and ε∇a calculated using these localizations, according to the t-test run to check the differences. However, for localizations Ws and Wbs , ε∇a gives a smaller proximity ratio than εa . As was expected, the a posteriori error bounds are much tighter than a priori error bouns, giving the best results for localizations Wd and Wbd . Comparing the different p p p types of the error bounds, ε M and ε∇ are tighter than ε p and ε ST for any fixed localization, p p with the best proximity ratio obtained for ε M (Wbd ). The second best is ε∇ (Wbd ). From Table 5 we can see that there are a few instances in SET1 to which the localization Wbs results in tighter a priori error bounds than Wbd and Wd . For the a priori error bounds calculated in SET2 and all types of a posteriori error bounds the tightest results were obtained p p for localizations Wbd and Wd . For the a posteriori error bounds ε p (W ), ε ST (W ), and ε M (W ) the localization Wbd was the best for all instances. This is not the case for the gradient-based p error bounds ε∇a (W ) and ε∇ (W ) where for some instances the localization Wd provides better results, especially in SET2. Comparing the best error bounds in Table 6 we see that for the Springer

132

Ann Oper Res (2006) 146:119–134

Table 6 Relative number of instances (in %) having a given combination (error bound, localization) as the best

ε∇a (Wbd ) ε∇a (Wd ) ε∇a (Wbs ) p εM (Wbd ) p ε∇ (Wbd ) p ε∇ (Wd ) 

Table 7 Proximity ratio by level of aggregation (localization Wbd )

average standard deviation

SET1 level 1 (¯z − z ∗ )/z ∗ (z∗ − LBε )/z∗ εa ε∇a εp p

ε∇ p

εST p

εM

level 2

level 3

SET1

SET2

89.33 5.33 5.33

77.00 23.00 0

90.67 9.33 0

48.00 49.00 3.00

level 4

level 5



SET2 level 4

level 5

level 1

level 2

level 3

1.0332 1.7219 1.3672 1.6523 2.2939 0.7235 1.4655 0.3400 0.9615 1.6557 0.5283 0.5319 0.5210 0.4520 0.4527 0.1196 0.1189 0.0869 0.0966 0.1257 0.5469 0.5043 0.5092 0.4757

0.4652 0.4212 0.3213 0.1522

1.4389 0.4980 1.2852 0.3354

1.9824 0.4607 1.9223 0.4587

1.7865 0.5297 1.6881 0.5213

0.0695 0.0263 0.0695 0.0263

0.0588 0.0207 0.0586 0.0202

1.2281 0.0668 1.2237 0.0666

2.2021 0.0667 2.1723 0.0683

2.1301 0.0940 2.0854 0.0895

0.0975 0.1061 0.0869 0.1030 0.0975 0.1061 0.0469 0.0134

0.1350 0.0678 0.0927 0.0579 0.1348 0.0679 0.0583 0.0123

0.2040 0.1067 0.1255 0.0983 0.2028 0.1073 0.0610 0.0163

0.2873 0.0603 0.1468 0.0464 0.2851 0.0614 0.0653 0.0170

0.3141 0.1621 0.1619 0.1422 0.3138 0.1623 0.0663 0.0182

0.0244 0.0074 0.0061 0.0026 0.0244 0.0074 0.0110 0.0029

0.0339 0.0053 0.0071 0.0030 0.0339 0.0053 0.0132 0.0037

0.0213 0.0082 0.0074 0.0035 0.0199 0.0082 0.0072 0.0025

0.0669 0.0228 0.0199 0.0077 0.0633 0.0236 0.0105 0.0037

0.1000 0.0312 0.0286 0.0130 0.0977 0.0314 0.0127 0.0040

p

p

relatively small problems in SET1 the error bound ε M (Wbd ) outperforms the bound ε∇ (Wbd ) for most, but not all instances. In SET2 of relatively large problems the gradient-based bound p p ε∇ provided tightest results slightly more frequently than Mendelssohn’s bound ε M . The proximity ratios as a function of clustering are shown in Table 7 for localization Wbd . Concerning the a priori error bounds, ε∇a is slightly tighter than εa . The error bounds do not increase monotonously with respect to the clustering level. The a priori error bounds for levels 1 and 2 are significantly tighter than for the highly aggregated instances in levels 4 and p p 5. This holds for both SET1 and SET2. Comparing the a posteriori error bounds, ε M and ε∇ p p have the lowest proximity ratio with ε∇ outperforming ε M only for levels 1 and 2 in SET2. p p The bound ε M is very stable, while ε∇ increases with the clustering level. To summarize:

r The a posteriori error bounds ε p , ε p are highly and statistically significantly correlated ∇ M with the actual error and are the tightest error bounds for all instances and localizations.

r ε p is less sensitive than ε p to the level of clustering and problem dimension, and it is ∇ M p

tighter than ε∇ for most instances (90%) in SET1, while only for 48% of instances in SET2 of larger problems. Springer

Ann Oper Res (2006) 146:119–134

133

r Localization Wbd gives the tightest results for all, but a few, a posteriori error bounds. The p

exceptions are for the error bound ε∇ where Wd performs better for (3%) of the instances in SET2. r For localizations Wbd and Wd the a priori error bounds εa and εa are statistically signifi∇ cantly correlated with the actual error and with all the a posteriori error bounds. r For all instances εa provides the best results for localizations Wbd and Wd , except for ∇ 5.33% of the instances in SET1 to which ε∇a (Wbs ) performs better.

4. Conclusions In this paper we derived new a priori and a posteriori aggregation error bounds for the GTP and studied the relationships between the error bounds and the actual error. Only customers (destinations) were aggregated and the clustering level was the only element of the aggregation strategy that was varied in the numerical study. Our numerical study demonstrates that the localization Wbd almost always performs better, resulting in higher correlations between the error bounds and the actual error, and giving tighter error bounds. This was expected, since in our case the error is due to incomplete (aggregated) information in the aggregated problem about particular demands. Thus, using all demand constraints explicitly, as in Wbd and Wd , may provide tighter estimation. Moreover, Wbd ⊆ Wd and strengthening the localization improves the bound. It can be seen from the numerical results that strengthening the localization also increases the correlation between the error bound and the actual error. The new a priori bound ε∇a (Wbd ) is significantly correlated with the actual error (α = 0.05). That is, if the customers in the GTP are aggregated using the strategy outlined in this study, and the prescribed localization is chosen, the a priori error bound can be considered as an appropriate guide to use in selecting the clustering level. Constructing various (including nonlinear, say, ellipsoidal (Litvinchev and Rangel, 1999) localizations for structured problems is an interesting topic for future research. Comparing different techniques to calculate the a posteriori error bounds, surprisingly good results were provided by Mendelssohn (1980)’s approach. We would like to stress that this is not the case for unstructured problems. In particular, for the continuous knapsack problems, the approach of Shetty and Taylor (1987) gives a tighter error bound than Mendelssohn (1980)’s one as reported in Leisten (1997) and Litvinchev and Rangel (1999). The good performance of Mendelssohn (1980)’s bound for the GTP needs to be further investigated and explained. In summary, for our set of randomly generated problems, the aggregated model having the tightest a priori error bound ε∇a (Wbd ) tends to give the best approximation of the optimal objective for the non-aggregated model. After the aggregated model (the level of clustering) was selected by the a priori error bound and solved, Mendelssohn (1980)’s a posteriori error p p bound, ε M , as well as the gradient-based error bound, ε∇ , provide a good quantitative measure of the actual error.

Acknowledgments The first author was partially supported by Russian foundation RFBR (02-01-00638) and Mexican foundation PAICyT(CA857-04), while the second author was partially funded by Brazilian foundations FAPESP (2000/09971-9) and FUNDUNESP (00509/03-DFP). The authors thank the anonymous referees for a careful reading of the article and helpful suggestions. Thanks are also due to Adriana B. Santos for useful comments related to the Numerical Study section. Springer

134

Ann Oper Res (2006) 146:119–134

References Ahuja, R.K., T.L. Magnanti, and J.B. Orlin. (1993). Network Flows: Theory, Algorithms and Applications, Prentice Hall, NY. Evans, J.R. (1979).“Aggregation in the Generalized Transportation Problem.” Computers & Operations Research, 6, 199–204. Francis, R.L., T.J. Lowe, and A. Tamir. (2002). “Demand Point Aggregation for Location Models.” In Drezner Z. and Hamacher H.W (eds.), Facility Location: Applications and Theory, Springer, pp. 207–232. Glover, F., D. Klingman, and N. Phillips. (1990). “Netform Modeling and Applications,” Interfaces, 20(4), 7–27. Hallefjord, A., K.O. Jornsten, and P. Varbrand (1993) “Solving Large Scale Generalized Assignment Problems—An Aggregation/Disaggregation Approach.” European Journal of Operational Research, 64, 103–114. Klingman, D., A. Napier, and J. Stutz. (1974). “NETGEN: A Program for Generating Large Scale Capacitated Assignment, Transportation, and Minimum Cost Flow Problems.” Management Science, 20, 814–821. Leisten, R.A. (1997). “A Posteriori Error Bounds in Linear Programming Aggregation.” Computers & Operations Research, 24, 1–16. Litvinchev, I. and S. Rangel. (2002). “Comparing Aggregated Generalized Transportation Models Using Error Bounds.” Technical Report TR2002-02-OC, UNESP, S.J. Rio Preto, Brazil. (Downloadable from website http://www.dcce.ibilce.unesp.br/˜socorro/aggregation/gtp 2002/). Litvinchev, I. and S. Rangel. (1999). “Localization of the Optimal Solution and a Posteriori Bounds for Aggregation.” Computers & Operations Research, 26, 967–988. Litvinchev, I. and V. Tsurkov. (2003). “Aggregation in Large Scale Optimization.” Applied Optimization, 83, Kluwer, NY. Mendelsohn, R. (1980). “Improved Bounds for Aggregated Linear Programs.” Operations Research, 28, 1450–1453. Montgomery, D.C. (2001). Design and Analysis of Experiments, 5th edition, John Wiley & Sons INC. Norman, S.K., D.F. Rogers, and M.S. Levy. (1999). “Error Bound Comparisons for Aggregation/Disaggregation Techniques Applied to the Transportation Problem.” Computers & Operations Research, 26, 1003–1014. Rogers, D.F., R.D. Plante, R.T. Wong, and J.R. Evans. (1991). “Aggregation and Disaggregation Techniques and Methodology in Optimization,” Operations Research, 39, 553–582. Shetty, C.M. and R.W. Taylor. (1987). “Solving Large-Scale Linear Programs by Aggregation.” Computers & Operations Research, 14, 385–393. Zipkin, P.H. (1980a). “Bounds for Aggregating Nodes in Network Problems.” Mathematical Programming, 19, 155–177. Zipkin, P.H. (1980b). “Bounds on the Effect of Aggregating Variables in Linear Programs.” Operations Research, 28, 403–418.

Springer

Suggest Documents