BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES, Vol. 64, No. 3, 2016 DOI: 10.1515/bpasts-2016-0056
Algorithms solving the Internet shopping optimization problem with price discounts J. MUSIAL1*, J.E. PECERO2, M.C. LOPEZ-LOCES3, H.J. FRAIRE-HUACUJA3, P. BOUVRY2, and J. BLAZEWICZ1, 4 1
2
Institute of Computing Science, Poznan University of Technology, 2 Piotrowo St., 60-965 Poznan, Poland Comp. Sci. and Commun. Res. Unit, University of Luxembourg, 6 rue Coudenhove Kalergi, L-1359 Luxembourg, Luxembourg 3 Instituto Tecnológico de Ciudad Madero, Tecnológico Nacional de México, 1º de Mayo S/N, 89440, Ciudad Madero, Mexico 4 Institute of Bioorganic Chemistry, Polish Academy of Sciences, 12/14 Noskowskiego St., 61-704 Poznan, Poland
Abstract. The Internet shopping optimization problem arises when a customer aims to purchase a list of goods from a set of web-stores with a minimum total cost. This problem is NP-hard in the strong sense. We are interested in solving the Internet shopping optimization problem with additional delivery costs associated to the web-stores where the goods are bought. It is of interest to extend the model including price discounts of goods. The aim of this paper is to present a set of optimization algorithms to solve the problem. Our purpose is to find a compromise solution between computational time and results close to the optimum value. The performance of the set of algorithms is evaluated through simulations using real world data collected from 32 web-stores. The quality of the results provided by the set of algorithms is compared to the optimal solutions for small-size instances of the problem. The optimization algorithms are also evaluated regarding scalability when the size of the instances increases. The set of results revealed that the algorithms are able to compute good quality solutions close to the optimum in a reasonable time with very good scalability demonstrating their practicability. Key words: e-commerce, Internet shopping, applications of operations research, approximations, algorithms, heuristics, combinatorial optimization.
1. Introduction Internet shopping, fitting into a business-to-consumer subcategory, becomes more and more popular thanks to the facilities provided by the new information and communication technologies such as cloud computing, mobile computing (mobile devices – smartphones, tablets), the availability of data centers, and the improvement in the payment gateways on trusted computing platforms. Products available in web-stores are often cheaper than those offered by regular local retailers, and a wide choice of offers is available just a click away from the customer. Based on outstanding logistics, the delivery can usually be operated within 48 hours or less. Customers often need to take into account that shipping cost are charged, so that it is a good idea to group purchased products into sets and buy them from small number of retailers to minimize these delivery costs. Automating such decisions requires three elements: information about the product availability, price lists and finally a specialized analytical tool that could find the minimal subset of shops (in this work we use shops and web-stores interchangeably) where all products from the customer’s shopping list could be bought at the lowest price. To automate these decisions one can prepare an algorithm / computer tool to calculate the best possible solution from the customers perspective. The issue could be perceived as an Internet optimization problem connected with online shop*e-mail:
[email protected]
ping. Partial solution to the most trivial version of the problem (just one product, no discounts, no additional costs, and so on) comes from software agents [1], so called price comparators. However their current functionality is limited to building a price rank for a single product among registered offers that fit to the customer’s query (search phrase). It is worth noting that beside all benefits of Internet shopping, it is still a type of shopping that may lead to problematic behaviors. One can notice that some customers may be addicted to shopping on the Internet. This kind of addiction is somehow similar to gaming and general Internet dependency. Many types of addictions and negative forms of consumptions connected with shopping in general have been widely researched and described in both medial and business journals. BusinessDictionary.com states that shopping is “the process of browsing and/or purchasing items in exchange for money’’. What will be important here that the platform is not so important. Therefore, Internet shopping experience may cause many of these negative behaviors. Follow the publication [2] for more information about different types of online shopping behaviors and addictions. State-of-the-art research on the problem we will tackle in this paper consists of works that introduced Internet shopping optimization problem (ISOP in short) and more complicated versions of the mentioned problem, taking into account also price discounts. [3] introduced ISOP with the following idea. The addressed problem is to manage a multiple item shopping list over several shopping locations. The objective is to have all
505
Bull. Pol. Ac.: Tech. 64(3) 2016 - 10.1515/bpasts-2016-0056 Downloaded from De Gruyter Online at 09/28/2016 10:36:44PM via free access
J. Musial et al.
shopping done at minimum total expense. It is worth pointing out that dividing the original shopping list into several sub-lists with items being delivered by different providers increases delivery costs. These are counted and paid individually for each package (sub-list) assigned to a specific Internet shop in the optimization process. Authors focused on the problem definition and its complexity analysis (with some sub-cases of the main problem). The first algorithms and computational results were introduced in [4]. Much more complicated version of ISOP that takes into account price discounts was introduced in [5, 6]. The Focus was on the problem definition and complexity analysis. Some basic algorithms were introduced. The current paper is focused on the introduced problem with new achievements and contributions to the shopping optimization topic. ISOP could be perceived as a base general name for Internet shopping problems since it leaves a lot of space for future additional requirements and attributes. Introducing complicated multi-objective decision-aided ISOP or a multi-objective problem could be one the upcoming interesting research steps. Many additional decisions could be made by customers with respect to their personal shops preferences, trust, delivery time, negative factors, and all other significant elements. Since the number of decisions could increase, it would be reasonable to support the client with a specific system or tool. This program would attempt to help them to make reasonable decisions that satisfy the customer and do not imply significant price changes. An interesting idea of providing a simple tool that helps decision makers make good decisions in the financial market is described in the literature [7]. In this paper we address the Internet shopping optimization problem considering delivery costs. It is of interest from the customer side to investigate an extended version of the problem which also considers price discounts [5, 6] offered by different sellers. The discount policy is based on real world observations and similar to that used in [8], in the sense that is expresses the discount as a function of the total quantity of money spent in the web-store; the more money a customer spends, the bigger discount he may obtain. In this work, we assume that there are no differences between the quality of the goods the web-stores sell beside the prices they charge for the different products. We introduce a new set of heuristic approaches to solve the problem. The set of heuristics is composed of a new lightweight metaheuristic based on a cellular optimization process, a extended greedy algorithm and two state-of-the-art greedy algorithms. We have designed the heuristics as a compromise solution balancing computational time and results close to the optimum solution. We empirically evaluate the heuristics on large set of real world data. The set of data was collected from 32 stores and covered the largest United States-based stores, including Amazon, BarnesandNoble.com, Borders.com, Buy. com, Booksamillion and top sellers among Internet bookstores in Poland such as empik.com and merlin.pl. We compare the performance of the proposed heuristics to the optimal values. The optimal solutions for small problem instances are computed using a branch and bound algorithm. To evaluate the scalability of the heuristics, we increased the problem size.
506
To enlighten the pure original content of the paper one should notice such important elements as: new original cellular algorithm created especially to solve a problem, redesigned forecasting algorithm, problem-specific modification of a stateof-the-art algorithm, new world data model as a data generator for an experiment, original designed experiment with a vast number of computational test, huge portion of computational results carefully analyzed and described by the authors. The paper is organized as follows. Section 2 describes the problem model. In Section 3, some of the relevant work is presented. In Section 4, we consider the extended version of the investigated problem. Section 5 provides a detailed description of a new set of proposed heuristics. It is followed by experimental results presentation in Section 6. We conclude the paper in Section 7.
2. Problem definition Notation used throughout this paper is given in Table 1. Table 1 Notation Symbol
Explanation
M
set of products
N
set of shops
m
number of products
n
number of shops
i
product indicator
j
shop indicator
Mj
multiset of products available in shop j
dj
delivery price of all products from shop j
yj
usage indicator for shop j
pij
cost of product i in shop j
xij
usage indicator for product i in shop j
T
cumulative value of all products bought in all shops
Tj
cumulative value of all products bought in shop j
f j (Tj)
piecewise function for all products bought in shop j
X = (X1,…, Xn)
sequence of selections of products in shops 1,…, n
F(X)
sum of product and delivery costs
d(x) X¤
0-1 indicator function for x = 0 and x > 0
F¤
optimal sequence of selections of products optimal (minimum) total cost
Internet shopping optimization problem (ISOP) should be described as follows. A single buyer is looking for a multiset of products M = f1, …, mg to buy in n shops. A multiset of available products M j, a cost pij of product i in store j, and a delivery cost dj of any subset of the products from shop j. It is assumed that pij = 1 if i 2 / M j. The problem is to find a sequence of disjoint selections (or carts) of products X = ( X1, …, X n), which we call a cart sequence, such that X j µ M j, j = 1, …, n, Bull. Pol. Ac.: Tech. 64(3) 2016 - 10.1515/bpasts-2016-0056 Downloaded from De Gruyter Online at 09/28/2016 10:36:44PM via free access
Algorithms solving the Internet shopping optimization problem with price discounts
n [ j=1 Xj = M, and the total product and delivery cost, denoted of n suppliers, indexed by i. For each good k in G, dk as the as F(X):= Σjn=1 (δ(|Xj |)dj + Σi2Xj pij), is minimized. Here |Xj | amount of good k to be procured is defined. To each supplier denotes the cardinality of the multiset Xj, and δ(x) = 0 if x = 0 i in S we associate a sequence of intervals Zi = f0, 1, …, maxig, and δ(x) = 1 if x > 0. Its solution is denoted as X ¤, and its indexed by j. Furthermore, for each supplier i 2 S and interval Solving the Internet Shopping Optimization Problem with Discount solution value Algorithms as F ¤. j 2 Zi, lij and uij define thePrice minimum and maximum number of It has been provedAlgorithms that ISOP Solving belongsthe to the NP-hard prob- Optimization goods respectively that needs to be ordered from supplier i to Internet Shopping Problem with Price Discount interval j ∈ Zi and 3. Related Work each good k ∈supplier G, let ci 2 S, the price for one lems family [3]. be in interval j. Finally, for each each interval i jk befor item ofi good purchased from supplier inthe itsfor j-th interval. j 2 Z and good be cthe price onefor item interval j ∈ each Zki and 3. Related Work eachk 2 G, good klet ∈ cG, price oneof ijk let i jk ibe Motivated by the problem of buying multiple products from Similarities goodofkgood purchased from supplier i in itsdiscounts interval. item k purchased from supplier i j-th in its j-thand interval. between ISOP with price TQD probby the buying multiple from Similarities differentMotivated web-stores, [3] problem modeledofInternet shoppingproducts as an opbetween ISOP withprice price discounts andTQD prob- G between with discounts and 3. Related work lemSimilarities can be noticed ifISOP we threat products to buy MTQD asprobgoods different web-stores, [3] modeled Internet shopping as an timization problem, where a customer wants to buy a list ofop- (dlem lem can be noticed if we threat products to buy M as goods can be noticed if we threat products to buy M as goods GG k is amount of the same good k) and shops N as suppliers where aofcustomer wants tofinal buy price. a listfrom of (d Motivated by problem multiple products (d is amount of the k) and shops N as suppliers S. amount of the same good shops N as suppliers productstimization from a setproblem, ofthe online stores atbuying the minimum kis k S. Piecewise discounting function f j for shop j will be associproducts from a set of [3] stores Internet the minimum different web-stores, modeled shopping as price. an op- S.Piecewise shop shopj will j willbebeassociated associPiecewisediscounting discountingfunction functionfj ffor j for The authors showed that theonline problem is atNP-hard in thefinal strong with a sequence of intervals Zi for supplier i. Price pi j of The authorsproblem, showed that the problem is NP-hard the a list strongof ated timization in which a customer wants toinbuy withwith a sequence of intervals Zi forZisupplier i. Price pij of product for supplier i. Price pi j of ated a sequence of intervals sense and designed a set of polynomial time algorithms for for one item ishop from shop jcan can beshown shown as price cfor sense andfrom designed a online set of stores polynomial time algorithms for product i k item products a set of at the minimum final price. i from j can be shown as price c k for one of good k one item of of product i from shop j be as price c i i special cases ofcases the of problem. During previous research difgood k purchased from supplier i. Piecewise function for shop special the problem. During previous research difThe authors showed that the problem is NP-hard in the strong good k purchased fromfrom supplier i. Piecewise function for shop j apk purchased supplier i. Piecewise function for shop ferent versions (specializations) of theofInternet shopping probferent versions (specializations) the Internet shopping probj applied to a product price f (p ) should be treaten as c jshould sense and designed a set of polynomial time algorithms for spe- jplied to a product price f j (pfijj)(p be treated as cijk applied to a product price be treaten as –ci price i j )i jshould jk - i jk lem were examined [5]. For example, due to NP-hardness of lem were examined [5]. For example, due to NP-hardness of of good purchased from supplier i inin its cial cases of the problem. During previous research different price for for one item of good kgood purchased from from supplier i in itsi j-th price for one one item of k kpurchased supplier in its the optimization problem, [9] designed a heuristic solution to the optimization problem, a heuristic solution to j-th versions (specializations) of[9] thedesigned Internet shopping problem were terval. However ISOP includes shipping costs that arethat specific j-th interval. However ISOP shipping costs are are interval. However ISOPincludes includes shipping costs that the basket andtoevaluate customer optimizeoptimize the shopping basket and due evaluate it forit for the the customer examined [5].shopping For example, NP-hardness of the optimi- specific to eachto This feature makes ISOP new enhanced version specific toshops. each shops. shops. This makes ISOP newnew enhanced each Thisfeature feature makes ISOP enhanced basket optimization problem to make it applicable for solving version basket optimization problem to make it applicable for to solving zation problem, [9] designed a heuristic solution optimize of the of TQD ofthe theproblem. TQD problem. version TQD problem. shopping cart in applications. the shopping basket andoptimization evaluate foron-line the applications. customer basket It Itis Itisis worthpointing pointingout outthat thatthe the decision version theTQD TQD worth pointing version ofofthe complexcomplex shopping cart optimization in iton-line worth out that thedecision decision version of the TQD Moreover, it is proven that the problem is not approximable optimization problem it applicable solving complex problem problemisisstrongly stronglyNP-complete. NP-complete. Moreover, polynomiMoreover, no no polynomialMoreover, it is proven that to themake problem is notfor approximable problem is strongly NP-complete. Moreover, no polynomialin polynomial time [3]. Theinarchetype of the presented prob- time shopping cart[3]. optimization on-line Moreover, al-time approximation algorithm with constant worst-caseratio ratio approximation algorithm with constant worst-case in polynomial time Thecustomer archetype of applications. the system presented prob- toit time approximation algorithm withPconstant worst-case ratio lem was a web-based assistance dedicated exists for the TQD problem (unless = N P). More inis proven that the problem is not approximable in polynomial exists for the TQD problem (unless ). More inforlem waspharmacy a web-based customer assistance system dedicated to foron the TQD problemonon (unless P permissible = N P). delay Morein inshopping that helps shopswas in aa webgeo- exists formation onmany many variations TQD(i.e. (i.e. permissible delay time [3]. The archetype of thecustomers presented find problem mation variations TQD pharmacy shoppingdefined that helps customers find shops in list a geographically range where the entire shopping could formation on many variations on TQD (i.e. permissible payments [18]), solutions, exact algorithms canfound bedelay based customer assistance system dedicated to pharmacy shop- inpayments in in [18]), solutions, exact algorithms [19][19] can be graphically defined range where the entire shopping list could be realized at the best total price [10]. For the general discusin payments in [18]), solutions, exact algorithms [19] can be the literature [8,21]. 20, 21]. ping that helps customers find shops in a geographically defined found in theinliterature [8, 20, be realized at the best total price [10]. For the general discussion on the e-commerce problems development and the way it range where the entire shopping list could be realized at the best found in the literature [8, 20, 21]. evolves over last For years should refers to the situation sion on the problems development andthe the wayinitthe 4. Extended Model totale-commerce price [10]. theone general discussion on e-commerce recent past [11]. evolves over last years one should refers to the situation in the problems development and the way it evolves over last years 4. Extended model 4.In Extended this section weModel present a known model of the optimization It is worth noting that there are some similarities recent past one[11]. should refers to the situation in the recent past [11]. beproblem, Internet Shopping Optimization Problem considering the well-known Facility Location Problem this section known model model the optimization It isISOP worthand noting that there some similarities similarities between In this sectionwe wepresent present aa known ofofthe optimization It is tween worth noting that there arearesome be- Indelivery costs and including price discounts (ISOPwD) [5]. Its (FLP) [12]. The main characteristics of the FLP (FLP) are space, ISOP and the well-known facility location problem [12]. problem, Internet shopping optimization problem considering problem, Internet Shopping Optimization Problem considering tween ISOP and the well-known Facility Location Problem mathematical program can be written as follows: the given customer locations given not given Themetric, main characteristics of the FLP areand space, theor metric, given delivery delivery costsand andincluding including price price discounts [5].[5]. Its Its costs discounts(ISOPwD) (ISOPwD) (FLP) [12]. Theformain characteristics of the FLP are space, m n n positions facility locations. A traditional FLP is to open a customer locations and given or not given positions for facility mathematical program can be written as follows: mathematical program canf j (p bei jwritten the metric, given customer and given orspace not given min ∑ ∑ xi j ) + ∑asd follows: jy j, number of A traditional facilities in locations arbitrary of the locations. FLP is positions to open a number of (continfacilities i=1 j=1 j=1 m n n positionsuous facilitypositions locations. A of traditional FLP is(discrete to openproba in problem) or in a subset given positions infor arbitrary of the space (continuous problem) or min ∑n ∑ f j (pi j xi j ) + ∑ d j y j , number lem) of facilities in arbitrary positions of the space (continand to assign customers to the opened facilities so that the a subset of given positions (discrete problem) and to assign cusxi j = 1, i = 1, . . .j=1 , m, s.t.i=1 ∑j=1 sum of opening costs and related distances between uous problem) or the in aopened subset ofcosts given (discrete probtomers to facilities sopositions that to thethe sum of opening costs j=1 their facility n lem) andcustomer to assign customers thecorresponding opened facilities solocations that the is and costslocations related toand theto distances between customer locations ≤ y . . , n, 0 ≤ x minimized. s.t. i j ∑j , ix=i j =1, .1,. . ,im,= j1,=. .1,. ,.m, and theircosts corresponding is minimized. sum of opening and costs facility relatedlocations to the distances between j=1 Discussions of FLPs can be found in [13, 14, 15, 16]. The Discussions FLPs can be foundfacility in [13–16]. The tradicustomer locations and of their corresponding locations is xi j ∈ {0, 1}, y j ∈ {0, 1}, i = 1, . . . , m, j = 1, . . . , n. traditional discrete FLP is NP-hard [17] in the strong sense. tional discrete FLP is NP-hard [17] in the strong sense. Note, 0 ≤ xi j ≤ y j , i = 1, . . . , m, j = 1, . . . , n, minimized. Note, however, thatgeneral the general problem dis- were a customer would like to buy products from a given set however, that the problem ISOPISOP with with priceprice discounts Discussions of FLPs can be found in [13, 14, 15, 16]. The . . {0, . , m} in ay jgiven of iInternet = 1, {1,. .. .. ,. ,n.n} counts be treated as a traditional FLP because cannot cannot be treated as a traditional discretediscrete FLP because there is M =x{1, , 1}, ∈ {0,set1}, = 1, . .shops . , m, Nj = ij ∈ traditional discrete FLP is NP-hard [17] in the strong sense. at the minimum total final price. There are the following given there is no evident motivation for a discount on the cumulative no evident motivation for a discount on the cumulative cost in a customer would like buy products from a given Note, however, that theofgeneral ISOP price disparameters and decision variables: cost in the sense distances. The important point tohere note the sense of distances. Theproblem important pointwith to note ishere that were where a customer wants to to buy products from a given set set M = {1, . . . , m} in a given set of Internet shops N = {1, . . . , n} counts cannot be treated as a traditional discrete FLP because is that this problem and ISOP are not each other’s sub-cases, this problem and ISOP are not each other’s sub-cases, while M = f1, …, mgprice in a given set of Internet shops of all products from shop j, N = f1, …, ng d j - delivery while the traditional discrete FLP is a special case of any of at the minimum total final price. There are the following given there is no motivation discount the of cumulative theevident traditional discrete for FLPa is a specialoncase any of these aty the minimum total final price. There are the following given j - usage indicator for shop j, these problems. problems. parameters anddecision decision variables: and variables: cost in the sense of distances. The important point to note here parameters pi j - standard price of product i in shop j, the ISOP Internet Shopping Optimization Problem Lookingatatand Internet shopping optimization problem with d – delivery price of all products shop j, j is that thisLooking problem are not each other’s sub-cases, usage indicator for product i in from shop from j, x i j priceindicator of all products shop j, d y j - delivery considering price discounts one can notice some similarities the focus on price discounts, one can notice some similarities – usage for shop j, j while the traditional discrete FLP is a special case of any of f j (T j ) - piecewise function (discounting) for final price of indicator price for shop j, j - usage with Quantitydiscount Discountproblem Problem(TQD) (TQD)[8]. [8].To To show with Total total quantity show the y p of product i in shop j, ij – standard all products T bought in shop j. these problems. price of product i in shop the similarities mostofofallalltotoshow show distinct distinct differences similarities andand most differenceswe we p x indicator for product i in j, shop j, i j - standard ij – usage Looking at enclose the Internet Shopping Optimization Problem functionfor model is(discounting) non-linear to itsof should mathematical formulation of TQD. One can deshould enclose mathematical formulation of TQD. One can x f piecewise function for final price usage product i in shop according j, j (T j) –indicator iA j -piecewise considering price discounts one can notice some similarities original nature and it is why we are using non-linear model fine G as set of indexed by k,by and as the of set n define G the as the setmofgoods, m goods, indexed k, Sand S assetthe all products T bought in shop j. for final price of function (discounting) f j (T j ) - piecewise with Total Quantity Discount Problem (TQD) the show amount here. suppliers, indexed by i. For each good k in G,[8]. dk asTo all products T bought in shop j. the similarities of all tois show distinct differences of good and k to most be procured defined. To each supplier iwe in S P ROPOSITION 1. In [3] ISOP is demonstrated to be part we sequence of intervals of Zi = {0, 1,One . . . , maxi}, piecewise function model is non-linear according507 to its Bull.associate Pol.mathematical Ac.: aTech. 64(3)formulation 2016 should enclose TQD. can de-in- ofANP-Hard problems. The ISOP can be reduced to the basic byofj. m Furthermore, for each supplier S and nature and it is why we are using non-linear model fine G asdexed the set goods, indexed by k, and S ias∈ the setinterval of n original 10.1515/bpasts-2016-0056 ISOPwD problem. define thegood minimum andd maximum number here. Zi , li j and Downloaded from De Gruyter Online at 09/28/2016 10:36:44PM suppliers,j ∈indexed byui.i j For each k in G, k as the amount via free in access the strong S TATEMENT 1. The ISOPwD is NP-Hard respectively that needs to be ordered from supplier of good ofk goods to be procured is defined. To each supplier i in S P ROPOSITION 1. In [3] ISOP is demonstrated to be part i to be in interval j. Finally, for each supplier i ∈ S, for each sense.
J. Musial et al.
5.2. Algorithm with forecasting. Observed weak points of Greedy led to the creation of a new, upgraded version. The local step choice analysis is more complicated than in the basic Proposition 1. In [3] ISOP is demonstrated to be part of Greedy. The forecasting method is looking for a step ahead NP- Hard problems. The ISOP can be reduced to the basic [4]. Therefore, technically this algorithm is not a strict greedy ISOPwD problem. algorithm. Sometimes it proposes J.a current which is Musial etsolution al. not optimal for the current step (local solution) but for a better 2 order solution. to hedge against Statement 1. The ISOPwD is NP-Hard in the strong sense. overallTable solution in hope of providing an optimalInglobal Price structure for poor performance of Greedy 2, we developed anoth In order to hedge against the instancesTable similar to that in 1 prod prod n another delivery heuristic algorithm, as Forecasting. Basic ISOP problem (known as strongly NP-hard) can beprod Table 2,2we ...developed denotedThe as main ide shop 1 forW − Forecasting. ε W −ε ...TheWmain − ε idea0 is to check thestep ahead one (forecasting bad s transformed to the ISOPwD considering the sub-case situation step shop 2 0 0 ... 0 W to the penultimate, the algori which all piecewise functions (discounts for each shop n 2 N) ahead (forecasting bad situations). From the first step to the antoeligible j for produc equals fj (Tj) = Tj. Therefore the extended ISOPwD problem penultimate, the algorithm calculates “rating” pick an shop eligible local optimum it looks one ste is NP-hard in the strong sense. shop j for product i. Instead for looking for a local optimum it Proof. Basic ISOP problem (known as strongly NP-hard) can case from occurring) and cal looks one step ahead (which prevents a bad case from occurf j (pi j +pi j+1 )+d j be transformed to the ISOPwD considering the sub-case for ring) and calculates the choosing factor asTT + for every j + 2 which all piecewise functions (discounts for each shop n ∈ N) with the calculated lowest calculated v for every shop j and picking the one with the lowest 5. Proposed algorithms equals f j (T j ) = T j . Therefore the extended ISOPwD problem i isactaken into a value. In each following step next productnext i is product taken into is NP-hard in the strong sense. T =j.TAfterwards, + f j (pi j ) +ddj j . Afterwa The ISOPwD is strongly NP-hard. Moreover, to our best knowl- count, i = i + 1. T is set as T = T + fj (pij) + d The last step of the algorit edge it cannot be reduced to one of the known problems. There- is set to 0. 5. Proposed Algorithms (forecast not work beyo fore, it is apparent right to propose heuristic solution, simple The last step of the algorithm works in a different waycould (forecast The ISOPwD is strongly NP-hard. Moreover, to our best last product is selected in its efficient greedy algorithms [5, 22] that use local knowledge could not work beyond the set of products i). The last product knowledge it cannot be reduced to one of the known probvalue T + f (p ). j ji(p j ij). and do not allow any backtracking for efficiency purpose. It is selected in its eligible shop j with minimum value T + f Therefore, right to propose sois worth to notice that the greedy algorithmlems. does not always it is apparent Piecewise function fj ( pijheuristic + pij+1) returns total costs of prodlution, an simple efficientucts greedy 22] that discount use local for shop Piecewise function f j (pi j + yield optimal solutions. However, it could provide optimal i andalgorithms i + 1 after[5,applying j. knowledge and do not allow any backtracking for efficiency products i and i + 1 after apply or close to optimal solution using much less resources and time purpose. It is worth to notice that the greedy algorithm does than other optimal working algorithms (i.e., full scan). 5.3. Cellular processing algorithm. The cellular processing Cellular Processing Alg not always yield optimal solutions. However, it could provide 5.3. based algorithm is a new pseudo-parallel optimization approcessing based algorithm is or close to optimal solution usingmultiple much less re5.1. Greedy algorithm. In the first heuristic an foroptimal the ISOPwD, proach [23]. It includes processing cells that explore tion approach [23]. It include sources and time other optimal working algorithms (i.e., denoted as Greedy [9], products are considered in a certain or-thandifferent regions of the search space. Each processing cell can explore different regions of th fullorders scan).and the be implemented using population or search der. The algorithm is run for various product based heuristics ing cell can be implemented u best solution found is presented to the customer. Let us consid- or an hybridization of them. The main idea and the principle 5.1. Greedy algorithm - Greedy. In the first heuristic for the heuristics or an hybridization er that the products are sorted in an ascending order 1, …, m. of the algorithm is to split a sequential algorithm into several ISOPwD, denoted as Greedy [9], products are considered in a principle of the algorithm is to Values of the total delivery for each store are initially set as dj, pseudo-parallel processing (i.e., cell) modules, so that each cell certain order. The algorithm is run for various product orders several pseudo-parallel proces j = f1, …, n. yj is the shipping indicator. The standard price for can explore different regions of the search space. The main and the best solution found is presented to the customer. Let each cell can explore different each product i in shop j is set as pij. In iteration i of Greedy, feature of the new approach is that the iterative verification us consider that the products are sorted in an ascending order main feature of the new appr product i is selected in its eligible shop j with minimum valof the stagnation conditions prevents wasting time on unnec1, . . . , m. Values of the total delivery for each store are initially cation of the stagnation condi ue T + f j (pij) + d j, and the corresponding T-value is re-set: essary tasks. set as d j , j = 1, . . . , n. y j is the shipping indicator. The stan- unnecessary tasks. T = T + fj (pij) + dj. Afterwards, dj is set to 0.dard price for each product Wei in design based on We design an opnew algorithm i a pseudo-parallel shop an j isnew set asalgorithm pi j . In iteration Piecewise function fj (pij) returns value of product i after applytimization approach introduced by [23]. The new algorithm we introduced approach of Greedy, product i is selected in its eligible shop j with min- timization ing discount for shop j. design is a cellular processing approach, which includes muldesign is a cellular processing imum value T + f j (pi j ) + d j , and the corresponding T -value is We observed that Greedy demonstrates very good perfortiple processing cells that helps to explore different regions of tiple processing cells that help re-set: T = T + f j (pi j ) + d j . Afterwards, d j is set to 0. mance on the experimental data. However, it can provide a soluthe search optimization space. the search optimization space. Piecewise function f j (pi j ) returns value of product i after aption whose value is much worse than the optimum. One can candidate of the alg The of components plying discount for shop The j. components of the algorithm are a pool consider products and delivery prices in Table 2. The experi-that Greedy solutions, generated either a constructive or a random algo- either by a generated We first observed demonstrates very by good perfor- solutions, mental computation results of Greedy can be found in [5] andexperimental [9]. rithm data. and a cell set that is simple, and a cell set that is sim mance on the However, it can provide independent, a so- rithmself-contained For any product sequence algorithm Greedy selects all prodapplied work the subset solutions that with the s and applied to work lution whose value is and much worseto than thewith optimum. One of cancandidate ucts in shop 1, which costs nW ¡ ε, while anconsider optimal products solution andwere givenprices to solve. This2.process continues untilgiven the cells stall This proc to solve. delivery in Table The first ex- were is to select all products in shop 2, which costsperimental W. all results solutions in theircan local Afterall that, the solutions solutions in their local opt computation of Greedy be optimum. found in [5] return to the pool, and the cells share information withpool, eachand the ce return to the and [9]. Table 2 other inalgorithm order to Greedy escape selects from the continue other inand order to escape from For any product sequence all local prod- optimum Price structure for poor performance ofucts Greedy search global optimum. in shop 1, which the costs nW −for ε, the while an optimal solution the search for the global optim In thisatwork we generated this2, work wecosts generated random. is to select all products inInshop which W . the candidate solutions prod 1 prod 2 … delivery prod n dom. Weasdesigned We designed an iterated local search algorithm (ILS) the core an Iterated thesimplicity core of theand cells. This c 5.2. withofforecasting - Forecasting. the cells. This choice was Observed made due toasthe shop 1 … 0 W–ε W–ε W – ε Algorithm plicity and high configurability weak points of Greedy led to the creation of a new, upgraded high configurability of this structure, which allows it to be highshop 2 0 0 … 0 W it to configurations. be highly scalable and to r version. The local step choice analysis is more complicated ly scalable and to run in a variety of hardware than in the basic Greedy. The forecasting method is looking figurations. Moreover, the ILS for a step ahead [4]. Therefore, technically this algorithm is metaheuristic that can be seen technique not a strict greedy algorithm. Sometimes it proposes a current 508 Bull. Pol. erful Ac.: Tech. 64(3) for 2016extending s The algorithm starts off by solution which is not optimal for the current step (local solu- 10.1515/bpasts-2016-0056 Then, a local search process i tion) but for a better overall solution in hope of providing an Downloaded from De Gruyter Online at 09/28/2016 10:36:44PM After that, following an it optimal global solution. via tion. free access A piecewise function model is non-linear according to its original nature and it is why we are using non-linear model here.
Algorithms solving the Internet shopping optimization problem with price discounts
Fig. 1. Algorithm Cellular-ILS
Moreover, the ILS algorithm is a trajectory-based metaheuristic that can be seen as a straight-forward, yet powerful technique for extending simple local search algorithms. The algorithm starts off by generating an initial solution. Then, a local search process is applied to the candidate solution. After that, following an iteration based approach, it seeks to improve the solutions from one iteration to the next. At each iteration, a perturbation of the obtained local optimum is carried out. The perturbation mechanism introduces a modification to a given candidate solution to allow the search process to escape from a local optimum. A local search is applied to the perturbed solution. The new solution is then evaluated and accepted as the new current solution under some conditions. The algorithm finishes when the termination condition is met. The proposed local search algorithm comprises the following methods: ● The criticalElements obtains a list of elements that may be promising in finding a better search region. This case comprises products that are more expensive and are bought only in one store, so the heuristic favors elements that are cheaper and are bought in the same store. ● The evaluateChange method explores the entire neighborhood of search to find the permutation that improves the current solution, but without perturbing it, and returns the store in which it is more convenient to purchase the product under consideration. ● The makeChange makes the proposed change given by evaluateChange.
● Once the change is performed, the applyDiscount method applies the discounts on the price of the products (piecewise function). The number of cells used in this configuration was established to five, each one starting from a different region of the search space and performing the process described beforehand. The communication, once all were stagnated, was done by comparing the quality of the solutions obtained by each processing cell. The stagnation was determined by the number of consecutive iterations without improvement, that in the case of each cell was established to ten iterations and for the whole process to five iterations without improvement. Coded algorithm was carefully designed and build especially to solve ISOPwD problem taking into account its specific and original nature. The algorithm is original and created from the basis especially for this purpose. It is natural to try to relate Cellular to Hyper heuristic. However, it is worth pointing out that Cellular differs from the Hyper heuristic approach in that each processing cell has complete knowledge on the problem that is being solved, as opposed to the Hyper heuristic, where exists a domain barrier among the controller and the low level heuristics. Another main difference resides in that the processing cells are adapted and tailored to the problem that is being solved, while the Hyper heuristic depends on generic low level heuristics that can be applied to a greater variety of optimization problems, according to Burke et al. [24]. 5.4. Minimum-minimum algorithm. The min-min algorithm is commonly and widely used in the context of scheduling independent tasks in distributed computing to minimize the total completion time of the tasks [25, 26]. One of the main advantages of this approach is that in general is capable to obtain good quality solutions with a relatively small computational cost, but as it schedules those tasks or processes with minimum cost first, it may result in an imbalanced solution [27]. [28] described the process of the heuristic min-min approach. In this paper we extended min-min to ISOPwD for the selection of a list of products in a set of web-stores. The extended version is called MinMin in this paper. The process begins with the search of the product in the list of unassigned products N, which minimizes the total cost TC in the shopping cart among the different stores M, given the cost of the product, Cij plus delivery cost, Di . In the case of a tie, the product with the lower delivery cost is selected, and if both stores have the same delivery cost, the product is chosen randomly. Once assigned, the total cost of the shopping cart is updated with the selected product, and it is removed from the unassigned product list N. The process continues until the unassigned product list is empty. 5.5. Branch and bound algorithm. To calculate the optimal solution, we designed a branch and bound algorithm [29]. The branch and bound (BB) algorithm starts off by calculating an upper bound (UB) employing the solution given by
509
Bull. Pol. Ac.: Tech. 64(3) 2016 - 10.1515/bpasts-2016-0056 Downloaded from De Gruyter Online at 09/28/2016 10:36:44PM via free access
J. Musial et al.
the Cell Processing Algorithm and proceeds to branch the first Internet bookstores in Poland such as empik.com and merlin. pl) level of the search tree in the stack. Subsequently, the algorithm to create our own model with instances generator as close to pops the top element of the stack and evaluates the objective reality as possible. value it would have if it were part of the current solution. If The working model was prepared as follows. In the the partial solution exceeds the limit given by the upper bound computational experiments we assume that n 2 f20, 40g, (UB), the current branch is fathomed. Otherwise, if it were not m 2 f2, 3, …, 10, 15, …, 100g. For each pair (n, m), 100 ina leaf of the tree, the algorithm would pile up the following stances were generated. In each instance, the following values Musialwere et al.randomly generated for all i and j in the corresponding elements within the stack. If it were a leaf, it would meanJ.that J. Musial et al. the current solution is better than the best global solution found ranges. Delivery price: d j 2 f5, 10, 15, 20, 25, 30g, publishmended price ofprice book ri ∈ boundsowith the new value found. 10, 10, 15, 15, 20,20, 25}, and far. Consequently it would update the upper bound with the er’s recommended of i: book i: r{5, 25g, andprice of i 2 f5, mended price of book i: r bound with the new value found. ∈ {5, 10, 15, 20, 25}, and pric new value found. i in bookstore aij ¸ 0.69r ∈ [aiji, i jb,ijb],i jwhere ], where ai j ≥j,0.69r bookof ibook in bookstore j: j:ppiijj 2 [a This process continues as long as there are elements in the price j, ∈ [a , b ], where a ≥ 0.6 book i in bookstore j: p This long as there are elements in the , and the structure of intervals [a , b ] is prepared Thisprocess processcontinues continues as long as there are been elements ≤ 1.47r , and the structure of intervals [a , b ] is prepared bij ∙ 1.47r i j i j i j i j j ij ij stack, which means that the as whole search tree has ex- inbthe ij j ij ij stack, which means that the whole search tree has been exas follows: ≤ 1.47r , and the structure of intervals [a , b ] is prep b stack, which means that the whole search tree has been exas follows: i j j i j i j plored. Founded upper bound is now considered as the optimal plored. Founded upper bound is now considered as the plored. Founded upper bound is now considered as optithe optimal as follows: minimum solution of the instance being evaluated. mal solution of the instance being evaluated. We consider it [32%] minimum solutionit of instance being evaluated. We consider as the a B&B model, even if it was implemented as a B&B model, even if it was implemented by ourselves, as [32%] minimum + (median−minimum) We consider it as a B&B model, even if it was implemented by ourselves, follows described the procedure in the lit4 it follows as theitprocedure in the described literature. Generating minimum + (median−minimum) by ourselves, as it follows the procedure described in the lit[9%] erature. Generating the search tree and stopping further ex4 the search tree and stopping further exploration where it’s not (median−minimum) [9%] erature. Generating the search tree and stopping further exploration where it’s not possible to find a better solutions on a possible to find a better solutions on a branch. minimum +
2 branch.ploration where it’s not possible to find a better solutions on a [9%] minimum + (median−minimum) 2 branch. (median−minimum) [9%] minimum + 6. Computational experimentsresults results 1.33 6. Computational experiments minimum + (median−minimum) [8%] 1.33 6. Computational experiments results Computational experiments were performedand and divided divided into [8%] Computational experiments were performed into median two groups thecomputational computational –divided a set Computational experiments werecomplexity performedtime) and two groups (due(due to to the complexity time) – a into [13%] median experiments BB exact algorithm and a set two groups including (due to the computational complexity set ofofexperiments including BB exact algorithm andofa expersettime) of – a [13%] iments optimalincluding solution asBB a comparison of all heurismedian + (maximum−median) set ofwithout experiments algorithm andalla set of 4 experiments without optimal solution asexact a comparison of tics to evaluate scalability issues by increasing instances’ size. median + (maximum−median) [6%] 4 experiments without optimal solution as a comparison of all heuristicsAll toalgorithms evaluate scalability issues by instances’ were implemented in increasing the PHP programming (maximum−median) [6%] heuristics to evaluate scalability increasing instances’ size. language median + in its 5.2 version. The aim issues of usingbyPHP was to easy 2 size. the algorithms median + (maximum−median) Allembed algorithms were implemented in the PHP programming in a website – to work in a conditions that [11%] 2 All algorithms were implemented the PHP programming are similar to its future final version. language in its 5.2 version. The aim of usinginPHP was to easy (maximum−median) [11%] median + 1.33 Thealgorithms experiments carried in an using Apple MacBook in its 5.2 version. The aim of PHP was to easy embedlanguage the inwere a website –out to work in a conditions median + (maximum−median) [12%] 1.33 Pro, version anfinal Intel processor at 2.3 embed the algorithms in aversion. website –running to work in aGHz, conditions that are similar to8.1 its with future maximum [12%] physical corestoand two virtual for each one and 4 GB that are similar future version. Thetwo experiments were its carried outfinal in an Apple MacBook Pro, where minimum =maximum 0.69r and maximum = 1.47r. DDR3 main memory at 1333 MHz. The experimentations were where and maximum = 1.47r. Thewith experiments were carried out in an Apple MacBook Pro, minimum = 0.69r version 8.1 an Intel processor running at 2.3 GHz, two where minimum = 0.69r and maximum = 1.47r. made on a virtual machine running GNU/Linux Mint 11 on version 8.1 with an Intel processor running at 2.3 GHz, two physical cores and two virtual for each one and 4 GB DDR3 Basedononrealreal world observation the following discounting a Mac OS X 10.7 host with two virtual cores assigned and 2 GB Based world observation the following discounting physical twoThe virtual for each one were and 4made GB DDR3 main of memory atcores 1333and MHz. experimentations Based on real world observation the following discoun main memory. function was proposed. function was proposed. main memory 1333 MHz. The experimentations were made function on a virtual machineatrunning GNU/Linux Mint 11 on a Mac was x proposed. if x ≤ 25, on a host virtual machine running GNU/Linux Mint2A chal11 World working andcores instances generator. OS X6.1. 10.7 with twomodel virtual assigned and GBon of a Mac x J. Musial et al. Cellular algori Fig. 4. Algorithm run time comparison - experiment with 20 shops m = 5 Forecas (including the optimal solution). as the best Cel Table 3 ucts. On the other hand, all heuristics are very fast so the ideamore is than m > theitsother hand, heuristics very fast issues so A comparison of algorithm results’ dispersion for the coefficient of products. to furtherOn test quality andallrun time forare scalability by among the test the idea is to further test itsofquality and runThe timeresults for scalability variation (CV) normalized measure increasing the number products m. are presented the quality of s issues bynext increasing the number of products m. The results are in the subsection after the dispersion analysis. products m (co products Greedy Forecasting Cellular MinMin BB presented in the next subsection after the dispersion analysis. The last heu 12 2.62% 1.02% 0.78% 6.33% 0.00% 6.3. Scalability: a set of experiments for heuristic algolutions for a lo 6.3. Scalability: a set ofgroup experiments for heuristic algo- abebigger numb rithms. In the second of experiments a comparison 13 3.96% 2.11% 4.23% 7.67% 0.00% rithms. In the second group of experiments a comparison be- the Forecasting tween the set of algorithms was done regarding scalability. In tween the set of algorithms was done regarding scalability. In provides soluti 14 3.63% 2.53% 1.68% 7.21% 0.00% these examples we increase the size of the instances as follow: these examples we increase the size of the instances as follow: than best Cellu n 2 f20, 40g, m 2 f5, 10,. .15, …, 100g, and piecewise discount15 3.72% 3.61% 2.33% 6.55% 0.00% n∈ {20, 40}, m ∈ {5, 10, 15, . , 100}, and piecewise discountFigure 6 co ing functions follow definition from subsection 6.1. ing functions follow definition from subsection 6.1. 16 3.27% 3.31% 2.17% 5.69% 0.00% sults dispersio Foreach each instances generated. Instances For pairpair (n, (n, m), m), 100 100 instances were were generated. Instances value (deviatio weregenerated generated using procedure described 17 3.62% 3.70% 2.71% 6.32% 0.00% were using the the samesame procedure described in Sec-in Secrithms) from 1 tion 6.2. tion 6.2. value for each 18 3.78% 3.81% 2.54% 5.93% 0.00% to compute the optimal solution valuesult in and the be As As we we are are not not ableable to compute the optimal solution value in 19 2.67% 2.93% 1.82% 4.84% 0.00% a reasonable computational timetime givengiven the size the of instances a reasonable computational theofsize the instances Forecasting Cell , be best weweevaluate error γ γofofeach evaluatethetherelative relative error eachstrategy strategyunder undereach each met10 3.64% 3.28% 1.84% 4.48% 0.00% tion value is pr F(X) metric. The relativeerror errorisisformally formally defined defined as = F(X) , where ric. The relative as γγ = products for 10 best F(X)F(X) represents the solution found by by a heuristic and where represents the solution found a heuristic andF(X)ful bestinformation thesolution best solution among all heuristics. test from a given instance (sometimes optimum or close to, F(X) denotes the best among all heuristics. best denotes the algorithm r sometimes quite far from it). We used the results of the BB is represented measure (perce set of of experiments withwith 20 shops. FigureFigure 5 presents algorithm as a reference value. In each cell the CV value is 6.3.1. 6.3.1.AA set experiments 20 shops. 5 presents solution values to the error error for thefor100 presented for the same instance of n shops and m products for average average solution values to relative the relative thein-100 in-Figure 7 exp stances over n =n = 20 20 shops, which are obtained by theby evaluated algorithm run t 100 computational tests. stances over shops, which are obtained the evaluated heuristics. from 100 com An important factor to consider in the context of online heuristics. 0.001
0.0001
1
2
3
4
5
6
7
8
9
10
Products
shopping is the run time needed by any algorithm to compute J. Musial et al. a solution. Figure 4 displays the results regarding the compari- 8 20 Shops 20 Shops Table 3 1.181.18 We can observe that Cellular outperforms the rest of son of algorithmsArun time in microseconds [ms] (including the Greedy Greedy comparison of algorithm results’ dispersion for the coefficient of variation Forecasting Forecasting the algorithms. For all instances n, m where m = Cellular Cellular (CV) normalized measure value from optimal solution). Each cell represents the average MinMin 1.161.16 {5,MinMin 10, 15, . . . , 100}, it provides the best quality solution. The Forecasting Cellular MinMin BB 100 computational products tests forGreedy 20 shops. behavior of Cellular is the same than in the first set of experi2 products 2.62% (m ∙ 5) 1.02% MinMin 0.78% is the 6.33% 0.00% 1.141.14 ments. For a low number of fastest 3 3.96% 2.11% 4.23% of 7.67% 0.00% algorithm. It can be observed that when the number products 1.121.12 4 3.63% 2.53% 1.68% 7.21% 0.00% is bigger than five (m > 5), Greedy outperforms the rest of 5 3.72% 3.61% 2.33% 6.55% 0.00% 1.1 1.1 algorithms. Differences between algorithms 6 3.27% all3.31% 2.17%are very 5.69% sig0.00% nificant. 7 3.62% 3.70% 2.71% 6.32% 0.00% 1.081.08 8 for BB 3.78%grows 3.81% 2.54% 5.93% 0.00% Run time execution exponentially and it was 2.67% 0.00% 1.061.06 impossible to prepare 9experiments for2.93% a bigger1.82% number4.84% of prod20 Shops
Relative error Relative error
1.18
1.16
Greedy Forecasting Cellular MinMin
1.14
10
3.64%
3.28%
1.84%
4.48%
0.00%
1.041.04
1.1
1.08
1.06
20 Shops 20 Shops 1.021.02
20 Shops
1000 1000
1.04
1000
Greedy Greedy Forecasting Forecasting Cellular Cellular MinMin MinMin BB BB 100 100
Relative error
1.12
Greedy Forecasting Cellular MinMin BB
100
1 1
1.02
10 10
20 20
30 30
40 40
1 10 10
Time [ms]
Time [ms]
1
1
0.1
0.1
30
40
50 50 60 60 Products 50Products 60 70
70 70 80
80 90
80
90
90
100 100
100
Products
Fig. 5. Algorithm resultscomparison comparison -–experiment withwith 20 shops Fig. 5. Algorithm results experiment 20 shops F(X) (without solution).Relative Relativeerror error is = F(X) , where where F(X) (without BB BB solution). is γγ = F(X) repbest denotes the represents the solution by a heuristicand andF(X) F(X)best resents the solution foundfound by a heuristic the best best denotes best solution among all heuristics. solution among all heuristics
10 Time [ms]
10
20
1
0.1
0.01
Greedy presents better scalability than MinMin and Forecasting by providing the second best quality of solutions. For We canthan observe Cellular outperforms the restofofthethe almore m > 15that products it stabilized within 10-11% gorithms. Foralgorithm all instances n, mFor where m 2 f5, …, 100g, Cellular solution. a low number10, of 15, products Fig. 4. Algorithm run time comparison - experiment with 20 shops 0.001 0.001 m = 5 Forecasting propose almost the same of solution (including the optimal solution). it provides the best quality solution. The quality behavior of Cellular the best Cellular more expensive). From is theassame than in thealgorithm first set(0.48% of experiments. more than m > 15 products it becomes the worst algorithm 0.00010.0001 are very 1 1 2 2 3products. 3 4 4On the 5 5other 6 hand, 6 7all 7 heuristics 8 8 9 9 10fast 10 so Greedy presents better scalability than MinMin and Foreamong the tested ones. Moreover, it is easily noticeable that Products Products the idea is to further test its quality and run time for scalability casting by providing second quality number of solutions. the quality of solutionthe degrades withbest the increasing of issues by increasing the number of products m. The results are For more than m > 15 products it stabilized within 10‒11% Fig. 4. Algorithm run time comparison – experiment with 20 shops products m (compared to the best Cellular algorithm). presented in the next subsection after the dispersion analysis. last heuristic algorithm - MinMin provides the worst so(including the optimal solution) of the The Cellular algorithm solution. For a low number of lutions for a lower number of products (m < 15). However, for 6.3. Scalability: a set of experiments for heuristic algo- a bigger number of products it provides better solutions than rithms. In the second group of experiments a comparison be- the Forecasting algorithm (but worse than Greedy). MinMin 512 Tech.bigger 64(3) 2016 tween the set of algorithms was done regarding scalability. In provides solutions between 6% andBull. 16%Pol. (on Ac.: average) these examples we increase the size of the instances as follow: than best Cellular solutions. - 10.1515/bpasts-2016-0056 n ∈ {20, 40}, m ∈ {5, 10, 15, . . . , 100}, and piecewise discountFigure comparison of heuristic algorithm reDownloaded from 6Decontains GruyteraOnline at 09/28/2016 10:36:44PM ing functions follow definition from subsection 6.1. free access sults dispersion. Each point represents thevia standard deviation For each pair (n, m), 100 instances were generated. Instances value (deviation around the best value calculated from all algo0.001
0.01 0.01
0.0001
1
2
3
4
5
6
Products
7
8
9
10
Greedy presents better scalability than MinMin and ForeGreedy presents better scalability than MinMin and Fore- casting by providing the second best quality of solutions. For casting by providing the second best quality of solutions. For more than m > 15 products it stabilized within 10-11% of the Fig.more 4. Algorithm comparison experiment with 20optimization shops than m Algorithms >run 15time products it stabilized within 10-11% of the Cellular solving the-Internet shopping problemalgorithm with pricesolution. discounts For a low number of products m = 5 Forecasting propose almost the same quality of solution (including the optimal solution). Cellular algorithm solution. For a low number of products xperiment with 20 shops m = 5 Forecasting propose almost the same quality of solution as the best Cellular algorithm (0.48% more expensive). From more than m > 15 products it becomes the worst algorithm Shops 2020 Shops as the best algorithm moreare expensive). From easily products. OnCellular the other hand, all(0.48% heuristics very fast so noticeable that the quality of solution degrades with 0.08 0.08 among the tested ones. Moreover, it is easily noticeable that Greedy Greedy more than m > 15 products it becomes the worst algorithm the increasing number of products m (compared to the best Forecasting the idea is to further test its quality and run time for scalability Forecasting ristics are very fast so Cellular Cellular the quality of solution degrades with the increasing number of among the tested the ones. Moreover, it is easily MinMin MinMin 0.07 issues by increasing number of products m. Thenoticeable resultsCellular arethat algorithm). 0.07 run time for scalability m (compared to the best Cellular algorithm). the quality of solution degrades with the increasing number of products heuristic algorithm – MinMin provides the worst ucts m. The results are presented in the next subsection after the dispersion analysis. The last The last heuristic algorithm - MinMin provides the worst soproducts m (compared to the best Cellular algorithm). 0.06 0.06 solutions for a lower number of products (m 15 products it becomes measuremeasure (percentage value).value). average solution valuesvalue). to the relative error for the 100 inFigure 7 exposes information regarding the comparison of measure (percentage hops. Figure 5 presents the worst algorithm among the tested ones. Moreover, it is Figure 7 exposes information regarding the comparison over n7 = 20 shops, which areregarding obtained by evaluated of algorithm run time [ms]. Each cell represents the average value error for the 100 in- stances Figure exposes information thethe comparison of algorithm run time [ms]. Each cell represents the average from 100 computational tests for 20 shops. We can observe tained by the evaluated heuristics. algorithm run time Each cell represents the average valuefrom Table[ms]. 4 value 100 computational tests for 20 shops. We can obfrom 100 computational tests for 20 shops. We can observe Heuristic algorithm results’ dispersion for the coefficient serve that for a low number of products m = 5 MinMin is the 8 of variation (CV) normalized measure Bull. Bull. Pol. Pol. Ac.: Ac.: Tech. Tech. XX(Y) XX(Y) 2016 2016 0.001 0.001
0.0001 0.0001
11
22
33
44
55
66
77
88
99
10 10
Products Products
8
9
10
Standard Deviation Standard Deviation
fastest algorithm. For other instances of the ISOPwD problem (number of products m > 5), algorithm Greedy is the fastest. Differences between all algorithms are very significant. The following example illustrates these differences. A few observations are worth noting.
Bull. Pol. Ac.: Tech. XX(Y) 2016
products
shops
Greedy
Forecasting
Cellular MinMin
15
20
4.45%
2.92%
2.62%
6.58%
10
20
3.75%
3.73%
0.94%
5.24%
15
20
2.79%
3.17%
0.00%
3.86%
20
20
3.18%
3.19%
0.00%
3.93%
25
20
2.65%
2.65%
0.00%
3.74%
30
20
2.71%
3.09%
0.00%
3.26%
35
20
2.36%
2.47%
0.00%
3.50%
40
20
3.42%
2.36%
0.00%
3.82%
45
20
3.04%
2.38%
0.00%
3.62%
50
20
3.09%
2.40%
0.00%
3.61%
55
20
2.69%
2.06%
0.00%
3.29%
60
20
2.51%
2.27%
0.00%
3.40%
65
20
2.10%
2.17%
0.00%
3.72%
70
20
3.77%
2.12%
0.00%
3.62%
75
20
2.57%
2.35%
0.00%
3.37%
80
20
2.73%
1.98%
0.00%
3.23%
85
20
3.48%
2.36%
0.00%
3.56%
90
20
2.06%
1.99%
0.00%
3.05%
95
20
3.47%
1.98%
0.00%
3.49%
100
20
2.71%
3.09%
0.00%
3.26%
Shops 20 20 Shops
2000 2000
Greedy Greedy Forecasting Forecasting Cellular Cellular MinMin MinMin
1500 1500
Time [ms] Time [ms]
7
1000 1000
500 500
00
1010
2020
3030
40 40
50 50 Products Products
60 60
70 70
80 80
90 90
100100
Fig. 7. Algorithm processing time comparison – experiment with 20 shops (without BB solution)
6.3.2. A set of experiments with 40 shops. There were no changes regarding the behavior of Cellular, it provides the best quality of solutions. That is, for all instances n, m it provided the best solution.
513
Bull. Pol. Ac.: Tech. 64(3) 2016 - 10.1515/bpasts-2016-0056 Downloaded from De Gruyter Online at 09/28/2016 10:36:44PM via free access
J. Musial et al.
4040 Shops Shops 1.16 1.16
1.14 1.14
Greedy Greedy Forecasting Forecasting Cellular Cellular MinMin MinMin
1.12 1.12
Relative error Relative error
1.11.1
1.08 1.08
1.06 1.06
1.04 1.04
1.02 1.02
1
1
10 10
20 20
30 30
40 40
50 50 Products Products
60 60
70 70
80 80
90 90
100 100
Fig. 8. Algorithm results comparison (Relative error measure) – experiment with 40 shops (without BB solution)
Regarding 40 shops experiment general observations are similar to the ones from the 20 web-stores experiment. However, some differences are significant and worth to notice. All standard deviation and CV values are lower for the experiment with 40 shops. Algorithm Greedy provides a lower dispersion level than Forecasting (slightly but still). Observations for the Cellular and MinMin algorithms are similar to the experiment with 20 web-stores. Regarding the computation time (see Fig. 10), for all instances with n = 40 web-stores, algorithm Greedy is the fastest. Differences between all algorithms are very significant. Calculation times are even more important here, since goal application will work as an online web-site. There is plenty of research showing that users do not want to wait to see a webpage content, because they quickly lose interest. An up-to-date survey is given in [32]. 40 Shops 40 Shops
4000 4000
3000 3000
2000 2000
1000 1000
0 0
4040Shops Shops Greedy Greedy Forecasting Forecasting Cellular Cellular MinMin MinMin
Standard Deviation Standard Deviation
0.06 0.06
0.05 0.05
0.04 0.04
0.03 0.03
0.02 0.02
0.01 0.01
00 1010
2020
3030
4040
5050 Products Products
6060
7070
8080
9090
100 100
Fig. 9. Algorithm standard deviation chart – experiment with 40 shops
514
10 10
20 20
30 30
40 40
50 50 Products Products
60 60
70 70
80 80
90 90
100 100
Fig. 10. Algorithm processing time comparison – experiment with 40 shops (without BB solution)
0.08 0.08
0.07 0.07
Greedy Greedy Forecasting Forecasting Cellular Cellular MinMin MinMin
Time [ms] Time [ms]
Figure 8 displays relative error average results. We can notice that characteristics of the results are not far away from experiment with m = 20 shops. Algorithm Greedy provides the second best quality of solutions. For more than m > 25 products it stabilized within 9.6‒10.8% of the Cellular algorithm solutions. For a low number of products m = 5 Forecasting computes almost the same quality of solution as the best Cellular algorithm (1.55% more expensive). From more than m > 10 products it becomes the worst algorithm among the tested ones. Moreover, it is easily noticeable that the quality of solution degrade with the increasing number of products m (compared to the best {Cellular algorithm}) from 1.55% to 15.83% percent on average greater than Cellular. MinMin provides the worst solutions for a lower number of products (m = 5). However, for a bigger number of products it provides better solutions than the Forecasting algorithm (but worse than Greedy). The algorithm also provides better scalability than the previous experiment for smaller number of web-stores. Figure 9 contains a comparison of heuristic algorithm results dispersion.
5000 5000
Concerning all results collected in the computational experiment it can be stated that the Cellular algorithm provides the best solutions among all heuristic algorithms. The algorithm also computes more stable solutions than the rest of compared algorithms. On the other hand, it is the slowest one. What is worth noting is that it could be much more efficient when parallelized on five machines (each cell can process on a different machine). Another important consideration is that in a real e-commerce web site the instances of the algorithms will be executed in the server machines (i.e. in a web-server of a data center), which usually are more powerful than customers’ machines. Algorithm Greedy can result in a lot of interest due to very fast processing times in which it can provide a good quality of solutions. Both algorithms Greedy and Forecasting can be parallelized for two machines (both algorithms make two separate executions and pick the best of these two at the final step). The performance of MinMin can be improved by using a local search algorithm [33]. To improve the run time execution MinMin can be easily parallelized. There are different Bull. Pol. Ac.: Tech. 64(3) 2016
- 10.1515/bpasts-2016-0056 Downloaded from De Gruyter Online at 09/28/2016 10:36:44PM via free access
Algorithms solving the Internet shopping optimization problem with price discounts
parallel implementations of the min-min scheduling algorithm reported in literature [33, 34], both CPU and GPU parallel implementations. Given the fast run time execution and the good quality of solutions that Greedy, Forecasting and MinMin can compute in average their solutions can be used as initial seeds for Cellular. It could help to improve the run time execution of the algorithm without degrading its performance. Special database system optimization could be performed to decrease all-around computational times [35].
7. Conclusions In this paper, we addressed the Internet shopping optimization problem including price discounts. For the practical application, we created a working model as close to real Internet shopping conditions as possible. The main reason was their wide choice in Internet web-stores and frequency purchase. A new set of algorithms is presented, three heuristics and a new cellular processing optimization algorithm. Computer experiments demonstrated their good performance. The results generated by the algorithms were compared to the optimal solutions computed by a proposed branch and bound algorithm. As for current version of the problem (doesn’t count yet with a linear model) it is not possible at the moment to perform a relaxation technique on the restrictions or to implement it on a MIP solver such as CPLEX or Gurobi. Nonetheless, presentation of a valid MILP model is certainly worth considering and interesting future step – this will be one of our next challenges. In the future, we plan to extend the Internet shopping model to include additional constraints such as: minimum delivery times, incomplete shopping lists realization, and maximum budget. Moreover, we plan to propose new quality algorithms for dual discounting function ISOP [36]. The experimental results demonstrated the potential by the new Cellular processing optimization algorithm. However, one of the main concerns for its applicability to ISOP is the time needed to find a solution. To alleviate the problem and also deal with scalability we consider investigating a parallel version of the algorithm on a GPU infrastructure. Furthermore, there are some interesting similarities between ISOP and exciting Cloud Brokering [37]. Linking it with ISOP may result in the possibility of using algorithms prepared for ISOP. Acknowledgments. This study was partially supported by the FNR (Luxembourg) and NCBiR (Poland), through IShOP project, INTER/POLLUX/13/6466384.
References [1] K. M. Tolle and H. Chen, “Intelligent software agents for electronic commerce”, Handbook on Electronic Commerce, Springer Berlin Heidelberg, 365–382 (2000). [2] S. Rose and A. Dhandayudham, “Towards an understanding of Internet-based problem shopping behaviour: The concept of online shopping addiction and its proposed predictors”, Journal of Behavioral Addictions 3 (2), 83–89 (2014).
[3] J. Blazewicz, M. Kovalyov, J. Musial, A. Urbanski and A. Wojciechowski, “Internet shopping optimization problem”, International Journal of Applied Mathematics and Computer Science 20 (2), 385–390 (2010). [4] J. Blazewicz and J. Musial, “E-commerce evaluation – multiitem Internet shopping. Optimization and heuristic algorithms”, Operations Research Proceedings 2010, 149–154 (2011). [5] J. Blazewicz, P. Bouvry, M. Y. Kovalyov and J. Musial, “Internet shopping with price sensitive discounts”, 4OR-Q J Oper Res 12 (1), 35–48 (2014). [6] J. Blazewicz, P. Bouvry, M. Y. Kovalyov and J. Musial, “Erratum to: Internet shopping with price-sensitive discounts”, 4OR-Q J Oper Res 12 (4), 403–406 (2014). [7] B. Sawik, “Downside risk approach for multi-objective portfolio optimization”, Operations Research Proceedings 2011, 191–196 (2012). [8] D. R. Goossens and A. J. T. Maas, “Exact algorithms for procurement problems under a total quantity discount structure”, European Journal of Operational Research 178 (2), 603–626 (2007). [9] A.Wojciechowski and J. Musial, “Towards optimal multi-item shopping basket management: Heuristic approach”, Lecture Notes in Computer Science (LNCS) 6428, 349–357 (2010). [10] A. Wojciechowski and J. Musial, “A customer assistance system: optimizing basket cost”, Foundations of Computing and Decision Sciences 34 (1), 59–69 (2009). [11] M. Shaw, R. Blanning, T. Strader and A. Whinston, Handbook on Electronic Commerce, International Handbooks on Information Systems, Springer, Berlin-Heidelberg (2000). [12] C. Revelle, H. Eiselt and M. Daskin, “A bibliography for some fundamental problem categories in discrete location science”, European Journal of Operational Research 184 (3), 817–848 (2008). [13] J. Krarup, D. Pisinger and F. Plastria, “Discrete location problems with push-pull objectives”, Discrete Applied Mathematics 123 (13), 363–378 (2002). [14] H. Eiselt and C. Sandblom, Decision Analysis, Location Models, and Scheduling Problems, Springer-Verlag, Berlin-HeidelbergNew York (2004). [15] M. Melo, S. Nickel and F. Saldanha-da Gama, “Facility location and supply chain management – A review”, European Journal of Operational Research 196 (2), 401–412 (2009). [16] C. Iyigun and A. Ben-Israel, “A generalized Weiszfeld method for the multi-facility location problem”, Operations Research Letters 38 (3), 207–214 (2010). [17] M. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, New York, Freeman (1979). [18] S. Krichen, A. Laabidi and F. B. Abdelaziz, “Single supplier multiple cooperative retailers inventory model with quantity discount and permissible delay in payments”, Computers & Industrial Engineering 60 (1), 164–172 (2011). [19] D. Manerba and R. Mansini, “An exact algorithm for the capacitated total quantity discount problem”, European Journal of Operational Research 222 (2), 287–300 (2012). [20] S. H. Mirmohammadi, S. Shadrokh and F. Kianfar, “An efficient optimal algorithm for the quantity discount problem in material requirement planning”, Computers & Operations Research 36 (6), 1780–1788 (2009). [21] C. Munson and J. Hu, “Incorporating quantity discounts and their inventory impacts into the centralized purchasing decision”, European Journal of Operational Research 201 (2), 581– 592 (2010).
515
Bull. Pol. Ac.: Tech. 64(3) 2016 - 10.1515/bpasts-2016-0056 Downloaded from De Gruyter Online at 09/28/2016 10:36:44PM via free access
J. Musial et al.
[22] T. Cormen, C. Leiserson, R. Rivest and C. Stein, Introduction to Algorithms, McGraw-Hill Higher Education, 2nd edition (2001). [23] J. D. Terán-Villanueva, H. J. F. Huacuja, J. M. C. Valadez, R. A. Pazos Rangel, H. J. P. Soberanes and J. A. M. Flores, “Cellular processing algorithms”, Studies in Fuzziness and Soft Computing 294, 53–74 (2013). [24] E. Burke, G. Kendall, J. Newall, E. Hart, P. Ross and S. Schulenburg, “Hyper-heuristics: An emerging direction in modern search technology”, Handbook of Metaheuristics, Springer, 457–474 (2003). [25] C. O. Diaz, J. E. Pecero and P. Bouvry, “Scalable, low complexity, and fast greedy scheduling heuristics for highly heterogeneous distributed computing systems”, Journal of Supercomputing 67 (3), 837–853 (2014). [26] S. Nesmachnow, B. Dorronsoro, J. E. Pecero and P. Bouvry, “Energy-aware scheduling on multicore heterogeneous grid computing systems”, Journal of Grid Computing 11 (4), 653–680 (2013). [27] M. Wu, W. Shu and H. Zhang, “Segmented min-min: A static mapping algorithm for meta-tasks on heterogeneous computing systems”, Heterogeneous Computing Workshop IEEE, 375–385 (2000). [28] T. Braun, H. Siegel, N. Beck, L. Bölöni, M. Maheswaran, A. Reuther, J. Robertson, M. Theys, B. Yao, D. Hensgen et al., “A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems”, Journal of Parallel and Distributed Computing 61 (6), 810–837 (2001).
516
[29] A. Land and A. Doig, “An automatic method for solving discrete programming problems”, Econometrica 28 (3), 497–520 (1960). [30] E. W. Karen Clay, Ramayya Krishnan, “Prices and price dispersion on the web: evidence from the online book industry”, The Journal of Industrial Economics 49 (4), 521–539 (2001). [31] W. Chu, B. Choi and M. Song, “The role of on-line retailer brand and infomediary reputation in increasing consumer purchase intention”, International Journal of Electronic Commerce 9 (3), 115–127 (2005). [32] J. Marszalkowski, J. M. Marszalkowski and M. Drozdowski, “Empirical study of load time factor in search engine ranking”, Journal of Web Engineering 13 (1&2), 114–128 (2014). [33] F. Pinel, B. Dorronsoro and P. Bouvry, “Solving very large instances of the scheduling of independent tasks problem on the GPU”, Journal of Parallel and Distributed Computing 73 (1), 101–110 (2013). [34] P. Ezzatti, M. Pedemonte and A. Martin, “An efficient implementation of the Min-Min heuristic”, Computers & Operations Research 40 (11), 2670–2676 (2013). [35] J. Marszalkowski, J. Marszalkowski and J. Musial, “Database scheme optimization for online applications”, Foundations of Computing and Decision Sciences 36 (2), 121–129 (2011). [36] J. Blazewicz, N. Cheriere, P.-F. Dutot, J. Musial and D. Trystram, “Novel dual discounting functions for the Internet shopping optimization problem: new algorithms”, Journal of Scheduling 19 (3), 245–255 (2016). [37] M. Guzek, A. Gniewek, P. Bouvry, J. Musial and J. Blazewicz, “Cloud brokering: current practices and upcoming challenges”, IEEE C
Bull. Pol. Ac.: Tech. 64(3) 2016 - 10.1515/bpasts-2016-0056 Downloaded from De Gruyter Online at 09/28/2016 10:36:44PM via free access