European Journal of Operational Research 183 (2007) 1230–1248 www.elsevier.com/locate/ejor
Applying self-adaptive evolutionary algorithms to two-dimensional packing problems using a four corners’ heuristic Kevin J. Binkley *, Masafumi Hagiwara Department of Information and Computer Science, Hagiwara Research Lab, Keio University, 3-14-1 Hyoshi, Kohoku-ku, Yokohama 223-8522, Japan Received 30 September 2004; accepted 15 December 2004 Available online 16 June 2006
Abstract This paper proposes a four corners’ heuristic for application in evolutionary algorithms (EAs) applied to two-dimensional packing problems. The four corners’ (FC) heuristic is specifically designed to increase the search efficiency of EAs. Experiments with the FC heuristic are conducted on 31 problems from the literature both with rotations permitted and without rotations permitted, using two different EA algorithms: a self-adaptive parallel recombinative simulated annealing (PRSA) algorithm, and a self-adaptive genetic algorithm (GA). Results on bin packing problems yield the smallest trim losses we have seen in the published literature; with the FC heuristic, zero trim loss was achieved on problems of up to 97 rectangles. A comparison of the self-adaptive GA to fixed-parameter GAs is presented and the benefits of self-adaption are highlighted. Ó 2006 Elsevier B.V. All rights reserved. Keywords: Packing; Cutting; Genetic algorithms; Simulated annealing; Evolutionary algorithms
1. Introduction Two-dimensional (2D) packing problems occur in a wide range of industries. The goal is simply the optimal utilization of space or material available. In the garment and paper industries the problem is often to cut smaller pieces from a large roll of cloth or paper while reducing the scrap. In the *
Corresponding author. Tel.: +81 045 566 1762; fax: +81 045 566 1747. E-mail addresses:
[email protected] (K.J. Binkley),
[email protected] (M. Hagiwara).
wood, glass, and metal industries the task is not to cut from a roll, but to cut from a fixed size sheet or plate. In the semiconductor industry the task is VLSI floor planning with the goal being to optimize the use of space on the chip. There are two major variants of the 2D packing problem: bin packing and strip packing. In the 2D bin packing variant, rectangles are to be packed in bins of given width and height, the goal being to minimize the number of bins used. In the 2D strip packing variant, rectangles must be packed in a fixed width, infinite height strip, the goal being to minimize the height. The bin packing variant is
0377-2217/$ - see front matter Ó 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.ejor.2004.12.029
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
most suitable for the wood, glass, metal, and semiconductor industries, while the strip packing variant will generally apply to the paper and garment industries. An important consideration is whether or not the rectangles can be rotated as they are placed. In the wood and garment industries one may care about the grain of the material and rotations may not be permitted. While in the paper, glass, and semiconductor industries there may be no particular restrictions. Allowing rotations adds flexibility and can result in a better packing, while at the same time apparently complicating the task. Given the many permutations of 2D packing problems, most research focuses on a particular type of the packing problem: either bin packing or strip packing, with or without rotations. A recent survey of 2D packing problems is given by Lodi et al. (2002). A typology of cutting and packing problems is defined in Wa¨scher et al. (2006). Being NP-hard, the 2D packing problems are an attractive challenge for evolutionary algorithms (EA). For application to strip packing problems, Jakobs (1996) developed a genetic algorithm (GA) using the bottom-left (BL) packing method with support for rotation of rectangles. Hopper and Turton (2001) provide a comprehensive comparison of GA, simulated annealing (SA), naı¨ve evolution, and simple hill-climbing for the strip packing problem with rotations and for various BL methods. A recent paper by Lesh et al. (2004) gives a method to find strip packing solutions using an exact branch and bounds exhaustive search. They successfully find perfect packings (zero trim loss) for problem sizes up to 30 rectangles. In regards to bin packing problems, Babu and Babu (1999) use a GA and a BL packing heuristic. Recently, Leung et al. (2003) developed a mixed simulated annealing-genetic algorithm. In this paper, we design the four corners’ (FC) packing heuristic with the aim to increase the search efficiency of EAs when applied to bin packing problems. By the nature of the FC heuristic, the packings produced are non-guillotineable and bottom-left stability is not preserved. In this research we study only orthogonal packing problems, consisting of one bin with the goal to seek minimum trim loss. In the typology of cutting and packing problems as defined by Wa¨scher et al. (2006), we deal with a two-dimensional, output maximization problem: we have one large rectangle of fixed dimensions, into which the value of the placed smaller rectangles (assumed proportional to area of the rectangle)
1231
must be maximized. The smaller rectangles are strongly heterogeneous and must be laid out orthogonally. Specifically, we study the two-dimensional single orthogonal knapsack problem. The FC heuristic is tested with two different selfadapting EAs: a self-adapting parallel recombinative simulated annealing (PRSA) algorithm and a self-adapting GA. The results are presented both with and without support for rotations. The FC heuristic combined with the self-adapting PRSA algorithm is found to produce tighter packings than any of the methods we have seen published in the literature. This paper is outlined as follows. In Section 2, an analysis on how EAs work on packing problems is performed and the EA-friendly four corners’ packing heuristic is introduced. In Section 3, a detailed description of the EA components developed in this study is given. Section 4 describes details of the PRSA implementation. Section 5 describes the GA implementation. The remaining sections cover the experimental settings, results, and conclusions. 2. Designing the four corners’ heuristic In this section, we analyze how EAs commonly function on packing problems and propose a four corners’ heuristic designed specifically for EAs. GAs are chosen as the representative EA for this discussion, however most of the principles apply in general to the many types of EAs. For example, SA can be implemented using the genome encoding and the mutation operators of a GA. An implementation of PRSA (see Section 4) could use both the mutation and crossover operators of a GA as is done in this paper. Tabu search, random search, hill-climbing, and other meta-heuristic search algorithms can be similarly implemented. The rest of this section is outlined as follows. Section 2.1 gives a brief introduction to the theory behind GAs. In Section 2.2, we analyze how GAs typically function on packing problems. In Section 2.3, using this knowledge we design the four corners’ (FC) heuristic. 2.1. A brief review of the theory of genetic algorithms A GA works through processing schemata, resulting in the growth in the population of useful schemata. The theory of schemata is discussed in Goldberg (1989). For the discussion here it will be helpful to think of a schema as a section of the
1232
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
genome. The theory says that as evolution continues, useful schemata will increase in the population. The useful schemata can be thought of as building blocks (BBs) (Goldberg et al., 1991). In a properly functioning GA, these BBs are recombined by the crossover operator. Through the process of evolution, the more fit BBs survive and the less fit BBs die out, causing the population to converge toward a potential solution. In order to prevent the population from converging too quickly to a non-optimal solution, the mutation operator randomly perturbs individuals in the population, continually introducing new BBs. The design of the crossover operators, mutation operators, and encoding of the genome are very much intertwined and have a critical effect on the efficiency of a GA. We now consider a typical GA applied to packing problems and take a look at the effect the GA operators have on the BBs in the genome. 2.2. Analyzing the building blocks in a packing problem For packing problems, a common genome encoding is the packing order of the rectangles and a commonly used decoding method is the bottom-left-fill (BLF) heuristic. The BLF heuristic packs the rectangles in the order given, placing each rectangle in the lowest possible position in the container and then left justifying it (Hopper and Turton, 2001). Fig. 1 shows the phenotypes from the genomes: (0, 1, 2, 3, 4, 5, 6) and (2, 1, 0, 3, 4, 5, 6). The second genome is the result of a common swap mutation applied to the first genome. Only two rectangles are swapped (rectangles 0 and 2), yet this leads to the phenotypic positions of every rectangle changing. This example shows only seven rectan-
6 5
5 6
4
4 2
1 0
3
2
3
0 1
Fig. 1. The effects of swapping two rectangles in a BLF packing genome.
gles, but with 200 rectangles changes in the first part of the genome will have an even more critical effect on the packing result. Considering the case of 200 rectangles, a swap between the 200th rectangle and the 195th rectangle can be considered fine-tuning. The bottom-left part of the packing will not change, and the upper-right rectangles will change a small amount. But a swap between the 1st and 5th rectangles is closer to a random jump to a different area of the search space. The bottom-left rectangles change and as the subsequent rectangles are packed, the position of all 200 rectangles may change, the result being a completely different packing phenotype. In general, we would like a small change in the genome such as the swap of two rectangles to produce a small change in the phenotype. When this is the case, we hypothesize that a good phenotype can be fine-tuned by the GA. It might be productive to design a mutation operator based on this observation. For example, rather than choosing swap locations based on a uniform random distribution, use a non-uniform distribution based on the degree of randomness the swap locations introduce. We do not investigate such mutation operators in this paper. However, we note that Lai and Chan (1997) developed a SA approach that performed swaps from the end of an ordered rectangle list based on the current annealing temperature. The previous example considered a swap mutation operator; however, the same logic applies to choosing crossover operators. For application to BLF packing genomes, a crossover operator should pay special attention to preserving the first part of the genome. Frequent perturbations to the initial allele sequence are likely to introduce too much randomness into the search, which is the job of the mutation operator, not the crossover operator. One popular crossover operator, partially matched crossover (PMX) (discussed in Section 3), has the desirable property of preserving commonly located alleles of the parents in the children, such that if both parents start with the same initial order of rectangles then both children will have the same initial order. PMX was used by Hopper and Turton (2001) in their study where it was found that for problems of more than 70 rectangles, a GA without crossover (their naı¨ve EA) performed just as well as the GA with crossover. This suggests that although crossover plays an important role for packing problems of low order, as the genome gets longer it becomes difficult for the PMX crossover operator to work effectively with the BBs available
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
in the genome, in particular the one at the beginning of the chromosome. With this in mind, we designed the four corners’ packing heuristic with the goal of increasing the efficiency of the crossover and mutation operators leading to more effective use of the BBs available in the genome. 2.3. Four corners’ heuristic For BL heuristics, a very small change in the packing order permutation can reposition every rectangle in the packing. A change at position i in a permutation generally changes the position of every rectangle j, j > i. Of course, changes in the genome introduced by mutation and crossover operations are an unavoidable necessity in an EA. However, the operators and the genome structure can be changed to allow for more gradual effective convergence of an EA to an optimal solution. With this design goal we introduce the four corners’ (FC) packing heuristic for EAs. As the name implies, in the FC approach rectangles are packed to the four corners, finally converging in the center. The genome encoding is a permutation of rectangles broken into four roughly equal sections, as shown in Fig. 2. The sections are labeled bottom-left, top-left, bottom-right, and topright. The genome size may not be divisible by four or two, in which case care is taken to allocate the alleles consistently. The specific process is as follows:
mutation operator that swaps two corners (discussed further in Section 3). The arrows in Fig. 2 represent the direction of the packing. In the example, the genome consists of 10 rectangles and is (4, 8, 2, 9, 0, 5, 3, 7, 1, 6). The first rectangle packed from the bottom-left is 4, the second is 8. From the top-right, the sequence of packing is rectangle 0 and then 9. This genome structure was chosen such that the rectangles packed first are grouped at the beginning, end, and middle (these tend to be the larger rectangles). The rectangles packed last are grouped where bottom-left meets top-right and bottom-right meets top-left; in other words, the one-fourth and three-fourth positions in the genome (these tend to be the smaller rectangles). The FC heuristic performs the packing by alternating through the corners packing one rectangle to each corner in the following order: bottom-left, top-right, bottom-right, and top-left. This packing sequence is chosen to minimize interference during the packing. Each rectangle is packed as close as possible to its assigned corner. The genome diagramed in Fig. 2 has the phenotype shown in Fig. 3. In this example, rectangle 4 is packed to the lower-left, 0 is packed to the upper-right, 5 to the lower-right, and 6 to the upper-left. Next, rectangle 8 is packed as close as possible to the lowerleft corner, placing 8 above 4. Packing is continued in this fashion until at last rectangle 7 is packed to the lower-right. The open space is tracked as each rectangle is placed. The rectangles from different corners will likely interfere with each other near the end of the packing process. Here, the position
1. Divide the genome in half resulting in two parts A and B, any extra allele goes to A. 2. Divide A in half resulting in two parts, A1 and A2, any extra allele goes to A1. 3. Divide B in half resulting in two parts, B1 and B2, any extra allele goes to B1.
6
8
Bottom-left
2
9 Top-right
0
0 1 9
The parts A1, A2, B1, and B2 correspond to bottom-left, top-right, bottom-right, and top-left. In some respects, the FC genome may be considered a set of four chromosomes. Special crossover and mutation operators can be designed to take advantage of the chromosomes and how they affect the packing. In this paper, we consider only one special 4
1233
2 5
7
8
3
4
Fig. 3. An example of the FC packing heuristic.
5
3
Bottom-right
Fig. 2. Encoding of the genome for the FC heuristic.
7
1
6 Top-left
1234
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
of rectangle 0 influenced rectangle 1, leaving a large open space in the upper left. At some point in the packing process, rectangles will not fit and will be skipped. The size and distribution of the open space of the resulting phenotypic packing can be used to direct the EA search. We now move to Section 3 where we discuss the components of the EAs presented in this paper. 3. Common components of a self-adapting evolutionary algorithm The PRSA and GA algorithms implemented in this paper share the genome structure, the crossover operators, mutation operators, and the fitness function. 3.1. Genome structure The genome structure of a simple packing problem is a string of alleles. Each allele represents a rectangle, and together they represent a permutation of rectangles. In order to support 90° rotations of rectangles, this structure was extended by adding a boolean flag to each allele to indicate the rotational state of the rectangle. When it is necessary to compare alleles during crossover, equivalence is determined by comparing only the rectangle. Thus the allele (rectangle #1, rotated) is equivalent to (rectangle #1, non-rotated). Parameters for selfadaptation are appended to the genome. These parameters are used for controlling the EA and do not affect the packing directly. 3.2. Crossover operators We implemented four different crossover operators. Each operator takes two parents and produces two children. We hypothesize that the best crossover operator will likely not only depend on the particular packing problem, the population size, mutation operator, mutation rate, and the many other parameters of the EA, but also on the current state of the EA population. In our self-adapting EA implementation we chose to let the choice of crossover operator evolve with the population. An algorithm for making the choice of crossover operator adaptive was proposed by Spears (1995). Spears appended tag bits to the genome, allowing the individuals to adapt between uniform crossover and two-point crossover. Following up upon this idea, we append a tag integer to the genome and
pick one of the four types. The integer undergoes mutation during the mutation phase based on a specified crossover tag mutation rate. Mutation is performed by randomly generating a new integer crossover tag. The type of crossover operation to perform is determined by the following pseudocode: If (parent0.crossover_type = parent1. crossover_type) Do parent0.crossover_type Else if (rand(0, 1) < 0.5) then Do parent0.crossover_type Else Do parent1.crossover_type After the crossover operator has been determined the tag integer of both children is updated to reflect the type of crossover operator performed when these individuals were generated. A plain copy of their parents’ crossover operator tag integers would allow some noncontributing crossover operator types to pass through when the parents’ tags differed. Two of the crossover operators implemented in this study are partially mapped crossover (PMX) and cycle crossover operators. The implementation of these crossover operators is discussed in Whitely (1997). Before discussing the other two crossover operators designed for this study, we consider how PMX and cycle crossover function on packing genomes. In the case where a location has the same rectangle in both parents, it will be identical in both children. In the case where a location has a different rectangle in both parents, there is a chance the location will be changed. When a change occurs, cycle crossover always swaps rectangles at corresponding locations in the genome, and PMX tends to do so. This leads to a tendency to replace rectangles with similar rectangles. We explain this tendency as follows. In packing problems, it is well-known that a good general packing order rule is to pack large rectangles first. In such a permutation the large rectangles will be first, followed by the smaller ones. Thus, for two parents following this type of rule the corresponding rectangles at the beginning of each parent’s genome will have the tendency to be similar. Of course, it is not known what packing order rule the EA will follow. However, as the individuals evolve it is likely that the EA will tend to place similar rectangles at corresponding locations in the genome. Indeed, our experiments do show
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
that the larger rectangles get placed first during the packing process. As evolution proceeds, rectangles at corresponding locations become similar and crossover becomes more likely to lead to effective recombination. The remaining crossover operators implemented in this study, partially mapped crossover random locations and preserve location crossover were designed specifically with packing problems in mind. Both are discussed in detail in the following sections. 3.2.1. Partially mapped crossover random locations This crossover operator is essentially the PMX crossover operator, except that instead of cutting the chromosomes at two points and using the alleles in between for the mapping, partially mapped crossover random locations (PMXR) takes N random locations in the chromosome and maps these locations as is done in PMX. The crossover section is non-continuous and scattered about the genome, and unlike PMX there is no bias of including the center alleles in the mapping. The idea is similar to the order crossover-2 operator as covered in Whitely (1997), which also selects N random locations in the crossover process. Specifically, the PMXR algorithm is 1. Pick a random number N from 0 to length of chromosome – 2. 2. Pick N random locations in the chromosome; these locations define the crossover section. 3. From the two parents map the N locations as done with PMX. Suppose the four random locations 1, 5, 3, and 8 are chosen. Consider the following example: Parent 1: Parent 2: Random Locations:
0 1 _
1 4 X
2 2 _
3 3 X
4 7 _
5 8 X
6 5 _
7 6 _
8 9 X
9 0 _
Proceeding analogously to PMX the resulting mappings becomes: 1 M 4, 3 M 3, 5 M 8, and 8 M 9. The elements from the crossover section of parent 1 are copied to the child. Next, the elements not found in either parent’s crossover section are copied to the child from parent 2. And finally, using the PMX mappings, the remaining elements are mapped (5 ! 9 and 1 ! 4). The operations are shown in the construction of the first child.
Child (step Child (step Child (step
1 1): 1 2): 1 3):
1235
_
1
_
3
_
5
_
_
8
_
_
1
2
3
7
5
_
6
8
0
4
1
2
3
7
5
9
6
8
0
The second child is produced by swapping the roles of the parents. The resulting second child is Child 2:
0
4
2
3
1
8
6
7
9
5
PMXR’s function in packing problems is similar to PMX. However, while PMX is restricted to an interval determined by its two-point nature, PMXR is able to create mappings throughout the genome. 3.2.2. Preserve location crossover The preserve location crossover operator (PLX) is a single point crossover operator that works to preserve the location of as many alleles as possible from both parents. Alleles whose location cannot be preserved are reassigned randomly. PLX is best explained by example. Consider the following example, where the crossover location 4 is chosen: Parent 1: 0 1 2 3 j Parent 2: 1 4 2 7 j
4 5 6 7 8 9 3 8 5 6 9 0
Child 1 takes all alleles up to the crossover point from parent 1. For the locations after the crossover point, the alleles not already placed in child 1 are taken from parent 2. Child (step Child (step Child (step
1 0 1 2 3 j 1): 1 0 1 2 3 j 2): 1 0 1 2 3 j 3):
_
_
_
_
_
_
_
8 5 6 9 _
7 8 5 6 9 4
In this case, alleles 3 and 0 were already placed in child 1 and not able to be taken from parent 2. The remaining unplaced alleles, 4 and 7, are taken from parent 2 and placed randomly in child 1. In this example they are randomly placed in the order 7 then 4. Child 2 is produced in a similar manner, where the alleles after the crossover point are copied from parent 2 to child 2. The alleles before the crossover point are taken from parent 1, preserving the location from parent 1 if possible, otherwise randomly placing them in open positions, here the remaining alleles 4 and 7 are randomly placed in the order 4 then 7.
1236
Child (step Child (step Child (step
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
2 _ _ _ _ j 1): 2 _ 1 2 _ j 2): 2 4 1 2 7 j 3):
3 8 5 6 9 0 3 8 5 6 9 0 3 8 5 6 9 0
Like the other three crossover operators, PLX has the property of preserving all commonly located alleles from parents to children. However, PLX introduces a degree of randomness into the crossover operation for the unmatched alleles. PLX guarantees the preservation of the location of each unmatched allele in one of the children; while in the other child, a random unmatched allele is chosen. Due to the inherent randomness, the process can be thought of as a kind of directed mutation. 3.3. Mutation operators Three different mutation operators are implemented: swap two rectangles, rotate a rectangle, and swap two corners. As for almost all EA parameters, it is difficult to determine appropriate mutation rates. Ba¨ck (2000) reviews many of the methods from the literature. However, most of the theory and experiments are carried out on rather simple problems. In Ba¨ck (1993), the constant rate pm = 1/l, where l is the length of the genome was proposed as a heuristic for genetic search on unimodal pseudoboolean functions. This heuristic seems reasonable since as the EA approaches the optimum, very small changes in the genome will be required to solve for the optimal solution. Indeed, when the mutation rate is mistakenly set to 2/l, depending on the mutation operator, an average of two alleles would be changing per genome per generation. If the optimal solution is just one allele change away then the probability of reaching the solution gets very small. Following this logic, we estimate that the mutation rate should be inversely proportional to the genome length. To make the EAs presented in this paper more flexible in solving problems of arbitrary numbers of rectangles, we have chosen to divide the mutation rate parameter by the length of the genome (the exception is the swap corners mutation which does not get divided, the reasoning is explained in Section 3.3.3). This allows one mutation rate setting to better function over a wide range of problem sizes.
Ba¨ck (1993) also indicates that multimodal problems may benefit from a non-constant mutation rate. As was done for the crossover operation type, self-adaptable mutation rates are implemented. Evolution is performed using methods based on evolutionary programming (EP) (Yao et al., 1999). In this study we chose not to use the full methods of EP; the standard deviations were fixed throughout the run. Each of the variable mutation rates, mi is updated according to the formula mi,g+1 = mi,g + N(0, 1) ri, where i represents one of the three variable mutation rates, g represents the generation number, N(0, 1) is a normally distributed random number with a mean of zero and standard deviation of one, and ri is the standard deviation of the ith mutation rate. The three ri and mi are appended to each genome. The initial mutation rates and standard deviations are initialized randomly from a uniform distribution, where the initial ranges are parameters to be specified. During evolution the mutation rates and standard deviations are constrained to these ranges. With adaptable mutation rates, much of the trial and error of setting the mutation rates goes away. Since the mutation rates are divided by the length of the genome, we can expect a reasonable setting to be around 1.0, for all sizes of problems. As the evolution proceeds, the rates for each individual are adjusted by Gaussian based mutations. Through the evolution process rates suitable to the particular packing problem should produce good packing results, survive, and further propagate. Next the three different mutation types are discussed. 3.3.1. Swap mutation The swap mutation is the straightforward swap of two alleles. As determined by the swap mutation rate, each allele has a chance of being swapped with another allele. The following pseudocode demonstrates the process. For pos = 0 to num_alleles - 1 If (rand(0, 1) < swap_mutation_rate) swap_pos=(rand_int(0, num_alleles1) + pos + 1)% num_alleles swap(pos, swap_pos) End if End for
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
4
8
Bottom-left
2
9
0
Top-right
5
3
Bottom-right
7
1
1237
6 Top-left
Fig. 4. Division of the genome for the FC heuristic.
As the EA converges toward an optimum, the diversity of the population decreases; schemata and building blocks are lost from the population. The swap mutation helps the EA by introducing new building blocks into the population. Indeed, with the appropriate stroke of luck, from any point in permutation space, the swap mutation is capable of generating any other point in permutation space (when rotations are not considered). Mutation is particularly critical later in the evolution processes when all individuals become quite similar to each other. However, if the swap mutation becomes excessive, the EA approaches a random search. 3.3.2. Rotation mutation For problems where rotations are supported, the swap mutation alone may not be sufficient to reach the best packing. Given the extra degree of freedom, better packing results can generally be reached by searching the larger space of permutations and rotations. The rotation mutation simply rotates a rectangle by 90°. Similar to the swap mutation, the rotation mutation is performed on each rectangle in the genome with a rotation mutation rate chance of the rectangle rotating. The rotation mutation’s purpose is the same as that of the swap mutation. As the GA heads to a solution, some new or lost schemata can be introduced into the population through the rotation mutation. 3.3.3. Swap corners’ mutation It has been found that special problem specific crossover and mutation operators can help a GA converge to a better solution faster. With this in mind, the swap corners’ mutation is introduced specifically for the FC heuristic. Recalling the discussion from Section 2.3, the FC heuristic is implemented by packing rectangles alternatively to each of the corners. Near the end of the packing process the corners meet each other in the center and come to interfere. Packing is done in the order: bottom-left, top-right, bottom-right, and then topleft. Due to this order, the rectangles in the bot-
tom-left part of the genome get packed first and receive priority over the others. If instead priority is given to another corner, the outside rectangles will remain very much unaffected, but the interference order in the center will change, resulting in a slightly perturbed, possibly improved packing. Fixing the first corner and considering the permutations of the others, there are 3! = 6 possible different corner orderings. An integer could be appended to the genome indicating the ordering to use, but then crossover would be complicated as corresponding allele locations in two recombining genomes would lose some of the tendency to represent similar rectangles. Avoiding this complexity, we chose to implement a swap corners mutation, by swapping the alleles corresponding to two corners. Four different swaps are implemented: bottomleft M bottom-right, bottom right M top-left, bottom-left M top-right, and top-left M top-right. In the case where one corner has an excess allele, the excess allele is not moved. Fig. 4 shows the genome (4, 8, 2, 9, 0, 5, 3, 7, 1, 6). A bottom-left M top-right swap will swap corresponding alleles. In this case, 4 M 0 and 8 M 9 are the corresponding alleles since 4 is packed first the bottom-left position and 0 is packed first in the bottom-right position. 2 is an excess allele and is not moved. The resulting genome is (0, 9, 2, 8, 4, 5, 3, 7, 1, 6). In the case of a bottom-left M bottomright swap, the packing order and number of alleles are the same. The genome sections can be swapped in the straightforward manner, resulting in the genome (5, 3, 7, 9, 0, 4, 8, 2, 1, 6). The swap corners mutation’s main purpose is not the reintroduction of lost schemata. Rather, it introduces new schemata that the EA has likely never seen before. It serves as an aid to direct the EA to move to investigate solutions similar in the phenotypic sense to those already in the population. Using only the other mutation operators and crossover operators, the EA would have no means to move quickly to investigate the individuals produced by the swap corners’ mutation. From the point of view of the swap mutation the individuals produced by the swap corners’ mutation are quite distant in search space:
1238
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
1
1 2 3
2
space is proposed. Specifically, the following integral is calculated over all open space: Z Z n n Z¼ ðx xc Þ þ ðy y c Þ dx dy; ð1Þ
3
Fig. 5. Comparing two FC packing results, the one on the left is the better packing (white numbered rectangles represent open space).
about half the genome is changed. The swap corners’ mutation provides an expedient path to an otherwise very distant, quality point in search space, essentially unreachable by the other mutation operators. 3.4. Fitness function The genome is decoded and the packing is performed using the FC heuristic as described in Section 2.3. A straightforward measure of packing quality is the resulting trim loss. However, there are many packing genomes with the exact same trim loss. Using the trim loss alone results in a very flat search space with a peak here and there, far and wide between, representing the improvements the EA seeks. To the EA, many individuals in the population will look identical, resulting in the undesirable, nearly random, undirected search. These needlein-the-haystack problems are very difficult for EAs. For packings of identical trim loss an additional measure is necessary to direct the EA’s search. Consider Fig. 5, in which both packings have the same amount of unpacked area. Which packing is better? With knowledge of the inner workings of the FC heuristic and the genome encoding, the left packing should be valued more heavily. The reasoning goes as follows. For individuals with identical trim loss, if more value is given to the packings with open space near the center, the EA will be encouraged to create individuals where the open space is near the center. The encoding of the FC genome and design of the EA operators allows small perturbations of the central rectangles without affecting the layout of the corners and edges. Through these perturbations, continuous open space will gradually move to and accumulate in the center, allowing for more rectangles to be packed. What is needed is a method to measure the spatial scatter of open space with respect to the bin center. As a measure of spatial scatter from the bin center, a fourth moment statistic over all open bin
where the center of the bin is represented by (xc, yc) and n is set to 4. In general, an nth moment statistic might be implemented (if n is odd then appropriate absolute values must be taken). However, if n is too large, packings with open space away from the center get penalized excessively, prohibiting the EA from effectively exploring alternatives. A second moment was also experimented with, but the fourth moment produced better preliminary results and was settled upon for this study. To illustrate the calculation of the Z value for a packing, we consider the packing showing in the right of Fig. 5. First the open space is broken up into non-intersecting rectangles, resulting in rectangles 1, 2 and 3 shown in Fig. 5. The Z value is calculated for each rectangle and the sum is taken as the Z value for the whole packing. Evaluating the integral (1) for a rectangle is performed as follows. First, the center of the bin is taken to be (0, 0), and the position of the rectangle is considered relative to the new center. The constants xc and yc drop out of (1), with n = 4, (1) is evaluated and Z is calculated as follows: Z¼
1 5 x ðy 1 y 0 Þ þ y 5 ðx1 x0 Þ ; 5
where the center-relative coordinates of the rectangle are (x0, y0, x1, y1). For ranking purposes the Z value is sufficient: the size of Z will change as the problem size changes, but a consistent ranking is always achievable. A GA using rank-based selection or tournament selection is an example where Z is alone sufficient. However, for a GA using proportional selection or an SA where the energy corresponds to the fitness, a bounded real number that does not depend on the bin width or height is more desirable. With a bounded value, the fitness vector can be more appropriately converted to a real number for use during the selection process. A straightforward method is to normalize Z by dividing by Zopen, the moment of a completely open bin. The value Z is much smaller than Zopen, resulting in very small values after normalization. In practice, with appropriate weighting this method appeared to work reasonably well. However, an even better method was found. In general, the trim loss component will be weighted
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
quite heavily compared to the spatial scatter component. For a given trim loss the amount of open bin space is known. By placing the open space around the perimeter of the bin, for this trim loss an estimate of the maximum spatial scatter component, Zperimeter, can be achieved. Zperimeter is estimated indirectly by placing all the packed area in a perfectly filled rectangle in the center of the bin. Labeling the spatial component of this filled centered rectangle Zfilled, then Zperimeter = Zopen Zfilled. The resulting fitness vector provided by the FC fitness function becomes (trim loss, Z/Zperimeter), trim loss being the primary quality. 4. Self-adaptive parallel recombinative simulated annealing Like the well known GA, simulated annealing (SA) also appears frequently in the packing literature. In Hopper and Turton (2001), it was found that initially a GA yields better solutions quickly, but as more fitness evaluations are performed SA was found to yield better overall solutions than a GA. Leung et al. (2003) designed and implemented a mixed simulated annealing-genetic algorithm for bin packing problems. In comparison to a simple GA, they found that in the long run the mixed SA–GA yielded better results. Given these results from the literature and our own experience, we believe that for the more challenging packing problems when more computational resources are available, better results will be achieved more easily by a SA-type algorithm than the simple GA. Not willing to give up the benefits of the crossover operator, rather than SA, we chose to implement parallel recombinative simulated annealing (PRSA) (Mahfoud and Goldberg, 1995). PRSA is essentially a parallel SA that uses recombination. PRSA being a form of SA, has the bonus of inheriting the proof of guaranteed convergence from SA theory, whereas the standard GA still has not been shown to have the guaranteed convergence property. The following sections describe the PRSA algorithm used in this paper. 4.1. The parallel recombinative simulated annealing algorithm The PRSA implemented in this paper is as follows.
1239
1. Set initial temperature. 2. Initialize population randomly. 3. Repeat until stopping criteria reached: (A) Repeat population size divided by two times: (1) Pick two individuals randomly from population without replacement. (2) Generate two children: apply crossover and then mutation. (3) Hold Boltzmann trials. 1. Between parent 1 and child 1. 2. Between parent 2 and child 2. (4) Trial winners are placed in the next generation. (5) Periodically lower temperature. The algorithm is very similar to a GA. In a GA, a pre-selection method such as tournament selection is used to determine the parents that reproduce. The parents are paired randomly for reproduction through crossover and mutation. Finally, in postselection only the children survive to the next generation (in the simple GA), the post-selection selects all children. In a sense somewhat opposite of the GA, in PRSA the pre-selection method simply selects all parents. Reproduction is done as in a GA. Then, the post-selection is performed by holding a Boltzmann trial between the parent and each corresponding child, the winner survives to the next generation. A Boltzmann trial is defined as the competition between a parent and child using the logistic acceptance criterion. The logistic acceptance criterion is, accept parent with the probability 1=ð1 þ eðEp Ec Þ=T Þ, otherwise accept the child. The parent’s energy Ep, and the childs’ energy Ec, are calculated as explained in Section 4.3. 4.2. Crossover and mutation operators SA, GA, and PRSA all being EA algorithms, have much in common. For GA and SA, it is possible to use the same mutation operators. In moving to PRSA it is straightforward to use the same crossover operators. In this study, all the PRSA operators are the same as the GA operators. Adaptive crossover is implemented in PRSA; however
1240
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
adaptive mutation rates were not used (see Section 6.2). 4.3. Parallel recombinative simulated annealing energies In order to hold the Boltzmann trials, the individual’s fitness must be converted to a real energy. We use a linear weighted average in converting the fitness vector to a real energy value. Specifically E = (wtt + wss)/(wt + ws), where E is the resulting energy value, s and t are respectively spatial scatter component and the trim loss of the fitness vector, and wt and ws are the weight parameters. 4.4. Cooling schedule We use the common SA geometric cooling schedule. After a particular number of iterations, T is reduced according to the formula: Tnext = aTprev. The constant a is called the cooling constant and determines how fast cooling occurs. To define the cooling schedule, our PRSA implementation takes the initial temperature Ti, the final temperature Tf, and the cooling interval x (number of generations between coolings) as parameters. The number of generations to execute, G, is also known. Thus the number of coolings is n = bG/xc. Solving for a, from the formula Tf = anTi, yields a = (Tf/Ti)1/n. The temperature is initialized to T = Ti, and updated every x generations according to the formula, Tnext = aTprev. 5. Self-adapting genetic algorithm Most of the components of the self-adapting GA were described in Section 3. The remaining components are outlined in this section. Here is the GA algorithm implemented in this paper. 1. Initialize population randomly. 2. Evaluate initial population. 3. Repeat until stopping criteria reached: (a) Select parents (pre-selection mechanism). (b) Perform recombination. (c) Perform mutation. (d) Evaluate children. (e) Select next generation (postselection mechanism). 4. Return best individual.
In the recombination and mutation steps, the GA uses the common EA operators defined in Section 3. For the pre-selection mechanism, we use tournament selection with replacement. To perform the ranking of individuals for tournament selection the trim loss component of the fitness vector is compared. If two individuals have the same trim loss component then the spatial scatter component is compared. As implemented in PRSA, a weighted average of the components is also possible, but not studied in this research. The post-selection mechanism in general can select the individuals for the next generation from both the parents and children of the current generation. An elitist model is often used to ensure that the best parent survives to the next generation. In this study we chose the non-elitist model, simply selecting all the children to become the next generation. 6. Experimental settings 6.1. Common settings Unless stated in a specific experiment, the following common experimental settings are used. 6.1.1. Evolutionary algorithm packing software The EA packing software was developed in C++ and runs on Microsoft Windows XP. We plan to make information about the experimental results, data files, packing software, and source code available for download at http://www.soft.ics.keio.ac. jp/~kbinkley/packing/index.html. 6.1.2. The packing problems The experiments are carried out on 31 problems published in the literature. The first 21 problems are published in Hopper and Turton (2001), Tables 9–15. These problems correspond to problems 1–21 in this paper: problem 1 M (category 1, P1), problem 2 M (category 1, P2), etc., and problem 21 M (category 7, P3). Problems 22–31 are taken from Leung et al. (2003), and correspond to Tables 1–10 in their paper. 6.1.3. Common overall settings All problems are run 10 times and the average results are presented. At the beginning of each of the run of each problem, the random number seed is set to 3, with the standard C rand() function being
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
used for random number generation. The crossover rate is kept constant at 1.0. Adaptive crossover is used as described in Section 3.2 with the crossover tag integer mutation rate being set to 0.05. The standard deviations for the adaptive mutation rates are fixed at 0.1, 0.1, and 0.01, respectively for the swap mutation rate, rotation mutation rate, and the swap corners mutation rate. The number of generations is not used as a parameter; rather population size and number of fitness function evaluations are used. The number of generations to execute is determined by dividing the number of fitness function evaluations by the population size and subtracting one to properly count the initial generation. To speed up the EA, caching of fitness evaluations was implemented as most of the computational resources are spent performing the packing to evaluate the genomes. Also, when a perfect packing was achieved, the run was stopped. 6.2. Parallel recombinative simulated annealing PRSA does not require a large population size. In the extreme case of a population one, it degenerates to SA, recombination not being performed. However due to recombination, PRSA does benefit from a large population. The population size is set to 200 for the results presented here. The PRSA specific cooling settings, the initial temperature, final temperature, and cooling interval, are set to 0.01, 0.0003, and 10, respectively. Self-adapting mutation rates are not used for PRSA. Through the cooling schedule PRSA controls the effect mutation has on the population. As in an SA, clearly during each generation a mutation is required for progress to be made. With this observation as a guideline, the mutation rate was set such that on average about one mutation is introduced per individual per generation. Performing mutation such that exactly one mutation occurs in each individual in each generation is also worth investigation; however, this is left to future research. For problems without rotations, the swap mutation rate is set to 1.0 and the swap corners mutation rate to 0.1. Where rotations are allowed, the swap mutation rate is set to 0.333, the rotation mutation rate to 0.667, and the swap corners rate to 0.1. Emphasis was placed on the rotation mutation, as it affects only one rectangle and is reasoned to be a smaller, less destructive, more desirable mutation. The weight parameters wt and ws are set to 6 and 1, respectively.
1241
Table 1 Common settings for the self-adapting mutation rates Swap
Rate range
Rotation
Swap corners
Min
Max
Min
Max
Min
Max
0.10
1.90
0.10
1.90
0.01
0.19
6.3. Genetic algorithm The population is set to 400, the tournament size to 4. These numbers are somewhat arbitrary, and our preliminary studies suggested that the while the larger problems need more function evaluations (and larger populations), the smaller problems do fine with fewer. Self-adapting mutation rates are used with limits shown in Table 1. The swap and rotation mutation rates are designed to initially produce an average mutation rate of 1/l, where l is the length of the genome as described in Section 3.3. The swap corners rate is not divided by l, it is initially set so that one in ten genomes will undergo mutation. For the initial population the rates are initialized from a random uniform distribution in the ranges given. In problems where rotation is not allowed, the rotation mutation rate is fixed at 0. 7. Experimental results In Section 7.1, we seek the best possible trim losses and run the EAs for 1,000,000 fitness function evaluations. In Section 7.2 we analyze the benefits of the self-adaptive GA. 7.1. Bin packing Table 2 shows the trim loss achieved with the FC heuristic for both the PRSA algorithm and GA when run for up to 1,000,000 fitness function evaluations (the algorithms stop when a perfect packing is reached). On the larger problems, consisting of 73 or more rectangles, the average trim loss result is generally much less than one percent for both algorithms. On the smaller problems, the failure to pack just one of the smaller rectangles can constitute several percent. For these smaller problems, if perfect packings are not found, the results will not reach the low trim losses seen in the larger problems. Using the FC heuristic, PRSA produced better results than the corresponding GA. Looking only at the problems with rotations permitted, we find
1242
Table 2 Bin packing trim loss results for PRSA and GA (1,000,000 evaluations)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
# of rects
16 17 16 25 25 25 28 29 28 49 49 49 73 73 73 97 97 97 196 197 196 10 15 20 20 25 25 30 30 40 50
PRSA
Genetic algorithm
With rotations
Without rotations
With rotations
Without rotations
Best
Average
Best
Average
Best
Average
Best
Average
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00444 0.00000 0.00333 0.00000 0.00333 0.00222 0.00000 0.00000 0.00167 0.00000 0.00167 0.00620 0.00135 0.00422 0.00000 0.00000 0.00000 0.02732 0.01357 0.00000 0.00000 0.00000 0.01018 0.01042
0.00350 0.00400 0.00000 0.00200 0.00000 0.00000 0.00600 0.00694 0.00561 0.00675 0.00333 0.00333 0.00244 0.00067 0.00196 0.00413 0.00152 0.00267 0.00900 0.00348 0.00665 0.00000 0.00000 0.00469 0.03141 0.02305 0.00133 0.00533 0.00369 0.01733 0.01481
0.00000 0.03500 0.00000 0.00000 0.00000 0.00000 0.00000 0.00444 0.00000 0.00750 0.00639 0.00333 0.00222 0.00222 0.00259 0.00375 0.00167 0.00333 0.01195 0.00263 0.00669 0.00000 0.00000 0.00000 0.00000 0.01071 0.00000 0.00000 0.00000 0.02061 0.01375
0.00000 0.03575 0.00000 0.00000 0.00100 0.00000 0.00000 0.01100 0.00394 0.01342 0.01061 0.00519 0.00439 0.00417 0.00433 0.00646 0.00271 0.00471 0.01376 0.00515 0.00901 0.00000 0.00000 0.00000 0.00402 0.01780 0.00000 0.00356 0.00308 0.02422 0.02225
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00444 0.00278 0.00333 0.00139 0.00167 0.00074 0.00259 0.00148 0.00219 0.00167 0.00125 0.00508 0.00273 0.00344 0.00000 0.00000 0.00000 0.02732 0.01357 0.00000 0.00000 0.00000 0.00970 0.01042
0.00150 0.01450 0.00000 0.00500 0.00150 0.00217 0.00578 0.01094 0.00767 0.00733 0.00353 0.00361 0.00330 0.00452 0.00328 0.00411 0.00367 0.00333 0.00628 0.00380 0.00470 0.00000 0.00000 0.01144 0.03645 0.02529 0.00133 0.00711 0.00246 0.01661 0.01604
0.00000 0.01000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00444 0.00000 0.00972 0.00833 0.00444 0.00556 0.00944 0.00426 0.00698 0.00510 0.00469 0.00807 0.00523 0.00596 0.00000 0.00000 0.00000 0.00000 0.01071 0.00000 0.00000 0.00000 0.01745 0.01167
0.00000 0.03300 0.00000 0.00633 0.00300 0.00067 0.00572 0.01683 0.00628 0.01572 0.01486 0.00944 0.01107 0.01491 0.00941 0.00949 0.01121 0.00893 0.01007 0.00666 0.00782 0.00000 0.00000 0.00000 0.01684 0.02409 0.00467 0.00889 0.00923 0.02640 0.02529
For comparison, results from Leung et al. (2003) are shown here.
Exec time (seconds)
157 171 154 292 258 269 310 377 332 927 751 795 1590 1584 1600 3301 2479 2889 19,712 17,895 19,183 87 129 233 236 348 270 324 356 671 1004
Leung et al. (2003) SA–GA (without rotations) Mutation rate = 0.3
Mutation rate = 0.7
Mutation rate = 1.0
Best
Average
Best
Average
Best
Average
— — — — — — — — — — — — 0.00444 0.00148 0.00519 0.00563 0.00208 0.00469 0.01190 0.00284 0.00792 0.00000 0.00000 0.00000 0.00000 0.01571 0.00000 0.00000 0.00000 0.01891 0.02271
— — — — — — — — — — — — 0.00979 0.00543 0.00919 0.00835 0.00439 0.00760 0.01507 0.00665 0.00982 0.00000 0.00000 0.01688 0.00182 0.02661 0.00178 0.00948 0.01313 0.03108 0.02667
— — — — — — — — — — — — 0.00463 0.00259 0.00333 0.00583 0.00260 0.00448 0.01120 0.00443 0.00776 0.00000 0.00000 0.00000 0.00000 0.02143 0.00000 0.00000 0.00000 0.01527 0.01958
— — — — — — — — — — — — 0.00833 0.00600 0.00925 0.00969 0.00531 0.00695 0.01313 0.00569 0.00920 0.00000 0.00000 0.01238 0.01205 0.03112 0.00267 0.01185 0.00930 0.02615 0.02325
— — — — — — — — — — — — 0.00648 0.00370 0.00678 0.00656 0.00396 0.00427 0.01227 0.00453 0.00724 0.00000 0.00000 0.00000 0.00000 0.02143 0.00000 0.01778 0.01231 0.01091 0.02083
— — — — — — — — — — — — 0.01025 0.00678 0.00963 0.01035 0.00733 0.00858 0.01488 0.00628 0.00960 0.00000 0.00000 0.02400 0.02123 0.03280 0.00400 0.01778 0.01997 0.02555 0.02757
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
Prob #
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
that PRSA produced better average trim losses on 20 of the problems, the GA producing the better results on only seven of the problems, the remaining problems being ties. Similarly, on the problems without rotations permitted, we find that PRSA produced better average trim losses on 23 of the problems, the GA producing better results on only three problems, with the remaining being ties. However, we note that the GA did better on two out of three of the most difficult problems both with and without rotations. In problems up to 97 rectangles perfect packings were achieved when rotations were permitted (see Fig. 7). However, without rotations, perfect packings were found only on problems up to 30 rectangles. In addition, allowing for rotations produced better average trim losses on all problems of 40 or more rectangles, for both the PRSA and GA methods. There are likely two causal factors. First, the increase in search space likely increases the number of solutions. Second, the EA through the rotation mutation is better able to fine tune the packing. The rotation mutation affects only one rectangle, in one part of the packing, whereas the swap mutation affects at least two positions in the packing. On the larger problems, we find that allowing for rotations clearly makes it easier to achieve a perfect packing.
Interestingly, the perfect packings of 49 or more rectangles were only achieved by the PRSA algorithm with the FC heuristic. Although the GA did well on some of the larger problems, it never achieved a perfect packing such problems. Of the four packing methods using the FC heuristic investigated in this paper, we find the PRSA algorithm with the FC heuristic with rotations permitted produced the overall best results. To get an idea of how the PRSA algorithm comes to outperform the GA, a look at the decrease in trim loss as a function of fitness function evaluations is given in Fig. 6. This graph shows the best result achieved by each algorithm on problem 20 with rotations permitted. The GA quickly reached a trim loss of 0.0040 after 250,000 evaluations, then stagnated and remained quite steady for the next 500,000 evaluations before finally moving lower. After 1,000,000 evaluations a trim loss of 0.0027 was reached. PRSA on the other hand, due to the high initial temperatures, does not converge quickly but rather gradually moves to an improved final trim loss of 0.0014. This graph illustrates an important consideration when increasing computational resources. In PRSA, using the increased computational resources is straightforward. The cooling schedule can be kept the same and the algorithm allocates the additional
0.08 GA
0.07
PRSA 0.06
Trim Loss
0.05 0.04 0.03 0.02 0.01 0 0
100000
200000
300000
400000
1243
500000
600000
700000
800000
900000 1000000
Number of Function Evaluations Fig. 6. The drop in trim loss with respect to the number of fitness function evaluations (problem 20, 197 rectangles).
1244
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
fitness function evaluations geometrically through out the cooling schedule. The caveat being that the number of fitness function evaluations must be determined before the run is started. In preliminary experiments, we found that quality of the final result improved steadily as the number of fitness function evaluations allowed for the complete run was increased from 50,000 to 1,000,000. The GA, on the other hand, is more complex to adjust. Effectively using additional function evaluations is more difficult, as simply running the GA for more generations increases the problem of premature convergence. In general, much more tuning is necessary, such as increasing the population, changing the tournament size, detecting convergence, and restarting the GA. However, the GA is not without merit, as pointed out above, the GA did better on four of the six problems with 196–197 rectangles. In addition, as shown in Section 7.2, the GA clearly beat the PRSA algorithm where a study was performed with only 100,000 fitness function evaluations permitted. The choice between the PRSA algorithm
and the GA likely depends on the available computational resources. We now take a look at a few specific packing results to illustrate the characteristics of FC packings. Fig. 7 shows a perfect packing achieved for a 97 rectangle problem. The tendency for the larger, bulky rectangles to be packed first to the corners and sides is clearly visible. When a perfect packing is not achieved, the trim loss tends to accumulate in the center. This tendency holds true for all problem sizes investigated, and is clearly illustrated by the simple problem consisting of only 29 rectangles shown in Fig. 8. The rectangle packing structure observed here conforms to the intuitive method of placing the larger rectangles first in the corners and sides, then jiggling the smaller rectangles around in the center until a better solution is found. To give a general idea of the computer time necessary for the FC bin packing algorithm, Table 2 shows the estimated worst case execution time in seconds for 1,000,000 fitness function evaluations. These executions times were measured on an Intel
Fig. 7. Perfect packing of problem 17 with 97 rectangles (rotations permitted).
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
1245
Fig. 8. An example of a 29 rectangle problem illustrating how trim loss (white space) accumulates in the center.
Pentium 4 550 processor (3.4 GHz) using the PRSA algorithm with rotations permitted; caching of fitness evaluations was disabled, and the algorithm was not stopped when a perfect packing was reached. Standard compiler optimizations were enabled; however, the code was not hand optimized. We compared the PRSA algorithm using the FC heuristic to the mixed SA–GA introduced by Leung et al. (2003). For reference, the results of their mixed SA–GA from Table 11 of their paper are duplicated in Table 2. They give results for the SA–GA method using three different mutation rates on problem numbers 13–31 without rotations. For the SA–GA mutation rate of 0.3, PRSA with the FC heuristic produces better average results 16 times; the SA–GA method produces a better average result only once. For a mutation rate setting of 0.7, PRSA with the FC heuristic again produces better average results 16 times, SA–GA producing a better result only once. And for the final mutation rate setting of 1.0, PRSA with the FC heuristic produces better results 17 times, SA–GA not producing any better results. Better overall results are achieved by PRSA with the FC heuristic.
7.2. The benefits of a self-adapting EA In this section, we take a look at the benefits of selfadaption. The test problems with rotations permitted are used for this study. Initial runs with both the PRSA algorithm and the GA were performed with 100,000 fitness functions. With only 100,000 fitness function evaluations permitted, PRSA did not perform nearly as well as the GA. Hence, for the results presented in this section we vary the self-adapting GA parameters to analyze their effects on the GA. With many possible crossover operators and mutation rates, a complete comparison is impossible. Some hints are sought from the self-adapting GA runs. At the end of a typical run, observing the mutation rates appended to the genome, it was found that the swap mutation, rotation mutation, swap corners’ mutation rates decreased from their initial average values of 1.0, 1.0, and 0.1 to final average values of 0.3, 0.6, and 0.05. Ba¨ck (2000) indicates that many researchers have proposed that mutation rates be adjusted during a run using a decreasing schedule. The self-adapting GA follows this rule of thumb. The actual decreases were quite
1246
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
problem and run specific; however, the rotation mutation rate generally remained higher than the swap mutation rate. Since as evolution progresses, a decrease in mutation rate may be expected, it cannot be concluded that these lower rates are appropriate initial settings. Still, we may interpret these rates to suggest that the ratio between the swap mutation and rotation mutation might not be 1:1 but instead closer to 1:2 (these are the actual ratios we used for our fixed PRSA mutation rates). Similarly, the crossover tag indicated that cycle crossover was the most frequently surviving crossover operator. Using these observations as guidelines to select good fixed crossover operators and mutation rates, experiments were performed to analyze the benefits of using adaptive crossover operators and adaptive mutation rates. In addition, to analyze the sensitivity of the self-adaptive GA to adaptive mutation rate ranges, two different adaptive mutation rates are tested. Specifically, the following eight bin packing experiments were conducted. 1. Adaptive A – the self-adaptive GA as described in the common settings section of this paper. 2. Adaptive PRSA – The PRSA algorithm as described in the common settings section of this paper, but with the fitness function evaluations set to 100,000. 3. Adaptive B – Adaptive A settings, but with the self-adaptive mutation rates scaled down as indicated in Table 3. 4. Fixed PMX – Adaptive A settings, but with selfadaptive crossover disabled, PMX being the exclusive crossover operator. 5. Fixed Cycle – Adaptive A settings, but with selfadaptive crossover disabled, cycle crossover being the exclusive crossover operator. 6. Fixed Mutation A – Adaptive A settings, but with the mutation rates fixed respectively at the 1.0, 1.0, and 0.1 for the swap mutation, rotation mutation, and swap corners’ mutation (the expected value of the initial Adaptive A rates). 7. Fixed Mutation B – Adaptive A settings, but with the mutation rates fixed respectively at the Table 3 Self-adaptive mutation rate settings for Adaptive B Swap
Rate range
Rotation
Swap corners
Min
Max
Min
Max
Min
Max
0.33
0.633
0.067
1.267
0.010
0.190
0.333, 0.667, and 0.1 for the swap mutation, rotation mutation, and swap corners’ mutation (the expected value of the initial Adaptive B rates). 8. Fixed Mutation B and Cycle – Fixed Mutation B settings, but with the crossover operator fixed to cycle crossover. Table 4 summarizes the results of the eight experiments. For each of the eight experiments, the average trim loss of the ten runs is given. For each problem, the best average trim loss achieved is italicised. In the case of ties, all best results are italicised. The total number of best average trim loss achieved results is tallied. Overall, we find that Fixed Mutation B using adaptive crossover performed best with 11 best averages, followed by the completely Adaptive A and the completely unadaptive Fixed Mutation B and Cycle, both achieved 10 best averages. However, looking at the details, we find that although the fixed settings perform better on the problems with fewer rectangles, the fully adaptive settings do much better on the large problems with 73 or more rectangles. Comparing Adaptive A to the fully fixed Fixed Mutation B and Cycle, we find that Adaptive A produced better results on eight out of nine of the problems with 73 or more rectangles. Interestingly, cycle crossover performed better than the well-known PMX crossover operator on all 14 problems of 40 or more rectangles. On the smaller problems of less than 40 rectangles, the PMX crossover operator did better. As we anticipate that most real problems GAs will be used to tackle will be large, we recommend cycle crossover over PMX. When it comes to fixing mutation rates, Fixed Mutation B performed better than Fixed Mutation A 28 times, the latter never achieving a better result. The GA results are quite sensitive to fixed mutation rates. Although Adaptive A betters Adaptive B 20 times, Adaptive B still performed better 9 times. We find that the GA using adaptive mutation rates, is not as sensitive as the GA using fixed mutation rates is to the initial parameter settings. One might be able to find completely fixed GA parameter settings such as Fixed Mutation B and Cycle that work well on a particular class of problems, however, finding these parameter settings is generally, time consuming. In addition, verifying that the GA will continue to perform well new problems believed to be in the class is difficult. We con-
K.J. Binkley, M. Hagiwara / European Journal of Operational Research 183 (2007) 1230–1248
1247
Table 4 Comparison of the various EA parameter settings, average trim loss is shown, and the best average trim loss for each problem is bold Problem #
# of rects
Adaptive A
1 16 0.00450 2 17 0.02450 3 16 0.00000 4 25 0.01033 5 25 0.00400 6 25 0.00200 7 28 0.00894 8 29 0.01156 9 28 0.00889 10 49 0.00725 11 49 0.00656 12 49 0.00558 13 73 0.00580 14 73 0.00730 15 73 0.00491 16 97 0.00962 17 97 0.00493 18 97 0.00593 19 196 0.02725 20 197 0.01143 21 196 0.02063 22 10 0.00000 23 15 0.00000 24 20 0.01913 25 20 0.03529 26 25 0.02391 27 25 0.00400 28 30 0.01422 29 30 0.00492 30 40 0.01867 31 50 0.01850 # of best averages on problems with: