Multi-Project Scheduling using Competent Genetic ... - Semantic Scholar

Multi-Project Scheduling using Competent Genetic Algorithms

Ali A. Yassine+ Department of Industrial & Enterprise Systems Engineering University of Illinois at Urbana Urbana, IL 61801 USA [email protected]

Christoph Meier Institute of Astronautics Technische Universität München 85748 Garching, Germany [email protected]

Tyson R. Browning Neeley School of Business Texas Christian University Fort Worth, TX 76129 USA [email protected]

University of Illinois Department of Industrial & Enterprise Systems Engineering (IESE) Working Paper

Current version: Feb. 1, 2007

+

Corresponding author.

The second author is grateful for support from the Bavarian Science Foundation.

ABSTRACT In a multi-project environment, many projects are to be completed that rely on a common pool of scarce resources. In addition to resource constraints, there exist precedence relationships among activities of individual projects. This project scheduling problem is NP-hard and most practical solutions that can handle large problem instances rely on priority-rule heuristics and meta-heuristics rather than optimal solution procedures. In this paper, a Competent Genetic Algorithm (CGA), hybridized with a local search strategy, is proposed to minimize the overall duration or makespan of the resource constrained multiproject scheduling problem (RCMPSP) without violating inter-project resource constraints or intraproject precedence constraints. The proposed Genetic Algorithm (GA) with several varied parameters is tested on sample scheduling problems generated according to two popular multi-project summary measures, Average Utilization Factor (AUF) and Average Resource Load Factor (ARLF). The superiority of the proposed CGA over simple GAs and well-known heuristics is demonstrated.

Keywords: Multi-Project Scheduling, Resource Constraints, Competent Genetic Algorithm, Heuristic Priority Rules Nomenclature (Project Scheduling)

Nomenclature (Genetic Algorithm)

m n nh h Vz di i→j Pred(i)

Number of projects Total number of activities Number of activities in project h Index of Project Set of activities {a(1)…a(nh)} in project h Processing time for activity i Predecessor relationship Set of predecessors for activity i

Rk

Set of renewable resources of type k

rihk

Per period usage of activity i of resource k in project h

Kh θ θT θR Cmax ARLF AUF CPh

Number of types of resources used by project h Set of feasible schedules Set of precedent feasible schedules Set of resource feasible schedules Vector of task completion times Average Resource Loading Factor Average Utilization Factor Non-resource-constrained critical path duration of project h Boolean variable, true (equal to 1) if activity i of project h is active at time t Equal to -1 if the part of activity i of project h is active at time t ≤ CPl / 2; otherwise equal to 1 Number of time intervals spanning a problem

SGA CGA BB c kBB l mBB npop pcut pκ pm ps s

Xiht Ziht S

μ Φp Φg

Simple Genetic Algorithm Competent Genetic Algorithm Building Block Constant factor Order or size of BBs Current chromosome/problem length Number of BBs Population size Cut probability Bitwise cut probability Mutation probability Slice probability Size of tournament used in the Tournament Selection phase Calibration coefficient Phenotypic search space Genotypic search space

1. INTRODUCTION Due to increasingly impatient customers and competitive threats, improvements in the efficiency with which projects are completed and new products are brought to market have become increasingly important. To complicate things further, many organizations are faced with the challenge of managing the simultaneous execution and management of a portfolio of projects under tight time and resource constraints. In such an environment, project management and scheduling skills become very critical to the organization. “Multi-project environments seem to be quite common in project scheduling practice…. It has been suggested [65,75] that up to 90%, by value, of all projects are carried out in the multi-project context, and thus the impact of even a small improvement in their management on the project management field could be enormous” [34]. In this paper, we address the case of a portfolio of simultaneous projects with identical start times. Each project consists of precedence-constrained activities that draw from common pools of resources, which are usually not large enough for all of the activities to work concurrently. In such cases, which activities should get priority? The goal is to prioritize them so as to optimize an objective function, such as minimizing the delay of each project or of the whole portfolio. Such is the basic resource-constrained multi-project scheduling problem (RCMPSP). In a RCMPSP environment, a company has m concurrent projects P1…Pm, each comprised of a set of activities Vz = {a(1)…a(nh)}, where nh specifies the total number of activities in project Ph. In addition, any activity i has several associated attributes, such as its duration di and the types and amounts of resources required. Each project will have a corresponding precedence network whose structure is often depicted by an activity-on-node network. The predecessor relationship between two activities i and j is denoted by i→j or (i,j), and the entire set of predecessors of activity j is denoted by the term Pred(j). Although projects may be unrelated by precedence constraints, they depend on a common pool of resources and are therefore related by resource constraints. We consider a set of renewable resources where the per-period usage of activity i of resource k in project h is written as rihk . The set Rk constitutes the constant number of resource k available during every time period. We assume that a resource must be devoted to an activity until it is completed before beginning another activity (i.e., no preemption is allowed). Moreover, we assume the single-mode case, where a single resource type is assigned for 1

performing a particular activity, and the processing time pi and the resources required rihk for any activity i are fixed. When considering the set of feasible project schedules θ = θT ∩ θR, where θT denotes the set of precedent-feasible schedules and θR denotes the set of resource-feasible schedules, there exist many possible θs and many potential objectives for choosing between them. If n defines the total number of activities in all m projects, popular objectives include minimizing the maximum project makespan (i.e., minimizing Cmax = max{d1…dn}), maximizing the net present value (NPV), maximizing resource leveling, or minimizing project costs [8,46]. This paper utilizes a new genetic algorithm (GA) approach to solve the RCMPSP; however, in contrast to previous research, we use a state-of-the-art competent GA (CGA) design. This design differs from simple GA (SGA) strategies as it strives to identify highly fit hyperplanes in the search space, called building blocks (BBs), which are subsequently combined in a sophisticated manner. The resulting project schedules are expected to be superior compared to those produced by priority rule heuristics or SGAs. To cope with the precedence and resource constraints in the RCMPSP, we introduce an efficient repair mechanism that ensures feasibility throughout the search. We further enhance the performance of the CGA with a local search strategy tailored for the RCMPSP. The performance of this approach is thoroughly tested on 77 test problems, carefully constructed according to two popular multi-project summary measures, the Average Resource Load Factor (ARLF) and the Average Utilization Factor (AUF), stated by Kurtulus and Davis [48]. As a result, we discovered that the parallel schedule generation scheme (SGS) outperforms the self-adapting SGS in combination with the proposed CGA. Furthermore, the proposed CGA outperforms many well-known priority-rule-based heuristics in 78% of the problems. The paper proceeds as follows. Section 2 provides background on activity scheduling and SGAs, after which §3 presents the design and benefits of a CGA for combinatorial problems and §4 describes its tailoring to RCMPSPs. §5 evaluates the performance of the proposed CGA compared with a SGA on the RCMPSP test bank. In §6, we present comparative results of the CGA with 20 popular heuristic priority rules used in the literature. The paper concludes in §7 with a brief summary of the work completed and possible extensions for future work.

2

2. BACKGROUND Project scheduling is of great practical importance and its general model can be used for applications in product development, as well as production planning and a wide variety of scheduling applications. Early efforts in project scheduling focused on minimizing the overall project duration (makespan) assuming unlimited resources. Well-known techniques include the Critical Path Method (CPM) [40] and the Project Evaluation and Review Technique (PERT) [56]. Scheduling problems have been studied extensively for many years by attempting to determine exact solutions using methods from the field of operations research [46]. It was earlier shown that the scheduling problem subject to precedence and resource constraints is NP-hard [49], which means that exact methods are too time-consuming and inefficient for solving the large problems found in real-world applications. There exist benchmark instances with as few as 60 activities that have not been solved to optimality [31]. Kolisch [46] surveyed a number of techniques developed for resource-constrained project scheduling, including dynamic programming, zero-one programming, and implicit enumeration with branch and bound. Some examples of exact solution methods can be seen in [8,15,16,72]. Among them, the branch and bound approach is the most widely applied. However, its depth-first or breadth-first searches cannot exhaustively explore a large-scale project scheduling problem. Simulation modeling provides a new angle to view the RCMPSP. A simulation model is proposed for multi-project resource allocation, interpreted as multi-channel queuing [20]. The innate drawback of simulation is time and cost, as well as deploying a particular simulation language that could hinder its dissemination. Finally, many different heuristic approaches have been developed to solve intractable problems quickly, efficiently, and fairly satisfactorily.

A survey of

heuristic approaches can be found in [8,45]. GAs, first proposed in [35], are adaptation procedures based on the mechanics of genetics and natural selection. These algorithms are designed to act as problem-independent algorithms, which is at times contrary to the field of Operations Research, where algorithms are often matched to problems. In brief, a simple GA works as follows. A feasible instance of the underlying problem is encoded as a socalled chromosome, while multiple chromosomes form a GA population. By selecting the fittest chromosomes (i.e., the ones with the highest value according to an objective function) and applying the

3

ordinary genetic operators—selection, crossover, and mutation—the population is expected to improve over time. The GA proceeds until a predefined convergence criterion is reached. Figure 1 depicts this simple circular flow.

Figure 1: Simple GA Flowchart

In terms of the scheduling problem, GAs were first used by Davis [13]. Since then, a vast literature on the application of GAs to various scheduling problems has emerged [10,17,19,37,67,78]. With a special focus on a single-project problem1, Hartman developed a GA with permutation-based encoding [31] and introduced a self-adapting representation scheme which determines automatically the best schedule decoding procedure [32]. Gonçalves et al. [27] used a SGA approach for the RCMPSP based on a random key chromosome encoding and a schedule generation procedure which creates so-called parameterized active schedules. Recently, Valls et al. [79] proposed a hybrid GA tailored to the RCPSP with a specific crossover and local search operator for the. Despite slight modifications of the GA flow illustrated at Figure 1, past and recent publications on the application of GAs to the RCMPSP/RCPSP have nevertheless the SGA design in common. We will explain in the subsequent sections why a SGA design, even with improvements through local search operators or specific crossover and mutation operators, is eventually inferior to a state-of-the-art CGA design.

1

This problem is known as the resource constrained project scheduling problem (RCPSP).

4

3. DESIGN AND IMPLEMENTATION OF A COMPETENT GA Past research on GA applications to scheduling problems is characterized by using a SGA design as illustrated in Figure 1. This design can be enhanced by several techniques. Some of these improvements include niching [55], parallelization [9], or the hybridization of the GA, which is oriented more towards global searches, with an efficient local search strategy [60]. Despite all of the well-known enhancement techniques, a SGA fails to provide adequate solutions continuously as problem difficulty increases [23]. A mathematical explanation for this statement is the dimensional analysis of the building block (BB) mixing process in SGAs [74]. BBs are useful hyperplanes in the search space, called schemata, which can be understood as portions of a solution contributing much to the global optimum [22]. Goldberg [22] claims that the constant juxtaposition and selection of BBs forms better and better solutions over time, leading to a global optimum in a search space. In a study of the BB mixing process, Thierens [74] pointed out the exponential scale-up in computational expense with linearly increasing problem difficulty. He defined the following equation for the population size necessary for a successful GA:

n pop (ln n pop + ln(ln n pop )) > c

2 m BB + μk BB ⋅ ln s ⋅ 2m pc m 2.5

(1)

where kBB denotes the BB order/size, mBB corresponds to the number of BBs, npop is the population size, μ can be regarded as a calibration coefficient, and c is a constant. Since problem difficulty can be expressed in terms of the length and order of the BBs, equation (1) demonstrates its negative effects in terms of computational resources. To tackle the mixing problem in a SGA, several so-called CGAs were developed. Three different approaches to CGAs can be distinguished: (a) Perturbation techniques, (b) Linkage adaptation techniques, and (c) Probabilistic model-based techniques. Examples of the first approach include the fast messy GA (FMGA) [26], the ordering messy GA (OmeGA) [41], and the gene expression messy GA (GEMGA) [39]. An example of the second approach is the linkage learning GA, introduced by Harik [28]. The third approach includes the compact genetic algorithm [29] and the Bayesian optimization algorithm (BOA) [66]. At present, OmeGA is the only CGA constructed for combinatorial problems like scheduling or the quadratic assignment problem. Essentially, OmeGA combines the FMGA with random keys to represent permutations. Empirical tests by Knjazew [41] on artificial test functions show a promising sub-quadratic scale-up behavior of resources with problem length, O(l1.4), where l = kBB·mBB. 5

Therefore, in the remainder of this section, we present the OmeGA and its application to the RCMPSP.

3.1.

Components of the OmeGA

3.1.1.

Data Structure

Using any GA as an optimization technique requires the existence of a proper data structure for manipulation. Each instance of this structure represents one point in the space of all possible solutions. In the context of GAs, this data structure is usually called a chromosome, which is a juxtaposition of genes. Genes occur at different locations or loci of the chromosome and have values which are called alleles. While the term genotype refers to the specific genetic makeup of an individual in natural systems and corresponds to the structure of a GA, the term phenotype corresponds to the decoded structure of GAs, which can be regarded as one point in the search space. To facilitate linkage learning by permitting genes to move around the genotype, the OmeGA uses a different representation technique than ordinary GAs: each gene is tagged with its location via the pair representation . For example, the two messy chromosomes shown in Figure 2 both represent the permutation (1-34-89-15-13-19). Such permutations constitute a schedule priority list for the RCPSP.

Figure 2: Illustration of a Messy Chromosome Messy chromosomes may also have a variable length; they can be overspecified or underspecified. As an example, consider the chromosomes in Figure 3. The problem length is 6 in this example, so the first chromosome is overspecified since it contains an extra gene. The second chromosome is underspecified because it contains only three genes. To handle overspecification, Goldberg [26] proposes a gene expression operator that employs a first-come-first-served rule on a left-to-right scan. In the example of Figure 3, the gene assigned to locus 1 occurs twice in chromosome A. Thus, the left-to-rightscan drops the second instance, obtaining the valid permutation (1-34-89-15-13-19) instead of (99-34-8915-13-19). In the case of underspecification, the unspecified genes are filled in using a competitive template, which is a fully specified chromosome from which any missing genes are directly inherited. At the start of the OmeGA, the genes of the competitive template are randomly chosen in consideration of 6

feasibility issues. For example, using the competitive template shown in Figure 4, the underspecified chromosome B is completed by inheriting genes 4 through 6 from the competitive template.

Figure 3: Messy Chromosomes May Have Variable Length

Figure 4: Use of a Competitive Template on Underspecified Chromosomes Representing alleles as ordinary integer values, SGAs for combinatorial problems typically utilize an integer encoding for the chromosomes. Therefore, various GA operators have been developed to maintain feasibility in terms of allele duplication in the population when using integer encoding [22,68].2 In contrast, the OmeGA uses a binary coding representation where messy chromosomes are encoded through random keys [2], as demonstrated in Figure 5. Each gene on the chromosome is assigned a number ri ∈ [0,1]. Then, the permutation sequence is determined by sorting the genes in ascending order of their associated random keys. This encoding has the advantage that any crossover operator can be used, since random keys always produce duplicate-free solutions for combinatorial problems. Moreover, information about partial relative ordering is preserved in crossover [41]. Floating point numbers are typically used as ascending sorting keys to encode a permutation. These numbers are initially determined randomly and change only under the influence of mutation and crossover. Accordingly, a permutation of length l consists of a vector r = {r1, r2, ..., rl}.

Figure 5: Demonstration of Random Keys 2

These operators do not ensure predecessor- or resource-feasibility.

7

3.1.2.

Fitness Function

The second component of any GA is the fitness function. Every optimization technique must be able to assign a measure of quality to each structure in the search space to distinguish good and bad results. For this purpose, GAs use a fitness function to assign each individual chromosome a fitness value. Generally, an optimization problem can be decomposed into a genotype-phenotype mapping fg and a phenotype-fitness mapping fp [52]. Assuming a genotypic search space Φg, which can be either discrete or continuous, the fitness function f assigns each element in Φg a value as follows: f ( x) : Φ g → \ . According to the fitness function decomposition, the genotype-phenotype mapping occurs first, where the genotype elements are mapped to elements in the phenotypic search space Φg: f g ( x g ) : Φ g → Φ p . Subsequently, the phenotype-fitness mapping Φp is performed: f p ( x p ) : Φ p → \ . Thus, the fitness function can be regarded as a composition of both mappings: f = f p D f g = f p ( f g ( x g )) . Due to the frequent fitness function evaluations, its efficient implementation is crucial for gaining adequate computational processing times. One opportunity for speeding up this operation is the use of parallelization techniques [9], which can be applied to some extent in the OmeGA. In their comprehensive survey of the project scheduling literature, Kolish and Padman [46] provide an overview of objectives for scheduling problems, including traditional ones such as makespan and cost minimization but also more recent ones like maximization of project quality. We use total project lateness as the RCMPSP performance measure to be minimized, so we use the following fitness function for the OmeGA: m

min

∑D

Rz

− Dz

(2)

i =1

where DRk corresponds to the duration of project z under consideration of resources available and Dz denotes the duration of project z neglecting any resource constraints.

3.2.

Mechanics of the OmeGA CGAs operate in a fundamentally different way from SGAs (shown in Figure 1). Figure 6 shows the

overall flow of the OmeGA, which iterates over epochs, each of which contains two loops, an inner and an outer. The outer loop is called an era and corresponds to one BB level.3 The inner loop consists of 3

A BB “level kBB” denotes the processing of BBs of maximum size kBB.

8

three stages: (1) Initialization Phase, (2) BB Filtering / Primordial Phase, and (3) Juxtapositional Phase. An important property of any good optimization technique is the ability to avoid local optima. For this purpose, the OmeGA uses level-wise processing. After each era (level), the best solution found so far is stored and used as the competitive template for the next era. We will now explain the three phases of each era more in detail.

Figure 6: The Flow of the OmeGA 3.2.1.

Initialization Phase

In the original study of the messy GA, the authors claimed that a population size with a single copy of all substrings of order/length kBB (ensuring the presence of all BBs) is necessary to detect solutions near the global optimum [24]. But the population size needed to guarantee such a copy is extremely high: ⎛l ⎞ l! ⎟⎟ = 2 k BB n pop = 2 k BB ⋅ ⎜⎜ k ( k ! l − k BB )! ⎝ BB ⎠ BB

(3)

This equation results from the number of ways to choose BBs of order kBB from a string of length l, multiplied by the number of different BB combinations (assuming a binary alphabet). This exponential demand of resources would cause a serious problem, known as an initialization bottleneck, but Goldberg et al. [26] found a way to overcome it. They predicted the theoretical population size for a probabilistically complete initialization and found only a linear scale-up with BB number and size. They verified this formula empirically on artificial functions. Unfortunately, the BB structure (size and number of BBs) of the problem is usually unknown. Hence, in practice, population size is normally determined by empirical tests. Nevertheless, we can estimate the required population size for a successful convergence of the OmeGA, since its design is identical to that of the fast messy GA except for its encoding scheme. During the Initialization Phase, the length of the chromosomes can be chosen 9

arbitrarily in the interval between kBB and l. Note that in each era a completely new population is initialized. The size of the population, popt, at level (or era) t, assuming a starting level 1, is given by: t

∑ pop . Thus, the overall population size is the sum of population sizes for each era. i

i =1

3.2.2.

Building Block Filtering Phase

A common characteristic of every CGA is a way of identifying the BBs. According to the BB Hypothesis [22], these specific hyperplanes in the search space can be subsequently combined to form solutions near the global optimum. In the OmeGA the BB Filtering Phase performs this essential task through repeated selection and random deletion of genes. At the beginning of the BB Filtering Phase, which is depicted in Figure 7, chromosomes arrive from the Initialization Phase with a length almost equal to the problem length. First, a Selection Segment probabilistically filters highly fit chromosomes from less fit ones. The current competitive template is used to complete the genotype of under- or overspecified chromosomes. Then, in a Length Reduction Segment, random deletion cuts chromosomes down to a length equal to the current BB level. Assuming the Selection segment provides sufficiently good genes, the random deletion is not expected to destroy all the BBs. The mathematical equations calculating the number of selections and deletions to be performed can be found in [24,41]. As a selection scheme, we use Tournament Selection without replacement and a tournament size of 4, due to the reasons mentioned in [54,64]. 3.2.3.

Juxtapositional Phase

In addition to identification, BBs must be properly recombined. For this purpose, the OmeGA incorporates a Juxtapositional Phase, which corresponds to the crossover stage in SGAs. Instead of traditional crossover operators like uniform crossover [73], the OmeGA uses cut and splice operators [24], as demonstrated in Figure 8. The cut operator divides a chromosome into two parts with cut probability pcut = pκ (l – 1), where l is the current length of a chromosome and pκ is a specified bitwise cut probability. Knjazew [41] suggests keeping l ≤ 2n, gaining a pκ set to the reciprocal of half of the problem length, pκ = 2 / n. In contrast to the cut operator, the splice operator connects two chromosomes with probability ps, where ps is usually chosen rather high. The OmeGA thereby combines the identified BBs to find the optimum in exactly the same manner suggested by the BB Hypothesis. Note that the population size for each Juxtapositional Phase is held constant. 10

Fitness

Chromosomes entering from Initialization Phase

5

14

16

7

Fitness

Chromosomes after Selection Segment

14

14

16

16

Fitness

Chromosomes after Length Reduction Segment

?

?

?

?

Figure 7: Illustration of the BB Filtering Phase

Figure 8: Examples of Cut and Splice Operators

4. APPLYING THE OmeGA TO PROJECT SCHEDULING PROBLEMS We have found that it is important to be careful in applying any GA to a RCMPSP, because slight nuances in the GA’s attributes can have important implications for solution quality and efficiency. 11

Therefore, we use this section to provide some details of the OmeGA’s application to the RCMPSP.

4.1.

Preserving Predecessor Feasibility Applying the OmeGA to the RCMPSP requires addressing the issue of predecessor feasibility. The

outcome of the OmeGA is a permutation of the priority list for all activities to be scheduled. Based on this list, we explain in the next section how a so-called schedule generation scheme (SGS) builds a real schedule with the start and finish times for each activity. However, a prerequisite for any SGS is a precedent-feasible schedule list. Since the mechanics4 and the random key encoding in the OmeGA merely ensure the prevention of duplicate alleles in chromosomes, an additional, efficient repair mechanism is required to transform any schedule list into a precedent-feasible one. In the case of predecessor violation, a straight-forward repair strategy is to iterate the processing step which caused the violation (e.g., the Juxtapositional Phase) until all constraints are satisfied. Considering the great discrepancy between the immense number of possible permutations (n! for n activities) and the number of feasible solutions, it becomes obvious that such a brute force method would be too time consuming. Therefore, we handle predecessor constraints in another deterministic and efficient way, using a repair mechanism prior to schedule construction and fitness function assignment. Predecessor conflicts cannot occur between activities which can be executed concurrently—i.e., activities which do not rely on predecessor information at the same point in time Ti. Assuming we start with an empty schedule list at time T0, we can calculate the set of parallel activities at T0 and subsequently pick an activity out of this set according to a deterministic strategy. The chosen activity is then appended at the end of the current priority list and all its relationships within the network of activities are deleted. Repeating this procedure until all activities have been assigned to a spot in the schedule list, we will never violate any predecessor constraints. The pseudo-code shown in Figure 9 describes the repair mechanism more in detail. As input, the algorithm needs the permutation to be mapped, q, the number of projects, m, the number of activities, n, the three-dimensional array of all projects DSM[m][n][n] (explained below), and two auxiliary variables, i and j. The output is a precedent-feasible schedule list, s. The algorithm identifies the first activity of q without precedent activities and assigns it to spot i in s. Then, all dependencies on the selected activity 4

In particular, the crossover operators that preserve predecessor feasibility in SGA designs, such as Union Crossover [9], cannot be applied in the OmeGA and are replaced by cut and splice operators.

12

are deleted from the design structure matrix (DSM).5 This simple algorithm scales up in complexity O(n2).

Input: Integer ScheduleList Array

i, j, m, n; q[n]; DSM[m][n][n];

Output: Feasible schedule list s[n] i ← 0; j ← 0; s ← new ScheduleList[n]; WHILE i < n FOR j to n-1 IF q[j].numberOfPredecessors = 0 AND q[j].isScheduled = false THEN s[i] ← q[j]; BREAK; ENDIF j ← j+1; ENDFOR j ← 0; FOR j to DSM[s[i].projectID].length-1 DSM[s[i].projectID][j][s[i].columnID] ← 0; ENDFOR j ← 0; q[s[i]].isScheduled ← true; i ← i+1; ENDWHILE

Figure 9: Pseudo-Code for Mapping any Permutation to a Feasible Schedule List As an example, consider two projects, each with four activities, modeled by the two DSMs in Figure 10(a) and a permutation representing a schedule list in Figure 10(b). The DSMs indicate the precedence relationships between the activities—e.g., activity 1 precedes activity 4 and activity 2 precedes activity 3. The permutation q = {3-7-1-8-5-4-2-6} does not yield a precedent-feasible schedule since, for instance, activity 3 is scheduled prior to activity 2. Applying the algorithm in Figure 9 leads to the following results. Activities 1, 2, 5, and 6 do not depend on any other activities in the set and thus comprise the initial set of parallel activities. The first value in this set which also occurs in q is {1}. Thus, the first value of the feasible schedule list q must be 1: q[0] = 1. After deleting the row and column for activity 1 in DSM 1, the next iteration of the algorithm begins, detecting a new set of parallel activities: {2,4,5,6}. In this set, activity 5 is the earliest one in q and consequently q[1] = 5 holds. The row and column for activity 5 are then deleted and a new loop starts. Repeating all steps of the algorithm until convergence, we obtain the precedent-feasible schedule list in Figure 10(c), q = {1,5,7,8,4,2,3,6}. 5

A DSM is an efficient and commonly used method of showing the relationships between with the activities in a project [5]. Essentially, it can be understood as the precedence matrix representation of an activity-on-node network. Given a set of n activities in a project, the corresponding DSM is an n × n matrix where the activities are the diagonal elements and the offdiagonal elements indicate the precedence relationships.

13

(b) (c)

1

5

7

8

4

2

3

6

Mapped Permutation, q

(a) Figure 10: Two Projects (a) Modeled by DSMs with (b) an Unmapped Schedule Priority List and (c) a Precedent-Feasible Schedule List after Executing the Proposed Repair Mechanism

4.2.

Schedule Generation Schemes To assign a fitness value (according to a predefined objective function) to a permutation (schedule

list) in the OmeGA, a schedule generation scheme (SGS) is necessary for building a real schedule out of a schedule list. Boctor [4] distinguishes between a “serial” and a “parallel” SGS. In a serial SGS, each activity’s priority is calculated once, at beginning of the SGS algorithm, whereas in a parallel SGS an activity’s priority is re-determined as necessary at each time step. The serial SGS proceeds as follows. First, the overall problem duration is broken down into N stages, where N is the total number of activities to be scheduled. This SGS separates the activities into two mutually exclusive and disjoint sets: the scheduled set, S (already scheduled activities), and the decision set, D (unstarted activities that depend only on activities in S). In each stage, one activity is selected from D and scheduled at its earliest precedence- and resource-feasible start time [40], which moves it to set S. The parallel SGS proceeds as follows. First, the overall problem duration is broken down into time steps. At each time step, the algorithm separates the activities into four mutually exclusive and disjoint sets: the complete set, C (finished activities), the active set, A (ongoing, “already scheduled” activities), the decision set, D (unstarted activities that depend only on activities in C), and the ineligible set, I (activities which depend on activities in A or D). Since preemption is not allowed, the SGS automatically assigns resources to activities in A. If the remaining resources are sufficient to perform the activities in D, then the algorithm adds these to A. If not, then it uses a priority rule to rank the activities in D. The highest-ranking activities are added to A as resources allow. The time step ends when the shortest activity (or activities) in A finishes. Finished activities are moved to C, and activities in I are checked for potential transfer to D. The schedule is complete (i.e., the project duration is known) when all activities are in C. 14

Unfortunately, it is impossible to predict in advance which SGS will perform best for an arbitrary RCMPSP. As the serial and parallel SGSs exhibit two different behaviors, thus potentially resulting in two unequal schedules for an identical schedule priority list [32], determining which scheme is best becomes an optimization problem in itself. To address this dilemma, Hartmann [32] proposed a GAbased heuristic, the self-adapting GA, to help determine if the serial or the parallel SGS is better suited for the underlying problem. Instead of selecting a SGS in advance, the self-adapting GA allows chromosomes to be evaluated via the parallel or the serial SGS. While the self-adapting GA proceeds, more fit chromosomes will prevail, not only with respect to their schedule list but also in terms of their SGS. For this purpose, the self-adapting GA maintains an additional gene for each chromosome that uniquely specifies its SGS. The modifications necessary to permit such flexibility in the choice of the SGS are described in [32] and were easily incorporated into the OmeGA.

4.3.

Hybridization with the 2-opt Heuristic In general, a GA tends to explore the search space via its crossover operator rather than extensively

exploiting specific regions through mutation. This is mainly due to the high crossover probability and the low mutation probability, both of which are necessary for a successful GA.6 However, an exploitation of interesting regions within the search space can usually be accomplished efficiently and effectively. Besides, sometimes local and global optima can be identified only by a fine-grained local searcher, since single peaks would be too difficult to detect for a coarsely-grained crossover operator. Both the SGA and the OmeGA can be extended with an efficient local search strategy to combine the positive traits of both approaches. In SGAs, local search is typically applied after crossover and mutation. (Another interesting opportunity is to incorporate the local search into the fitness function itself.) Aside from its appropriate placement in the GA flow, the choice of a suitable local search strategy for the underlying problem is crucial. A potential approach would comprise the application of a wellknown priority rule heuristic (see section 5) for the RCMPSP to one chromosome in the initial GA population and to the competitive template after each era. In this case, the initial GA population would quickly increase its average fitness and the final outcome of the GA would never be worse than the outcome of the priority rule itself. Although this local search approach sounds promising, it would in fact be a poor choice. The initial chromosome produced by a priority rule heuristic would too quickly 6

For more information on a reasonable adjustment of crossover and mutation probability, the reader may refer to [22,58,61,74].

15

dominate the entire GA population, resulting in a premature convergence of the GA.7 Typically, this phenomenon arises if the chosen selection scheme provides too much “selection pressure” or intensity. In other words, by placing a highly-fit chromosome into the initial population, the GA is precluded from properly seeking interesting search regions that might contain the global optimum. Instead of a local searcher that strives to push chromosomes to the optimum quickly, thereby hazarding to get stuck in a local optimum, we favor a strategy which increases the average fitness of the GA population at a more moderate pace, thereby exploring more of the search space. Therefore, we decided to combine the OmeGA with a tailored 2-opt heuristic for the RCMPSP. In the RCMPSP, the 2opt neighborhood is defined as the set of all feasible solutions that can be reached by a swap of two elements in the priority list. Since the OmeGA initializes a new population in each era and does not embody separate crossover and mutation phases, but a Juxtapositional Phase instead, we invoke our local search approach at the end of each era. Furthermore, to strike a good balance between computational time and effectiveness, we apply the local search only to the competitive template. Although we execute the 2opt heuristic just once per era, we do it for the most promising chromosome, which serves as a template for upcoming chromosomes in the next era. Tailoring the fundamental concept of the 2-opt heuristic to the RCMPSP can be accomplished as follows. Assuming a precedent-feasible schedule list, s, the DSM, and several auxiliary variables, as noted in Figure 11, s must be decomposed into its independent sets of parallel activities (a vector of vectors) followed by a 2-opt search in each of these disjoint sets (see Figure 12). For example, consider the DSMs and s in Figure 10. Due to the predecessor constraints, it is obvious that a traditional swap of two alleles is not generally possible, as it would often lead to a predecessor violation—e.g., exchanging activities 5 and 8 must not be allowed. However, if we determine the three disjoint sets of parallel activities in s—namely {1,2,5,6}, {3,4,7} and {8}—we can modify the order of activities within these sets as they appear in s without violating any predecessor constraints. In particular, we can apply a swap between two activities in every parallel set, leading to a new schedule list which is subsequently evaluated by a SGS. Figure 12 demonstrates the application of this algorithm, which produces nine feasible schedule lists that differ in exactly two positions from s. Generally, the total number of

7

At this time only mutation can produce slightly new chromosomes. However, the mutation probability is typically set very low. Hence, the chance to generate new genotypes exists but is unlikely.

16

exchanges for a schedule list with z parallel sets of activities is: z

∑ i =1

Input: Integer ScheduleList Vector Array

2

zi − zi 2

.

h, i, j, m, n, projectID,swapID_1,swapID_2; s[n], t[n]; parallel; DSM[m][n][n];

Output: Performs 2-opt heuristic for the RCMPSP h ← 0; i ← 0; j ← 0; WHILE i < n Vector band ← new Vector(); FOR j to n-1 IF s[j].numberOfPredecessors = 0 AND s[j].isScheduled = false THEN s[j].isScheduled ← true; band.add(s[j]); i ← i+1; ENDIF j ← j+1; ENDFOR parallel.add(band); j ← 0; FOR j to band.size()-1 projectID ← band.elementAt(j).projectID; FOR h to DSM[projectID].length-1 DSM[projected][h][band.elementAt(j).columnID]← 0; h ← h+1; ENDFOR h ← 0; j ← j+1; ENDFOR parallel.add(band); j ← 0; ENDWHILE h ← 0; i ← 0; j ← 0; FOR i to parallel.size()-1 FOR j to parallel.elementAt(i).size()-2 swapID_1 ← parallel.elementAt(i).elementAt(j); h ← j+1; FOR h to parallel.elementAt(i).size()-1 swapID_2 ← parallel.elementAt(i).elementAt(h); t • s; t[swapID_1.spotInList] ← swapID_2; t[swapID_2.spotInList] ← swapID_1; CALL SGS with t; h ← h+1; ENDFOR j ← j+1; ENDFOR j ← 0; i ← i+1; ENDFOR

Figure 11: Pseudo-code for 2-opt Heuristic in the RCMPSP8

8

The algorithm in Figure 11 describes the 2-opt heuristic only for one chromosome.

17

(4)

1

Precedent feasible schedule list s

Vector parallel of parallel activity sets

Precedent feasible schedule lists t after local search

5

7

1

2

5

6

8

4

2

3

3

4

7

6

8

2

5

7

8

4

1

3

6

1

5

7

8

3

2

4

6

5

1

7

8

4

2

3

6

1

5

3

8

4

2

7

6

6

5

7

8

4

2

3

1

1

5

4

8

7

2

3

6

1

2

7

8

4

5

3

6

1

5

7

8

4

6

3

2

1

6

7

8

4

2

3

5

Figure 12: Demonstration of Local Search in the RCMPSP

5. COMPUTATIONAL RESULTS 5.1.

OmeGA (CGA) vs. SGA In order to to illustrate the superior performance of CGAs compared to SGAs, we tested both

designs without local search extensions on artificial functions which allow to scale the level of difficulty. While a search space can exhibit many problem characteristics like epistasis [14], noise [59], or symmetry [36], the GA community mainly examines GA performance using so-called deceptive functions [21]. Deceptive functions attempt to mislead the GA to local optima by assigning low fitness values to solutions near the global optimum and high fitness values to solutions far from the optimum. The global optimum is thereby isolated and surrounded by global minima like the “needle in a haystack” problem. Kargrupta et al. [38] introduced two deceptive functions for combinatorial problems which spawn an extremely difficult search space for a GA. Therein, k deceptive sub-functions (which can be regarded as BBs) of order m are combined into one deceptive function of length l in a way that only one global optimum exists among l! / (k!)m local optima. The global optimum can be detected only if the GA subsequently identifies all k sub-functions—i.e., all sub-problems of the overall problem. Problem difficulty can be scaled not only by the number and length of deceptive sub-functions but also by their corresponding encoding. In short, two encoding schemes exist, tight and loose, with the loose encoding

18

causing more difficulties for a GA.9 For tests on deceptive functions, we used the parameters listed in Table 1. The reader may refer to [58] for detailed information on how to set these parameters. Due to prior knowledge of the BB structure of the problem (kBB equals the order of the deceptive function and l equals the length of the deceptive function) we roughly calculated population size using the equation proposed by Harik [30] and convergence time according to Goldberg [23]. Figures 13 and 14 display average fitness results for 10 independent test runs. In our initial tests, we demonstrate the advantage of the OmeGA (a CGA) on a deceptive function of order 4 and length 32 with a single global optimum (fitness value of 32): it performs equally well for easy (tight encoding) and hard (loose encoding) problems. While the SGA requires fewer function calls than the OmeGA to perceive the global optimum in case of tight-encoded deceptive functions, it clearly struggles if the search space becomes more difficult and is unable to detect the global optimum even after many fitness evaluations. Further tests on deceptive functions of varying length, depicted in Figure 14, exhibit the inability of SGAs to deal with difficult (loose encoding) problems. The computational resources necessary to reliably identify the global optimum scale up exponentially with problem difficulty (as predicted by equation 1), which is particularly problematic if problem size exceeds a value of about 16. Note that the population size depicted on the y-axis of Figure 14 had to be sufficient to detect the global optimum in 9 out of the 10 runs.

SGA

pc :

OmeGA (CGA)

1.0

pcut:

2 / npop

pm :

1 / 4n

ps:

1.0

Selection scheme:

TWOR with/without continuously updated sharing and tournament size 4 [58,70]

pm:

1 / 4npop

Crossover operator:

Position-based Crossover, Version 2 [62]

Selection scheme:

TWOR with tournament size 4 [70]

Mutation operator:

Shift mutation [62]

Number of eras per epoch:

4

Number of generations per epoch

60

Ratio of population size in eras 1-4:

1:1:2:6

Table 1: Test Parameters

9

The encoding scheme influences the so-called defining length [22] of the BBs or sub-functions—the greater the defining length, the greater problem difficulty.

19

(a) SGA

(b) OmeGA

Figure 13: Test Results for a Tight- and Loose-Encoded Deceptive Function of Length 32

Figure 14: Comparative Performance of the OmeGA and the SGA on Various Deceptive Functions

5.2.

Performance of the Hybrid OmeGA on Generated Test Problems Due to its problem independent performance (as demonstrated in the previous section), the CGA

design is much more suited for real world problems than the SGA design as problem difficulty is unknown in advance of optimization. In order to achieve best optimization results, we extended the OmeGA with the local search strategy explained in section 4.3. and tested it for 77 test problems. These problems were constructed based on two popular multi-project measures: the Average Resource Load Factor (ARLF) and the Average Utilization Factor (AUF) [48,80]. The ARLF identifies whether the bulk of a project’s total resource requirements are in the front or back half of its critical path duration10 and the relative size of the disparity. For project h, it is defined as:

10

Based on scheduling each activity at its early start time.

20

ARLFh =

1 CPh

CPh K ih nh

∑∑∑ Z t =1 k =1 i =1

iht

⎛r ⎞ X iht ⎜⎜ ihk ⎟⎟ ⎝ K ih ⎠

⎧−1 t ≤ CPh 2 ⎩ 1 t > CPh 2

where Ziht = ⎨

,

⎧1 if activity i of project h is active at time t , X iht = ⎨ otherwise ⎩0

(5) (6) (7)

ZihtXiht ∈ {-1, 0, 1}, nh is the number of activities in project h, Kih is the number of types of resources required by an activity i in project h, and rihk is the amount of resource type k required by activity i in project h. Projects with negative ARLF are “front loaded” in their resource requirements, while projects with positive ARLF are “back loaded.” The ARLF for a problem is simply the average of the ARLFs of its constituent projects. The AUF indicates the average tightness of the constraints on (i.e., the average amount of contention for) each resource type:

AUFk =

1 S Wsk ∑ S s =1 Rk s

(8)

where Rk is the (renewable) amount of resource type k available at each interval, and S is the number of time intervals in the problem. Using S = 3 intervals, for example, once the projects have been sorted from shortest to longest, such that CP1 ≤ CP2 ≤ CP3, then S1 = CP1, S2 = CP2 – CP1, and S3 = CP3 – CP2. The total amount of resource k required over any interval s is given by: b

m

nh

Wsk = ∑∑∑ rihk X iht

(9)

t = a h =1 i =1

where a = CPs −1 + 1, b = CPs , and rihk and X are defined as above. Since the AUF is essentially a ratio of resources required to resources available, averaged across intervals of problem time, AUFk > 1 indicates that resource type k is, on average, constrained over the course of a problem. To get the AUF for a problem involving K types of resources: AUF = Max(AUF1, AUF2, …, AUFK)

(10)

For the test problems, we generated (using the generator described in [7]) seven random problems, each composed of three projects (i.e., 21 total networks), and with 20 activities per project. Each of the seven problems has a different ARLF setting, varied in integer increments over -3 ≤ ARLF ≤ 3. For each problem, we adjusted the number of resources available at 11 levels, thereby varying the AUF in 0.1 21

increments over 0.6 ≤ AUF ≤ 1.6.11 This approach, originally taken in [48], yielded 77 test problems. First, we compared the serial and parallel SGS results, averaged over all 77 problems. To enable drawing reliable conclusions, we invoked the OmeGA 50 independent times on each problem for each SGS. The most important CGA parameters, population size and convergence time, were set to 2000 individuals and four epochs, respectively. The remaining parameters were defined as in Table 1. An unknown property of each problem is its underlying BB structure—the size, scaling, and number of BBs. Hence, for the tests we assumed a “conservative” problem difficulty with BB size 4, even though tests with different values for the size of BBs might lead to better results.12 Table 2 depicts the best average fitness value out of the three different SGSs (serial, parallel, and self-adapting) for each of the 77 problems. In some cases the OmeGA produced identical average fitness values for more than one SGS. Interestingly, the self-adapting SGS “won” in only 21 out of 77 cases (27.3%). The serial SGS won in 11 out of 77 problems (14.3%). Meanwhile, the parallel SGS won in 39 out of 77 cases (51%). When we sum the average and best fitness values and the standard deviations (out of all 50 independent runs) for each SGS on all 77 problems, as shown in Table 3, we conclude that the hybrid OmeGA performs best in combination with the parallel SGS. While this conclusion contradicts the insights in [32], this is not necessarily surprising given the different GA design and problem instances. Schedule Generation Strategy Parallel SGS Serial SGS Self-Adapting ALL Strategies Parallel / Self-Adapting

0.6 0.7 0.8 0.9 1.0 AUF 1.1 1.2 1.3 1.4 1.5 1.6

-3 1.76 4.96 13.72 19.62 30.00 39.54 50.98 59.96 70.28 82.36 94.10

-2 4.32 16.40 29.28 46.92 57.24 74.38 88.12 109.50 125.02 140.14 152.66

-1 12.78 24.00 38.92 59.82 69.54 90.50 112.06 134.82 148.50 171.00 194.28

ARLF 0 2.00 5.00 11.00 23.94 34.84 51.66 68.08 79.40 105.68 119.72 142.78

1 3.00 11.16 25.82 36.40 60.96 71.90 105.84 121.64 142.04 172.44 233.10

2 3.00 9.60 21.32 34.88 52.94 67.32 75.00 93.52 111.46 131.84 148.60

3 0.00 2.30 7.14 16.90 22.06 34.34 43.16 50.26 65.22 78.68 91.48

Table 2: Comparison of Different SGS Strategies for the Hybrid OmeGA (average fitness of 50 independent runs)

11

We found that we could not adapt standard single-problem generators and test sets such as ProGen/PSPLIB [42] to create multi-project problems to our specifications. Currently, there is no way to figure out a “good“ BB size except by testing all sizes. However, one point is clear: the larger the selected BB size, the more difficult to identify these BBs. Thus, we think a BB size of 4 is a conservative and good choice although it might not be the best for some problems.

12

22

Sum of Best Results

Sum of Average Results

Sum of Standard Deviations

Serial SGS

5105

5281.86

92.29

Parallel SGS

5020

5166.44

77.82

Self adapting

5021

5175.00

83.07

Table 3: Aggregate Performance of Each SGS in Terms of Total Project Lateness (TPL)

6. COMPARISON TO POPULAR PRIORITY-BASED HEURISTICS We also compared the performance of the OmeGA to 20 popular priority rule heuristics found in the project scheduling literature, as summarized in Table 4. Some of these rules are developed specifically for a multi-project environment, while others have been reported to be successful in a single-project environment. To increase their comparability, we standardized the tie-breaker for all rules to FCFS.

Priority Rule (* = multi-project)

Formula

Comments

1. FCFS—First Come First Served

Min(ESil), where ESil is the early start time of the ith activity from the lth project

Best in study by Bock and Patterson [3]

2. SOF—Shortest Operation First

Min(dil), where dil is the duration of the ith activity from the lth project

3. MOF—Maximum (longest) Operation First

Max(dil)

4. MINSLK*—Minimum Slack

Min(SLKil), where SLKil = LSil – Max(ESil, t), LSil is the late start time of the ith activity from the lth project, and t is the current time step13

5. MAXSLK*—Maximum Slack

Max(SLKil)

6. SASP*—Shortest Activity from Shortest Project

Min(fil), where fil = CPl + dil and CPl is the critical path duration of the lth project without resource constraints

7. LALP*—Longest Activity from Longest Project

Max(fil)

8. MINTWK*—Minimum Total Work content

K ⎞ ⎛K Min⎜⎜ ∑ ∑ d il rilk + d il ∑ rilk ⎟⎟, k =1 ⎠ ⎝ k =1 i∈ AS l where ASl is the set of activities already scheduled (i.e., in work) in project l

Best in study by Patterson [64]

Best in studies by Davis and Patterson [12], Boctor [4], and Bock and Patterson [3] Best in studies by Kurtulus and Davis [48] and Maroto et al. [57]

K ⎞ ⎛K Max⎜⎜ ∑ ∑ d il rilk + d il ∑ rilk ⎟⎟ k = 1 i ∈ AS k =1 l ⎠ ⎝

Best in studies by Maroto et al. [58] and Lova and Tormos [54]

10. RAN—Random

Activities selected randomly

Best in study by Akpan [1]

11. EDDF—Earliest Due Date First

Min(LSil)

12. LCFS—Last Come First Served

Max(ESil)

13. MAXSP—Maximum Schedule Pressure

⎛ t − LFil ⎞ where Wil is the percentage of the activity ⎟⎟, Max⎜⎜ ⎝ d il Wil ⎠ remaining to be done at time t

14. MINLFT--Minimum Late Finish time

Min(LFil)

Equivalent to MINSLK in serial scheduling case (Kolisch [43])

15. MINWCS*—Minimum Worst Case Slack

Min(LSi – Max[E(i,j) | (i,j) ∈ APt]), where E(i,j) is the earliest time to schedule activity j if activity i is started at time t, and APt is the set of all feasible pairs of eligible, un-started activities at time t

Best in study by (Kolisch [43]); without resource constraints, reduces to MINSLK

9. MAXTWK*—Maximum Total Work content

13

Also known as “critical ratio”

t is relevant only when using the parallel SGS, where an activity’s slack will diminish the longer it is delayed.

23

Priority Rule (* = multi-project)

Formula

Comments

16. WACRU*—Weighted Activity Criticality & Resource Utilization

K ⎛ r ⎞ where Ni is the −α Max⎜ w∑ (1 + SLK iq ) + (1 − w)∑ ik ⎟, ⎟ ⎜ q =1 k =1 R Max , k ⎠ ⎝ th number of immediate successors of the i activity, w is the weight associated with Ni (0 ≤ w ≤ 1), SLKiq is the slack in the qth immediate successor of the ith activity, and α is a weight parameter

Best in study by (Thomas and Salhi [76]) We use w = 0.5 and α = 0.5

17. TWK-LST*—MAXTWK & earliest Late Start time (2-phase rule)

Prioritize first by MAXTWK (without FCFS tie-breaker) and then by Min(LSil)

(Lova and Tormos [54]); min. late start time (MINLST), was best in study by (Davis and Patterson [12])

18. TWK-EST*—MAXTWK & earliest Early Start time (2-phase rule)

Prioritize first by MAXTWK (without FCFS tie-breaker) and then by Min(ESil)

(Lova and Tormos [54])

19. MS—Maximum Total Successors

Max(TSil), where TSil is the total number of successors of the ith activity in the lth project

20. MCS—Maximum Critical Successors

Max(CSil), where CSil is the number of critical successors of the ith activity in the lth project; CSil ∈ TSil

Ni

Best in study by (Kolisch [43])

Table 4: Overview of Popular Priority Rules Used for the RCMPSP (Adapted from [6]) To compare results of the non-deterministic CGA with deterministic priority rules, we used the average fitness values of the CGA instead of absolute best values out of 50 runs.14 We compared these average values with the best result from any priority rule. Nevertheless, the OmeGA outperformed the priority-rule based heuristics in 60 of the 77 problems (77.9%) in terms of solution quality (Table 5 and Table 6). Interestingly, we note that Tables 5 and 6 show the priority-rules outperforming the OmeGA at low AUF values. When AUF = 0.6, the resources are relatively unconstrained and the consequential delays are small. In these cases, many of the priority rules gave good solutions, while the OmeGA did not. We therefore conclude that using the OmeGA in the context of nominal resource constraints is probably not worth the effort. It is also interesting that the priority rules performed well when AUFs were very high—i.e., when resources were very highly constrained. With respect to computational time, the priority rules have a clear advantage: the OmeGA required an average of 211 seconds15, while each priority rule required less than one second. However, optimization of a RCMPSP does not necessarily constitute a real-time application, and the time required for the OmeGA should be convenient for many purposes. Nevertheless, priority rules remain most practical for very large problems.

14

Since GAs constitute a non-deterministic optimization technique, it is much more fair to compare average values than absolute best fitness values. 15 Tests were performed on a PC with 3.0GHz Intel Pentium IV processor, 1024MB RAM and a Windows XP operating system.

24

“Winner” OmeGA Priority Rule

AUF

Both

0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6

-3 1.86 4.96 13.72 19.62 30.12 39.54 51.00 59.96 70.44 82.38 94.22

-2 3.00 16.40 29.28 46.92 54.00 74.38 88.12 97.00 111.00 129.00 152.66

-1 11.00 23.00 38.92 59.94 69.82 90.66 112.06 135.28 148.50 168.00 194.28

ARLF 0 1.00 4.00 11.00 23.94 34.84 51.66 68.08 79.40 105.68 119.72 142.78

1 3.00 11.16 26.00 36.40 61.12 72.46 106.22 122.06 142.04 170.00 223.00

2 3.00 9.88 21.72 34.88 53.34 67.32 75.14 93.00 102.00 131.84 149.08

3 0.00 2.32 7.14 16.92 22.06 34.58 43.22 50.26 65.58 78.68 91.50

Table 5: Comparitive Performance of OmeGA and the Best-Performing Priority Rule on Each Test Problem (Average Total Project Lateness)

“Winner” OmeGA Priority Rule

AUF

Both

0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6

-3 -7.00% -17.33% -8.53% -18.25% -18.59% -14.04% -16.39% -14.34% -14.10% -9.47% -12.76%

-2 45.33% -13.68% -11.27% -9.77% 6.00% -8.17% -9.15% 12.89% 12.63% 8.88% -0.22%

-1 16.18% 4.35% -15.39% -10.54% -10.49% -11.98% -9.63% -4.06% -2.30% 2.27% -0.88%

ARLF 0 100.00% 25.00% -21.43% -11.33% -18.98% -16.68% -5.44% -6.59% -4.79% -4.98% -8.47%

1 33.33% -20.29% -18.75% -28.63% -8.78% -15.74% -4.31% -4.64% -3.37% 1.64% 4.53%

2 15.33% -17.67% -19.56% -8.21% -8.03% -3.83% -11.60% 1.14% 9.61% -0.12% -6.82%

3 0.00% -42.00% -40.50% -10.95% -15.15% -6.54% -9.96% -8.62% -10.16% -1.65% -4.69%

Table 6: Percent Difference between the OmeGA and the Best-Performing Priority Rule on Each Test Problem

7. CONCLUSION In this paper, we present the first results of applying a CGA, the OmeGA, to the RCMPSP. Because slight differences in GA settings can have a large influence on its efficacy and performance, we carefully explored these settings in relation to the RCMPSP. While traditional SGAs scale up exponentially in computational resources (i.e., fitness function evaluations and necessary population size) with increasing problem difficulty, CGAs exhibit sub-exponential scale-up behavior due to their ability to identify BBs of the solution. Furthermore, we extended the OmeGA with a local search strategy tailored to the RCMPSP. As a basis for tests, we constructed 77 test problems according to the traditional ARLF and AUF measures. The test results were twofold. First, we found that the parallel SGS—not the selfadapting SGS, as stated in [32]—performed best in combination with the OmeGA. Second, we compared 25

the OmeGA with many well known priority-rule heuristics, concluding that the OmeGA outperforms the rules in 78% of problem instances. Moreover, the 22% of instances where the rules performed best were focused on the very low and very high AUF values, where resources are very slightly or very highly constrained. Thus, we have been able to provide some reasonable guidance on where the OmeGA would be best applied. As problem size increases, we expect even better performance from the OmeGA compared to the rules (which makes it useful for practical applications), even though this performance comes at additional computational expense.16 For future work, we suggest research on even more effective local search strategies for the RCMPSP. With CGAs already being able to identify BBs, and thus interesting regions of the search space, new local search strategies that extensively exploit the BB neighborhood should be beneficial. Furthermore, it would be very worthwhile to develop strategies for accurately estimating the BB structure of the underlying problem.

REFERENCES [1]

E.O.P. Akpan, Priority Rules in Project Scheduling: A Case for Random Activity Selection, Production Planning & Control 11(2) (2000) 165-170. [2] J.C. Bean, Genetic algorithms and random keys for sequencing and optimization, ORSA Journal On Computing 6(2) (1994) 154-160. [3] D. Bock, J. Patterson, A Comparison of Due Date Setting, Resource Assignment, and Job Preemption Heuristics for the Multi-Project Scheduling Problem, Decision Science 21(3) (1990) 387-402. [4] F.F. Boctor, Some Efficient Multi-heuristic Procedures for Resource-constrained Project Scheduling, European Journal of Operational Research 49 (1990) 3-13. [5] T.R. Browning, Applying the Design Structure Matrix to System Decomposition and Integration Problems: A Review and New Directions, IEEE Transactions on Engineering Management, 48(3) (2001) 292-306. [6] T.R. Browning, A.A. Yassine, Resource-Constrained Multi-Project Scheduling: Priority Rule Performance Revisited, TCU M.J. Neeley School of Business, Working Paper, 2006a. [7] T.R. Browning, A.A. Yassine, A Random Generator for Resource-Constrained Multi-Project Scheduling Problems, TCU M.J. Neeley School of Business, Working Paper, 2006b. [8] P. Brucker, A. Drexl, R. H. Möhring, K. Neumann, E. Pesch, Resource-constrained project scheduling: Notation, classification, models, and methods, European Journal of Operational Research 112(1) (1999) 3-41. [9] E. Cantú-Paz, Efficient and accurate parallel genetic algorithms, Kluwer, Norwell, 2000. [10] R. Cheng, M. Gen, Y. Tsujimura, A tutorial survey of job-shop scheduling problems using genetic algorithms, part II: hybrid genetic search strategies, Computers and Industrial Engineering 36(3) (1999) 343-364. [11] E.W. Davis, Project Network Summary Measures and Constrained Resource Scheduling, AIIE 16

The larger the problem instance, the larger (usually) the search space. Certainly this depends on the constraints, too. Since heuristics explore only small subsets of the overall search space, the chance to find the best solution should decrease with increasing search space size. 26

Transactions 7(2) (1975) 132-142. [12] E.W. Davis, J.H. Patterson, A Comparison of Heuristic and Optimum Solutions in ResourceConstrained Project Scheduling, Management Science 21(8) (1975) 944-955. [13] L. Davis, Job shop scheduling with genetic algorithms, in: Proceedings of the 1st international conference on genetic algorithms, 1985. [14] Y. Davidor, Epistasis variance: A viewpoint on GA-hardness, in: Foundations of Genetic Algorithms, 1991. [15] E. Demeulemeester, W. Herroelen, A branch-and-bound procedure for the resource constrained project scheduling problem, Management Science 38(12) (1992) 1803-1818. [16] E. Demeulemeester, W. Herroelen, An efficient optimal solution procedure for resource constrained project scheduling problem, European Journal of Operational Research 90(2) (1996) 334-348. [17] H. Fang, P. Ross, D. Corne, A Promising Hybrid GA/Heuristic Approach for Open-Shop Scheduling Problem, in: 11th European Conference on Artificial Intelligence (1994) 590-594. [18] M.R. Garey, D.S. Johnson, Computers and intractability: A guide to the theory of NPcompleteness, 1979. [19] M. Gen, R. Cheng, Genetic Algorithms and Engineering Design, 1997. [20] S. Ghomi, B. Ashjari, A simulation model for multi-project resource allocation. International Journal of Project Management 20(2) (2002) 127-130. [21] D.E. Goldberg, Simple genetic algorithms and the minimal, deceptive problem, in: Genetic Algorithms and Simulated Annealing (1987) 74-88. [22] D.E. Goldberg, Genetic algorithms in search, optimization, and machine learning, AddisonWesley, 1989. [23] D.E. Goldberg, The Design of Innovation, Kluwer, Norwell, 2002. [24] D.E. Goldberg, B. Korb, K. Deb, Messy genetic algorithms: Motivation, analysis, and first results, Complex Systems 3(5) (1989) 493-530. [25] D.E. Goldberg, K. Deb, A comparative analysis of selection schemes used in genetic algorithms, in: Foundations of Genetic Algorithms, 1991. [26] D.E. Goldberg, K. Deb, H. Kargupta, G. Harik, Rapid, Accurate Optimization of Difficult Problems Using Fast Messy Genetic Algorithms, in: Proceedings of the Fifth International Conference on Genetic Algorithms, 1993. [27] J.F. Gonçalves, J.J. de Magalhaes Mendes, M.G.C. Resende, A Genetic Algorithm for the Resource Constrained Multi-Project Scheduling Problem, AT&T Labs Technical Report TD668LM4, (2004). [28] G. Harik and D.E. Goldberg, Learning Linkage, in: Foundations of Genetic Algorithms IV, 1997. [29] G. Harik, F. Lobo, D.E. Goldberg, The Compact Genetic Algorithm, Proceedings of the IEEE International Conference on Evolutionary Computation 3(4) (1998) 523-528. [30] G. Harik, E. Cantu-Paz, D.E. Goldberg, B.L. Miller, The gambler’s ruin problem, genetic algorithms, and the sizing of populations, Evolutionary Computation 7(3) (1999) 231-253. [31] S. Hartmann, A competitive genetic algorithm for resource-constrained project scheduling, Naval Research Logistics 45 (1998) 733-750. [32] S. Hartmann, A self-adapting genetic algorithm for project scheduling under resource constraints, Naval Research Logistics 49(5) (2002) 433-448. [33] S.Hartmann, R.Kolisch, Experimental investigation of state-of-the-art heuristics for the resourceconstrained project scheduling problem, European Journal of Operations Research 127 (2000) 394407. [34] W.S. Herroelen, Project Scheduling - Theory and Practice, Production and Operations Management, 4(4) (2005) 413-432. [35] J.H. Holland, Adaptation in natural and artificial systems, The University of Michigan Press, Ann Arbor, 1975. [36] C. van Hoyweghen, B. Naudts, D.E. Goldberg, Spin-flip symmetry and synchronization, Evolutionary Computation 10(4) (2002) 317-344. [37] W.H. Ip, Y. Li, K.F. Man, K.S. Tang, Multi-product planning and scheduling using Genetic Algorithm approach, Computer & Industrial Engineering 38 (2000) 283-296 [38] H. Kargrupta, K. Deb, D.E. Goldberg, Ordering genetic algorithms and deception, in: Parallel Problem Solving from Nature II (1992) 47-56. 27

[39] H. Kargrupta, The gene expression messy genetic algorithm, in: Proceedings of the International Conference on Evolutionary Computation (1996) 814-819. [40] J.E. Kelley Jr., Critical-Path Planning and Scheduling: Mathematical Basis, Operations Research 9(3) (1961) 296-320. [41] D. Knjazew, OmeGA: A Competent Genetic Algorithm for Solving Permutation and Scheduling Problems, Kluwer, Norwell, 2002. [42] R. Kolisch, A. Sprecher, A. Drexl, Characterization and Generation of a General Class of Resource-Constrained Project Scheduling Problems, Management Science, 41(10) (1995) 16931703. [43] R. Kolisch, Efficient Priority Rules for the Resource-Constrained Project Scheduling Problem, Journal of Operations Management 14(3) (1996a) 179-192. [44] R. Kolisch, Serial and Parallel Resource-Constrained Project Scheduling Methods Revisited: Theory and Computation, European Journal of Operational Research 90 (1996b) 320-333. [45] R. Kolisch, S. Hartmann, Heuristic algorithms for solving the resource constrained project scheduling problem: classification and computational analysis, Handbook on Recent Advances in Project Scheduling, Kluwer, Boston, 1998. [46] R. Kolisch, R. Padman, An integrated survey of project deterministic scheduling, International Journal of Management Science 29(3) (2001) 249–272. [47] R. Kolisch, R. Padman, An integrated survey of deterministic project scheduling, OMEGA 29 (2001) 249-272. [48] I. Kurtulus, E.W. Davis, Multi-Project Scheduling: Categorization of Heuristic Rules Performance, Management Science 28(2) (1982) 161-172. [49] J. Lenstra, K. Rinnooy, Complexity of Scheduling under Precedence Constraints, Operations Research 26(1) (1978) 22-35. [50] M.J. Liberatore, B. Pollack-Johnson, Factors Influencing the Usage and Selection of Project Management Software, IEEE Transactions. on Engineering Management 50(2) (2003) 164-174. [51] F.S.C. Lam, B.C. Lin, C. Sriskandarajah, H. Yan, Scheduling to minimize product design time using a genetic algorithm, International Journal of Production Research, 37(6) (1999) 1369-1386 [52] G.E. Liepins, M.D. Vose, Representational issues in genetic optimization, Journal of Experimental and Theoretical Artificial Intelligence 2(2) (1990) 4-30. [53] S. Leu, C. Yang, A GA-based multicriteria optimal model for construction scheduling. Journal of Construction Engineering and Management 125(6) (1999) 420-427. [54] A. Lova, P. Tormos, Analysis of Scheduling Schemes and Heuristic Rules Performance in Resource-Constrained Multi-project Scheduling, Annals of Operations Research 102 (2001) 263286. [55] S.W. Mahfoud, Niching methods for genetic algorithms, Doctoral dissertation, University of llinois at Urbana - Champaign, 1995. [56] D.G. Malcolm, Application of a Technique for Research and Development Program Evaluation, Operations Research 7(5) (1959) 646-669. [57] C. Maroto, P. Tormos, A. Lova, The Evolution of Software Quality in Project Scheduling, in: Project Scheduling: Recent Models, Algorithms and Applications, Kluwer, Boston, 1999. [58] C. Meier, A.A. Yassine, T.R. Browning, Design Process Sequencing with Competent Genetic Algorithms, ASME Journal of Mechanical Design (Forthcoming) (2006). [59] B.L. Miller, Noise, sampling, and efficient genetic algorithms, Doctoral dissertation, University of Illinois at Urbana-Champaign, 1997. [60] M. Mitchell, An Introduction to Genetic Algorithms, MIT Press, Cambridge, 1996. [61] H. Mühlenbein, How Genetic Algorithms really work: Mutation and hillclimbing, in: Parallel Problem Solving from Nature II (1992) 15-26. [62] T. Murata and H. Ishibuchi, Performance evaluation of genetic algorithms for flow shop scheduling problems, in: Proceedings of the First IEEE Conference on Genetic Algorithms and their Applications (1994) 812-817. [63] I.M. Oliver, D.J. Smith, J.R.C. Holland, A study of permutation crossover operators on the traveling salesman problems, in: Genetic algorithms and their application (1987) 227-230. [64] J.H. Patterson, Alternative Methods of Project Scheduling with Limited Resources, Naval Research Logistics Quarterly 20(4) (1973) 767-784. [65] J.H. Payne, Management of Multiple Simultaneous Projects: A State-of-the-Art Review, International Journal of Project Management, 13(3) (1995) 163-168. 28

[66] M. Pelikan, D.E. Goldberg, E. Cantú-Paz, BOA: The Bayesian Optimization Algorithm, in: Proceedings of the Genetic and Evolutionary Computation Conference (1999) 525-532. [67] P. Pongcharoen, C. Hicks, P. M. Braiden, The development of genetic algorithms for the capacity scheduling of complex products, with multiple levels of product structure, European Journal of Operational Research 152 (2004) 215-225. [68] P.W. Poon, J.N. Carter, Genetic algorithms crossover operators for ordering applications, Comp. Ops. Res. 22(1) (1995) 135–147. [69] K. Sastry, D.E. Goldberg, Modeling tournament selection with replacement using apparent added noise, in: Proceedings of the Genetic and Evolutionary Computation Conference 11 (2001) 129 – 134. [70] K. Sastry, D.E. Goldberg, Modeling tournament selection with replacement using apparent added noise, Intelligent Engineering Systems Through Artificial Neural Networks 11 (2001) 129-134. [71] M. Spinner, Improving Project Management Skills and techniques, Prentice-Hall, Englewood Cliffs, 1989. [72] A. Sprecher, Solving the RCPSP efficiently at modest memory requirements, Manuskripte aus den Instituten für Betriebswirtschaftslehre. No. 425, University of Kiel, 1996. [73] G. Syswerda, Uniform crossover in genetic algorithms, in: Proceedings of the Third International. Conference on Genetic Algorithms (1989) 2–9. [74] D. Thierens, Mixing in genetic algorithms, Doctoral dissertation, Katholieke Universiteit Leuven, 1995. [75] J.R. Turner, The handbook of project-based management, McGraw-Hill, United Kingdom, 1993. [76] P.R. Thomas, S. Salhi, An Investigation into the Relationship of Heuristic Performance with Network-Resource Characteristics, Journal of the Operational Research Society 48 (1997) 34-43. [77] A.A. Yassine, D. Braha, Four Complex Problems in Concurrent Engineering and the Design Structure Matrix Method, Concurrent Engineering Research & Applications 11(3) (2003) 165-176. [78] S. Kumann, J. Jegan Jose, K. Raja, Multi-project scheduling using an heuristic and a genetic algorithm, International Journal of Advanced Manufacturing technology, 31 (2006) 360-366. [79] Valls, V., Ballestín, F., Quintanilla, S., A Hybrid Genetic Algorithm for the Resource-Constrained Project Scheduling Problem, European Journal of Operational Research (2007), forthcoming. [80] S. Tsubakitani, R. Deckro, A heuristic for multi-project scheduling with limited resources in the housing industry, European Journal of Operational Research 49 (1990) 80-91.

29