Minimizing the Number of Tardy Jobs with Chance Constraints and ...

Minimizing the Number of Tardy Jobs with Chance Constraints and Linearly Associated Stochastically Ordered Processing Times by Dan Trietsch and Kenneth R. Baker

Working Paper 2 February 2006 Revised 21 November 2006

Abstract We consider the single-machine sequencing model with stochastic processing times and the problem of minimizing the number of stochastically tardy jobs. In general, this problem is NP-hard. Recently, however, van den Akker and Hoogeveen found some special cases that could be solved in polynomial time. We generalize their findings by providing a polynomial solution for any stochastically ordered processing times. We then show that the algorithm applies as well to basic linearly associated processing times, thus allowing more realistic modeling of positive dependencies.

Introduction Schedulers must control the risk of excessive tardiness penalties, but even stochastic scheduling models rarely address this need. When due dates are decision variables, safety can be incorporated by using balanced safety time (Trietsch, 1993), but when due dates are given and capacity is rigid, it may be necessary to reject some jobs. For this purpose, we need models that minimize the expected total penalties due to rejections and tardiness. An alternative approach, which we take in this paper, is to address the need for safety by constraining the probability of tardiness. We consider the standard single-machine model with stochastic processing times. Job j has a given due date dj, and scheduling decisions dictate its completion time Cj. In the probabilistic context, we replace the deterministic definition of “on time” by a stochastic one. Define the service level for job j as P(Cj ≤ dj), the probability that job j completes by its due date. Let βj denote a given target probability, 0 ≤ βj ≤ 1. Then the form of a service-level constraint for job j is P(Cj ≤ dj) ≥ βj. We say that job j is stochastically on time if its service-level constraint is satisfied; otherwise, the job is stochastically tardy. A stochastically tardy job may be rejected or, equivalently, placed at the end of the sequence. A subset of jobs is stochastically feasible (or, in our context, simply feasible) if a schedule exists in which the all the jobs in it are stochastically on time. Our problem is to find the largest stochastically-feasible set, denoted A. The other jobs are rejected, and become part of a set R. Using standard notation for classifying scheduling problems, we refer to this problem as the stochastic version of 1 | | ΣUj. In the deterministic case, this problem is solved by a polynomial algorithm often referred to as the Moore-Hodgson algorithm (Moore, 1968). The stochastic version of the problem was introduced by Balut (1973)

-1-

and later proven NP-hard by Kise and Ibaraki (1983). Recently, van den Akker and Hoogeveen (to appear) found some special cases that have polynomial complexity. In this note, we generalize those findings. We use the notation [k] to denote the index (sequence position) of the k-th job in a schedule. A subset of jobs is stochastically feasible if a schedule exists in which the subset is contained in A and stochastically on time. Lemma 1 is a rewording of Theorem 3.6 in the van den Akker and Hoogeveen paper.

Lemma 1. Consider two jobs, i and j. If di ≤ dj and βi ≥ βj, then either there exists an optimal schedule where i precedes j or at least one of jobs i and j must be rejected.

When di ≤ dj and βi ≥ βj for all pairs (i, j), we say that the dj and βj are agreeable. Based on Lemma 1, we henceforth break due date ties by nondecreasing βj; any remaining ties can be broken by SEPT. This tie-breaking mechanism is understood to hold for the EDD sequence. Assuming statistical independence, van den Akker and Hoogeveen obtain the following results. 1. They identify two processing time distributions for which there exists an equivalent deterministic 1 | | ΣUj problem that can be solved by the MooreHodgson algorithm. 2. They describe a special case where all job processing times consist of a deterministic component plus an iid stochastic disturbance. For this case, the Moore-Hodgson algorithm also applies if all dj and βj are agreeable. 3. They observe that in the case where all job processing times follow normal distributions, the Moore-Hodgson algorithm does not apply and the problem is NP-hard; however, a polynomial solution is possible if each variance is a constant

-2-

times the corresponding mean and βj ≥ 0.5. [This can be shown to be equivalent to a case with stochastic dominance. Furthermore, it is sufficient that job means and variances be agreeable in the regular sense.]

Agreeability, which is acknowledged in their paper, and stochastic ordering, which is not, are key properties of their special cases. We show that when the processing times are stochastically ordered, the problem can be solved by a polynomial algorithm. We also show that in the special subcase of agreeable due dates and target levels, the Moore-Hodgson algorithm applies. Finally, we introduce the concept of linearly associated random variables, discuss why it is useful in practice, and show that our problem remains in P (and our algorithm works) for basic linearly associated random variables with stochastically ordered processing times. As described by Moore (1968), the Moore-Hodgson algorithm (i) starts with the EDD sequence; (ii) identifies the smallest k for which a subset made up of jobs [1] through [k] is infeasible (i.e., where job [k] is the only tardy job); (iii) allocates the longest job in any such infeasible subset to R; and (iv) repeats (ii) and (iii) until the schedule is feasible for all jobs not in R. For stochastic application of this algorithm, we consider a subset feasible if it is stochastically feasible; and in an infeasible subset, the job with the largest expected processing time is the one we reject (i.e., assign to R). In what follows, we denote stochastic dominance with probability one (or almost surely) by ≤as (or ≥as), regular stochastic dominance by ≤st (or ≥st), and we use ≤ex (or ≥ex) to denote a similar ordering by expectation. None of these relationships requires statistical independence (Ross, 1996).

-3-

Polynomial Algorithms with Stochastic Dominance We begin with a lemma that helps us determine which job to reject.

Lemma 2. Assume we are given a sequence of n jobs with stochastically-ordered and independent processing times (pj). Suppose that we must reject exactly m out of the first k jobs (where 1 ≤ m ≤ k < n) and that our objective is to minimize the number of stochastically tardy jobs among the last (n − k) jobs in the given sequence. Then it is optimal to reject the m stochastically largest jobs.

Proof. »» Let X, Y, V, W, and S be independent random variables. If X ≥st Y and V ≥st W, then X + V ≥st Y + W (Ross, 1996). Similarly, S + X ≥st S + Y and S − X ≤st S − Y. Therefore, the sum of the processing times of the m largest jobs is stochastically larger than the sum for any other m jobs, and the sum of the processing times of the remaining (k − m) jobs is stochastically smallest among all possible such subsets. Accordingly, the completion time of each of the (n − k) jobs that follow is stochastically minimized. Therefore, rejecting the m stochastically largest jobs maximizes the service levels of those (n − k) jobs and thus minimizes the number of stochastically tardy jobs. ««

Theorem 1.

Assume stochastically-ordered and independent processing times (pj),

and assume that all pairs of jobs have agreeable dj and βj. Then the MooreHodgson algorithm minimizes the number of stochastically-tardy jobs.

Proof.

-4-

»» If all jobs are feasible, the Theorem holds by Lemma 1 (because the algorithm begins with the EDD sequence). Now assume that we encounter infeasibility, and recall that the Moore-Hodgson algorithm guarantees that it is enough to reject one job per such occurrence. Let v ≥ 0 denote the number of jobs in the previous (feasible) subset when we first encounter infeasibility, so [v + 1] is the index of the job for which the infeasibility occurs. Consider two cases: In Case 1, Job [v + 1] is the stochastically largest job among the (v + 1) jobs in the current subset. Rejecting this job achieves subset feasibility. Moreover, by Lemma 2 (with m = 1), rejecting Job [v + 1] is the best choice for the jobs that follow the subset. In Case 2, suppose Job [u], where u ≤ v, is the stochastically longest job. To complete the proof it remains to show that after rejecting Job [u], Jobs [u + 1], [u + 2], …, [v], [v + 1] will be feasible in positions u, u + 1, …, v − 1, v. (We continue to identify jobs by their former positions.) For Jobs [u + 1], [u + 2], …, [v], feasibility is preserved because their completion times cannot increase. Now consider the completion time of the last job, C[v + 1] and compare it to the former completion time of Job [v], denoted by Cv. Let the completion time of the current next-to-last job be C0, then C[v + 1] = C0 + p[v + 1] and Cv = C0 + p[u]. But p[v + 1] ≤st p[u] so C[v + 1] ≤st Cv. Therefore, P(C[v + 1] ≤ d[v]) ≥ β [v]. But d[v + 1]

≥ d[v] (by EDD) and β [v] ≥ β [v + 1] (by agreeability), so P(C[v + 1] ≤ d[v + 1]) ≥ P(C[v + 1]

≤ d[v]) ≥ β [v] ≥ β [v + 1]. ««

Thus, we can apply the Moore-Hodgson Algorithm directly when (a) the dj and βj values are agreeable and (b) processing times are independent and stochastically ordered. We now remove the agreeability restriction and present another polynomial procedure, which runs in O(n3) time, for solving the problem. The procedure starts with all jobs unscheduled, but ultimately assigns all jobs either to set

-5-

A of accepted jobs or to set R of rejected jobs. We treat the jobs in the order of Shortest Expected Processing Time (SEPT), which is equivalent to stochastic ordering from shortest to longest. Without agreeability, we cannot rely on the EDD sequence to determine whether a set is feasible. Instead, we use the feasibility check developed by van den Akker and Hoogeveen, which verifies feasibility for any set of k jobs in O(k2) time, assuming that convolutions and similar calculations take constant time. The feasibility check starts with all jobs unsequenced. The distribution of C[k] can be determined, so the feasibility of any job in that position can be ascertained. If any unsequenced job is feasible in position [k], we assign it to that position, reduce k by 1, and repeat for the remaining unsequenced jobs, thus constructing a sequence from back to front. A feasible sequence is obtained if and only if the initial set of k jobs is feasible. If a set of jobs is feasible, then all its subsets are also feasible. Therefore, it is possible to construct a feasible set from progressively larger feasible subsets of it. The following algorithm obtains the optimal solution by this approach.

SEPT-Feasibility (S-F) Algorithm 1. Index the jobs in SEPT order; initialize sets A and R as empty; set i = 1. 2. Tentatively add the first unscheduled job to set A. 3. Run the feasibility check for the i jobs in set A. If it fails, assign the tentative job permanently to R. Otherwise, leave the job in A and increment i by 1. Return to step 2 until all jobs are scheduled.

-6-

Theorem 2.

Assume stochastically-ordered and independent processing times (pj).

The SEPT-Feasibility Algorithm minimizes the number of stochastically-tardy jobs.

Proof. »» At stage k, the S-F Algorithm addresses the first k jobs in SEPT order, and we denote this set as S[k]. Let A[k] = {a[1], …, a[mk]} be the subset of S[k] accepted by the S-F Algorithm, where mk = |A[k]|. The theorem is true if and only if A[k] is optimal for all k. We proceed by induction on k; i.e., we establish that the theorem is correct for S[1]; for k ≥ 2 we assume it is correct for S[k − 1], and we prove that it must be correct for S[k]. For k = 1, the S-F Algorithm accepts Job 1 if and only if it is feasible, so A[1] must be optimal for S[1]. For k ≥ 2, the S-F Algorithm performs a feasibility check for {a[1], …, a[m(k − 1)], k}. If this set is feasible, then A[k] = {a[1], …, a[m(k − 1)], k} and because A[k − 1] is optimal by assumption and |A[k]| > |A[k − 1]|, A[k] must be optimal. So assume the set {a[1], …, a[m(k feasibility check on {a[1], …, a[m(k

− 1)],

− 1)],

k} is not feasible and trace the

k}. If any jobs are feasible in the last

positions, mk, m(k − 1), …, we can ignore them because they cannot cause infeasibility in an earlier job. By the assumed infeasibility we know that there exists a set of j ≥ 1 jobs, of which Job k is one, none of which is feasible in position j. By feasibility of A[k − 1] we know that it is sufficient to remove one of these jobs. Furthermore, jobs (k + 1), (k + 2) …, n that will subsequently be examined by the S-F Algorithm are stochastically larger than Job k (and thus larger than each of the j − 1 remaining jobs), so none of them will be feasible in one of the first j − 1 positions without removing at

-7-

least one of the existing jobs from A as well. We now invoke Lemma 2 to select Job k, which is indeed the job that the S-F Algorithm will reject. Hence, as we start with an optimal A[k − 1] and take the optimal next step, the resulting A[k] must be optimal too. ««

Theorem 2 is more general than Theorem 1, but the complexity associated with Theorem 1, as that of the Moore-Hodgson algorithm, is O(n log n), and since each operation here involves convolutions and numerical function checks, efficiency is important. It is possible to achieve better average complexity in various ways (e.g., utilizing Lemma 1 to impose partial order), but this is beyond our present scope.

Linearly Associated Random Variables When random variables are positively correlated as a result of common causes of variation impacting more than one of them in the same direction, they satisfy the definition of associated random variables (Esary et al., 1967). Random variables are associated if the correlation between any positive nondecreasing functions of them is nonnegative. Independent random variables are associated (weakly), but negatively correlated ones are not. Association may arise not only by adding the same random variable to two or more independent random variables, but also if two or more positive random variables are multiplied by the same positive bias element (Trietsch 2005). For example, if a regular worker may be sick tomorrow, and the replacement worker is slower, then this constitutes a bias element and for scheduling purposes tomorrow’s job processing times are positively dependent. If the quality of a particular tool deteriorates, than those jobs that require it may take longer to process, etc. In general, it is likely that various causes impact different subsets of jobs in such a

-8-

way that positive dependence is introduced between them in various degrees. The case of associated processing times is significant in practice because there are many such common causes of variation that impact more than one job in the same direction. We say that two nonnegative random variables, X1 and X2, are linearly associated if there exist four independent nonnegative random variables, R1, R2, Z and B, and two nonnegative scalars, α1 and α2, such that X1 = (R1 + α1Z)B and X2 = (R2 + α2Z)B. If we set Z = 0 and B = 1, then X1 = R1, X2 = R2, and they are independent by assumption (and thus associated). At the other extreme, if R1 and R2 are 0, then X1 and X2 are proportional (and thus associated). Here B models the common bias shared by X1 and X2, whereas Z represents any additive element they may share. In what follows we assume α1 = α2 = 1.

Theorem 3.

Given four nonnegative independent random variables—R1, R2, Z and

B—let X1 = (R1 + Z)B and X2 = (R2 + Z)B. Then X1 ≤ex X2 if and only if R1 ≤ex R2; X1 ≤st X2 if and only if R1 ≤st R2; and X1 ≤as X2 if and only if R1 ≤as R2.

Proof. »» For ≤ex, by independence, E(Xj)=E(B)E(Rj + Z), so the theorem holds. For ≤st, let W1 = R1 + Z and W2 = R2 + Z, then W1 ≤st W2 if and only if R1 ≤st R2. By the definition of stochastic dominance, for nonnegative random variables, log W1 ≤st log W2 if and only if W1 ≤st W2. Therefore log W1 + log B ≤st log W2 + log B (and thus W1B ≤st W2B) if and only if R1 ≤st R2. Finally, the proof for ≤as is essentially the same as for ≤st. ««

To construct general scheduling models with linear association, we might wish to assume that there are k bias elements (generalizing B in the definition of linear

-9-

association), such as workers, tools, weather, etc. Similarly, we can model m possible common elements, generalizing Z in the definition, and let each job incorporate a weighted subset of them. Then each job is subject to a subset of biases and a subset of common elements. For each particular pair of jobs, the product of the bias elements in the intersection of bias elements acts as B, and the intersection of common elements acts as Z. Bias and common elements that are not shared by the two jobs can be incorporated into Rj. Such a model, however, poses a very significant estimation challenge, and for this reason we simplify by using just one bias element and one common element (with the same weight everywhere). We refer to the result as the case of basic linear association. For basic linear association, Theorem 3 holds for any number of stochastically ordered jobs because the same B and the same Z apply to all processing times). Theorem 1 required statistical independence because it relied on Lemma 2, and a similar observation applies to Theorem 2. But if in the proof of Lemma 2 we allow the random variables X, Y, V, W, and S to be linearly associated with the same B and Z everywhere, then by a similar argument to the proof of Theorem 3, the lemma will remain valid. Similarly, Lemma 1 and the feasibility check can be shown to hold for linearly associated random variables. Therefore, Theorems 1 and 2 can be generalized for linearly associated random variables. Hence, our models and results require a weaker condition than statistical independence.

Conclusion When constructing schedules, practitioners must consider safety explicitly. Sometimes this necessitates rejecting some jobs to make others more likely to be on time. van den Akker and Hoogeveen (to appear) showed that, although the problem of minimizing the number of stochastically tardy jobs on a single machine is NP-hard,

- 10 -

the Moore-Hodgson Algorithm applies for some special cases. We elaborated on their result in three ways. First, we imposed a weaker condition, namely stochastic ordering of the processing times. Second, we showed that with stochastic ordering the problem is in P. Finally, we introduced linearly associated random variables and showed that they model realistic processing time dependencies better than statistically independent random variables. Our former two results are among the potentially many models originally proven under a stochastic independence assumption that can be generalized by allowing some form of linear association.

References: Akker, van den J.M. and J.A. Hoogeveen (to appear) “Minimizing the Number of Late Jobs in Case of Stochastic Processing Times with Minimum Success Probabilities,” Journal of Scheduling. Balut, S.J. (1973) “Scheduling to minimize the number of late jobs when set-up and processing times are uncertain,” Management Science 19, 1283-1288. Esary, J.D., F. Proschan and D.W. Walkup (1967) “Association of Random Variables, with Applications,” Annals of Mathematical Statistics 38, 1466-1474. Kise, H. and T. Ibaraki (1983), “On Balut's Algorithm and NP-Completeness for a Chance Constrained Scheduling Problem,” Management Science 29, 384-388. Moore, J.M. (1968) “An n Job, One Machine Sequencing Algorithm for Minimizing the Number of Late Jobs,” Management Science 15, 102-109. Ross, S.M. (1996) Stochastic Processes, 2nd Ed., Wiley. Trietsch, D. (1993) “Scheduling Flights at Hub Airports,” Transportation Research, Part B (Methodology) 27B, 133-150.

- 11 -

Trietsch, D. (2005) “The Effect of Systemic Errors on Optimal Project Buffers,” International Journal of Project Management 23, 267-274.

- 12 -