Constraint Modeling in the Context of Academic Task ... - Math Unipd

1 downloads 2044 Views 97KB Size Report
Department of Computer Science and Engineering, 115 Ferguson Hall, ... graders or instructors to the majority of courses offered during that semester. Typically ...
Constraint Modeling in the Context of Academic Task Assignment Robert Glaubius and Berthe Y. Choueiry Department of Computer Science and Engineering, 115 Ferguson Hall, University of Nebraska-Lincoln, Lincoln NE 68588-0115 glaubius|[email protected]

1

Introduction

We explore some fundamental issues of the modeling, implementation, and processing of non-binary constraints. These issues include constraint representation and the processing efficiency. Our work builds on the work of Gent et al. [3], while we focus on the practical aspects of the use of non-binary constraints in a real-world application. Techniques for reformulating non-binary constraints into binary ones [1] are not practical in the case of high-arity constraints such as capacity ones [4]. Our practical experience reinforces the perception that the study of non-binary constraints requires careful attention. We first motivate our need to express practical requirements in the form of non-binary constraints, then we explore methods for dealing with high-arity constraints as the conventional methods [1] become impractical. We discuss these needs in the context of a specific real-world application. This is the assignment of graduate teaching assistants to courses in the Computer Science department of the University of Nebraska-Lincoln. The idea for this particular application is borrowed from Rina Dechter, at the University of California, Irvine. In addition to the theoretical motivations of our study, we also undertake these investigations in order to aid our department in this task. Each semester, a pool of twenty-five to forty GTAs must be assigned as graders or instructors to the majority of courses offered during that semester. Typically, the number of courses offered is between fifty-five and seventy. In the past, the task of making this assignment has been executed manually and has involved at least three administrators (i.e. the department chair, the vice-chair, and the Graduate Program secretary) in addition to the faculty and students involved. In addition to the disadvantage of a large and painful investment of effort, the results of this process tend to be less than satisfactory1. We expect that (semi) automation of this process will facilitate the discovery of substantially less problematic solutions, as was largely confirmed by our first field experimentations on solving the assignment of GTAs for Fall’2001. Currently, we are working 1

GTAs are often assigned to help with courses that have time conflicts with their other commitments, or assigned to courses in which they have inadequate proficiency, among other difficulties.

2

on the theoretical questions of the representation and processing of non-binary constraints that we isolated in our practical study. This document is meant as an extended abstract and an introduction to our research. It is organized as follows. Section 2 formalizes the graduate teaching assistant assignment problem. Section 3 examines our approach to modeling this problem. Section 4 briefly discusses strategies for confronting high-arity constraints, and we conclude in Section 5. 1.1

Definitions

While we expect that the reader of this summary is familiar with ideas from constraint processing in general, we offer here a brief overview of some the terms we will use throughout this work. A Constraint Satisfaction Problem (CSP) is a triple, P = (V, D, C), where V = {V1 , V2 , . . . , Vn } is a set of variables, and D = {DV1 , DV2 , . . . , DVn } is the set variable domains, such that DVi is the domain of variable Vi . When the domains are finite, the CSP is said to be finite. C = {Ci , Cj,k , . . . , Ci,j,...,m , . . . Cn } is a set of constraints such that Ci,j indicates a constraint between variables Vi and Vj . For a given constraint Ci,j,...,m , the set of variables Vi , Vj , . . . , Vm is called the constraint’s scope and the size of this set is the constraint’s arity. A constraint Ci,j,...,m is said to be defined by extension when it enumerates, in a list, all allowed tuples. This is in contrast to a constraint defined by intension and where a predicate relation is used to determine whether or not a given assignment is consistent.

2

Informal problem description

The graduate teaching assistant assignment (GTA) problem can be stated as follows. Given a set of graduate teaching assistants and a set of courses for a semester, find an assignment of GTAs to courses that meets the following criteria. The assignment is consistent with all constraints specified on the courses, and ‘satisfactory’ according to the number of courses assigned and the preferences expressed by the GTAs, in which case the problem is an optimization one. This problem can naturally be formulated as a CSP. Courses are variables whose domains consist of the available GTAs. Constraints in the system are based on the practical constraints elicited from informal descriptions provided by the CSE department. We define them by extension when practical (e.g., for unary constraints), but otherwise define constraints by intension. We next discuss each of these, the variables, values, and constraints in this problem. We also briefly discuss the solution ‘quality’ for this problem. A course has several different features that may directly or indirectly affect which GTAs can be assigned to it. These include the course type, such as lecture, lab, or recitation; GTAs typically grade homework from lecture courses, but teach labs and recitations. The course duration is another important feature. If a course meets for only half of a semester, we should take into account that a

3

GTA assigned to this course may be free before the course begins or after it ends. The meeting time is important; it is not good practice to force a GTA to teach during a course he or she is enrolled in. Finally, we want to take into account the expected difficulty, or load, that a course places on an assigned GTA. The attributes that a GTA may have are also important, in that these attributes are usually the reason that a GTA is constrained from assignment to a given course. For instance, the GTAs enrollment may conflict with a course that we would like to assign him or her to. An international student requires ITA certification before he or she can teach a lab or recitation. We also must take into account whether the GTA has received a half-time teaching assistantship (TAship) or a full-time TAship. If the GTA has received only a half-time TAship, that GTA should not be assigned more than a set workload. The type of TAship, half-time or full-time, is directly related to that GTA’s capacity. Finally, we introduced a new attribute into the system; GTAs now are able to indicate a integer-valued preference [0,5] for each course. A preference of 0 indicates that the GTA shouldn’t be considered, while 5 indicates a strong preference for the assignment. We proposed and introduced this new attribute to the data collection mechanism in the department. Above, we summarized most of the constraints in our application. In Section 3, we discuss them and how we implemented in more detail. Typically, a solution is satisfactory when all variables are assigned and no constraint is violated. Due to the fact that this problem is often over-constrained in practice, we relax the first criterion. Therefore, a solution is satisfactory if it breaks no constraints. Since this would allow solutions that keep some variables uninstantiated, we introduce several optimization criteria. The main optimization criterion is to choose the solution with the most number of assigned variables (absolute requirement of the department). To break ties between solutions, we either choose the solution that maximizes the geometric mean of GTA preferences, or the solution that maximizes the minimum preference.

3 3.1

Approach and modeling Approach

The first stage of our work focuses on the elicitation of the problem description from the department, as discussed in Section 2. This consists of determining the variables that must be receive assignments (Courses) and the values to assign to them (GTAs). Of utmost importance at this stage we determine the informal description of constraints that must be taken into account when assigning GTAs to courses from intensive and iterative discussions with the department staff. The second stage of our work is implementation of this system. Some of the important questions that must be asked at this point are how best to handle some complex constraints observed in practice. Frequently it is most natural to describe complex constraints as non-binary constraints, including global constraints. For example, perhaps a global constraint is the most natural way to describe a constraint on capacity.

4

Naturally, we looped iteratively over these two stages repeatedly refining our model and also the rules so far followed by the department. 3.2

Problem model

While the implementation of variables and values is relatively easy, once the data has been collected, the implementation of the constraints between variables is substantially more tricky. We identified and modeled the following constraints: 1. Unary constraints: These are overlap, enrollment, ITA Certification, and zero-preference constraints. 2. Binary constraints: One mutex constraint between variables that overlap in time. 3. High-arity constraints: These are equality, capacity, and confinement constraints. Overlap constraints specify that a GTA may not be assigned to teach a course that overlaps in time with a course he or she is enrolled in. This constraint is a unary constraint based on the variable’s meeting time and a GTAs enrollment status, and applies only to labs and recitations. Enrollment constraints prevent a GTA from being assigned to a course in which he or she is enrolled. This constraint at first glance may appear redundant with the Overlap constraint; however, Enrollment constraints are more general in that they apply to all variables in a problem. ITA Certification constraints restricts an international student from teaching courses if he or she is not certified, and Zero preference constraints prevent students with preference of zero for a given course from being assigned to that course. All of these constraints apply to exactly one course. We include also a binary mutex constraint between two courses. This constraint is placed between a combination of labs and/or recitations that overlap in time. It tends to be easier to avoid these assignments than to coerce a GTA to be in two places at once. Finally, we have elected to utilize three types of non-binary constraint in our model. These are equality, capacity, and confinement constraints. Equality constraints specify that all courses in its scope should be assigned the same GTA. Capacity constraints prevent a GTA from being assigned more courses than his or her TAship allows for. The third of these constraints, the confinement constraint, is by far the most complex. This constraint specifies that, for a fixed, predefined subset of the variables in its scope, if a GTA g is assigned to this set (called the confinement set), then g should not be assigned to variables outside of that set. A similar condition exists for courses outside of the confinement set; if g is first assigned to one of these courses, then we cannot assign g to courses in the confinement set. As it is the case for each of the constraints discussed above, the motivation behind the confinement constraint, is based on practical consideration. A given lecture course may have several labs associated with it. Further, there may be several sections of this lecture, each with its own set of labs. This is true for

5

many introductory level courses. For instance, there are frequently three or four sections of the lecture course “Introduction to Computer Science.” Each of these sections can have from two to five labs. The course material taught in each of the lectures may have some variation (e.g., honors program versus regular program), which is then reflected in the labs. It is preferable to assign a GTA to teach labs that are all associated with the same lecture, since this requires less preparation on the part of the GTA. By placing the labs associated with the same course as a confinement set of a global confinement constraint, we achieve this condition. We suspect that confinement constraints are relevant to many practical applications since it can be used to enforce locality of assignments to a set of proximal variables. For instance, in the area of scheduling workers to run machines in a manufacturing plant, we could use the confinement constraint to assign a given worker to only run machines that are close together. This constraint has the admirable quality of looking good on paper. However, we noticed that, as defined above, it is prohibitively expensive to check this constraint even when defined by intension2 . For this reason, we are currently investigating efficient reformulations of the confinement constraint so that it can be checked with a reasonable cost.

4

Constraint Reformulation

A large amount of the work on constraint satisfaction assumes only binary constraints. This has been justified by the existence of methods for converting a non-binary CSP into a binary CSP [1]. Unfortunately, these methods rely on an extensional definition of the non-binary constraint. In particular, when dealing with global constraints with arity of sixty or more, we cannot afford this luxury. We are therefore confronted with a dilemma. Confinement constraints are expensive to check when formulated as a global non-binary constraint. The use of the established procedures for translating these constraints into binary constraints is infeasible. We therefore examine the plausibility of a semantic reformulation of this constraint through constraint decomposition [3]. Our initial results establish that we can model reformulation constraint by an equivalent decomposition to a network of binary mutex constraints. However, as a consequence of [3], it seems that the complexity of checking our non-binary constraints will instead rear its head during search. Gent et. al. show that in general, using the nFC proposed by [2], in particular nFC2 through nFC5 on a non-binary problem formulation, it is possible to avoid a combinatorial explosion in the number of branches explored during search on the binary decomposition of that problem. We are currently investigating how to characterize this constraint and its reformulation theoretically, and studying the effects on the performance of search. 2

We do not consider defining this constraint by extension, since the space required is prohibitively high. One such constraint had 365 5 acceptable combinations.

6

5

Conclusions

We discuss here the formulation and modeling of a practical application, the assignment of GTAs to courses in an academic setting. In the course of our study, we examine some of the issues surrounding constraint representation, such as intensional versus extensional constraints, and constraint decomposition. The merit of this work lies in the fact that theoretical concerns, such as the equivalence of binary and non-binary CSPs are considered alongside practical concerns regarding time and space complexity.

References 1. Fahiem Bacchus and Peter van Beek. On the conversion between non-binary and binary constraint satisfaction problems. In AAAI/IAAI, pages 310–318, 1998. 2. Christian Bessi`ere, Pedro Meseguer, Eugene C. Freuder, and Javier Larrosa. On Forward Checking for Non-binary Constraint Satisfaction. In Principles and Practice of Constraint Programming (CP’99), pages 88–102, 1999. 3. Ian P. Gent, Kostas Stergiou, and Toby Walsh. Decomposable constraints. In New Trends in Constraints, pages 134–149, 1999. 4. Jean-Charles R´egin. Usenet message posted on comp.constraints, 1998.

Suggest Documents