Proceedings of the CP 2003 Workshop on Immediate Applications of

Proceedings of the CP 2003 Workshop on Immediate Applications of Constraint Programming (ACP)

Marius-Calin Silaghi Markus Zanker (editors)

Held on September 29th 2003 in conjunction with the 9th International Conference on Principles and Practice of Constraint Programming (CP 2003) Kinsale, County Cork, Ireland

ORGANISATION Organizers and Workshop Chairs Marius-Calin Silaghi, Florida Institute of Technology, [email protected]

Markus Zanker, University Klagenfurt, [email protected]

Program Committee Roman Bartak, Charles University, [email protected]

Frederic Benhamou, University of Nantes, [email protected]

Berthe Choueiry, University Nebraska Lincoln, [email protected]

Joerg Denzinger, University Calgary, Canada, [email protected]

Boi Faltings, EPFL, Switzerland, [email protected]

Alexander Felfernig, University Klagenfurt, [email protected]

Christian Frei, ABB Switzerland, [email protected]

Gerhard Friedrich, University Klagenfurt, Austria, [email protected]

Dietmar Jannach, University Klagenfurt, Austria, [email protected]

Pedro Meseguer, Institut d'Investigació en Intelligència Artificial (IIIA), [email protected]

Debasis Mitra, Florida Institute of Technology, [email protected]

Christian Russ, ConfigWorks Informationssysteme & Consulting, Austria, [email protected]

Markus Stumptner, University of South Australia, [email protected]

Barry O'Sullivan, University College Cork, [email protected]

Franz Wotawa, Technical University Graz, [email protected]

II

PREFACE Constraint satisfaction is a successful paradigm developed by the AI community that found acceptance in many fields of computer science. While specific techniques (e.g. operations research ones) exist for most of the problems that can be represented as Constraint Satisfaction Problems (CSPs), constraint satisfaction owes its success to the simplicity and straightforwardness with which humans can formalize new problems and profit thereby of powerful existing techniques. While there exist extensive literature about abstract techniques for solving more or less general CSP formulations, the applicability of the field is covered by a veil of mystery. The companies that propose CSP based solutions seem to have particular success on the market but the details and the contribution of the Constraint Satisfaction research to all of it are all but well known. The lack of feedback to the large amount of academic researchers involved in the field encourages a more in breadth research versus a focused research. Moreover, most conferences in the field give currently only little attention to the relation technique/application and focus solely on randomly generated or on toy evaluation criteria. Most problems faced by the researchers trying to apply constraint satisfaction to real applications are invisible to the community as they do not find easily an avenue into the main conferences like CP. With the series of applied science - workshops initialized by the current Workshop on Immediate Applications of CSPs we plan to foster a better interaction between physical reality and academic research on CSPs and to direct CSP research towards areas of high promise and social interest. (All the difference is BFS vs. A*!) This goal of interaction between industry and academia resembles also in the choice of invited speakers: Eugene Freuder and his team will address current research challenges and the issue of engaging industry and Ulrich Junker discusses successful application development of problem solving engines for large-scale Constraint Programming at ILOG. The working notes of this workshop gather contributions dealing with applications in diverse fields such as storage management, assembly planning, ressource allocation or distributed product configuration. The 6 papers demonstrate that continuous technical advances in CSPs lead to a technology that fits more and more real applications. Marius-Calin Silaghi and Markus Zanker Sept 05, 2003

III

CONTENTS Invited talk: Large-Scale Constraint Programming: from Advanced Application Development to Advanced Problem Solving Engines Ulrich Junker, Daniel Godard, Daniel Mailharro Invited talk: Immediate Applications and the Role of Academia: Research Challenges and Engaging Industry Gene Freuder, Chris Beck, Ken Brown, James Little, Barry O’Sullivan, Rick Wallace

1-2

3

Storage Management by Constraint Satisfaction Garret Swart

4-25

CB2 – A Constraint-Based Algorithm for Assembly Sequence Planning Carmelo Del Valle, Antonio A. Márquez, Rafael M. Gasca, Miguel Toro

26-44

A software tool for modelling and solving optimization problems with side constraints Andrew Davenport

45-56

Constraint Based Type Inferencing in Helium Bastian Heeren, Juriaan Hage, S. Doaitse Swierstra

57-78

Characterizing the Behavior of a Multi-Agent Search by Using it to Solve a Tight, Real-World Resource Allocation Problem Hui Zou, Berthe Y. Choueiry

79-99

Distributed Generated CSP approach towards multi-site product configuration A. Felfernig, G. Friedrich, D. Jannach, M.C. Silaghi, M. Zanker

100-123

V

Large-Scale Constraint Programming: from Advanced Application Development to Advanced Problem Solving Engines Ulrich Junker ILOG S.A. 1681, route des Dolines 06560 Valbonne France

Daniel Godard and Daniel Mailharro ILOG S.A. 9, rue de Verdun BP 85 F-94253 Gentilly Cedex

[email protected]

{dgodard,dmailharro}@ilog.fr

Abstract Constraint programming (CP) comes with a powerful paradigm, namely that of combining a rich and declarative language for formulating combinatorial problems with efficient algorithms for solving tractable subproblems. The expressiveness of the constraint language make CP very attractive for industrial scheduling and resource allocation problems and permits a natural modelling of complex resource and temporal constraints. Furthermore, efficient algorithms from graph theory and operations research can be encapsulated inside global constraints such as the alldifference constraints, path constraints, and resource constraints. In spite of this powerful computational paradigm, constraint programming has faced a lot of challenges when applied to industrial logistics and engineering problems such as production scheduling, air traffic control, crew scheduling, equipment configuration, and many others. Those problems are inherently large-scale involving thousands of variables with thousands of possible values. Global resource constraints make it difficult to identify local causes when search runs into dead-ends and to recover from those failures. Finally, user interaction and online optimization require responses within a small time-frame. Thus, it is not astonishing that naive applications of standard backtracking search with constraint propagation often are not sufficient to solve these problems. We discuss several examples where a successful application of CP required the development of sophisticated problem solving methods. Clever heuristics and good problem decomposition are important techniques for solving large-scale CP problems. We discuss a complex heuristic for solving an air-traffic control problem. Next we outline different problem decomposition schemes. Operation Research has a long tradition in problem decomposition and it is a natural idea to use CP for certain subproblems. Crew scheduling problems are usually decomposed into a set partitioning problem (solved with a MIP Solver) and a roster generation subproblem, which can adequately be solved by CP. Furthermore, we explain how a production planning and scheduling problem can be decomposed into a MIP-part

1

and a CP-part. In addition to heuristics and problem decomposition, symmetry breaking methods may be essential for certain large-scale problems such as the configuration of an instrumentation and control system. These examples show that successful applications of CP to large-scale optimization problems rely on a clever application development. A question is whether higher forms of search programming can help the developer in this task. We argue that a separation of search model and search behaviour are a first step in this direction. The search model defines search decisions and subproblems, whereas the search behaviour defines how these subproblems and decisions are treated during search. The preference-based problem solving of ILOG JConfigurator implements this separation. The search behaviour can be changed (e.g. random backtracking instead of systematic backtracking) without changing any search decision. Search programming thus becomes more modular and easier to adapt to complex problems.

2

Immediate Applications and the Role of Academia: Research Challenges and Engaging Industry Gene Freuder Chris Beck Ken Brown James Little Barry O'Sullivan Rick Wallace Cork Constraint Computation Centre Computer Science Department University College Cork Cork, Ireland www.4c.ucc.ie

Abstract. As academics interested in immediate applications of our technology we are concerned both with identifying suitable research challenges, and with engaging industry. We will address some of the research challenges posed by: Uncertainty and Change Collaboration and Privacy Acquisition and Explanation We will discuss the experience of establishing the Industry Associates Program at the Cork Constraint Computation Centre. This work has received support from Science Foundation Ireland, Enterprise Ireland, the Royal Irish Academy, cadcoevolution.com, ILOG SA, and Xerox Corporation.

3

Storage Management by Constraint Satisfaction* Garret Swart Computer Science Department University College Cork Cork Ireland [email protected]

Abstract. Storage management is the process of managing an enterprise’s digital storage infrastructure to best meet the needs of that enterprise. It is a natural constraint satisfaction problem because of the need to minimize the cost of the storage infrastructure while simultaneously meeting the requirements of the enterprise for data capacity, performance, security and availability. In this paper we discuss the applicability of constraint satisfaction to storage management and show how constraint satisfaction is a useful way of framing and solving the problems. We examine the “clean-slate” configuration problem and show how this problem can be modeled and solved efficiently. We also define the incremental storage management problem, where we start with an arbitrary configuration and find a path to a new configuration that meets an updated set of requirements.

1

Introduction

Storage management is the process of managing an enterprise’s digital storage infrastructure to best meet the needs of that enterprise. It is a natural constraint satisfaction problem because of the need to minimize the cost of the storage infrastructure while simultaneously meeting the data requirements of the enterprise. In its simplest form the storage management problem is nothing more than a binpacking problem: stuffing the enterprise’s data into the right size storage units. However, in more realistic scenarios it is a second order bin-packing problem because in a modern virtualized storage system the inputs to a storage management problem are not the bins themselves but the materials out of which a set of bins may be constructed. The bins so constructed not only have to be big enough but they also have to have other attributes in amounts appropriate to the items that are to be placed in them, including availability, performance and security. While storage management is generally performed manually, it needs the discipline of an automated solution because of the large number of constraints that need to be checked and the large number of possible configurations to be considered. It can further benefit from an automated solution because data requirements and systems *

This work was partially supported by Science Foundation Ireland under Grant 00/PI.1/C075.

4

configurations are continually changing and in many circumstances reconfigurations can be made to a running storage system without disruption and without making physical changes in the system. It is well known lore that making manual changes to a system’s configuration is one of the most likely causes of system instability [13]. It is hoped that by building automatically managed storage solutions that we can increase the efficiency of these resulting systems, give guarantees that all the requirements are met, and increase the overall availability of the resulting system by doing hands-off administration. There are two approaches that have been taken to build automatically managed storage systems. One is to design individual components that are fundamentally selfmanaging, for example a smart virtual disk that understands its requirements and attempts to find the resources it needs to meet those requirements. This approach, while interesting and useful, is fundamentally limited because each component of the storage system has only limited knowledge of the environment it is operating in and can generally communicate only with components of the same type. The other approach is to use a set of agents that are external to the storage elements and whose scope is much larger than that of a single storage component, or even a component type. These agents collect enough information about the enterprise’s complete storage requirements and its storage infrastructure to allow it to determine an optimal storage configuration and how to best move from the current configuration to the optimal configuration. Once the changes have been determined the agent may be configured to effect those configuration changes, monitoring the system to make sure the changes have the expected effect, and then to continue monitoring the environment making additional changes as the environment continues to change. The main topic of this paper is the formulation of the constraint satisfaction problem that such an agent will need to solve to determine the new configuration and the techniques used to solve these CSPs. This work is part of a larger project to apply constraint satisfaction and agent technology to a wide array of system configuration problems. In the remainder of this section, we look at some of the related work done in this area and define the basics of a storage environment and rolls of the people that administer it. The structure of the storage environment, both technical and organizational, motivates some of the decisions made in the way that this problem is modeling as a CSP. We then examine the different ways that storage units are combined using RAID technology to create the logical storage units with specific desired properties. We next examine the requirements that administrators want to place on data in more detail and see how this is modeled in OPL. We then look at the ‘fresh slate’ storage problem and discuss the search strategy used to efficiently solve this problem using the OPL language. We then define the incremental storage management problem, where we start with an existing set of storage requirements and a configuration that meets those requirements. We then accept a modification of the requirements and find a minimal set of changes to the storage infrastructure to have it meets the new requirements. Finally we outline some directions for future work in this area.

5

1.1 Related Research and Products There is a wide history and literature in the development of self-administering (or nearly self-administering) storage systems and it is an explicit goal of various standards and research bodies [6, 7, 11, 12]. While current systems are far from self administering, we have been making progress. Early storage devices required administrators to be involved in revectoring bad blocks on a disk with good ones elsewhere on the disk, while today, no self-respecting disk vendor would think of requiring administration for this purpose. Early file systems would sometimes require administrators to specify which disk cylinders to allocate to a given file, while on modern file systems not only are files automatically placed but some file systems will automatically defragment themselves while the file system are is in use. Again and again computers have proved themselves to be capable of taking some burden off of the backs of poor overworked system administrators. The HP AutoRAID system [31], the Petal Disk System [18], the Cooperative File System [11] and the Object-based storage project [20] are all examples of research projects to build single storage systems that are more flexible and easier to administer. Most of the work on external configuration engines has been done in industry. The initial products in this area have been focused on information collection. Newer versions of these products have moved on to making configuration suggestions, and future versions may be able to effect those changes without human intervention. Examples include products from Legato [18], EMC [10], Veritas [30], Raidtec [23], Sun [26], HP [14], Storability [24], and Creekpath [44]. However none of the commercial solutions take on the problem globally and look at the breadth of requirements that are needed. HP Labs has made great strides in building research systems that make complex configuration decisions and they have developed a number of systems for automatically laying out storage systems [1, 2, 3,17, 30]. They have made important advances in the modeling of RAID arrays and some on the development of special purpose solvers for quickly finding near optimal configurations. The work in this paper is related but we focus on how to model and solve even more complex configuration problems using general purpose solvers. A related problem to determining optimal configurations is that of modeling a given storage configuration so that the performance of a configuration can be estimated accurately. The simple performance modeling techniques in this paper only apply to long term throughput, they do not apply to synchronous or latency sensitive environments. The work done by Varki et al. [28, 29] addresses this problem and incorporating her results with the ones in this paper is a goal for the future. 1.2 Corporate Storage Environment A modern storage physical infrastructure consists of storage units, generally disks, tapes, optical storage and solid-state devices, connected to storage controllers. The storage controllers are either directly attached a general purpose computer or the con-

6

trollers may be connected with many general-purpose computers as part of a storage area network (SAN). On top of the physical infrastructure is overlaid a logical (also known as virtual) storage infrastructure. In the logical infrastructure the storage units are carved up or combined to form logical storage units that may have quite different characteristics than the physical storage units from which they are built. Storage virtualization has been around for some years, pioneered by research such as the Berkeley’s RAID project [22] and is now a common feature in most commercial products. It is important to distinguish between data management, database management and storage management. Data management is determining what data is needed by the enterprise; the requirements that are to be placed on that data; and how the data is accessed. The requirements that are placed on a service, including a storage service, are sometimes known as an SLA, or service level agreement. An SLA is a pseudo-legal document that specifies exactly what the service provider needs to provide to the service customer. Data management is generally done performed by the IT management and design staff. They play off the costs and benefits of maintaining data at different levels of reliability and availability and the effect it will have on the business if a service is unavailable. It is at this level where they decide which data has to be maintained at a mere four nines (99.99%) of availability and what data needs a lofty six nines (99.9999%) of availability. Database management is the task of determining how that data should be organized so as to allow it to be accessed efficiently. Database management is usually performed by a person well versed in the intricacies of the databases used to store the data. This person will work with the database designers and programmers to make sure that information can be organized in such a way to give fast enough access to the required information. The database designer should ideally know enough about the storage behavior of the database so that given the usage estimates from the data manager, the minimum requirements on the underlying logical storage can be estimated. Finally storage management is concerned with four interrelated problems: − What storage units, storage controllers and network controllers should be purchased? − How should the physical units, storage controllers and SAN infrastructure be configured physically? − How the logical storage infrastructure should be configured out of the physical infrastructure? And finally, − How the data should be assigned to the logical storage infrastructure? Storage management is generally done by systems administrators specifically trained in the tools provided by the operating systems and the storage infrastructure provider, and in wiring and installing the physical infrastructure. Storage management is often split into monitoring, maintenance and design functions. There has been a fair bit of work in automation of the monitoring function and some easy maintenance operations, e.g. replacing a failed storage unit with a hot standby unit, but there has been very little automation of the storage management

7

design task. When designing a storage infrastructure, storage administrators are primarily concerned with: − Making sure that data is reliable enough: The data is not lost. − Making sure that data is available enough: The data is accessible when it is needed. − Making sure that enough data storage capacity is available to meet the projected needs. − Making sure that the data may be accessed with acceptable performance. The frustrations of a storage infrastructure designer are myriad: − Incomplete requirements from customers. Some of this is because storage managers don’t know which questions to ask, while some is because data and database administrators don’t know how to estimate these requirements. − Incomplete understanding of the characteristics of the physical components. Some of this is because the hardware providers don’t provide good information about hardware performance or failure probabilities some is because the properties of the constructed logical storage units are not understood. − A huge number of potential configurations to explore. − Incomplete and incorrect information about the configuration currently installed. This can be because of bad book keeping as configurations are changed quickly to solve problems and also be because many vendors do not provide enough information to rebuild a configuration map automatically.

2

Storage System Components

In this section we discuss the components that go together to make a storage system. The relationships between these components are shown in this UML diagram. Physical Unit Class

Physical Storage Unit

Logical Storage Unit

Dataset

Storage Unit

Stream

Application

Figure 1: Storage System Components

8

RAID Type

2.1 Datasets There is a need to aggregate data for the purposes of administration. While one could imagine placing different requirements on individual bytes in a file, in practice even the individual file is too small a unit for specifying requirements. For this reason we use the notion of a dataset. A dataset is a collection of related data files that is accessed by applications as a unit. For example, a particular user’s email files might be a dataset, the files implementing a particular web site might be a dataset, and the set of files holding a particular database might also be a dataset. Datasets can be broken up as needed so that more fine grain requirements can be expressed. For example, one could split off the log files as a separate dataset from the rest of the files representing a database so that they may have their own set of requirements. In other literature [2] a dataset is called a store. In traditional hand configured systems, users and administrators have been trained to err on the side of simplicity. In an automated system, where many datasets can be dealt with on an automated basis, users may be encouraged to define more datasets to allow their requirements to be specified on a finer basis. Datasets, by themselves, don’t have many requirements. Most requirements on datasets come from applications use of a dataset, and are thus associated with the application rather than the dataset. Requirements that are associated directly with a dataset include: − Capacity: The expected maximum number of bytes that are to be stored in the dataset over the next reasonable time period. Using too long a time period for such an estimate will consume resources now, not six years from now when storage prices have fallen. The time frame to choose for this depends on an organization’s budget cycle, the storage unit vendor, the expected delivery and installation time for new hardware. Again, because of the difficulty of adding storage automatically, and the desire to amortize the disruption and the potential hazards for each such disruption, administrators have tended to plan for storage requirements up to a year in advance. − Security: The likelihood that an unauthorized application can access the dataset. Security at the dataset level is usually enforced through a combination of firewalls and password or PKI based access control. Firewalls are the most important as most enterprise storage infrastructure is on private networks insulated from the corporate intranet as well as from the public Internet. A high security setting on a dataset would indicate that the storage used for that dataset might need to be gotten from storage units available only on a high security SAN, rather than stored on a file server on the corporate intranet. In practice most storage level security is quite good and most unauthorized access to data is usually made through applications, e.g. web servers and databases, whose security issues are not part of the storage management problem. Given that the probability of an unauthorized access to a storage service is difficult to quantify, we generally express security requirements in either qualitative, e.g., high, medium or low, or in an implementation specific way, e.g. the firewall or password policy. As security is only as strong as its weakest link, the level of secu-

9

rity for a dataset is the minimum level supplied by any of its constituent storage units. Note that setting a dataset’s security level not only affects the storage units for storing the data, it also may restrict the processors from which applications accessing the dataset may run. − Privacy: The likelihood that the data on a dataset may be read by an unauthorized individual. The major threat to the privacy of a dataset is through the physical media. For example, off-site back up tapes that may get into the wrong hands or hotswap disks that may be stolen or removed by maintenance personnel. To ensure privacy the common techniques are physical security measures, encryption of the data on the disk units or on the tape backups, and the use of encryption on any network connection between the application and the storage unit. Again, the likelihood of a privacy violation is hard to quantify and so again we use qualitative measures and ensure that the minimum level implemented by the storage units used to store this dataset, matches the dataset requirement. Again privacy at the storage level only takes you so far, as most privacy violations occur at the application level. − Reliability: The probability that any data stored in this dataset is lost or corrupted. This requirement is typically handled by a combination of good engineering and checksums of data on the storage unit to detect and correct problems with recorded data. Other threats, like physical damage or software corruption are handled by backups, while others may be handled by using RAID or other replication techniques. There are two types of replication in common use in storage systems: Synchronous and asynchronous. Replication is called synchronous if it provides single copy serializable semantics to the client. This is called synchronous because providing these semantics generally requires updating all copies (or at least a majority) of the data before the update is recorded as complete. RAID units provide synchronous replication. Asynchronous replication, on the other hand, allows an update to complete before all of the replicas are updated; in this case, the failure of the master replica may cause a certain number of in-flight updates to be lost. Asynchronous replication has the advantage that replicas may be geographically separated without the consequent loss of latency. Reliability calculations of asynchronously replicated data has to balance the larger probability that a few seconds of updates will be lost because of an inopportune temporary failure at the master site, against the smaller probability that an entire day’s updates, as well as several days of availability, will be lost if the single data center is destroyed due to a catastrophe. 2.2 Applications In contrast to the dataset, which is the passive container for data, the application is the active consumer of the data. The performance and availability related requirements on a dataset are not inherent in the dataset itself but instead spring from the needs of the applications that use it.

10

A single application may use many different datasets and for each one of those datasets it may have a schedule during which its load on the associated dataset may vary. For example, on online data processing application may generate 800 I/O requests per second on a particular dataset during the peak times of 10 am until 4pm, but in the afternoon its load may go down to 200 I/O requests per second. In contrast, a batch-scheduled application may be scheduled by the batch scheduling system for an arbitrary low load time during the day. Taking the lead of the existing literature [21] we call each application’s use of a dataset a stream. The performance and load requirements a stream may place on a dataset are: − Sequential I/O Bandwidth: An application may need to consume or generate data sequential data from a dataset at a certain rate. Depending on the buffering in the application, that rate may need to be guaranteed over a smaller or larger time window. For example, an application with a large amount of buffering may allow for 30 second bandwidth window, while an application with a small amount of buffering may require that bandwidth to be given on a one second window. An application like playing a movie or doing a back up typically has firm bandwidth requirements that are determined by the video device or the tape drive. − Random I/O Operation Throughput: An application may generate random read and write operations at a given rate. It may be that adequate functioning of the application may require that each of these I/O’s complete within a certain length of time. Synchronous random I/Os are often generated by database index operations. The distinction between a random operation and a sequential operation is that a sequential operation is large enough that the cost of seeking to the data is insignificant. Also for RAID 5 units, sequential writes are assumed to be full stripe writes, avoiding the large cost of reconstructing parity by re-reading data that is unaffected by the write. We can conservatively combine the performance requirements of multiple streams on a dataset and multiple datasets on a logical unit by simply adding the throughputs. In our current formulation we do not yet handle latency requirements on operations or windows on bandwidth guarantees which complicate those calculations. The node in the network where an application is running is an important consideration in determining the security and the available bandwidth to the dataset. An application running on an application processor which has poor connectivity to the storage unit will not be get guaranteed high throughput access to the storage unit even if the unit itself has plenty of bandwidth available. Similarly an application running outside of the corporate firewall will not be able to access a high security dataset, because such a dataset is only accessible inside of a corporate firewall. − Application Location: A set of application processors where an instance of this application may be configured to run. The datasets being accessed from this application needs to be placed on physical units with enough network capacity and security between the physical units and the application processor. Applications also have to meet availability and reliability requirements and these availability requirements have to be reflected back to the dataset through the stream. If a dataset is required for an application to function then the availability requirement on

11

the application is passed directly through to such datasets. Other datasets may have a more complex way of mapping availability from the application to the dataset. For example, an application with a huge data requirement might decide to split its data into 101 different datasets. For all of these datasets to be available 99.999% of the time, each dataset would need a phenomenal 99.99999% availability. But it may be that 100 of the datasets are for customer data and each customer’s data is stored on only one dataset. If that is the case then the availability of the individual datasets need merely be 99.999% because the failure of a single dataset will affect only 1% of the users. Specifying 100 reasonable sized datasets with a given availability will be much easier for the configuration manager to fulfill than a single gigantic dataset with the same availability. A positive correlation between the availability of multiple datasets being used by an application generally improves the availability of the application as applications seldom make use of any redundancies between datasets. − Availability: The probability that a dataset is available and providing the correct performance guarantees. If several applications have different availability requirements on a dataset, then the requirement on the dataset is the maximum of the requirements of each of the applications using the dataset.

2.3 Physical Storage Units Physical storage units are available from manufacturers with different specifications and with different costs. There are four types of storage in active use today. − Tapes: Tape units provide high throughput and density for sequential I/O and very limited support for random input. Low cost media, high density and transportability makes them ideal for offline storage, but the potential for magnetic and chemical decay makes them unattractive for long-term storage. Hybrid systems consisting of tape robots and disks are still attractive for the very lowest price per byte systems. − Disks: The mainstay of most storage systems. While old style disks used many different geometries, starting with the 24” disk used on RAMAC [21], the 3.5inch fully sealed disk has become the standard for both personal computer and data center use. 2.5 inch and 1 inch sizes are popular for notebooks and handheld devices respectively. − Optical: While quite unpopular in storage systems, optical is still popular as a distribution media and for long-term storage because of its high reliability rate over the long term. The main problem pushing it out of data centers is its low storage density. − Solid State: Once popular in stand-alone units because if its high performance and low latency, the lowering cost of electronics has caused it to become incorporated into most traditional disk and RAID controllers to improve latency on disks transparently. Given the lack of moving parts, solid state disks have very high availability, but depending on the technology used to keep the data stable, its reliability can be lower.

12

There are three popular kinds of interconnect used for connecting a storage unit to its controller: − IDE (Integrated Drive Electronics) disks are the cheapest disks and are generally used in mass-market devices and in personal computers. Because of economies of scale, these tend to be about half of the price of SCSI disks. Because of the good price performance on these disks, storage vendors are starting to use these disks in data center applications. Each IDE controller can talk to four devices, a master and a slave on each of two channels. Large IDE configurations generally use many IDE controllers networked together. A difficulty with IDE networking is that there is no mastering protocol, so each disk can only be connected to one IDE controller, a potential single point of failure in accessing the disk. − SCSI (Small Computer Systems Interconnect) disks are generally used in workgroup servers where extra complexity in the SCSI interface provides better performance for a multiprocessing workload. Modern SCSI versions also allow dual porting, where a single SCSI chain of disks may be connected to two SCSI controllers allowing the disks to be accessible even if one of the hosts they are connected to is down. − Fibre Channel disks are generally used in larger disk farms where the highest performance and reliability are demanded. Fibre channel allows for any number of hosts and over 100 disk drives per Fibre channel loop. Because Fibre channel can use optical technology, it allows for very long cable lengths and small connectors. With the introduction of storage area networks, the number of storage units that can be directly accessed by a computer has skyrocketed, however, there are few tools that allow the bottlenecks in SANs to be diagnosed and repaired. 2.4 Logical Storage Units A logical storage unit is constructed out of one or more physical storage units. A logical storage unit is sometimes called a partition (Windows), a raw disk (Unix), or a logical volume (IBM, HP). Logical storage units are generally used as the storage for a file system or as part of the storage for a database. The storage controller may implement logical storage units itself or storage units may be implemented by the operating system of the application processors or of the storage or file server. Some systems use a hybrid approach to allow the operating system to be aware of the logical virtual mapping but allowing the controller to do the required heavy lifting, such as parity computations and replacement disk resynchronizing. There are several ways that logical storage units can be built out of other logical storage units and physical storage units. The most popular forms are given here. They have had much discussion in the storage system literature. The properties of the resulting logical volumes are summarized in Table 1. − Partitioning: A physical storage unit may be partitioned into several logical storage units. Historically this was the first type of mapping between physical and logical storage units that was supported on most systems. Partitioning was used to allow

13

one to build two or more file systems on the same physical disk. This allows you to run different operating systems on the same disk, or to separate datasets that have conflicting access patterns. For example, image processing datasets don’t do well sharing a file system with the operating system, the huge files in the image processing dataset tend to cause fragmentation problems for the OS files. Logical storage units created via partitioning have correlated availability, reliability and performance and they inherit the availability, reliability and security characteristics of their parent storage unit. Generally partitioning is not used to build high performance storage systems. − Concatenation: Several storage units of varying sizes can be concatenated to form a larger storage unit. The I/O performance of the resulting storage unit will not be much better than the constituent storage unit as sequential operations will continue to use only one storage unit at a time and even for random reads and writes these will often be localized on a particular constituent unit The availability of a concatenated store is less than the availability of its constituents, because all of the continuant storage units have to be available for the concatenation to be available. For example, if each of the k units in a concatenation was independently available with probability a, then the availability of the resulting concatenated logical volume is: ak The primary reason concatenation is used, as opposed to say striping, described below, is that concatenation can be done on the fly with no disruption to the applications while adding another storage unit to a stripe set usually requires a complete reformatted of the storage unit. Concatenations are thus very useful when a logical storage unit unexpectedly runs out of room. Concatenations are generally avoided in the data center; if they are used to fix a problem in the short term, they are often removed and replaced when a reformatting is possible. − Striping (RAID 0): Several logical storage units of the same size can have their blocks interleaved to form a single larger logical volume with much better sequential and random I/O throughput. The availability of such a configuration is the same as above, however, the sequential read and write throughput increase linearly because large or random I/Os are naturally split evenly among the constituent units. − Mirroring (RAID 1): Several logical storage units of the same size are combined so that they replicate each other. This increases the availability and reliability of the data stored on the storage units. The availability of an n-way mirrored storage unit constructed out of units with availability a is thus 1 – (1-a)n because the data is unavailable only if all n of the replicas are unavailable. Mirroring increases the random read throughput because reads may be done from either disk. It can also increase the sequential read throughput in the same degree because it is possible to read alternate tracks from each replica. Write throughput on the combined unit is the same as for the base units because both replicas have to be written.

14

− Striping and Mirroring (RAID 10): This combines the benefits of RAID 1 and RAID 0 by mirroring each individual disk and then combining them in a stripe set. The availability of n-way mirroring followed by and k-way striping is: (1 – (1–a)n)k because the stripe set is available only if each of its mirrored components is available. Note that striping and then mirroring gives the same performance but less availability, so it is not generally used. − Striping and Distributed Parity (RAID 5): This is another way of combining the benefits of RAID 1 and RAID 0 but instead having a complete extra copy of the data, we instead store the parity, the exclusive OR function, applied to each block of the stripe set. The parity can be used to reconstruct the data if one of the blocks in a stripe fails. The availability of an n-way RAID 5 is logical volume is nan-1(1–a) + an because the probability that a RAID 5 set is available is the probability that all n member of the stripe set are available, plus the probability that any one of the n disk units are down times the probability that all of the n-1 remaining storage units are available. Note that random writes operations are much more expensive on a RAID 5 system because each write entails 4 separate I/O operations: (1) The old data has to be read, (2) the old parity has to be read, (3) the new parity has to be computed by XORing the old data and the new data with the old parity and then written, and the (4) new data has to be written. − Striping over a multiple RAID 5 arrays (RAID 50): This is a way of creating very large logical units. Compared with a single large RAID 5 unit, the availability is somewhat higher and the random write operation throughput is better, but the space used for parity is much larger. One reason this configuration is because many RAID 5 disk arrays are packaged with the disks and the controller in a single enclosure. This type of RAID 5 controller cannot handle disks outside the enclosure so if a single higher capacity, highly available logical unit is required external striping is Table 1: Properties of Logical Units with given RAID types given properties of the constituent physical units Availability: a

Read Bandwidth r

Write Bandwidth: w

Capacity: s

Read Ops TP: ro

Write Ops TP: wo

Partitioning

a

r/k

w/k

s/k

ro / k

wo / k

Concatenation

ak

r

w

ks

ro

wo

Striping Mirroring RAID 10 RAID 5 RAID 50

ak 1 – (1–a)n (1 – (1–a)n)k nan-1(1–a) + an (nan-1(1–a) + an)k

kr nr knr nr knr

kw w kr (n–1) r k (n–1) r

ks s ks (n–1) s k (n–1) s

k ro n ro k n ro n ro k n ro

k wo wo k wo wo n / 4 k wo n / 4

15

required. Striping has low overhead and can be done easily at host level.

3

Constraint Satisfaction Problem

A few liberties and simplifications have been taken with the storage management problem as outlined above. These simplifications will be revisited in future work. − We have combined the dataset requirements from the applications using a dataset to make a single requirements structure per dataset. This removes the interaction between application scheduling and storage management, a significant simplification. − We have ignored the issue of the network location of the storage unit and the application. This removes the ability to look at congestion along the data path between the application and the storage units and removes the ability to check for inconsistencies between the security and privacy levels and the network paths that must be navigated by the data. − We assume that the availability and reliability of each of the storage units is independent when in fact there are correlations: Storage units may be in close physical proximity, meaning that they could be both destroyed by the same accident or natural disaster. Storage units may be connected into the same network switches, electrical system, or they may be connected to the same storage controllers. Ideally each of these shared access points are themselves both replicated and highly available making the correlation coefficient quite low as the probability of a failure due to such a problem quite unlikely with respect to the failure of the storage units themselves. − We have ignored issues of disk latency and real time requirements and focused our performance work just on bandwidth and throughput. − We do not consider building logical units out of other logical units; we only consider building logical units directly out of physical units. We make a special case of RAID 10 considering it as one level of logical unit assuming that the level of mirroring is always 2. In this section we will look at how the CSP is formulated in OPL [27]. The intent is not that storage administrators should to understand this formulation; they will be supplying the parameters to the model through a management application which will then trigger the solution of the CSP. When the solution is complete, the answer will be displayed to the storage administrator. 3.1 Dataset Requirements To drive the construction of logical storage units and the assignment of datasets to the storage units, it is necessary for the system to get the requirements on the datasets. In OPL these requirements were modeled as an array of structures. The StorageData

16

structure contains a field for each requirement that must be met by the logical storage unit where the dataset is stored. Note that this same structure is also used to define the capabilities of the physical storage units in the next section. range float Probability 0.0 .. 1.0; enum Privacy {LowPrivacy, MediumPrivacy, HighPrivacy}; enum Security {LowSecurity, MediumSecurity, HighSecurity}; struct StorageData { int+ capacity; // int+ readBandwidth; // int+ writeBandwidth; // int+ readOpsTP; // int+ writeOpsTP; // Probability availability; Probability reliability; Privacy privacy; Security security; };

in KB KB/sec KB/sec ops/sec ops/sec

enum DataSets ...; StorageData dataSetReq[DataSets] = ...;

3.2 Physical Storage Unit Model Rather than consider each physical storage unit as an individual object, we reduce the search space by putting the storage units into classes, each member of which is considered to be identical. We also keep allow a restriction on the maximum number of units from each class. enum PUnitClass ...; StorageData unitCapabilities[PUnitClass] = ...; int+ unitCost[PUnitClass] = ...; int+ unitCount[PUnitClass] = ...; int+ maxCount = (max(p in PUnitClass) unitCount[p]); int+ maxCost = (sum(p in PUnitClass)(unitCost[p]*unitCount[p])) Note that for the bandwidth and throughput capabilities, we assume that when presented with a mixed workload, that the capabilities may be scaled and combined. For example, if a physical unit has a maximum specification of 5 MB/s of read bandwidth and 4 MB/s of write bandwidth, 40 random reads per second and 20 random writes per second., then we assume it can then simultaneously perform any fraction of those operation rates that add up to a value less than one. For example, such a device would be able to read 1.25 MB/s, write 1 MB/s, perform 10 random reads and 5 random writes. Given that we are not concerned with latency at this time, this assumption clearly holds.

17

3.3 Logical Storage Configuration Model In this OPL model we ignore partitioning and concatenation, as partitioning is typically used on personal computers much more of then large data centers and concatenation is not used when configuring new logical units. We construct each logical storage unit out of identical physical units, that is, units that are from the same physical unit class. This is good practice as characteristics of a logical unit tend to blend of the worst characteristics of each physical unit, so having the physical units be identical smoothes the operation. Also, just as airlines find cost savings in using the same type of aircraft for all of their flights, so also do data centers find cost savings in using as few different types of physical units as possible. enum RAIDTypes { RAID0, // Striped n-ways RAID1, // Mirrored n-ways RAID10, // Mirrored 2-ways and striped n/2-ways RAID5}; // Distributed Parity and n-1 striped // Maximum number of virtual disks int+ maxLogical = ...; // No more logical units than datasets assert maxLogical =ft(TFather)+ ∆mov(SA,M(T),M(TFather))) if ∃ T’ ∈ tree / M(T’)=M(T) ∧ C(T’)≠C(T) addConstraint(st(T)≥ft(T’)+∆cht(M(T),C(T),C(T’))) endif endif addConstraint(ft(T)≤ makespan) addConstraint(T.requires(M(T))) end Fig. 6. Including constraints for a task.

The function adds the constraints for the start times (st) and end times (ft) of the task, taking into account that the start time must be later than the end time of the predecessor task plus the delay due to the transportation of the subassembly from the machine where the predecessor task was executed to the machine where the task must be executed. Moreover, if there exists in the current tree a predecessor task using the same machine and different configuration in the machine, the possible delay due to the change of configuration in the machine between the execution of the two tasks is taken into account. Another constraint is for the end time of the task, so that it must be earlier than makespan. Finally, each task is associated to a specific resource (constraint requires), so that two tasks requiring the same resource can not be executed simultaneously. In order to tackle the corresponding disjunctive constraints, our algorithm uses an edge-finder propagator [16], so that it can detect more inconsistencies. Figure 7 shows the way the solutions are calculated, after the tree of tasks has been completed. Once the tasks are known, the function rankResTasks ranks the resource constraints, so that it selects first the resource with the smallest slack, and the task with the minimal earliest start time. Each call to the function nextSolution produces a valid instantiation of all the variables of the CSP (times of tasks and makespan) taking into account the constraints procedure findSolution( ) rankResTasks() ok=false while (nextSolution(makespan)) solution = getValue(makespan) ok = true endwhile return() end Fig. 7. Finding solution

34

and propagating them properly when instantiating each variable, so that it backtracks if the domain of some variable becomes empty. If all variables can be instantiated, nextSolution returns true, and false in other case. This way, successive calls to nextSolution(makespan) produce better and better solutions (improving the makespan) until the last call can not instantiate all the variables when minimizing the makespan, so that the last solution found represents the best solution for that tree of tasks.

5

Heuristics for bounding solutions

The double backtracking algorithm presented in the previous section is not very efficient in general. The adaptation to branch and bound allows to improve its behaviour because of two factors: in the branching phase you can use more intelligent criteria when choosing the expansion order of alternatives; on the other hand, in the bounding phase, you can use heuristic functions that estimate the values of the solutions that can be obtained from a task, in order to prune more branches that will not provide better solutions. Bounding solutions can be implemented by saving the best solution found so far and by adding constraints ft(T) < optimum when referring at the ending times of incorporated tasks to the solution, optimum being the value of the current best found solution. However, the use of heuristic functions that estimate the values of the solutions that can be obtained from a task will help to prune the search space. This way, as we are constructing a certain tree and calculating its partial solution, we can detect more accurately, with the help of the heuristic functions, if that tree will not improve the best solution found, so that we can discard it earlier. As shown below, the same heuristics can serve, also, to choose more properly the order of exploring alternatives. The estimation of branches must be optimistic so that when excluding the alternatives that do not improve the current best solution, we do not lose any solution, that is, it must be a lower bound of the value of all the possible solutions deriving from these partial solution. If no heuristics are considered, the estimation if a task is added to a partial solution would be the end time of the task. Obviously, this estimation is too optimistic, because it does not consider the tasks below T. Notice that, the more fitting is the bound imposed by the estimation, better the algorithm will behave pruning the search space, but greater would be the computational cost for calculating the estimation. In this work several heuristic functions are exposed that carry out optimistic estimations of the solutions. They are based on separating what can be pre-calculated offline from those calculations that must wait until the execution of the branch and bound algorithm, in order to minimize the computational cost. 5.1

The heuristic function hs

The first significant estimation takes into account the minimum time needed to execute each task and its successors in the And/Or graph, without considering the constraints due to the use of shared resources among tasks of different branches in the

35

graph and without considering the possible delays due to the changes of configuration in the machines and to the transportation of subassemblies between machines. This way, the minimum time to execute a task T and its successors is estimated by the expression:

hs (T ) = dur (T ) + max(min (hs (Ti )), min (hs (T j ))) Ti ∈Or1

T j ∈Or 2

(1 )

where Ti and Tj refer to the tasks associated to the Or nodes connected by T. The minimum operation refers to the selection of the most favorable task in the Or node (only one is executed), while the maximum operation refers to the most loaded branch (Or node) according to the estimation (at least a task of each branch must be chosen to complete the solution). Notice that these calculations can be done off-line, i.e., before the algorithm starts the search for solutions. If task T was added to a partial solution, the estimation of the total assembly time (eft) would be: eft = est (T ) + hs(T )

(2 )

est(T) being the earliest start time of task T and hs(T) the estimation for the minimum time needed for executing T and all tasks below it from any subtree. Notice that the precedence constraints defined in the And/Or graph are taken into account, via the collected constraints for st(T), the start time of task T. The previous calculations will allow to exclude those tasks (and delete its associated subtree) whose estimation has an estimation of the accumulated time eft not less than the value of the current best solution. In addition, if all the tasks corresponding to one branch have been eliminated, the partial solution should be discarded and the algorithm must backtrack to the closest choice point. These estimations are taken into account in the algorithm, in order to obtain a more bounded domain for the variables st(T) and ft(T), by substituting in the function createConstraintsTask of Figure 6 the constraint ft(T) ≤ makespan

(3 )

by st(T) + hs(T) ≤ makespan

5.2

(4 )

The heuristic function ht

If the delays due to the change of configuration in the machines (∆cht) and to the transportation of intermediate subassemblies (∆mov) are taken into account, a new heuristic ht can be defined, with the same meaning than hs: ht (T ) = dur (T ) + max(min (htc(Ti , T )), min (htc(T j , T ))) Ti ∈Or1

T j ∈Or 2

where

36

(5 )

htc(Ti,T) = ht(Ti) + max(τ(Ti ,M(T),C(T)),∆mov(sa(Ti),M(Ti),M(T)))

(6 )

M(T) and C(T) being the required machine and configuration for the execution of task T, respectively, M(Ti) the machine needed for executing the task Ti, and sa(Ti) the sub-assembly produced by the task Ti (successor of task T). The new factor τ(T, M, C) represents an additional time to the execution of T and their successors due to we would need the configuration C in the machine M in order to be used by a task above T in the And/Or graph. The value of τ is given by the expressions: τ(T, M(T), C(T)) = 0 τ(T, M(T), C) = ∆cht(M,C(T),C) τ(T , M , C ) = max(0, τ1 (T , M , C )), if M ≠ M (T ), where τ1 (T , M , C ) = max(min (τ 2 (T , Ti , M , C )), min (τ 2 (T , T j , M , C ))) Ti ∈Or1

(7 )

T j ∈Or 2

τ 2 (T , Ti , M , C ) = τ(Ti , M , C ) − (ht (T ) − ht (Ti ))

Notice that, when M ≠ M(T) , τ would be positive (τ1>0) if the necessary configuration change for the execution of some successor task has a duration greater than the duration of T. Notice that these calculations can be done off-line. The estimation given in Equation (2) for the minimum assembly time (eft) is still valid, but now using the heuristic ht, instead of hs: eft = est (T ) + ht (T )

(8 )

Obviously, the criteria for pruning the tree search are the same than before, even though the estimation can be different.

6

Heuristics for exploration order

As explained below, the heuristic functions defined in the previous section can be used in order to select the order of exploration in the search tree. However, other strategies can be used, and some criteria are defined in the next subsections. 6.1

The heuristic function nTrees

This heuristic takes into account the number of different subtrees that can be generated starting from a task T. In this work we examine the influence in the selection of Or nodes (function selectNode in Fig. 4) with more or less number of trees below it when exploring the And/Or graph to construct assembly trees. In the experiments, two different criteria will be used: selecting the Or node with the lower and the highest number of subtrees respectively. The number of different subtrees from task T (And node) in the And/Or graph, nTrees(T), can be calculated as

37

nTrees (T ) = nTrees (Or1 ) × nTrees (Or2 )

(9 )

Or1 and Or2 being the two Or nodes below T in the And/Or graph. In the other hand, the number of different subtrees from an Or node, nTrees(Or), can be calculated as

 nTasks (Or )  nTrees (Or ) = max 1, ∑ nTrees (Ti )  i =1  

6.2

(10)

The heuristic function height

This heuristic considers the height of a task in the And/Or graph. The height of the nodes in the And/Or graph is defined recursively, and it refers to the maximum distance from the node to a leaf node in the best case (subtree): height (Or ) = min(height (Ti )) Ti ∈Or

height (T ) = max(height (Or1 ), height (Or2 )) + 1

(11)

The height of and Or node is defined as the minimal height of the tasks below it, and the height of a task (And node) is one plus the maximal height of the Or nodes below the task. For the leaf nodes, the height is defined as zero. As for the previous heuristic function, two different criteria will be used in order to examine the influence of the height in the selection of Or nodes (function selectNode in Fig. 4): selecting the Or node with the lower and the highest height respectively. 6.3

Selection of the exploration order

As we saw in Section 4, there are two choice points in the algorithm CB2 (see Figure 4), the first corresponding to the function selectNode(OrList, heuristic), that selects an Or node in order to grow the tree of tasks, and the second corresponding to the function rankTasks(SA, heuristic), in order to choose or decide the order of selection of tasks from an Or node (subassembly SA), that is, the order of generating the different trees. Another criteria for the exploration order could be similar that those taken usually for variables and values in CSP, i.e., the fail-first principle and the succeed-first principle [17] [18]. For these, the most restricted variable is chosen at first, i.e., the one that could fail the search more easily, so that the failures are detected as soon as possible. On the other hand, the most promising value is chosen for the variable, i.e., the one that has more possibilities so that all the constraints are satisfied. In our case we might consider that the Or nodes are like the variables from the CSP, and the tasks from each Or node would be the values. Therefore, we should start with the most critical Or node, and from an Or node, we should start with the most promising task. According to this, we can use the heuristic functions hs and ht defined for bounding solutions in order to rank both Or nodes and And nodes. Therefore, the first selec-

38

tor function, selectNode would select first the Or nodes with higher values of hs or ht. The second selector, rankTasks would select the tasks with lower values of hs or ht. However, we have tested other different strategies, in order to evaluate their influence in the computational efficiency of the algorithm. For the selection of Or nodes in the function selectNode, we have tested nine different strategies: selecting the Or node with less number of subtrees, with more subtrees, with better hs, with worse hs, with better ht, with worse ht, with less height, with worse height, an finally selecting the Or node according to the order of inclusion in the OrList (referred as none strategy). For the selection of tasks in the function rankTasks, we have tested three different strategies: selecting the tasks in increasing order according to the heuristic hs or ht considered, selecting the tasks in decreasing order of number of subtrees, and using the order used in the And/Or graph (referred as none strategy). Finally, we have used three different constraints in function createConstraintsTask, corresponding to the use of hs, ht (see Subsections 5.1 and 5.2) or none of them, so that the domains of variables can be reduced and so it could affect to the search.

7

Results and Discussion

The algorithm CB2 has been implemented using ILOG Solver and Scheduler [19], a C++ library for Constraint Programming and scheduling problems. The algorithm has been tested in a variety of situations, considering different number of machines and configurations and durations of tasks. The results in next tables correspond to a hypothetical product of 30 parts, with 396 Or nodes and 764 And nodes in the And/Or graph, so that the number of legal linear sequences is about 1021. The tables show the results of 11 different problems generated randomly (each problem have a different combination of durations and resources for the tasks in the And/Or graph) in a system with 2 different machines and 2 different configurations for each machine. The values in the tables correspond to the execution time, in seconds, using a Pentium IV (1.47 GHz). As indicated in the previous section, we have tested different strategies for the search, in order to evaluate their influence, so that for each of the 11 problems we used 72 different compatible strategies. Table 2 shows the influence of using the heuristics hs and ht in function createConstraintsTask when substituting the constraint: ft(T) ≤ makespan

(12)

by st(T) + hs(T) ≤ makespan

/

st(T) + ht(T) ≤ makespan

(13)

The table shows, for each alternative heuristic (or none of them) used for bounding the makespan, the average time obtained and the ratio of problems that were solved in

39

Table 2. Influence of using heuristics hs and ht when bounding the makespan.

Problem #09 36.56 0.0% #13 257.99 3.7% #18 197.02 3.7% #19 93.11 0.0% #28 264.87 14.8% #35 134.18 0.0% #61 109.92 7.4% #63 1683.79 3.6% #65 102.56 0.0% #81 525.13 51.9% #91 929.02 96.3% #Ave 394.01 16.5%

hs 962.39 73.78 196.49 90.15 665.64 132.93 77.59 125.97 100.53 564.53 948.04 358.00

0.0% 63.0% 0.0% 0.0% 0.0% 0.0% 48.1% 32.1% 0.0% 3.7% 0.0% 13.4%

28.80 261.16 187.57 80.33 258.16 129.42 96.19 1946.31 96.53 524.78 944.05 413.94

ht 100.0% 33.3% 96.3% 100.0% 85.2% 100.0% 44.4% 64.3% 100.0% 44.4% 3.7% 70.1%

less time than the other heuristics. Each entry for hs and ht corresponds to 27 different combinations of search strategies, and 18 for the other alternative. We can see that the use of ht give the best results than the other cases in the 70,1% of the tests (19 from 27). We can see that the average time for ht is worse than the other heuristics, but it can be explained from the results for the Problem #63, whose execution time is very high. However, the heuristic ht obtained better results in 7 of the 11 problems, that is the 63.6%. Table 3 shows the influence of using the heuristics “hs or ht” and the number of subtrees (referred as nTrees) in the function rankTasks, which decides the order of selection of alternative tasks from each Or node. The table shows, for each heuristic (or none of them), the average time obtained and the ratio of problems that were solver in less time than the other heuristics. In this case, each entry for heuristics “nTrees” and “-“ (no heuristic) corresponds to 27 different combinations of search strategies, and

Table 3. Influence of the heuristics hs or ht and nTrees when selecting the tasks.

Problem #09 33.3% 41.25 #13 61.49 85.2% #18 356.32 0.0% #19 106.59 0.0% #28 470.71 0.0% #35 195.21 0.0% #61 117.75 0.0% #63 72.87 0.0% #65 195.13 0.0% #81 115.39 100.0% #91 1086.75 0.0% #Ave 256.31 19.9%

nTrees 45.24 0.0% 100.07 0.0% 188.05 0.0% 133.38 0.0% 54.14 100.0% 179.96 0.0% 88.40 37.0% 48.70 96.4% 86.51 3.6% 744.19 0.0% 948.27 3.6% 21.9% 237.90

40

hs or ht 941.26 66.7% 431.38 14.8% 36.73 100.0% 23.62 100.0% 663.81 0.0% 21.36 100.0% 77.55 63.0% 3627.40 3.6% 20.74 96.4% 756.24 0.0% 791.51 96.4% 671.96 58.3%

Table 4. Average time for the heuristics used in the selection of Or nodes.

Problem

-

#09 #13 #18 #19 #28 #35 #61 #63 #65 #81 #91 #Ave

346.5 205.3 194.1 89.8 395.9 137.4 110.2 1322.9 103.1 558.6 938.1 400.2

min max worst worst min max best hs best ht nTrees nTrees hs ht height height 341.5 350.1 345.1 344.1 337.8 339.7 337.0 341.4 166.1 219.8 167.7 167.5 212.3 205.4 227.9 206.9 185.7 204.7 187.3 187.1 196.0 197.1 190.3 200.9 89.6 85.6 93.2 87.0 86.2 87.3 87.7 84.4 390.9 419.8 391.6 393.4 389.8 390.7 399.2 394.6 128.7 136.8 130.9 132.4 132.3 131.4 127.5 132.1 76.5 109.2 79.9 74.3 107.2 110.9 112.0 70.8 1326.5 1330.5 1325.4 1322.6 1311.1 1313.1 1299.8 1303.4 95.3 95.8 97.4 103.4 93.6 107.6 101.3 101.2 555.7 478.8 591.9 436.6 562.5 591.2 575.7 468.6 935.3 960.3 937.9 939.4 947.2 938.6 918.1 945.2 392.6 398.3 394.6 383.4 398.4 400.2 393.5 387.5

18 for the heuristic “hs or ht”. We can see that in more than a half of cases the use of hs or ht give better results than the other heuristics. As before, the average time is worse than the others, but mainly because of the very high times for the Problems #63 and #09. However, the heuristic “hs or ht” obtains better results in 6 of the 11 problems than de other strategies (54.4%). Moreover, even for the Problem #09, when the average time was higher, this heuristic obtained better results than the others in the 66.7% of the cases. Finally, Tables 4 and 5 show the results obtained with the 9 different strategies used in function selectNode, that selects the Or node to be expanded. Each entry shows the results for 8 tests, corresponding to the compatible combination of strategies referred in Tables 2 and 3. For each problem, Table 4 shows the average execution time for each heuristic, and Table 5 shows the ratio of problem solved in less time than the other heuristics. Table 4 shows that the best average times are obtained with heuristics “best ht” and “max height”. However, if analysing the individual problems, we can see that most times the better heuristics are “min nTrees” and “min height”. Concretely, “min height” obtained the best average times in 6 of the 11 problems, while “min nTrees” obtained the best average times in 3 problems and in other 3 was the second best heuristic. In the opposite point, the heuristic “best ht” was 2 times the best and another one the second, while “max height” only was the best once, and twice the second. In fact, if Problem #81 was not considered, the average times would be according to the previous reasoning. Table 5 shows more clearly the last effects, and we can confirm that the heuristics that obtain the best results are “min height” (the best for 59.9% of the tests) and “min nTrees” (the best for 20.6%). This behaviour can be explained as follows: when selecting the Or node with less height (or less subtrees), the tree tends to grow in depth, so that the algorithm adds tasks which have precedence constraints with others yet introduced in the tree, and propagateConstraints in the CB2 algorithm (see Figure 4) can prune more solutions.

41

Table 5. Number of times (%) that a heuristic solves the problem more quickly than the others.

Problem #09 #13 #18 #19 #28 #35 #61 #63 #65 #81 #91 #Ave

0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 10.0% 10.0% 0.0% 1.8%

min max nTrees nTrees 66.7% 0.0% 11.1% 0.0% 22.2% 0.0% 22.2% 0.0% 0.0% 0.0% 0.0% 0.0% 11.1% 0.0% 33.3% 0.0% 50.0% 0.0% 10.0% 0.0% 0.0% 0.0% 20.6% 0.0%

best hs best ht 0.0% 11.1% 11.1% 0.0% 0.0% 11.1% 0.0% 0.0% 0.0% 0.0% 0.0% 3.0%

0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%

worst hs 22.2% 0.0% 0.0% 11.1% 33.3% 0.0% 0.0% 0.0% 10.0% 10.0% 0.0% 7.9%

worst min max ht height height 11.1% 0.0% 0.0% 0.0% 77.8% 0.0% 0.0% 66.7% 0.0% 0.0% 66.7% 0.0% 0.0% 66.7% 0.0% 22.2% 66.7% 0.0% 0.0% 88.9% 0.0% 11.1% 55.6% 0.0% 0.0% 30.0% 0.0% 0.0% 40.0% 30.0% 0.0% 100.0% 0.0% 4.0% 59.9% 2.7%

In that part of the algorithm, the propagation was limited to the precedence constraints, without considering the resource constraints, taken into account in the function findSolution. So, using this strategy, the algorithm tends to choose those tasks that are close to the leaf nodes, and have more precedence constraints than the others. It explains also that “max height” and “max nTrees” obtain the worst results, because the trees tend to grow in width.

8

Conclusions and future work

This paper presents CB2, a constraint-based algorithm using a double backtracking approach, in order to obtain optimal assembly sequences in systems with multiple assembly machines. This problem involves not only the ordering of assembly tasks, but also their selection from a set of alternatives, that are represented through an And/Or graph for the assembly of the product. The problem is solved by generating the different promising assembly plans (trees of the And/Or graph) containing a complete set of tasks sufficient to assemble the product. An assembly plan is modelled as a CSP, with precedence constraints among the tasks defined in the And/Or graph, and resource constraints. The model takes into account (in the constraints) the possible delays due to the changes of configuration in the assembly machines between the execution of different tasks, and to the transportation of intermediate subassemblies between different machines. The combinatorial character of the problem is due to the resource constraints (use of shared resources by tasks without precedence constraints among them), and also to the selection of tasks from the And/Or graph. Instead of enumerating all the assembly trees, and solving the corresponding subproblems separately with well-known techniques for disjunctive constraints handling, such as in [1] and [16], the algorithm presented makes a combined treatment of all the disjunctive constraints through backtracking. This approach takes advantage of the

42

calculations associated to the tasks located near the root of the And/Or graph that would be common to many of the possible solutions. In order to improve the computational behaviour of the algorithm, different heuristic criteria are proposed, both for bounding solutions and for selecting the exploration order of the algorithm. Some results of the use of those criteria are presented. The scheduling of alternative tasks has received little attention, so that the development of techniques for searching and constraint propagation is of special interest. This can be made in different ways, as in [3]. Other open ideas are the adaptation of heuristics proposed for A* algorithms for the same problem [13].

Acknowledgements This work has been partially funded by the Spanish Ministry of Science and Technology, project DPI2000-0666-C02-02. The authors would also like to extend their thanks to the anonymous referees for their helpful suggestions.

References 1. Y. Caseau and F. Laburthe. Improving Branch and Bound for Jobshop Scheduling with Constraint Propagation. Proc. 8th Franco-Japanese 4th Franco-Chinese Confrence CCS’95. 2. P. Esquirol, H. Fargier, P. Lopez, T. Schiex. Constraint programming. Belgian Journal of Operations Research, Statistics and Computer Sciences, 1996. 3. J. C. Beck and M. S. Fox. Constraint-directed techniques for scheduling alternative activities. Artificial Intelligence, 121 (2000) 211-250. 4. C. Del Valle and E.F. Camacho. Automatic Assembly Task Assignment for a Multirobot Environment. Control Engineering Practice, Vol. 4, No. 7, pp. 915-921, 1996. 5. C. Del Valle, M. Toro, E.F. Camacho and R.M. Gasca. A Scheduling Approach to Assembly Sequence Planning. Proceedings of the 2003 IEEE International Symposium on Assembly and Task Planning, pp. 103-108, Besançon, France, July, 2003. 6. L.S. Homem de Mello and A.C. Sanderson. A Correct and Complete Algorithm for the Generation of Mechanical Assembly Sequences. IEEE Trans. on Robotics and Automation. Vol. 7, No. 2, pp. 228-240, 1991. 7. T. L. Calton. Advancing design-for-assembly. The next generation in assembly planning. Proceedings of the 1999 IEEE International Symposium on Assembly and Task Planning, pp. 57-62, Porto, Portugal, July, 1999. 8. L. Kavraki, J.C. Latombe, and R.H. Wilson. On the Complexity of Assembly Partitioning. Information Processing Letters. Vol. 48, pp. 229-235, 1993. 9. L. Kavraki and M. Kolountzakis. Partitioning a planar assembly into two connected parts is NP-complete. Information Processing Letters. Vol. 55, pp. 156-165, 1995. 10. R.H. Wilson, L. Kavraki, T. Lozano-Pérez and J.C. Latombe. Two-Handed Assembly Sequencing. International Journal of Robotic Research. Vol. 14, pp. 335-350, 1995. 11. Homem de Mello, L.S., Lee, S. (eds.): Computer-Aided Mechanical Assembly Planning. Kluwer Academic Publishers, 1991. 12. M. H. Goldwasser and R. Motwani. Complexity measures for assembly sequences. International Journal of Computational Geometry and Applications, 9:371-418, 1999.

43

13. C. Del Valle. Algoritmos heurísticos para la selección de secuencias óptimas de ensamblaje. Tesis doctoral, Universidad de Sevilla, 2001. 14. L.S. Homem de Mello and A.C. Sanderson. And/Or Graph Representation of Assembly Plans. IEEE Transactions on Robotics and Automation. Vol. 6, No. 2, pp. 188-199, 1990. 15. J. Wolter. A Combinatorial Analysis od Enumerative Data Structures for Assembly Planning. Journal of Design and Manufacturing. Vol 2, No. 2, pp. 93-104, 1992. 16. J. Carlier, E. Pinson, Adjustment of heads and tails for the job-shop problem, European J. Oper. Res. 78, pp. 146-161, 1994. 17. R. M. Haralick and G. L. Elliot. Increasing tree search efficiency for constraint satisfaction problems. Artificial Intelligence, 14: 263-313, 1980. 18. Pascal Van Hentenryck. Constraint Satisfaction in Logic Programming. The MIT Press, 1989. 19. ILOG, France, http://www.ilog.fr/.

44

A software tool for modelling and solving optimization problems with side constraints Andrew Davenport [email protected] IBM T.J.Watson Research Center, Yorktown Heights, New York, USA

Abstract. Many real world optimization problems can be modelled as one of a number of well known efficiently solvable problem types such as network flow problems, matching problems, minimum spanning tree or shortest paths problems or as an NP-hard problem such as the multiple knapsack problem or travelling salesman problem. Often side constraints are used to model additional aspects of the problem being solved which do not fit into one of these core problem models. Solving such problems with side constraints requires expertise in either adapting specialised algorithms to handle the new constraints, making use of techniques such as Lagrangean relaxation or modelling the problem using a general purpose solver such as constraint programming or integer programming. We make the case that what is needed by optimization developers is a general purpose software tool that (a) can be used to model and solve a wide range of “standard” problems using the most efficient specialised solution techniques available; (b) provides a declarative language for stating side constraints to these problems, and (c) provides a number of general purpose and specialized techniques for solving such problems with side constraints. Such a tool should be usable by a non-expert optimization developer. We illustrate how such a tool could be used to solve an inventory matching problem arising from the steel industry.

1

Introduction

The Operations Research and Computer Science communities have, over the years, identified and developed efficient algorithms and approximation schemes for solving a wide range of recurring problem types that are often encountered in practice. Examples of such problems include many for which polynomial time algorithms are known, such as network flow problems (max-flow, min-cost maxflow), matching problems (weighted bipartite matching, weighted general matching, assignment problems, b-matchings), minimum spanning tree problems and shortest path problems. Many NP-hard problems have also been well studied [1], and good algorithms and approximation schemes have been developed for solving them. Examples of such problems include the multiple knapsack problem, the travelling salesman problem and the set covering problem.

45

In practice many real world optimization problems can be modelled as variations of these well studied problem types, either directly or with the addition of side constraints. For example: 1. Inventory matching of partially finished steel coils can be modelled as a multiple knapsack problem [2]. Restrictions on combinations of items that can be placed together in a knapsack can be modelled using colour constraints [3]. 2. Sequencing of coils in a steel mill finishing line facility can modelled as travelling salesman problem with side constraints [4]. 3. The winner determination problem in combinatorial auctions can be modelled as a set covering problem. Business rules such as minimum and maximum number of winners can be modelled using side constraints [5, 6]. There has been much work directed towards solving these well known problem types in their pure form, without side constraints, using a diverse range of specialised and efficient techniques. But despite the fact that solutions to these problems are applicable in a wide range of industry settings, there does not exist, as far as we are aware, any publically available software packages containing efficient implementations of these algorithms that can be used by developers who need to solve such problems. General purpose solvers based on integer programming, such as CPLEX or OSL, or constraint programming, such as ILOG Solver, are often used to model and solve these types of problems. However, general purpose technology may not be the most efficient way of solving these problems. Furthermore, integer programming has a weak modelling language which is difficult to use by the non-expert optimization developer. Constraint programming provides a more declarative modelling language, but is not specifically designed for solving optimization problems. One software tool that does contain good implementations of many useful polynomial time algorithms (such as shortest paths, matching and network flow algorithms) is LEDA [7]. LEDA is a C++ library that provides a graph modelling language, uses graph-based data structures as inputs to its optimization algorithms and provides a graph visualization capability. We have used LEDA as the basis for a number of optimization applications. However LEDA does not provide any capability for solving problems with side constraints, so such capabilities usually have to be added by the developer1 . Furthermore LEDA does not provide any implementations of algorithms for solving common NP-hard problems. In this paper we propose that what is needed by software developers and consultants working on optimization applications is a software tool that: 1

A research project extending LEDA to handle side constraints using a combination of Lagrangean relaxation techniques and solution ranking has been made available [8], however this package is not easy to use and is not applicable to a wide range of problems.

46

1. Provides a capability for modelling and solving a wide range of “standard” polynomial time and NP-hard problems. 2. Provides a declarative modelling language for stating a wide range of side constraints to these problems. 3. Provides efficient implementations for solving such problems with side constraints that have been studied in the literature, as well as an open interface for adding new implementations for new classes of side constraints. 4. Provides implementations of general techniques for solving problems with side constraints when no existing techniques are available. In the following sections we present some usability considerations that motivate the proposal we are making. We outline how this software tool could be used to solve an example of a real world inventory matching problem we encountered at a steel company. We also discuss some existing techniques for solving problems with side constraints. Unfortunately, while we raise many requirements for what an “ideal” software tool for us would look like, we do not provide many answers for how to build such a tool. We leave this as an exercise for (much) further research.

2

Usability considerations

In [9] it is pointed out that the wider uptake of constraint programming is hampered by the shortage of constraint programmers who have sufficient expertise in modelling constraint programs, and that tools should be developed to support the non-expert user in developing constraint programming applications. While we agree that this is a very important research direction, from our experience in the manufacturing domain we believe that constraint programming and optimization in general has a long way to go before software developers without any background in optimization can fully develop and deploy optimization technology without help from experts. Companies today deploy optimization technology in one of two ways. Firstly, they may use software packages which have been developed to solve a particular function, such as capacity planning for a supply chain, which then may be customised in some way by consultants. Alternatively when problems arise that cannot be handled by such general tools, for instance detailed scheduling with complex constraints or new applications of optimization technology are being pursued, companies may hire consultants with expertise in optimization to design and develop software tools to solve these types of problems. One of the problems of working with consultants in this way is that the issue of maintenance of the software after consultants exit from a project is a major concern to the client. Requirements for optimization applications usually change over time, often after consultants have left the project. Even during a project, tight schedules often require that changes have to be made very quickly to the constraints and objectives of the problem being solved by the application. This places further requirements on the nature of the optimization technology being deployed during a project. Modelling languages need

47

to be intuitive enough so that they can be understood by non-expert developers responsible for the maintainence of the optimization application. The optimization technology itself needs to be flexible enough so that it is robust to changes to the application requirements.

3 3.1

Case study: inventory matching Problem overview

We consider as a case study a problem that arises in operations planning in the process industry. In make-to-order production systems, a surplus inventory accumulates due to cancellations of orders and rejection of production units for failing to satisfy quality requirements. It is to the advantage of the production facility to utilize this surplus inventory before planning its production activities. The inventory matching problem involves matching orders in an order book against this surplus inventory. The objectives are both maximizing the total amount of orders that are assigned, and minimizing total waste of production units. This process occurs as a preprocessing step to production planning, and was encountered in the operation of a large steel plant. Figure 1 presents a bipartite graph representation of an inventory matching problem, where the nodes on the left of the graph represent orders and the nodes on the right represent inventory items (in this case surplus slabs of steel). Edges in the graph represent possible assignments of orders to surplus slabs in inventory. Manufacturability considerations such as the compatibility of the orders and slabs in terms of quality and size impose assignment constraints on which orders can be applied to which slabs. Each order has a target weight that needs to be delivered. However, in practice a single order will be satisfied by shipping a number of production units, for which a minimum and maximum weight will be specified. In the example of Figure 1, order C has a target weight of 20 tons, which can be satisfied with a number of production units of between 8 and 12 tons. Furthermore, the size of all production units being manufactured from a single slab of steel must be the same. For instance, order A can be satisfied by matching it to slab S2 by making two production units of 10 tons each. Thus in order solve the inventory matching problem, we need to decide which orders will be matched to which slabs, and for each application, how much weight of the applied order will be satisfied using the applied slab and what production unit size and number will be used to apply this weight. In practice, the weights of the orders can be much larger than the weights of the slabs, so typically a single order will be satisfied by matching it to many slabs. We also want to minimize the amount partial surplus in the solution. Partial surplus occurs when orders are applied to a slab with some remaining waste on the slab. For instance, after applying 20 tons of order A to slab S1 , the remaining 10 tons of the slab, if not applied elsewhere, is considered partial surplus. It is also possible to apply multiple orders to a single slab, if they share

48

20 tons: [7..11] A

15

30 tons: [10..15]

15

B

20 tons: [8..12]

S1: 15 tons

20 S2: 20 tons

20

C

Slabs

Orders

Fig. 1. Bipartite graph representation of a simple steel inventory matching problem

common physical characteristics and process routes. This situation is referred to as the packing of orders on a slab. 3.2

Multiple knapsack problem representation

We first consider a simple version of the problem where no partial surplus and no packing is allowed. This is the best possible type of solution from the viewpoint of the user, since packing is bottleneck operation and partial surplus is to be minimized. The problem can be modelled as a variation of the multiple knapsack problem [10] called the sparse multiple knapsack problem. In the multiple knapsack problem we are given a set of n items and a set of m knapsacks. Each item has a profit, and a weight. Each knapsack has a capacity. The goal of the problem is to select m disjoint subsets of items so that the total profit of the selected items is a maximum and each subset can be assigned to a knapsack whose capacity is no less than the total weight of items in the subset. In the sparse multiple knapsack problem, assignment constraints exist between items and knapsacks, restricting which items can be placed in which knapsacks. The multiple knapsack problem formulation of the inventory matching problem associates orders with knapsacks and slabs with items. In the bipartite graph representation of the problem, we associate a weight on each edge corresponding to the profit associated with matching each order and slab. In this formulation the profit is the weight of the order that can be applied to the slab 2 . Finally, 2

In practice, this weight takes into account other factors, such as order priority and due date, hence the problem becomes a generalised assignment problem [10].

49

we only include as an edge in the graph if the application of the corresponding order to the slab is feasible with respect to assignment constraints and there exists a feasible production unit size and weight of the order for the application. An optimal solution to the problem presented in Figure 1 is to apply two production units of 7.5 tons of order A to slab S1 and two production units of 10 tons of order C to slab S2 , giving a total applied weight of 35 tons. The multiple knapsack problem is a fundamental “core” operations research problem, and there has been much work directed to developing efficient algorithms to solve it. Specialised branch and bound algorithms are usually used to solve this problem to optimality [10]. However, these algorithms can be complicated and difficult to implement efficiently. Unfortunately, as far as we are aware, no commercial software programming library exists that provides a capability for modelling and solving instances of the multiple knapsack problem using efficient implementations of the best techniques that are known. 3.3

Network flow representation with side constraints

While the simple version of the inventory matching problem can be solved efficiently using well known operations research techniques, the problem becomes more complicated when we allow multiple orders to be matched to a single slab. Consider the problem illustrated in Figure 2. If we allow packing of multiple orders to a single slab then we can find a solution of applied weight 30 tons by matching 20 tons of order A to slab S1 , 10 tons of order B to slab S1 , 5 tons of order B to slab S2 and 25 tons of order C to slab S2 .

A

20

20 tons: [8..12]

S1: 30 tons

15 B

15

15 tons: [5..7] S2: 30 tons 28 C 28 tons: [10..15] Orders

Slabs

Fig. 2. Bipartite graph representation of a steel inventory matching problem allowing packing

50

The inventory matching problem with packing cannot be formulated as a multiple knapsack problem, since solutions to the multiple knapsack problem provide a “many to one” matching, whereas what is needed is a “many to many” matching. In inventory matching with packing, we have orders which can be applied to more than one slab, and slabs that can be applied to more than one order. In [3] a heuristic solution using an iterated bipartite matching approach to solving this problem is presented. We present an alternative formulation of this problem as a network flow problem [11] with side constraints. By adding a source and a sink node to the bipartite graph of Figure 2, we can formulate a max-flow problem, as shown in Figure 3. Edges in this figure are labelled with a lower and upper edge flow capacity ([LB, U B]) and the flow in the max-flow solution to this problem (in round brackets). Edges between the source node and the order nodes, and the order nodes and the slab nodes, have a lower capacity of 0 and an upper capacity corresponding to the target weight of order. Edges between the slab nodes and the sink node have a lower capacity of 0 and an upper capacity corresponding to the weight of the slab. In a solution to the max-flow problem, the flow between each order node and slab node corresponds to the weight of the order that is to be applied to the slab.

A

[0,20] (20)

20 tons: [8..12]

[0,20] (20)

S1: 30 tons

[0,10] (10)

[0,30] (30)

[0,15] (13) S

B 15 tons: [5..7]

[0,28] (28) C

* [0,15] (2)

S2: 30 tons

[0,30] (30)

[0,28] (28)

28 tons: [10..15] Orders

Slabs

Fig. 3. Network flow representation and solution of a steel inventory matching problem allowing packing

The max-flow formulation of the inventory matching problem can handle assignments of multiple orders to a slab, unlike the multiple knapsack formulation. However, it cannot guarantee feasibility of the assigned weights of orders to slabs with respect to their range of possible production unit sizes. Note that the maxflow relaxation of the problem presented in Figure 2 finds an infeasible solution of applied weight 60 tons. The infeasibility is with respect to the constraint on the minimum production unit size of order B. The flow on the edge between the order B node and the slab S2 node is 2, representing an applied weight of 2

51

tons of this order to the slab. The minimum production unit weight of order B is 5 tons. It is not possible to directly modify the max-flow formulation to state that if any flow is on this edge, it must have a minimum value corresponding to the minimum production unit size of the order. This constraint on minimum production unit size is a side constraint to the network flow formulation of the problem. One technique based on constraint programming which is applicable to problems such as this is probe backtrack search [12]3 . Probe backtrack search partitions the set of constraints in a problem into easy constraints and hard constraints. Easy constraints can be solved efficiently using some specialized algorithm to handle them. However the solution found by the easy constraint solver may be infeasible with respect to the hard constraints. In this case, a constraintbased branch and bound search is performed. Branching takes place on possible ways of satisfying the violated hard constraints in the solution to the easy constraint problem. We illustrate the approach with the inventory matching problem. The easy constraints in this problem are those stating that we want to maximize the application of total order weight to surplus slab weight, while not exceeding, for each order and slab, the applied order and slab weight respectively. This easy problem is solved using the max-flow relaxation of the problem. The hard constraints for the max-flow relaxation are those relating to the minimum and maximum production unit size of each order. In our example, the infeasible constraint in the solution to the relaxation is that of the minimum production unit size of order B. The two branches that can be explored in the max-flow relaxation to satisfy this constraint are: 1. Set the lower capacity of the edge between the order B node and the slab S2 node to be 5. 2. Set the upper capacity of the edge between the order B node and the slab S2 node to be 0. When we investigate these branches by solving two new max-flow problems with the constraints on each branch added, we find feasible solution of applied weight 60 tons for the first branch (applying 5 tons of order B and 25 tons of order C to slab S2 and a feasible solution of applied weight 58 tons for the second branch. Constraint propagation techniques can be applied at each node in the search here to detect infeasibilities or prune edge capacity domains. Computing solutions to the max-flow problem can be performed in time 1 O(n2 m 2 ) for graphs with n nodes and m edges [14]. However incremental maxflow algorithms have recently been developed that recompute from an existing solution the maximum flow after edges have been added or deleted [15]. These algorithms have a time complexity of O((∆n)2 m), where ∆n is the number of affected nodes in the new max-flow solution. Such incremental algorithms can be used efficiently in a branch and bound search. 3

The same idea has been extended to use local search and probing [13].

52

3.4

Why use the network flow representation with side constraints?

Constraint-based probing may not be the best way to solve the inventory matching problem we have presented here. Other specialised techniques may exhibit better performance. However we do believe that the network flow representation with side constraints provides a good way of modelling the problem, for the following reasons: 1. A network flow model with side constraints is much more intuitive for nonoptimization experts to understand than an integer programming model. A complicated, non-intuitive model will not be maintainable by developers who do not have a background in optimization. 2. Using a declarative constraint programming language to express side constraints on the network flow problem allows new constraints to be added easily to the model. For instance, the full inventory matching problem has further constraints covering, for instance, what is allowed in terms of partial surplus if a slab cannot be filled completely with orders 4 . Adding new constraints to an integer programming formulation can be complicated, and may require new classes of variables to be created (such as indicator variables). A developer without an optimization background cannot perform this sort of task. 3. Many of the efficiently solvable problems we are discussing here (network flows, matching, spanning tree problems) have natural graph representations. Graphs can be easily visualized, aiding development, debugging and maintenance of the model. 4. By identifying that an efficiently solvable problem, such as a network flow, is at the core of the problem, this may suggest methods specialized techniques for solving the problem. For some instances of problems with side constraints efficient solution techniques have been developed (for example, there has been a lot of work in solving the resource constrained shortest path problem [8]).

4

Approaches to dealing with side constraints

In the previous section we have outlined how side constraints to a network flow problem can be handled using a constraint programming approach. Constraint programming provides a rich declarative framework for modelling side constraints, and constraint-based probing [12] can provide a general mechanism for solving problems with side constraints when no specialized techniques are available. The flexibility of the constraint-based probing approach allows for any appropriate algorithm to be used to handle the easy constraints. For instance, 4

An example is that each order may have associated with it a minimum amount of partial surplus that should occur if it is applied to a slab with partial surplus. When multiple orders are applied to a slab, the minimum of the minimum partial surpluses for the applied orders is the minimum amount of partial surplus that should be used.

53

a standard technique used in constraint-based approaches to solving scheduling problems is to satisfy the temporal constraints efficiently using an algorithm to solve the all-pairs shortest path problem [16, 17], while violations of the resource capacity constraints in the problem are resolved by branch and bound search and constraint propagation. Constraint-based probing is certainly not the most efficient technique for solving all problems with side constraints, even though a constraint-based language may be desirable for modelling such problems. There are a number of other general approaches for solving problems with side constraints, as well as many specialized algorithms for specific problems. In addition to constraintbased probing, an optimization expert might make use of a range of techniques such as: 1. Using modelling tricks to formulate the side constraints inside the language of the underlying easy problem. 2. Modifying and implementing existing algorithms to handle the side constraints. 3. Using Lagrangean Relaxation, which moves side constraints into the objective function. 4. Performing iterative enumeration of multiple solutions to the “easy” problem, not taking into account the side constraints, and checking the feasibility of these solutions respect to the side constraints. This can done for problems such as shortest path with side constraints, where multiple solutions to the shortest path problem can be easily generated by solving the k-shortest path problem. 5. Using general purpose optimization tools, such as integer programming or constraint programming. The development and use of these techniques requires significant expertise in optimization. We propose that such expertise should be provided within a software tool for solving different classes of optimization problems with side constraints. For instance, many variations of the shortest path problem have been studied, such as the resource constrained shortest path problem [8], the shortest k-path problem [4], the k-shortest path problem, the shortest path problem with specified nodes and many others. Such problems arise surprisingly often in real world optimization applications. A declarative modelling language for stating such problems, and efficient implementations of specialised algorithms for solving these variations of the shortest path problem would be invaluable for developers. In recent years a number of global constraints have been added to constraint programming languages to allow improved propagation over problems containing polynomial time sub-problems such as weighted bipartite matching [18] or network flows [19]. These global constraints do address some of the issues raised in this paper, in that they provide a modelling framework and solution technology for solving problems with side constraints. However, the search in these frameworks is guided not by the solution to the “easy” optimization problem, but by the satisfaction of the constraints in the problem. Optimization is handled indirectly by solving the constraint problem many times, with different

54

bounds on the value of the objective function represented by a constraint. Thus these techniques are more applicable to problems where the satisfaction of the constraints is difficult. However, they do provide an alternative way of solving problems with side constraints that could be used in a software tool.

5

Summary

In this paper we have made the observation that while many real world optimization problems involve solving “standard” polynomial time and NP-hard problems for which efficient algorithms are known, software tools providing good implementations of these algorithms are not commercially available. We believe that a software library of specialised algorithms for solving these types of problems would be of great value to the optimization developer community. LEDA [7] provides an excellent example of how to do this for a range of polynomial time algorithms, using graph-based data structures for formulating problems. We believe that such representations are more intuitive for non-expert users to understand than models formulated with general purpose solvers such as those based on constraint and integer programming, and efficient implementations of specialised algorithms can provide better performance than models formulated with general purpose solvers. Secondly, the addition of side constraints to these “standard” problems makes them challenging to solve, especially for the non-expert developer. However such side constraints frequently arise in real world optimization applications. Many specialised techniques have been developed for solving specific instances of problems with side constraints. Implementations of such techniques would be invaluable within an algorithms library such as LEDA. In situations where a particular type of side constraint cannot be handled by using specialised techniques, we propose a constraint-based modelling language should allow arbitrary side constraints to be stated to these problems. General techniques, such as constraintbased probing, could be used for solving these problems when specialised techniques are not available.

References 1. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman and Company, New York (1979) 2. Salman, F.S., Kalagnanam, J.R., Murthy, S., Davenport, A.J.: Cooperative strategies for solving the bicriteria sparse multiple knapsack problem. Journal of Heuristics 8 (2000) 215–239 3. Kalagnanam, J., Dawande, M., Trumbo, M., Lee, H.S.: The surplus inventory matching problem in the process industry. Operations Research 48 (2000) 4. Okano, H., Morioka, T., Yoda, K.: A heuristic solution for the continuous galvanizing line scheduling problem in a steel mill. Technical Report RT0478, IBM Research Report (2002)

55

5. Davenport, A., Kalagnanam, J.: Price negotiations for procurement of direct inputs. In Dietrich, B., Vohra, R., eds.: Mathematics of the Internet: E-Auction and Markets. Volume 127 of The IMA Volumes in Mathematics and its Applications. Springer (2001) 27–44 6. Hohner, G., Reid, G., J., R., Ng, E., Davenport, A., Kalagnanam, J., Lee, H.S., An, C.: Combinatorial and quantity-discount procurement auctions benefit mars, incorporated and its suppliers. Interfaces 33 (2003) 7. Mehlhorn, K., N¨ aher, S., Uhrig, C.: The LEDA platform for combinatorial and geometric computing. In: Proceedings of the 24th International Colloquium on Automata, Languages and Programming (ICALP’97), Springer-Verlag, LNCS 1256 (1997) 7–16 8. Mehlhorn, K., Ziegelmann, M.: Cnop – a package for constrained network optimization. In: Proceedings of ALENEX’01, Springer-Verlag, LNCS 2239 (2001) 17–31 9. Little, J., Gebruers, C., Bridge, D., Freuder, E.: Capturing constraint programming experience: A case based approach. In: International Workshop on Reformulating Constraint Satisfaction Problems, Workshop Programme of the Eighth International Conference on Principles and Practice of Constraint Programming., CP’2002. (2002) 10. Martello, S., Toth, P.: Knapsack problems. John Wiley and Sons Ltd., New York (1989) 11. Ahuja, K., Magnanti, T.L., Orlin, J.B.: Network flows. Prentice Hall (1993) ISBN 0-13-617549-X. 12. El Sakkout, H., Wallace, M.: Probe backtrack search for minimal perturbation in dynamic scheduling. Journal of Constraints 5 (2000) 359–388 13. Kamarainen, O., El Sakkout, H.: Local probing applied to scheduling. In: Principles and Practice of Constraint Programming, CP’2002, Springer-Verlag, LNCS 2470 (2002) 155–171 14. Goldberg, A.V., Tarjan, R.E.: A new approach to the maximum flow problem. J. Assoc. Comput. Mach. 35 (1988) 921–940 15. Kumar, S., Gupta, P.: An incremental algorithm for the maximum flow problem. Journal of Mathematical modeling and algorithms 2 (2003) 1–16 16. Cheng, C.C., Smith, S.F.: Applying constraint satisfaction techniques to job shop scheduling. Annals of Operations Research, Special Volume on Scheduling: Theory and Practice 1 (1996) 17. Cesta, A., Oddi, A.: Gaining efficiency and flexibility in the simple temporal problem. In Chittaro, L., Goodwin, S., Hamilton, H., Montanari, A., eds.: Proceedings of the Third International Workshop on Temporal Representation and Reasoning (TIME-96), Los Alamitos, CA, IEEE Computer Society Press (1996) 18. Caseau, Y., Laburthe, F.: Solving various weighted matching problems with constraints. In: Principles and Practice of Constraint Programming, CP’1997, Springer-Verlag, LNCS 1330 (1997) 19. Bockmayr, A., Pisaruk, N., Aggoun, A.: Network flow problems in constraint programming. In: Principles and Practice of Constraint Programming, CP’2001, Springer-Verlag, LNCS 2239 (2001) 196–210

56

Constraint Based Type Inferencing in Helium Bastiaan Heeren, Jurriaan Hage, and S. Doaitse Swierstra {bastiaan,jur,doaitse}@cs.uu.nl Institute of Information and Computing Science, Universiteit Utrecht, P.O.Box 80.089, 3508 TB Utrecht, Netherlands

Abstract. The Helium compiler implements a significant subset of the functional programming language Haskell. One of the major motivations for developing it was to yield understandable and appropriate type error messages. The separation between the generation, the ordering and the solving of constraints on types has led to a flexible framework which has been employed successfully in a classroom setting. Among its many advantages are the possibility to plug in heuristics for deciding the most likely source of a type error, the generation of multiple type error messages during a single compilation, and the possibility for programmers to tune the type inference process and resulting messages to their own needs without having to know any of the details of the implementation.

1

Introduction

At Universiteit Utrecht, the Helium compiler was developed with a special focus on generating understandable error messages for novice functional programmers [ILH]. The compiler implements a large subset of Haskell, the most serious omission being that of type classes, an omission which is currently being dealt with. The compiler has been succesfully employed in an educational setting and the feedback obtained from the students is very positive. The generation of “good” type error messages for higher-order functional programming languages such as Haskell is a lively field, mainly because novice users have a large problem understanding those generated by most exisiting compilers [HW03,Chi01,YMT00,Sul,McA00]. However, too little of this research has ended up in prominent compilers and interpreters available for the language. In that sense, external tools have an advantage, because they do not have to be built into a compiler. Every programmer has his own style of programming, like top-down, bottomup or something in between. A programmer should be able to choose the style of type inferencing which suits him best, based on his style of programming. Therefore, we think that there is not a single deterministic type inference process which works best in all cases for everybody, and advocate the use of a more flexible system that can be tuned by the programmer. To make this workable, some mechanism should be present to make this possible without having to delve into the innards of the compiler. A drawback of existing compilers is that they are rigid, i.e., it is not possible for a programmer to change the way the type

57

inference process works, unless he is willing to inspect and modify the compiler itself; given the complexity of compilers, this is usually a very difficult task. The flexibility we sought for has been obtained in the implementation of the type inference process of Helium by taking a constraint based approach dividing the process into three distinct phases: 1. the generation of constraints in the abstract syntax tree, 2. the ordering of constraints in the tree into a list, and 3. the solving of constraints. For the latter phases, multiple instantiations are possible. This separation has resulted in a flexible framework, which has been relatively easy to implement using an attribute grammar system – an abstraction layer on top of Haskell –, developed by Swierstra et al. [SBL]. In fact, we plan to show that the translation of the type rules to an attribute grammar is quite straightforward. The generality of our framework also allows the emulation of well-known type inference algorithms such as W [DM82] and M [LY98], by choosing an appropriate order in which the type constraints should be solved. This makes it easy to compare these standard algorithms with type inference processes which use global heuristics and elaborate data structures such as type graphs (see Section 4.3). In this way we can gain insight into the computational penalties to be paid, and it has the additional benefit that the standard ways of type inferencing are available to those who have grown used to it. The paper is organized as follows. After a short tutorial on type inferencing, and introducing the sorts of constraints we need, we give the type inference rules for our language in Section 3. After showing how these can be transformed into an exectuable specification in the attribute grammar system in Section 4.1, we consider ways of ordering the constraints in Section 4.2. Section 4.3 considers various types of solvers, greedy ones which can act similar to algorithms like W and M, and a more global one which tries to find a minimal set of errors based on a global analysis of a type graph. We conclude in Section 5 by shortly discussing type inference directives described in [HHS03]. This facility gives programmers, and more specifically the developers of combinator libraries, the tool to control the behaviour of the compiler by specifying how the type inference process should behave on certain sets of expressions, and a way to provide tailormade error messages for the combinator language. This, we feel, is an important innovation, that has become feasible through the underlying constraint based approach.

2

Preliminaries

Types and substitutions Many existing programming language have a type system, a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute. Pierce [Pie02]

58

Reasons for having a type system are many: source documentation, generating more efficient code, and finding program errors early. Tractability is essential for the usefulness of type systems within a compiler. Usually this is obtained by ensuring that the method can be performed in a syntax directed manner, i.e., determined by the structure of the abstract syntax tree. In fact, we generate constraints from our program in such a manner. In many programming languages with such a type system, it is necessary that the programmer annotates identifiers and procedures or methods with their types. It is then the task of the compiler to verify that these types are consistent and do not lead to run-time errors. The field which addresses this kind of problem is called type checking. In the case of a language like Haskell and ML, many of these annotations can be dropped since Damas and Milner showed how to inference polymorphic let expressions [DM82]. For some language constructs, polymorphic lambda abstractions for instance, Haskell needs externally supplied types to make the type inferencing process decidable. For more information, the reader might consult Chapter 22 and 23 of Pierce [Pie02] for its constraint based development of a type system for a polymorphic functional language. The syntax of types and type schemes is given by: (type) (type scheme)

τ := α | T τ1 . . . τn σ := ∀ α.τ

where arity(T ) = n

A type can be either a type variable or a type constructor applied to a number of types. The arity of each type constructor is fixed. Typical examples are → Int Bool, and → a (→ b a), the function space constructor being a binary type constructor. In the following, we use the standard infix notation τ1 → τ2 for function types, and a special notation for list and product types. The set of type constructors can be extended with user defined data types such as Maybe. A type scheme ∀ α.τ is a type in which a number of type variables α = α1 , . . . , αn , the polymorphic type variables, are bound to a universal quantifier. The free type variables are called monomorphic. Note that n may be zero, in which case a type scheme is simply a type. Although the type variables have an implicit order in any given type scheme, the order itself is not important. For this reason we may view the vector α as a set when the need arises. The set of free type variables of a type τ is denoted by ftv(τ ) and simply consists of all type variables in τ . Additionally, ftv(∀ α.τ ) = ftv(τ ) − α , where we use − to denote set difference. A substitution, usually denoted by S, is a mapping of type variables to types. For type variables D = {α1 , . . . , αn } and types τ1 , . . . , τn the substitution mapping αi to τi is denoted by [α1 := τ1 , . . . , αn := τn ]. Implicitly we assume all type variables not in D are mapped to themselves. As usual, a substitution only replaces free type variables, so the quantified type variables in a type scheme are not affected by a substitution.

59

Generalizing a type τ with respect to a set of type variables M entails the quantification of the type variables in τ that do not occur in M . generalize(M, τ )

=def

∀ α.τ

where α = ftv(τ ) − M

An instantiation of a type scheme is obtained by replacing the quantified type variables with fresh type variables. instantiate(∀α1 . . . αn .τ )

=def

[α1 := β1 , . . . , αn := βn ]τ where β1 , . . . , βn are fresh

α.τ2 , denoted τ1 ≺ σ, if A type τ1 is a generic instance of a type scheme σ = ∀ there exists a substitution S with {β | β = S(β)} ⊆ α such that τ1 = Sτ2 . In the following we shall often encounter (finite) sets of pairs of the form x : τ , where usually x is a variable and τ a type. For such a set X we define dom(X) = {x | x : τ ∈ X} and ran(X) = {τ | x : τ ∈ X}. Constraints A constraint set, usually denoted by C, is a set of type constraints. We introduce three forms of type constraint: (constraint)

C := τ1 ≡ τ2 | τ1 ≤M τ2 | τ σ

An equality constraint (τ1 ≡ τ2 ) reflects that τ1 and τ2 should be unified at a later stage of the type inference process. The other two kinds of constraints are used to cope with the polymorphism introduced by let expressions. An explicit instance constraint (τ σ) states that τ has to be a generic instance of σ. This constraint is convenient if we know the type scheme of a variable before we start inferencing its definition; this occurs, for instance, when an explicit type for the variable was given. In general, the (polymorphic) type of a variable introduced by a let expression is unknown and must be inferred before it can be instantiated. To overcome this problem we introduce an implicit instance constraint (τ1 ≤M τ2 ), which expresses that τ1 should be an instance of the type scheme that is obtained by generalizing type τ2 with respect to the set of monomorphic type variables M , i.e., quantifying over the other type variables, ftv(τ2 ) − M . We consider the necessity of the implicit instance constraint when we consider the type rule of the let. Equality constraints on types can be lifted to sets of pairs of types by (X ≡ Y ) = {τ1 ≡ τ2 | x : τ1 ∈ X, x : τ2 ∈ Y } and similarly for ≤M and . Once a constraint set has been generated, we can look for the minimal substitution that satisfies each constraint in the set. In the compiler, when we encounter a constraint that cannot be satisfied (or, in general, raises a conflict), we simply dismiss the constraint, and use it to generate an error message instead. In much of the literature, the unification of two types which cannot be unified results in the error substitution . Needless to say, such a substitution never arises in our case.

60

Satisfaction of a constraint by a substitution S is defined as follows: S satisfies (τ1 ≡ τ2 ) S satisfies (τ1 ≤M τ2 ) S satisfies (τ σ)

=def =def =def

Sτ1 = Sτ2 Sτ1 ≺ generalize(SM, Sτ2 ) Sτ ≺ Sσ

After substitution, the two types of an equality constraint should be syntactically the same. The logical choice for S in this case is the most general unifier of the two types. For an implicit instance constraint, the substitution is not only applied to both types, but also to the set of monomorphic type variables M . The substitution is applied to the type and the type scheme of an explicit instance constraint, where the quantified type variables of the type scheme are, as usual, untouched by the substitution. The distinction between the explicit and implicit instance constraint is that type variables in the (right hand side of the) latter can still become monomorphic. This information is known at the outset for explicit instance constraints. We illustrate this by an example. Example 1. Let c = α3 ≤{α5 } α1 → α2 be a constraint. We want a substitution S that satisfies c, i.e., Sα3 ≺ generalize(Sα5 , S(α1 → α2 )) where the latter is equal to ∀β1 β2 .β1 → β2 , if α1 , α2 and α5 are not touched by S. A most general substitution to satisfy this constraint is S = [α3 := α → β], where α and β are fresh type variables. But what happens if we later encounter c = α5 ≡ α1 ? We have already chosen α1 to be polymorphic in c, although the constraint c now tells us that α1 was in fact monomorphic. It seems then that we considered c too early and should have waited until we were sure that α1 was polymorphic. A safe approach in this case is to infer let definitions without an explicit type signature before considering their occurrences in the body of the let. To be more precise, an implicit instance constraint c = τ1 ≤M τ2 can be solved when τ2 does not contain any active type variables beside those in M . Activeness can be defined as follows. activevars(τ1 ≡ τ2 ) activevars(τ1 ≤M τ2 ) activevars(τ σ)

=def =def =def

ftv(τ1 ) ∪ ftv(τ2 ) ftv(τ1 ) ∪ (ftv(M ) ∩ ftv(τ2 )) ftv(τ ) ∪ ftv(σ)

The condition for the implicit instance constraint c to be solvable is then that (ftv(τ2 ) − M ) ∩ activevars(C) = ∅ . (1) where activevars(C) = {activevars(c) | c ∈ C}. Let S be a substitution which satisfies the set of constraints C and was computed without ever violating the condition (1) for an implicit instance constraint. Lemma 3 of [HHS02] proves that solving C by considering the constraints of C in any order (again, without ever violating the condition (1)) results in the same substitution S (up to type variable renaming). Thus, it follows that there is a lot of freedom when choosing an order, only restricted by the side condition (1).

61

expr = | | | | | | |

lit | var | constructor expr expr+ if expr then expr else expr ’λ’ pat+ ’→’ expr case expr of alts let decls in expr ’(’ expr ’)’ | ’[’ expr ’]’ expr ’::’ typescheme

pat

alt decl fb rhs

= | | = = = =

lit | var | constructor pat* ’(’ pat ’)’ | ’[’ pat ’]’ var ’@’ pat | ’ ’ pat ’→’ expr fbs | var ’::’ typescheme var pat* ’=’ rhs expr (where decls )?

Fig. 1. Context-free grammar for the subset of Helium

As demonstrated in [HHS02] there are sets of constraints that contain only implicit instance constraints for which the condition is not fulfilled. In the next section we shall see that for sets of constraints generated from a Helium program this situation never occurs.

3

The bottom-up type inference rules

In this section we give the type inference rules for a large part of Helium. It is important to note that the notational similarity with the Hindley-Milner rules is deceptive. The Hindley-Milner rules form a system of logical deduction which formulate, for a lambda calculus enriched with a let construct, what the valid typings of a given expression are. Constructing the principal type for a given expression under these deduction rules is exactly what an algorithm such as W does. In this and the next section we formulate a non-deterministic algorithm which is a generalization of W, both by the introduction of more programming constructs as well as by being less deterministic. The rules given below are thus the first part of an algorithm. They specify for a given abstract syntax tree what constraints should be constructed at various nodes in the tree. In fact, the rules are close to an attribute grammar, and we show later how they can transformed into executable code in the uu ag system. The part of Helium we consider is given in Fig. 1 where we have concentrated on expressions and patterns1 . In addition, we have alternatives (for case expressions), explicit types and where-clauses. The Helium language incorporates a few more constructs such as data type declarations, type synonyms, list comprehension, monadic do-notation, guarded function definitions and a module system. For the sake of brevity we omit these. Note that Helium uses the same layout rule as Haskell, which implies that the semicolons between multiple declarations in a let expression or where clause are not always necessary. In general, our approach is to label a subexpression (node in the abstract syntax tree) with a fresh type variable (usually called β in the type rules) and 1

For a non-terminal X we abbreviate | (X , )∗ X, a comma separated, possibly empty sequence of X’s, by X . Similarly, Xs is equivalent to a semicolon separated sequence of X’s: Xs = | (X ; )∗ X. Also, X ? indicates optionality of X.

62

to generate constraints for the restrictions to be imposed in this node. Often the type for an expression is simply a type variable (during the solving process we discover exactly what type it represents), although in some cases it is advantageous to return a type expression containing a sequence of new variables. A sequence of such kind is denoted β1 , . . . , βn (see for instance the [FB] type rule in Fig. 4). The judgements in the type rules for expressions are of the form M, A, C BU e : τ . e

Here C is a set of constraints, e is the expression, τ is the type of e, M is the set of monomorphic type variables and A is the so-called assumption set. An assumption set collects the type variables that are assigned to the free variables of e. Contrary to the standard type environment used in the Hindley-Milner inference rules, there can be multiple (different) assumptions for a given variable. In fact, for every occurrence of a free variable there will be a different pair in A. As can be seen from the expressions for A and C in the rules, there is implicitly a flow of information from the bottom up. The only piece of information that is passed downwards is the set of monomorphic variables M . An implicit instance constraint depends on the context of the declaration. Every node in the abstract syntax tree has a set of monomorphic type variables M . For an arbitrary subexpression, the set M contains exactly the type variables that were introduced by lambda abstractions and other monomorphic binding constructs at a higher level in the abstract syntax tree. Note that the rules allow for flexibility in coping with unbound identifiers. At any point, those identifiers for which we still have assumptions in our assumption set are unbound in the expression. Also, every assumption corresponds with a unique use of this identifier within the expression. We shall now consider some of the type rules in detail. e The type rule [Lit]BU expresses that for every literal (such as 1, False, ’a’...) we generate the constraint β ≡ τ , where τ is the fixed type of the literal (such as Int, Bool, Char,...). For a variable we simply generate a fresh type variable, while for constructors we have to instantiate the type scheme for that constructor. We assume that the type of the constructor is already known. The rule for application is a basic one. If we have types τ and τ1 , . . . , τn for the function and the list of n arguments respectively, then we have to impose the constraint that τ is in fact a function type taking arguments τ1 , . . . , τn giving us a result of type β (β is as always fresh). The reason we have chosen for n-ary instead of a binary application is to the stay as close as possible to the source code. A lambda abstraction abstracts over a number of patterns. These patterns contain variables to which the variables in the body of the lambda can be bound. Similar to assumption sets, each occurrence of such a pattern variable is paired with a unique type variable in B. These type variables are then passed along in M, so that the body of the abstraction is informed about which type variables are definitely monomorphic. The constraints generated for the lambda abstraction

63

literal : τ e M, ∅, {β ≡ τ } BU literal : β

e

[Lit]BU e

[Var]BU

e

M, {x : β}, ∅ BU x : β C :σ e M, ∅, {β σ} BU C : β e

e

[Con]BU

e

M, A, C BU f : τ M, Ai , Ci BU ai : τi for 1 ≤ i ≤ n e M, A ∪ i Ai , C ∪ i Ci ∪ {τ ≡ τ1 → . . . → τn → β} BU f a1 . . . an : β e

e

e

M, A1 , C1 BU e1 : τ1 M, A2 , C2 BU e2 : τ2 M, A3 , C3 BU e3 : τ3 M, A1 ∪ A2 ∪ A3 , C1 ∪ C2 ∪ C3 ∪ {τ1 ≡ Bool, τ2 ≡ β, τ3 ≡ β} e BU if e1 then e2 else e3 : β p

Bi , Ci BU pi : τi for 1 ≤ i ≤ n M, A\dom(B),

i

B=

i

Bi

Ci ∪ C ∪ (B ≡ A) ∪ {β ≡ τ1 → . . . → τn → τ } e BU (λp1 . . . pn → e) : β d

M, A ,

e

e

M,

i

Ai ,

e

[Let]BU

a

M, A, C BU e : τ M, Ai , Ci BU ai : φi , ψi for 1 ≤ i ≤ n M, A ∪ i Ai , C ∪ i Ci ∪ ({τ, φ1 , . . . , φn } ≡ {β1 }) ∪ ({ψ1 , . . . , ψn } ≡ {β2 }) e BU (case e of a1 ; . . . ; an ) : β2

M,

e

[Abs]BU

e

C ∪ C ∪ C ∪ {β ≡ τ } BU let explicits; d1 ; . . . ; dn in e : β i i

e

[If]BU

e

M ∪ ran(B), A, C BU e : τ

M, Bi , Ai , Ci BU di for 1 ≤ i ≤ n M, A, C BU e : τ (A , C ) = BindingGroupAnalysis(M, explicits, {(∅, A), (B1 , A1 ), . . . , (Bn , An )})

e

[App]BU

e

[Case]BU

e

M, Ai , Ci BU ei : τi for 1 ≤ i ≤ n e A , C ∪ {β ≡ (τ1 , . . . , τn )} BU (e1 , . . . , en ) : β i i i i

e

[Tuple]BU

e

M, Ai , Ci BU ei : τi for 1 ≤ i ≤ n e C ∪ ({τ1 . . . τn } ≡ {β1 }) ∪ {β2 ≡ [β1 ]} BU [e1 , . . . , en ] : β2 i i

e

[List]BU

e

M, A, C BU e : τ e M, A, C ∪ {β σ, τ σ} BU (e :: σ) : β Fig. 2. The Bottom-Up type inference rules for expressions

64

e

[Typed]BU

itself specify that the type of each identifier is equal to the type of each of its occurrences in the body (expressed by B ≡ A), and that the resulting type is an appropriate function type. e For [Let]BU we take into account that the types of some definitions are explicitly given. Essentially, a let expression can be used to introduce an abbreviation. For instance, the let expression let i = \x -> x in i i has the same semantics as the unfolded code (\x -> x)(\x -> x). Note that it is essential that the type of both lambda abstractions should have a common form, ∀a. a → a, but that the instances of this type may differ, e.g., b → b for the latter and (b → b) → (b → b) for the former abstraction. It is possible to cope with let expressions without resorting to constraints other than equality constraints, but this has two significant drawbacks: – The number of constraints increases, in fact it may explode exponentially, because for every occurence of a definition an independent sets of constraints is generated. – If let definition contains an error, it is much more difficult to discover this fact, because of the independence between the various set of constraints, already mentioned in the previous point. Since our focus is the generation of good error messages and efficiency is important, we have chosen to introduce special constraints for the let construct (the implicit instance constraints introduced in Section 2). In Helium, the binding groups are determined in part by the given explicit typings. It is the function BindingGroupAnalysis that generates the necessary implicit and explicit instance constraints based on the sets of assumptions (the As) and the variables introduced by the patterns occuring in the left-hand sides (the Bs). Obviously, the function also needs to know the set of monomorphic type variables at this point. In our formulation, the body of the let is considered to be just another binding group that does not introduce new variables. We explain the working of the function BindingGroupAnalysis by an example. If we have two mutually recursive definitions in a let expression of the form f = . . . g . . . and g = . . . f . . ., then without explicit types for f or g, the definitions belong to the same binding group and we generate equality constraints between the type variables for the definition of g and every single use in f and g. Consequently, all occurrences of f and g in these definitions are monomorphic. However, if the let expression includes an explicit polymorphic type for f , then f and g are no longer in the same binding group and f may be used polymorphically in g and vice versa. In that case, we generate an explicit instance constraint, based on the explicit type of f , for each use of f in g, and if g is not explicitly typed itself, we generate an implicit instance constraint for every use of g in f . The same applies to uses of f in the body of f . In other words, if an explicit type for f is given, then polymorphic recursion is allowed and we depart from the type system of Hindley-Milner [DM82]. Helium expressions can be explicitly typed, in which case both the type τ of the expression itself as well as the returned type have to be an instance of the

65

explicitly mentioned type σ. The check that σ is not more general than the type τ is postponed until a later stage. We leave the remaining rules to the reader. literal : τ p ∅, {β ≡ τ } BU literal : β

p

[Lit]BU p

p BU

{x : β}, ∅

B, i i

C :σ Bi , Ci pi : τi for 1 ≤ i ≤ n p C ∪ {β σ, β1 ≡ τ1 → . . . → τn → β2 } BU C p1 . . . pn : β2 i 1 i

B, i i

x:β

p BU

B, i i

[Var]BU p

[Con]BU

p

Bi , Ci BU pi : τi for 1 ≤ i ≤ n p C ∪ {β ≡ (τ1 , . . . , τn )} BU (p1 , . . . , pn ) : β i i

p

[Tuple]BU

p

Bi , Ci BU pi : τi for 1 ≤ i ≤ n p C ∪ ({τ . . . τn } ≡ {β1 }) ∪ {β2 ≡ [β1 ]} BU [p1 , . . . , pn ] : β2 i 1 i

p

[List]BU

p

B, C BU p : τ p B ∪ {x : β}, C ∪ {τ ≡ β} BU x@p : β

p

[As]BU p

p

∅, ∅ BU : β

[Wc]BU

Fig. 3. The Bottom-Up type inference rules for patterns

We now proceed with patterns in Helium, see Fig. 3. Patterns occur in lefthand sides of function definitions, in lambda abstractions, and in the left-hand sides of case alternatives. The variables introduced in a pattern are, together with their fresh type variable, passed upwards in the set of bindings B. The rules for literals, variables, lists, and tuples are the same as for expressions, except that we do not need to pass a set of monomorphic variables down into the pattern. The constructor rule combines the function application and constructor rules for expressions into one. The as-pattern allows us to bind nontrivial patterns to variables. The corresponding rule simply equates the type of the pattern with the type of the variable using a fresh type variable. Wildcards do not introduce new restrictions, so we only give them a dummy type β. Finally, consider the rules in Fig. 4 for the remaining constructs that we deal with in this paper. The rules for function bindings and declarations are rather subtle. A declaration of a function consists of m functions bindings, all starting with the same function identifier, here f . Each function binding consists of the function name, a list of patterns pi (here numbered from 1 to n − 1 to fit better d with the [Decl]BU rule) and a right-hand side. The rule is very similar to that for the lambda abstraction except that we do not return a function type, but construct a type sequence of n types (n−1 parameters plus the type of the righthand side). The type sequences for the various function bindings are collected d in [Decl]BU . Now we can see the reason why we did not immediately construct a

66

p

e

B, C1 BU p : τ1 M ∪ ran(B), A, C2 BU e : τ2 a M, A\dom(B), C1 ∪ C2 ∪ (B ≡ A) BU (p → e) : τ1 , τ2 e

a

[Alt]BU

d

M, A, C BU e : τ M, Bi , Ai , Ci BU di for 1 ≤ i ≤ n (A , C ) = BindingGroupAnalysis(M, explicits, {(∅, A), (B1 , A1 ), . . . , (Bn , An )})

M, A , C ∪

i

rhs BU

Ci ∪ C

(e where explicits; d1 ; . . . ; dn ) : τ

rhs

[RHS]BU

fb

M, Ai , Ci BU f bi : τ1,i , . . . , τn,i for 1 ≤ i ≤ m

Bi , Ci BU pi : τi for 1 ≤ i ≤ n − 1

B=

d

d

M, {f : β1 → . . . → βn }, i Ai , i Ci ∪ j {βj ≡ τj,i | 1 ≤ i ≤ m} BU f b1 ; . . . ; f bm where f is the single function being declared by the function bindings f bi p

M, A\dom(B),

i

Bi

rhs

M ∪ ran(B), A, C BU rhs : τn

fb

C ∪ C ∪ (B ≡ A) BU (f p1 . . . pn−1 = rhs) : τ1 , . . . , τn i i

Fig. 4. The remaining Bottom-Up type inference rules

function type: the types of the first pattern in each of the function bindings have to be the same, and similar for the other patterns and the right-hand sides. Of course, this could have been done by equating the function types, but our way seems to us more intuitive and more amenable to generating good type error messages. Example 2. A small example is given in Figure 5 showing the constraints generated for the expression λf x → f (f x). If we assume the set of monomorphic variables of the root to be empty, then the subexpressions of the body of the lambda all have {τ0 , τ1 }. Note that below every node in the tree we have indicated the fresh type variable introduced at that point.

4

The type inference process

The type inference process consists of three different phases. The first of these generates constraints in the nodes of the abstract syntax tree (using the rules of the previous section), generating a Rose tree2 where each node is labelled with a collection of constraints. The second phase consists of traversing the abstract syntax tree and ordering the constraints found in the tree. Our implementation contains an infrastructure for building traversals and the most important ones are predefined. The final phase solves the flattened constraint tree. The solver may be a greedy solver which continually updates a substitution, or a more global solver which builds elaborate data structures such as type graphs in order to be able to decide in a more global fashion which constraints are likely to be responsible for the inconsistency. In both cases, the outcome includes a list of unsatisfiable 2

A tree datastructure where each node has an arbitrary number of children.

67

[Decl]BU

fb

[FB]BU

τ7 ≡τ0 →τ1 →τ6 τ1 ≡τ4 τ0 ≡τ3 τ0 ≡τ2 Abs τ7 τ2 ≡τ5 →τ6 App Var ”f”

Var ”x”

τ0

τ1

τ6 τ3 ≡τ4 →τ5 Var ”f”

App

τ2

τ5 Var ”f”

Var ”x”

τ3

τ4

Fig. 5. Constraints for the expression λf x → f (f x)

constraints, each with a type error message which explains to the programmer what the problem is. 4.1

Collecting the constraints using uu ag

In this section we explain how the type inference rules of the previous section can be implemented easily in the uu ag system. We assume that the reader is familiar with attribute grammars in general. For more information on the uu ag system consult [SBL]. The main focus of this section is not to show how it works exactly, but only to give the reader a flavour of what it looks like and to highlight the correspondence between the type rules and the code. The advantage of using attribute grammars at this point is that we only need to formulate how values depend on each other, and the attribute grammar system automatically determines how these values can actually be computed. All fragments of code are shown in Fig. 6 (slightly modified from the actual code). First we define the attributes of the non-terminals in the abstract syntax tree, found in the ATTR section. Besides the top-down (inherited) and the bottom-up (synthesized) aspects, there are chained attributes that are passed along in both directions. Note that all elements in a judgement for an expression turn up as attributes, with M an i-attribute and the remaining ones (A, ....) s-attributes. Additionally, the chained attribute unique provides a counter to generate fresh type variables. Instead of using a list to collect the type constraints, a Rose tree that follows the shape of the abstract syntax tree is constructed, in which the nodes are decorated with any number of constraints. In this way we retain some flexibility in the order in which the constraints will be considered later. Hence, this part

68

ATTR Expr [ mono : Types | unique : Int | aset : Assumptions ctree : ConstraintTree beta : Type ]

(inherited) (chained) (synthesized)

SEM Expr | If lhs . aset . ctree

= @guard.aset ++ @then.aset ++ @else.aset = Node [ [@guard.beta ≡ boolType] ‘add‘ @guard.ctree , [@then.beta ≡ @beta] ‘add‘ @then.ctree , [@else.beta ≡ @beta] ‘add‘ @else.ctree ] guard . unique = @lhs.unique + 1 loc . beta = TVar @lhs.unique

(1) (2)

(3) (4)

SEM Expr | Lambda lhs . aset . ctree

= removeKeys (map fst @pats.bset) @expr.aset = [ beta ≡ foldr (→) @expr.beta @pats.betas ] ‘add‘ Node [ @pats.ctree, @binds ‘spread‘ @expr.ctree ] pats . unique = @lhs.unique + 1 expr . mono = map snd @pats.bset ++ @lhs.mono loc . beta = TVar @lhs.unique . binds = [ τ1 ≡ τ2 | x1 == x2 , (x1 ,τ1 ) ← @pats.bset, (x2 ,τ2 ) ← @expr.aset ] Fig. 6. Code fragments of the attribute grammar

of the code only generates the constraints, while fixing the order and actually solving the constraints is considered later. Due to space restrictions we limit ourselves to giving the semantic functions for the conditional expression and the lambda abstraction. The three sub-expressions of a conditional are referred to as guard, then and else. Consider the attribute unique. We use it to generate a new variable in the local attribute beta. Note that beta is also the name of a synthesized attribute of the if-node. The reason we introduce it as a local attribute is because we also need it for cset. The value of unique is threaded through the tree, incremented along the way every time we introduce a new type variable. The semantic rules are given in the SEM section of Fig. 6. Here, the syntax for referring to an attribute is @child.attribute, where lhs and loc are special keywords to refer to inherited attributes of the father node and attributes that are defined locally, respectively. Obviously, the subtrees for the condition, the then-part and the else-part will contain the constraints generated for those parts. However, the if construct has some constraints of its own: the constraint that the result of the condition is of type boolean, the constraint that the result of the then-part is equal to the result of the conditional as a whole, and similarly for the else-part. It would be quite natural to put these constraints into a set

69

lhs

um

If

loc

(4)

m u a c b

a c b u

(1)

(2)

b

= = = = =

mono unique aset ctree beta

(3)

Expr

um

guard

Expr

a c b u

um

then

Expr

a c b u

um

else

a c b u

Fig. 7. Dependencies between the attributes for the conditional

of constraints attached to the if-node. However, to retain as much flexibility as possible, we have chosen differently. In addition to being able to add a constraint to the set of constraints of the if-node, it is also possible to add a constraint to one of the subtrees, and this is what happens here. The three type constraints for a conditional are added to the constraint trees of their corresponding subexpression with the function add. We consider the uses of this facility in more detail in the next section. The dependencies between the various attributes for the conditional are made more explicit in Fig. 7. The number following a semantic equation refers to the correspondingly numbered dashed lines in Fig. 7; the dotted edges represent the passing of unmodified attributes. For instance, mono is passed on unchanged to all three children. We do not have to write the code for passing these attributes ourselves. Instead, the compiler generates these copy rules automatically. For lambda abstractions, the assumptions concerning the bound variables are removed from the assumption set, and the type variables that are introduced in the patterns are inserted into the set of monomorphic type variables that is passed to the body. A constraint is constructed for each matching combination of a tuple from the assumption set and from the binding set. This set of constraints, which is the local attribute binds, is added to the constraint tree with the function spread. 4.2

Flattening the constraint tree

After the generation phase we have obtained a Rose tree in which a node contains a set of constraints, and a list of children. Each child is a subtree containing its own constraints as well as a separate collection of constraints put there by its parent. The location where most type inferencers detect an inconsistency for an illtyped expression strongly depends on the order in which types are unified. By

70

τ7 ≡τ0 →τ1 →τ6 τ0 ≡τ3 τ0 ≡τ2 τ1 ≡τ4 Abs τ7

τ2 ≡τ5 →τ6 App

Var ”f”

Var ”x”

τ0

τ1

τ6

τ3 ≡τ4 →τ5

Var ”f”

App

τ2

τ5 Var ”f”

Var ”x”

τ3

τ4

Algorithm W τ0 ≡τ2 , τ0 ≡τ3 , τ1 ≡τ4 , τ3 ≡τ4 →τ5 , τ2 ≡τ5 →τ6 , τ7 ≡τ0 →τ1 →τ6 Algorithm M τ7 ≡τ0 →τ1 →τ6 , τ2 ≡τ5 →τ6 , τ0 ≡τ2 , τ3 ≡τ4 →τ5 , τ0 ≡τ3 , τ1 ≡τ4 Bottom-up τ3 ≡τ4 →τ5 , τ2 ≡τ5 →τ6 τ0 ≡τ2 , τ0 ≡τ3 , τ1 ≡τ4 , τ7 ≡τ0 →τ1 →τ6

Fig. 8. Spreading the constraints for λf x → f (f x)

specifying how to flatten the constraint tree we can imitate several type inference algorithms, each with its own properties and characteristics. We flatten a constraint tree by defining a treewalk over it that puts the constraints in a certain order. Besides the standard preorder and postorder treewalks, one can think of more experimental ones such as a right-to-left treewalk. In our current implementation we use the same treewalking strategy in each node of a constraint tree, in the sense that we cannot specify a treewalk going left-to-right in application nodes and right-to-left in the case a list node; it is a straightforward extension to use different strategies depending on the nonterminal in the abstract syntax tree. A treewalk takes the various collections of constraints available in a node: the constraints of the node itself, the collections collected from the subtrees and the collections of constraints added (by means of the function add, see Fig.6) to each of the substrees, and orders these collections in some fashion, usually without changing the order within the collections themselves. Special care is taken for inserting a constraint that corresponds to the binding of a variable to a pattern variable; these constraints are inserted with the function spread. Instead of adding the constraint to the node where the actual binding takes place, a constraint may be mapped onto the occurence of the bound variable. Fig. 8 shows the spreading of three constraints. The constraint orders of three treewalks are shown on the right: a postorder treewalk with spreading (W), a preorder treewalk with spreading (M), and a postorder treewalk without spreading. The main motivation for spreading is that the way the standard algorithms W and M solve their constraints corresponds to spreading. To be able to mimick these algorithms it has been included. Finally, we want to point out that any treewalk for ordering the constraints should adhere to the principle that it only generates of lists of constraints so that when we encounter an implicit instance constraint, the condition (1) is satisfied. This is guaranteed by solving the constraints arising from the definitions in a let, before continuing with the constraints arising from the body. An implicit instance

71

constraints can only depend on definitions higher up in the tree. Remember that we consider an entire binding group at the time and polymorphic recursion is only allowed when explicit types are given. 4.3

Solving the constraints

In this section we consider a number of approaches to solving the collected constraints. We describe the general characteristics of a constraint solver by listing all the operations that it should be able to perform. We continue by explaining how various greedy solving methods can be implemented within this framework. These greedy algorithms can be tuned quite easily by specifying a small number of parameters. Finally, we spend some time on discussing how the constraints can be solved in a more global way using type graphs which enables us to remove the left-to-right bias inherent in tree based online algorithms such as W and M. Note that “solving” in this context means that the constraint is taken from the list of constraints and dealt with. Solving does not mean to imply that the constraint is consistent with the information we have obtained so far. Directly, or at some later point, we may decide that the constraint is a source of an inconsistency and generate an appropriate error message for it. A type class for solving constraints To pave the way for multiple implementations of a solver, we present a Haskell type class. Since it is convenient to maintain a state while solving the constraints, we have implemented this class using a State monad that provides a counter for generating unique type variables, a list of reported inconsistencies, and a substitution or something equivalent. A type solver can solve a set of constraints, in which each constraint is carrying additional info, if it is an instance of the following class. class Solver solver info where initialize :: State solver makeConsistent :: State solver unifyTypes :: info → Type → Type → State solver newVariables :: [Int] → State solver findSubstForVar :: Int → State solver

info info info info info

() () () () Type

By default, the functions initialize, makeConsistent, and newVariables do nothing, that is, leave the state unchanged. If unifyTypes is called with two nonunifiable types it can either deal with the inconsistency immediately, or postpone it and leave it to makeConsistent. The function applySubst, which performs substitution on types, is defined in terms of a more primitive function called findSubstForVar. A function solve can now be defined which takes an initial value for the type variable counter, and a collection of constraints, where each constraint is labeled with info, and results in a State which describes the result of solving the collection of constraints. After initialization, the constraints are solved one after another resulting in a possibly inconsistent state. Calling makeConsistent

72

will remove possible inconsistencies and, as a side effect, add error messages to the state. We consider now how the three types of constraints may be solved. An equality constraint is solved by unification of the two types. The type scheme of an explicit instance constraint is instantiated and the state is informed about the fresh type variables that are introduced. An implicit instance constraint is solved by first making the state consistent, and subsequently applying the substitution to the type and the monomorphic type variables. Only then may we solve the constraint by transforming it into an explicit instance constraint and solving it. Note that due to the lazy semantics of Haskell the list of constraints generated by a treewalk is only constructed insofar we actually solve the constraints. Whenever an error is encountered with the kind of solver that terminates once it has seen a type error, the other constraints are not even computed. Needless to say, laziness imposes its own penalties. Greedy constraint solving The most obvious instance of the type class Solver is a substitution. The implementation of unifyTypes then simply returns the most general unifier of two types. The result of this unification is incorporated into the substitution. When two types cannot be unified, we immediately deal with the inconsistency. As a result, makeConsistent can be the default skip function, because unifyTypes always results in a consistent state: if adding a constraint results in an inconsistent state, then it is ignored instead, although an appropriate error message is generated (and added to the state). After the discovery of an error, we can choose to continue solving the remaining constraints, which can lead to the detection of more type errors. For efficiency reasons, we represent a substitution by a mutable array in a strict state thread, also because the domain of the substitution is dense. Instead of maintaining an idempotent substitution, we compute the fixpoint in case of a type variable lookup. Constraint solving with type graphs Because we are aiming for an unbiased method to solve type constraints, we discuss an implementation which is based on the construction of a type graph inspired by the path graphs described in [Por88]. The type graph allows us to perform a global analysis of a set of constraints, which makes type inferencing a non-local operation. Because of its generality, adding heuristics for determining the “correct” type error is much easier. It is important to note that although the constraint solver based on type graphs does consider the constraints in the order that the treewalk generated them, this does not mean that an early constraint has a smaller chance of being held responsible for an inconsistency. This is completely determined by a number of heuristics which work on the type graph. Each vertex in the type graph corresponds to a subterm of a type in the constraint set. A composed type has an outgoing edge labelled with (i) to the vertex that represents the ith subterm. For instance, a vertex that represents a function type has two outgoing edges. All occurrences of a type variable in the

73

constraint set share the same vertex. Furthermore, we add undirected edges labelled with information about why two (sub)terms are unified. For each equality constraint, an edge is added between the vertices that correspond to the types in the constraint. Equivalence of two composed types propagates to equality of the subterms. As a result, we add derived (or implied ) edges between the subterms in pairwise fashion. For example, the constraint τ1 → τ1 ≡ Bool → τ2 enforces an equality between τ1 and Bool, and between τ1 and τ2 . Therefore, we add a derived edge between the vertex of τ1 and the vertex of Bool, and similar for τ1 and τ2 . For each derived edge we can trace the constraints responsible for its inclusion. Note that adding an edge can result in the connection of two equivalence classes, and this might lead to the insertion of more derived edges. Whenever a connected component of the type graph contains two or more different type constructors, we have encountered a type error. Infinite types can also be detected, but we skip the details. Example 3. Consider the following ill-typed program. f0y=y f x y = if y then x else f (x − 1) y

The set of type constraints that is collected for this program is as follows. #1 #2 #5 #8 #11 #14 #17 #20

τ0 ≡ τ2 →τ3 →τ1 Int ≡ τ4 τ5 ≡ τ 6 τ8 ≡ τ 3 τ7 ≡ τ11 τ17 Int → Int → Int τ17 ≡ τ16 →τ18 →τ15 τ13 ≡ τ14 →τ19 →τ12

#3 τ4 #6 τ6 #9 τ8 #12 τ11 #15 τ7 #18 τ15 #21 τ12

≡ τ2 ≡ τ1 ≡ τ10 ≡ τ9 ≡ τ16 ≡ τ14 ≡ τ9

#4 τ5 ≡ τ3 #7 τ7 ≡ τ2 #10 τ10 ≡ Bool #13 τ0 ≡ τ13 #16 Int ≡ τ18 #19 τ8 ≡ τ19 #22 τ9 ≡ τ1

Fig. 9 depicts the type graph for this set. The shaded area indicates a number of type variables that are supposed to have the same type. The graph is clearly inconsistent, because of the presence of both Int and Bool. Applying the heuristic of Walz and Johnson [WJ86] to measure the proportion of each constant, would result in cutting off the boolean. Cutting off the boolean removes the inconsistency, but even then there are various constraints which can be removed to obtain this effect. If #10 is chosen – this amounts to saying that the type of y is not a boolean, then it would generate the following type error message. (2,12): Type error in conditional expression : if y then x else f (x - 1) y term :y type : Int does not match : Bool

If instead #12 is chosen, which corresponds to the fact that the type of the then-branch of a conditonal should have the same type as the conditional as

74

(2)

Int

τ18

(1)

#16

τ13

Int

(1)

→

#20

→ (2)

(2)

Int

(2)

τ15

#18

(1)

→

(1)

τ14

→

Int

τ4

#3

τ2

τ16

#15

τ7

#17

#14

→

Int

τ11

#4

τ5 #5

τ12

#11 (1)

(1)

τ19

τ3

#7 (1)

τ17

(2)

(1)

(2) #2

→

(2)

#1

→

(2)

→

τ0

#13

τ6 #6

#21 #12

τ9

#22

τ1

#8

#19

τ8 #9

τ10 #10

Bool

Fig. 9. TypeGraph

whole, then the following message results: (2,19): Type error in conditional expression : if y then x else f (x - 1) y term :x type : Int does not match : Bool

Constraints have been implemented in such a way in our compiler, that each constraint carries all the information it needs, e.g. position information, to generate an appropriate error message. We refer the reader to the Helium website [ILH] for a list of example programs with the output as generated by the compiler. The type error messages were all generated using the type graph data structure of this section guided by a collection of heuristics. After an inconsistency is detected in the type graph, a large repertoire of heuristics is applied resulting for each heuristic in two things: a list of constraints contributing to an inconsistency and a score proportional to the amount of trust the heuristic has in the result. These values are compared for the various heuristics and the ’best’ one is chosen. The impact of a heuristic can be changed by modifying the way the score is computed. Heuristics fall into two natural classes: the ones which look for special mistakes programmers often make (the first two listed next), and a number of more general ones (the others). The scoring in Helium is set up such that the latter are used as tiebreakers in case none of the former return anything useful. – If the permutation of arguments to a function resolves a type inconsistency, and no other permutation does, then we assume that this is the problem and

75

–

–

– – –

5

generate an error message and a hint for improving the program. This is in fact one example of a host of heuristics which result in easy fixes: forgotten arguments, permuted elements in a pair, too many arguments, etcetera. If an integer is used where a float was expected or vice versa, then this is a likely source of an error. Since we do not have for numerals as is the case in Haskell, this is an error in Helium. Experienced Haskell programmers are hence likely to make this type of mistake and we have included a special heuristic to give a hint. The proportion of constants in an inconsistent part of the type graph determines which constant is considered to be the error (see the example earlier on). Constraints which are part of a cycle are usually not chosen, because they cannot by themselves resolve an inconsistency. Definitions from the Prelude are to be trusted more than definitions in the program itself. The order in which the constraints were added to the type graph The later it was added, the more likely it is to be considered a source of inconsistencies.

Type inference directives

In the previous sections we have already shown that an infrastructure is present to be able to experiment with various type inference processes within the Helium compiler. Some of these can be used by setting the appropriate parameters to Helium, some can be used by small modifications to the compiler. To improve the quality of type error messages in functional programming languages, we proposed four techniques in [HHS03] to further influence the behaviour of the constraint based type inference process. These techniques take the form of externally supplied type inference directives, precluding the need to make any changes to the compiler. A second advantage is that the directives are automatically checked for soundness with respect to the underlying type system. These techniques can be used to improve the type error messages reported for a combinator library. More specifically, they can help to generate error messages which are conceptually closer to the domain for which the library was developed. In [HHS03] we show how this can be obtained within Helium. The main reason that this can done at all, is the separation between the generation, the ordering and the solving of constraints. An additional necessity is that it should be possible for type error messages to be overridden by messages supplied by the user of the compiler. We conclude this section with a necessary limited example to give a flavour of how this facility can be used. Example 4. Novice programmers often have difficulty learning to use higherorder functions such as map and filter. User-defined type rules can be used to isolate expressions which use these functions and generate specific error messages for them. In our setup this can be formulated in a .type file which accompanies the usual .hs source file. We now give a much simplified type rule for expressions in which map is applied to two parameters (which can be any expression).

76

f :: t1 ; xs :: t2 ; -------------------------------map f xs :: [b] ;; t1 == t3 -> t4: The first argument to map should be a function t4 == b : The result type of the function should match the elements of the resulting list t2 == [t3] : The second argument to map should be a list of @t3@s By means of the notation @t3@, the programmer can refer to the type to which the type variable t3 maps during execution. Similar expressions can be used to refer to range information for expressions in occurring in the type rule and so on. The main thing to note is that the list of constraints implies a preference from the side of the programmer about the order in which they should be checked, in the sense the error message for t4 == b can only be given, if t1 == t3 -> t4 was satisfied. Because of this, we can use the fact that t1 == t3 -> t4 is known to hold when we consider t4 == b, so that referring to ’the function’ in the error message is guaranteed to make sense. This allows for much more concrete error messages. The impact of this is even more strongly felt when type rules are given for expressions which include complex user defined datatypes.

6

Conclusion and future work

In this paper we have described the Helium compiler which deals with a large subset of Haskell and can by virtue of a constraint based type inference process generate understandable type error messages. The type inference process consists of three distinct phases: 1. the generation of constraints, 2. the ordering of constraints and, finally, 3. the solving of constraints. Extra flexibility is obtained by means of various tree walking algorithms and the even more powerful feature of type inference directives which allow a programmer to tune the type inference process to his liking. All of this has been made possible by virtue of a constraint based type inference process, and the way it has been implemented. We are currently working on extending our type inferencer with type classes and turning the solver into a general implementation for solving constraints. One of the major challenges is to extend the type inference directives into an easy to use programming language with which programmers can tailor a compiler to their own needs. On the one hand we expect to include directives with which one can specify the tree walk to be performed and which heuristics should be applied. On the other hand, the addition of type classes shall give rise to a new set of directives, e.g., specifying that two type classes have to be disjoint. This can help giving good error messages in case of ambuigities that may arise.

77

References [Chi01]

Olaf Chitil. Compositional explanation of types and algorithmic debugging of type errors. In Proceedings of the Sixth ACM SIGPLAN International Conference on Functional Programming (ICFP’01), pages 193–204, September 2001. [DM82] L. Damas and R. Milner. Principal type schemes for functional programs. In Principles of Programming Languages (POPL ’82), pages 207–212, 1982. [HHS02] Bastiaan Heeren, Jurriaan Hage, and S. Doaitse Swierstra. Generalizing Hindley-Milner type inference algorithms. Technical Report UU-CS-2002031, Institute of Information and Computing Science, University Utrecht, Netherlands, July 2002. [HHS03] Bastiaan Heeren, Jurriaan Hage, and S. Doaitse Swierstra. Scripting the type inference process. In International Conference on Functional Programming ’03, 2003. To appear. [HW03] Christian Haack and J. B. Wells. Type error slicing in implicitly typed higherorder languages. In Proceedings of the 12th European Symposium on Programming, pages 284–301, April 2003. [ILH] Arjan IJzendoorn, Daan Leijen, and Bastiaan Heeren. The Helium compiler. http://www.cs.uu.nl/helium. [LY98] Oukseh Lee and Kwangkeun Yi. Proofs about a folklore let-polymorphic type inference algorithm. ACM Transanctions on Programming Languages and Systems, 20(4):707–723, July 1998. [McA00] Bruce J. McAdam. Generalising techniques for type debugging. In Phil Trinder, Greg Michaelson, and Hans-Wolfgang Loidl, editors, Trends in Functional Programming, pages 49–57. Intellect Books, March 2000. [Pie02] Benjamin C. Pierce. Types and Programming Languages. MIT Press, Cambridge, MA, 2002. [Por88] Graeme S. Port. A simple approach to finding the cause of non-unifiability. In Robert A. Kowalski and Kenneth A. Bowen, editors, Proceedings of the Fifth International Conference and Symposium on Logic Programming, pages 651–665, Seatle, 1988. The MIT Press. [SBL] S. Doaitse Swierstra, Arthur I. Baars, and Andres Loeh. The UU-AG attribute grammar system. http://www.cs.uu.nl/people/arthurb/ag.html. [Sul] Martin Sulzmann. The Chameleon system, http://www.comp.nus.edu.sg/ sulzmann/chameleon. [WJ86] J. A. Walz and G. F. Johnson. A maximum flow approach to anomaly isolation in unification-based incremental type inference. In Conference Record of the 13th Annual ACM Symposium on Principles of Programming Languages, pages 44–57, St. Petersburg, FL, January 1986. [YMT00] J. Yang, G. Michaelson, and P. Trinder. Helping students understand polymorphic type errors, 2000.

78

!#" $&%' )(+*,-. /0 132547698:;' < :=/0-%->?@-. ACBD E%'(GF -%H BD 8:-$I*%%JK -MLN-2&%@O

P QSR T+UVQWYXGZ[ \@]_^_`S\Ka.b+c1`SUVQG\@Rd]_e ~ x'pnolyzx'f: htlgihhkgrsjmlox)nqf:nypjogirrslhtzwlvg@utVwVly jmx)lyn1x'x'|VzuVnq)jprx)jo{+hp@'p@|}x {gp@nqrshhp.)lygignosh w hrshx'x)nyrshk Y@yV@YY V¡@¢t£¥¤¦i§V¨©¤iª oj«Eº¦xfp¬¥uVny'Ä:_®'µ¹¿-¯°zÆ±Ç®iÈx_É¼²´lyÊµ³ ËgYh»xlyµkº¼'rjgi½1zp¾v}p@¿-x)nynxEÀ¶g@· nlyµxjoxgx_¸Vs}Vlyx)rsx)nohkhÀÁg@nyzEpÂlyµkpx f:hx'gi)zhkx.jmkglonqrsÌnyprslyrshtµp@l&rsYjÍjmuVlyzp@lyx)VrslywjmµÃp@gYg@)Slyp rslygizKgJhÅklyÄµlypnyrsgl´p@|igsx'x' htz p l szgtx'p@ht l jog@x1p@nyÑ1qµÂnqpikº¼{©ukp@ly¿xpÐ+hxp@_p5µkjmrshwkJjmlyx)¾1zEjojop@rslyjmrslqÎp@hYº¼Ï lyj-Ðº¦¿ÑvjoÐ©x¾vp@ny¿ qµlygÎÀg@p@nKpijogVx'YzrshrsÎlqlypµkjoxEYj'pÊ¥jojoÐ rsiµkhVrj ny'xgp@x_¦n¥·:jmglonynyx'Ehp@klyµjsrsp@p@hly rsgijohGµ¶glynorslyi)µtgil:zprshhÍj©ggdlykx)lyhµrsgj¥zKx_noks)lygirhpjmlox'nqp@htrsl©hjoxxSp¶tnyp_µGssg¶··j:µrsk_jÒµ ly· g gVrjm hkg½ lÓ¾Åµp'sxxp@|}nyx'w0x'hgiV}lyg}jojox)rsno|Àg@sx1nyzg@lyjµx)|}nog·lyµrsjoxi{©ÊtuÕËp@xhjo0µgÏ ·ÔÐ ly¶¥µp@p@jlrÀlg@n ÖjohgVj pp-|ksjoxig¶lyVrslyirsµtgil hf·ukµkÄx'j'h ¶ lylyµkµkx-xvzKg@lykµsx)lynrpl·:x'gJhtllyx)jo_xµkphny_r×tµ5zx'j x_lyÃµp@rgYÃEÊÒØ kx'g·:x'x'hx_x)nqnp@¶©lyx'dgjnKrsh-g@ilyx)x)nony¦z'gijhkg@jm¥lonqjmplqrsph|kxrs&rlVw.nyp@gih|Esx'zlyµj'x ¶ rs}×thg.jop@josprrsl|khw´sÕxÍg@qp@Gµlyp@µkVnqnyxp@gjo)pgilyqx)sµknyx)rslyÚ'jrsgxdhgrjÓl n1nyp@µx'jÍp@phqp-µkxksrsSxhpÊ´kËpsgtxhqrÎkjox)gihYµslyYx)rrsÙhhwKgizlyµVx)xxhpijogiVghGkgtÊGnyq'ÛYxkj'nogÊ lySµklyx)µknr¶}jÒ· joµx´g@knolyrsjo''giz´joj Ü ÝÞÌß}àSáâÍãä+ßSåyáÌÞ c1UVXGæq^_]'WYRdX^ç}WY^_RÀæyè¦Wké^)RÙUkXê`GWkæ´\@ë5\@])ìV\iZCWkæWÕí+Utî \@]_è¼QSï íGWY]'WVZ}RdìVëðè¼Uk]ë5UZS\@ïdRÙXSì&WVXGZ æqUkïÙñRdXSìÁïÀWY])ìV\Cé@UVëòGRÙXGWY^_Uk]_RÀWYïÍíS]_UkòSïd\@ëÎæ@bç}é'`S\iZ}QSïdRÙXGìóWYXZ¹]_\iæqUkQS]'é\CWVïÙïdU}é@Wt^)RÙUkX!WY])\ æqUkë.\ÍUYè^)`S\K\WY])ïdRÙ\iæy^1WVíSíSïdRdéWt^)RÙUkX0WY])\WkæUYè¥^)`SRÀæÌ^)\é'`SXGUVïdUVìVeô õSö_÷YøbùoXJ^)`SRÀæ ZSUé@QSë5\@X^ö î1\vZ}RÀæ_é@QGæ_æ WKæqí+\é@Rú+é1WVíSíSïdRdéWt^_RdUVX-UYèé@UVXGæq^_]'WYRdX^ÓíS])U}é\iæ_æ_RÙXGì ^_U´WÍ]_\iWYïÙûî UV])ïdZKíS])UVòGïÙ\ëb üv`SRdæÌRdæ^_`S\ÍWVæ)æqRdìVXSë5\XkÛVèÒýK])WkZ}QGWt^)\ üÒ\iWVé'`SRdXSìEþÍæ_æ_Rdæq^)WVX^)æ ÿý´üÓþ ^_U-éUVQG])æ_\æ RdXÎ^_`G\ \íGWY]_^_ë5\@X^ UYè+c1Ukë.íGQ}^_\]çSéRd\@XGé@\1WYX Z XGìVRdXS\@\]_RdXSì UVè}^_`S\ XSRdñV\])æ_RÙ^yeKUVè \òS])Wk æ tWYû RdXGéUkïÙXb©üv`S\.RdZS\WÎè¼Uk] ^)`SRdæKíGWV]q^)Rdé@QSïÀWY]ÍWVíSíSïdRdéWt^)RÙUkX î Wkæè¼UkQSXGZ UkX&^_`G\-î \@ò íWYìV\UYè RÙX W \ié'`^_\@] Wt^1^_`S\ XSRdñV\])æ_R^ye-UYè:c WVïÙRÙè¼UV])XSRÀWSöYùo])ñRÙXS\kbküv`SRÀæ WYíSíSïdRÀé@Wt^)RÙUkXÎRdæÌRÙXJè¦WVé^ W é])R^)RdéWYïÓWVXGZWY]'Z}QSUkQGæ])\æ_íUkXGæqRdòSRdïÙRÙ^yeÕ^_`GWY^ ^_`S\.Z}\@íWY]_^_ë5\@X^ æ WkZ}ë5RÙXSRÀæq^_]'Wt^_RdUVX&`WVæv^)U `GWYXGZSïÙ\\ñV\@])eJæ_\@ë5\æq^_\]b ùoXó^)`SRÀæÎíGWYí+\@]0î1\ é@UVXGZ}Qé^0W éUkë.íWY]'Wt^_RdñV\&æq^_QGZSe UYèK^_`G\&ò\`GWñRÙUk]ÎUYè´ñWV]_RdUVQæ æq\iWY]'é'` WYïdìVUV])RÙ^_`SëÎævè¼Uk]Kæ_UVïdñRÙXSìJ^_`SRÀæíG]_UkòSïÙ\ëb©ç\iWY]'é'` WYïdìVUk]_RÙ^_`SëÎæ è¼UV]´æqUkïÙñRdXSìcvç 1æ WY])\ÍQGæ_QGWVïÙïdeJéïÀWVæ)æqRÙúG\iZJRdXk^)U.^yî1U.ëÎWYRdXÕé@WY^_\ìVUV])Rd\æ kïdU}é@WYï©])\@íGWVRÙ]WVXGZ0æ_e}æy^)\@ëÎWt^)Rdé´WYïdìVUVû ]_RÙ^_`SëÎæb \ié\@X^)ïÙekö RdQ5\^ÌWYïbGô tøSíS]_UkíUæq\iZ-W´é@UVë5í\@^_RÙ^_RdñV\ë-QSï^)RûoWYìk\@X^òGWVæ_\Z5æ_\WV])é'` î`S\@])\ÎWYìV\X^)æ´éUkë.ë-QSXSRÀé@WY^_\ !#"%$'&)(&^)`S]_UkQSìV` WéUkë.ë5UkXê\XñRd]_UkXSë5\@X^+b *ÍXÔ^_`G\

79

qæ QS]_è¦WVé@\VöG^_`SRÀæ^)\é'`SXSR QS\-WVíSí+\WY]'æv^)UJò\.WÎñtWY])RdWY^_RdUVXUYèïdU}é@WVïÒæq\iWY]'é'`bGPUtî1\ñV\@]iö}^)`S\@e Z}R ©\@]æ_ïdRÙìk`k^)ïÙe-WVæ î \vWV]_ìkQS\ RÙXÎç\é^_RdUVX }b VbùoX5UV]'Z}\]Ó^)U´QSXGZ}\])æq^)WVXGZ-WVXGZ.é'`GWY]'WVé^_\@])R @\ ^_`S\Eò+\@`WñRdUV]UVè ^_`SRÀæ ë-QSï^)RûoWYìk\@X^æ_\WY]'é'`ö}î \EéUkë.íWY])\R^ ^)U5^yî1UÎUV^_`S\]WYíGíS]_UWVé'`S\iæ î1\êRÙë5íSïd\@ë5\@X^)\ZÅ^_Uóæ_UVïdñV\C^_`SRÀæíS])UVòSïd\@ë vW æ_eæq^_\ëÎWt^_RÀé òWV#é ^_]'WV#é æq\iWY]'é'`!WYXGZÅW `SRÙïdïÙûmé@ïÙRdëòSRdXSìÔïÙU}éWYïvæq\iWY]'é'`b Ô\æy^)]_\iæ_æ^_`GWY^.UkQS].RÙXñV\iæy^)RÙìWt^)RÙUkXGæWV]_\Õë.UV^_RdñtWt^_\iZÁòeVö WYXGZ è¼Ué@QGæ_\ZêUVXö+^_`G\Îý´ü:þ íS])UVòSïd\@ëDUVè1î`SRÀé'`Cî \5`GWñV\.éUVïdïd\é^)\ZêWJè¼\î ])\WYïZSWY^)WYû æ_WVë.íGïÙ\iæ@b ÔUV] RdXSìêUVXó^_`SRÀæ.íS])Wké^)RdéWYïWYíSíGïÙRÀé@WY^_RdUVX `WVæ5WYïdïÙUtî \ZÁQGæ-^_U RdZS\@X^_RÙè¼eÂ^_`G\ WVZ}ñtWYX^'WYìV\iæÍWVXGZÔæq`SUk]q^'éUkë.RdXSìæÍUVèÌ^)`S\5ëQSïÙ^_RÙûmWVìV\XkÊæ_\WV])é'`ö+î`SRÀé'`êî UVQGïdZêXSUY^`GWñk\ ò\\@X0í+Ukæ)æ_RÙòSïd\KRè:î1\Íî1\]_\KæqUkïÙñRdXSìE^)Ute.íG]_UkòSïÙ\ëÎæ@ö}WYXZJæqQG]_\ïÙeJZ}R JéQSïÙ^vRè:î1\Íî1\]_\ÍQGæyû RÙXSì0])WVXGZ}UVë5ïdeûìk\@XS\])WY^_\Z0íS])UVòSïd\@ëÎæbGüv`S\ò+\@`WñRdUV]'ævî \ERdZS\@X^_RÙè¼eWVXGZ&Z}\iæ_é@]_Rdò\`G\@])\ æq`SUkQSïÀZJò+\´ZSRÙ])\é^_ïde5WVíSíSïdRdéWYòSïd\Í^_U.ìV\@XG\@]'WYïcvç Ìæ´ÿ¼Rb \Vbdöò+\@ekUVXGZ0UVQS]víGWV]q^)Rdé@QSïÀWY]1WVíSíSïdRû é@Wt^)RÙUkX öSWVXGZÕî1\´WY])\éQS])]_\X^_ïdeÎñWVïÙRÀZSWY^_RdXSì-UkQS]éUkXGéïdQGæ_RÙUkXGæ UVX])WVXGZ}UVë5ïdeÎìV\XS\@]'Wt^)\Z XSUVX}ûmòSRdXGWY])eÕcvç 1æ@b üv`S\´]_\iæqQSïÙ^)ævUVèÓUkQS] æy^)QGZ}eé@WYXò+\´æ_QSë5ëÎWY])R \ZWVæ è¼UkïÙïdUtî æ@bSüv`G\Kë-QSïÙ^_RÙûmWVìV\@X^ WYíSû íS]_UWVé'`-\ }`SRdòSR^'æWYX5WYëÎW @RdXSìKWVòSRdïÙRÙ^ye^_UEWñkUVRÀZEïdU}é@WYïUkí}^_RdëÎWSb Ô\v^_]'WVé@\Ì^)`SRdæÌWYòSRdïÙRÙ^ye^)U R^'æ úGXG\ûmìV]'WYRdXWVXGZ Z}\@ûmé@\@X^_]'WYïdR @\Zé@UVX^_])UVïÓë.\ié'`GWYXGRdæ_ë RdX î`SRÀé'`CWVìV\@X^'æv^_])e^_U & &)(&])\WYïdR @\E^)`S\@Rd]´RÙXZ}RÙñRÀZ}QGWVïìVUkWVïdæî`SRdïÙ\ÎéUkë5ëQSXSRÀé@WY^_RdXSì0RdXGZ}Rd])\é^)ïÙe^)`S])UVQSìk` ^_`G\ \@XñRÙ])UVXSë5\XkîbþÍæÍWÎ])\æ_QSïÙ^öG^_`S\-ëQSïÙ^_RÙûmWVìV\Xk^KWYíSíS])UkWké'`&é@WYX æ_UVïdñV\E^_RdìV`^cvç 1æ î`S\X ^_`S\-UY^_`G\@]^yî U0WYíGíS]_UWVé'`S\iæ è¦WYRdïb+P Utî \@ñk\@]iöSîR^)` QSXGæ_UVïdñtWYòSïd\íS])UVòSïd\@ëÎæöGRÙ^)æò\`GWñRÙUk] ò\iéUVë5\iæ\]_]'Wt^)RdéÕWYXGZÂQGXS]_\ïÙRÀWYòGïÙ\WYXGZÁ]_\iæqQSïÙ^)æ-RdXóWCZS\WVZSïÙU}#é Âæy^'Wt^)\Vb ê\Õ^_]'WVé@\J^)`SRdæ æq`SUk]q^'éUkë.RdXSì´^)U^_`S\KZ}\é@\@X^_]'WYïdR @\ZÎé@UVX^_])UVïÓÿÃ^_`G\ñV\]_e.æ_WVë5\è¼\WY^_QS])\^_`Wt^ é@UVXGæq^_RÙ^_QS^_\æ ^_`S\æq^_])\@XSìV^_`0UVè^_`GRdævWYíGíS]_UWVé'` ÌWYXGZÕWYïÀæqU^_U.^_`G\ÍïÀWV#é ÎUVèÓZ}Rd]_\ié^vRÙX^_\]qûoWYìk\@X^ é@UVë5ëQSû XSRdéWt^)RÙUkXGæ@b Ô\.WV]_ìkQS\E^)`GWt^Z}\WkZ}ïdU#é RÀæÍWké^)QGWYïdïÙe&WÕíS])\éRdUVQæè¼\WY^_QS])\-^_`GWY^´WVïÙïdUtî æ QGæ ^_U&RÀZ}\@X^_RÙè¼eêé@UVX Rdé^)æö¥î`SRdé'` RÀæ Kûm`GWY]'ZCRdX ìV\XS\@]'WYïb ê\Jæ_`SUtî ^)`GWtÊî`G\@X î1\.^)]_e ^_U0WñkUVRÀZ0^)`S\-ZS\WVZSïÙU}#é 0íG`S\@XSUkë5\@XSUkX&òe&WkZSZ}RdXSìÎìVïdUVòGWVïÓé@UVX^_])UVï¥^)UÎ^_`G\ëQGï^)RûoWYìk\@X^ æ_é'`S\ëÎWSö©î1\ÎZ}\iæy^)]_UteRÙ^)æ´Rdë5ëQSXSRÙ^ye ^)UÕïdU}é@WVïUVí}^)RÙëÎWSb Ô\ÎWY])ìVQS\^_`GWY^éUkX GRÀé^]_\iæqUVû ïÙQ}^)RÙUkX&]_\ QSRÙ])\ævXG\@ìVUV^_RÀWt^)RÙUkXÕë5\é'`WYXSRÀæqëÎæ WVë5UVXSì5^)`S\WYìk\@X^)æöSWYXZ0^)`QGæ RÀæ`S\QS]_RÀæq^_RÀé WYXGZÕíS])UVòSïd\@ë.ûoZ}\@í+\@XGZS\@X^b üv`SRdæKíGWVí\]Rdææq^_])QGé^_QS])\ZêWVæKè¼UVïdïdUtî æ@bç\ié^_RdUVX 0RÙX^_])U}Z}QGé@\æKëQSïÙ^_RÙûoWYìV\X^´òWVæ_\Z æq\iWY]'é'`öS^)`S\5ý´ü:þNíS]_UkòSïd\@ëö+WVXGZ&^)`S\ þ ë.U}Z}\ïbç\ié^)RÙUkXC÷ÕZS\æ)é])RÙò+\æúGñV\-\ }í+\@]_û RÙë5\@X^'æ@ö+æqQSë5ëÎWY])R @\ævUkQS] UkòGæq\]_ñtWY^_RdUVXGæöSWYXGZ&ZSRdæ)éQGæ)æ_\æv^)`S\E])\@ïÀWt^_RdñV\´í+\@]_è¼UV])ëÎWYXGé@\UYè ^_`S\1ë.\@^_`SU}ZSæ^_\iæy^)\Z¥btç\é^_RdUVX ZSRdæ)éQGæ)æ_\æí+Ukæ)æ_RÙòSïd\ÌWVíSíS])UkWVé'`G\æ+^_UÍWñVUVRÀZÍ^_`S\ Z}\WkZ}ïdU#é íS`S\@XGUVë5\@XSUkXÁWYXGZ íG]_\iæq\Xk^'æ^yî U&\ ^_\XGæ_RÙUkXGæ^)U þb :RdXGWYïdïÙekö:ç\ié^)RÙUkX íS])UtñRÀZ}\iæ Z}RÙ])\é^_RdUVXGæ è¼Uk]vè¼Q}^_QG]_\])\æ_\WV])é'`b ä àGáÌãÞÍâ þc1UkXGæy^)])WVRÙX^ç}Wt^)Rdæqè¦WVé^_RdUVX Ì]_UkòSïÙ\ë9ÿmcvç RÀæKZS\úGXS\iZ òe ,30 0 ,E2 Í]_QSïd\æbüv`G\WYìk\@X^ yê íSRdéWYïdïÙe WVíSíSïdRÙ\iæ& 6* $ 0 ,E2 ´WYXGZCQGæ_\æ * >,30 0 ,E2 5îRÙ^_` WJíS])UVòGWVòSRdïÙRÙ^ye C^)U ìV\@^ UVQ}ÛYèW.ïÙU}éWYï¥UVíS^_RdëQSëb RÀæÒ^_`S\véUkëòSRdXGWt^)RÙUkXUYèS^)`S\ =# $ $ 01,32 vWVXGZ *>,30 0 ,E2 v])QSïd\æbùm^ Rdæ:æ_RÙë5RdïdWV] ^_U \ Sé@\@í}^^)`GWt^ RÙ^ ]_\íSïÀWVé\iæ &* $ 0 ,E2 îR^)`K=# $ $ 01,32 @b Rdæ5^)`S\Cé@UVëòGRÙXGWY^_RdUVX UYèK^_`S\ = $'$ 01,32 @ö &* #$ 0 ,E2 Vö WYXGZ *+,E0 01,32 ]_QGïÙ\iæ@b}üv`S\´WYìk\@X^ÌúG]'æq^vWVíSíSïdRÙ\iæ =# $ $ 01,32 Í^)UúGXZ0RÙ^)æ XS\ ^ví+Ukæ_R^)RÙUkXbùmèÓRÙ^1è¦WVRÙïÀæ@ö R^WVíSíSïdRÙ\iæ b ¥ :RÙ]'æyîö}^_`G\EWYìV\X^WVíSíSïdRÙ\i æ E^)RÙë5\æ^_`S\])QSïd\ = $'$ 01,32 @b}ùmè R^ è¦WVRÙïÀæ ^_U5úXGZ UVXS\kö}R^WVíSíSïdRÙ\iæv^_`G\ ])QSïd\Vb æ ¥ è¼Uk]^)`S\EúG]'æq^ ÎRÙ^_\])WY^_RdUVXGæÍWYXGZ ^_`S\XCRÙ^KWYíGíSïÙRd\æ

¥ üv`S\.WYìk\@X^KWVíSíSïdRÙ\i e Sb ö^yeíSRÀé@WVïÙïd RdQJ\^1WVïb+ô øè¼QS]q^)`S\@] ])\@í+UV]_^_\Z-^)`S\ è¼UkïÙïdUtîRÙXGìUVòGæ_\@])ñtWt^_RdUVXæ@bG ÿ Ìüv`S\éUæy^1UY;è =#%$'$ % 01,32 1RdXc !^_Rdë5\ÍRÀæÌë-QGé'`0æ_ëÎWYïdïÙ\]Ì^)`GWYXÎ^)`GWt^vUYè & 6* $ 0 ,E2 öî`SRÀé'`0])\ QSRd])\æ\ñtWYïÙû QGWt^)RÙXSì WVïÙï1íUæqRÙ^_RdUVXæ@bÌÿ üv`G\Jé'`GWYXGé@\ÎUYè æ_QGé@é@\æ)æyè¼QSïdïde úGXGZSRÙXSìêWíUæqRÙ^_RdUVX ^)U&ë.Utñk\ ^_UÔîR^)` =#%$'$ % 0 ,E2 ERÀæ QSR^)\Õ`GRÙìk`b ÿ÷ $'$ 01,32 .WVïÙïdUtî æEë5Ukæq^ÎWYìV\X^)æ^)UCúXGZ ò\@^q^_\]ÎíUæqRÙ^_RdUVXæ.WY^5^_`S\úG]'æy^Îæq^_\@íÒb ÿ ¥ UVQS^_í+\@]_è¼UV])ë5}æ ¥ ö:î`SRÀé'`óRdXó^_QS])X UVQ}^)í\]qè¼Uk]_ëÎæ RdXÕ^)\@])ëÎævUYèÓ])QSX^_Rdë5\Vb

8

1

8

8

1

#

.

1

:

"

C %'I> H CEI $ 9! ' H# I H CEIJ$

*ÍQG]ïdUéWYïÌæ_\WV])é'`¹ÿ Óç ´RÀæW `SRÙïdïÙûmé@ïÙRdëòSRdXSìCæ_\WV])é'` QGæ_RdXSì^_`S\Õë5RÙX}û éUVX GRdé^K`G\@QS])Rdæq^_RÀéè¼UV]´ñtWYïdQS\.æ_\@ïd\é^_RdUVX ô iõtømb+ùm^´ò+\@ìVRdXGæKîRÙ^_` WÕé@UVë5íSïd\^_\5WVæ)æ_RÙìkXSë5\@X^ ÿ¼XSUV^XS\ié\iæ_æ)WY])RÙïde éUVXæqRÀæy^)\@X#^ ÕWYXZ ^_])RÙ\iæJ^)UóRdë5íS])UtñV\ RÙ^&òe¹é'`GWVXSìVRdXSìÁRdXGé@UVXGæ_Rdæq^_\X^ WVæ)æqRdìVXSë5\Xk^'æRdXCUk])ZS\@] ^)UÕ])\ZSQGé\^_`S\.XQSëò+\@]KUYè1é@UVXGæq^_]'WYRdX^ÍñRdUVïÀWt^_RdUVXæ@b ê\.íS])UVíGWYû ìkWt^)\´^_`S\\ +\iéÛVè é@UVXGæ_RÀæy^)\@X^WVæ)æqRdìVXGë.\X^)æ UtñV\]v^_`S\EZ}UkëÎWYRdXGæUYè:^_`G\EñtWY])RdWVòSïÙ\iæ îR^)` RÙXGé@UVXGæ_RÀæy^)\@X^ÕWVæ)æ_RÙìkXSë5\@X^)æbÌüv`SRÀæ0Z}\iæqRdìVX¹Z}\iéRÀæqRdUVX!WVïÙïdUtî æ5QGæJ^_UÁ`GWVXGZ}ïd\ \ ©\é^_RdñV\ïÙe XSUVX}ûmòSRdXGWY])e éUVXæy^)])WVRÙX^)æb Ô\RÙë5íSïd\@ë5\@X^5ïÙU}éWYïvæq\iWY]'é'`ÂRÙX W ìV])\@\iZ}eêè¦WVæ_`SRÙUkXÁRdXÁ^_`G\ æq\XGæq\.^)`GWtÊî \ÎZ}UXSUY^òGWk#é k^)])Wk#é UtñV\@]é@UVXGæ_Rdæq^_\X^Wkæ_æ_RdìVXSë5\@X^'æ@b &Uk]_\UtñV\@]iö+î1\ÎWYíSû íSïÙeÕW-]'WYXZ}UVë.ûmî WVï Jæq^_]'Wt^)\@ìVe5^)U5\æ)é@WYí+\Kè¼]_UkëïdUéWYï¥Ukí}^_RdëÎWô ømb ¹RÙ^_`&W.íS]_UkòGWYòGRÙïdR^ye ÿ ö+î \5é'`SUUkæ_\^)`S\5ñWVïÙQG\-UVè WÕñWV]_RÀWYòGïÙ\-QGæqRdXSìÕ^_`S\5ë5RdX}ûmé@UVX Rdé^´`G\@QS])Rdæq^_RÀéYö¥WVXGZ îR^)`êíS])UVòWYòSRdïÙRÙ^ye î1\5é'`SUUkæ_\-^_`SRÀæ´ñtWYïdQS\5])WVXGZ}UVë5ïdeVb SUkïÙïdUtîRdXSìJ^)`S\ÎRÙXZ}RdéWt^)RÙUkXGæKUYè ô tøöÓî1\Jé'`SUUæq\ S§ S Sb SQS]_^_`S\]-æq^_QGZSeêUkXê^)`S\0é'`SUkRdé@\ÎUYèv^_`G\JñtWYïdQS\JUYè è¼Uk]E^_`G\ ý´üÓþ íS]_UkòSïd\@ë îRdïdïÓò+\])\@í+UV]_^_\iZÕRdXÁô tømb GQS]_^_`S\]_ë5UV])\VöGî1\EQæq\E]'WYXZ}UVë ]_\iæy^'WY]_^)æv^)U òS]_\iW JUkQ}ÛYè ïÙU}éWYï¥UVíS^_Rdë5WGb ' H# I H CEIJ$ ê\.Q}^)RÙïdR \.æ_e}æy^)\@ëÎWt^)Rdé-æq\iWY]'é'`^_\é'`GXSR QS\iæKòGWVæ_\Z&UkXCZ}\í}^_`}û úG])æq^ÍòWV#é ^_]'WV#é æq\iWY]'é'`Õ^)UÕæ_UVïdñV\^)`S\Îý´üÓþ íS]_UkòSïd\@ëbùoXCUVQS]ÍRÙë5íSïd\@ë5\@X^'Wt^_RdUVX ÿ¦[vü ö è¼UV])î WV])Z-é'`S\#é RdXSì.ô øGWYXGZ.òS]'WYXé'`}ûmWVXGZûmòUkQSXGZë.\ié'`GWYXGRdæ_ëÎæWY])\ RdX^_\ìV]'Wt^_\iZERdX^_U´^_`G\ æq\iWY]'é'`æy^)])WY^_\ìVeVbþóè¼QSïdï}ïÙUU ûoWY`S\iWVZæq^_]'Wt^)\@ìVe´î UVQSïÀZZS])Wkæy^)RdéWYïdïÙe´RdXGé])\Wkæq\Ì^_`S\vXQSëò+\@] UYèÒéUkXGæq^_]'WYRdXk^Ìé'`G\#é }æî`SRdïÙ\Í\ ©\é^_RdñV\@ïde5eRÙ\ïdZ}RdXSìEïdRÙ^q^_ïd\ úï^)\@])RÙXSì-æqRdXGé\^_`S\KWYíSíGïÙRÀé@WY^_RdUVX `GWVæ.ëÎWYXeÁë-Q}^_\ óWYXGZ ìVïdUVòGWVïéUVXæy^)])WVRÙX^)æÿ¦R^ÎRÀæÎWC])\æ_UVQS]'é\WYïdïÙU}éWt^_RdUVXóíS])UVòSïd\@ë b þæZ}\í}^_`}ûúG]'æy^ æ_\WV])é'`J\ }íGWYXGZGæ1XGUZS\æ RÙXW5æq\iWY]'é'`JíGWY^_`ö}î \Ké'`G\#é ÎRÙèÒ^_`G\K\ }íGWYXGæ_RdUVX UYèÌ^)`S\Îæq\iWY]'é'` íGWY^_`êéWYXêRÙë5íS])UtñV\.UVXC^_`S\ÎéQG]_])\@X^´ò\iæyÊæ_UVïdQ}^_RdUVX+b *ÍXGé@\^)`S\ÎéQS])])\@X^ ò\iæy^-æ_UVïdQ}^)RÙUkX éWYXSXGUY^ò\0RÙë5íS])UtñV\iZ¥ö¥òGWV#é ^)])Wk#é &UééQS]'æbùoXÂWVZSZ}RÙ^_RdUVXÒö^_`G\JZ}eXGWYë5RÀé %'I+ : H CEI $ )

9

#

.

#

9#

85

ñWV]_RÀWYòGïÙ\5WYXGZÔñtWYïdQS\5UV]'Z}\]_RdXSì0`G\@QS])Rdæq^_RÀé@æWV]_\ÎWVíSíSïdRÙ\iZCRdXÂ[ ü´bÒýKRdñV\XC^)`GWtÊíG]_UkòSïÙ\ëÎæ ë5WeÔò+\0UtñV\]qûoéUkXGæq^_]'WYRdXS\ZCî1\æqïdRÙìk`^_ïdeêë5U}Z}RÙúG\ZÁæ_\WV])é'` XSUV^^)U òGWV#é ^)])Wk#é Cî`S\XÁW Z}UVëÎWYRdX&îRdí\@ûUkQ}Û}é@é@QS])æöSòSQS^ÍUVXGïÙeÕî`S\@X&^)`S\é@QS]_])\@X^ íGWY^_` éWYXSXSUV^Rdë.íG]_Utñk\QSíUkX ^_`S\Eé@QS]_])\@X^RdXGéQSë-ò\X^b *ÍQS]Rdë5íSïd\@ë5\@X^)WY^_RdUVXRÀæZ}\æ)é])RÙò+\ZÕRdXZS\^)WVRÙïÒRÙXÂô Rtømb ÍåqàGåqä qã

ÓßSåqáÌÞ= & ! ,38!"# ^)`S\&éQSë-QSïdWY^_RdñV\ñtWYïdQS\UYè ^)`S\&])\@ëÎWYRdXSRdXSìÔéWYíGWkéRÙ^ye UYèKWVïÙï ý´üÓþÍæJWtèÃ^)\@]0Wkæ_æ_RdìVXSë5\@X^ ÿ¦é@UVïb kö õGöWYXGZ RdX ü:WYòÒb bÌüv`SRÀæÎíS])UtñRÀZ}\iæ.WVX \æq^_RdëÎWt^_\UVèÓî`S\@^_`S\]W5æq\iWY]'é'`Õæq^_]'Wt^)\@ìVeÎRÀæîvWVæq^_\@è¼QSï¥UYè ]_\iæqUkQS]'é\æb } b ^_`G\EXQSëò+\@]ÍUYèéUkXGæy^)])WVRÙX^é'`S\i#é }æ@öGé@UVQSX^)\ZQGæ_RÙXGì5^_`S\.éUVXñk\@X^_RdUVXUYè[vWVéû é'`QGæWYXGZ WVX[ \@\ Cô ø1ÿéUVïb Sö }öSWYXGZ -RdXü:WYòÒb b ê\])\@í+UV]_^v^_`S\´è¼UkïÙïdUtîRÙXGì-UkòGæq\]_ñtWY^_RdUVXGæ *ÍXSïde þ RÀæ-WVòSïd\Î^_UCúGXGZÁW è¼QSïÙïvæ_UVïdQ}^_RdUVXÂ^)UCWYïdïvæqUkïÙñtWYòGïÙ\JíG]_UkòSïÙ\ëÎæ0ÿ¦é@UVïdQSë5X RÕUYè1ü:WYòÒb b¥[ UY^)`Ô[vü WVXGZ :ç&è¦WVRÙï è¼Uk]WYïdïÓ^)`S\æ_\5RÙXæy^'WYXGé@\æb+ùoXC^_`SRÀæ´])\æ_í\ié^ö þ éïd\WV]_ïdeUVQ}^)í\]qè¼Uk]_ëÎæ^_`S\.UY^)`S\@]Í^yî1Uæy^)])WY^_\ìVRd\æWVXGZ WñkUVRÀZSæ ìk\^q^)RÙXGìÕæq^_QG#é RÙXQGæ_\@ïd\æ)æví+UV]_^_RdUVXGævUVèÒ^_`G\Eæq\iWY]'é'`Õæ_íGWVé@\Vb ¹`S\@XJ^_`S\Í])WY^_RdUEUYè©^_UV^)WVï+éWYíGWkéRÙ^yeE^_U^_UV^)WYïïÙUWVZ5RÀæìk]_\iWt^_\]^_`WYX ÿ¼^_`S\KíS])UVòSïd\@ë ëÎWeêUk]EëÎWeêXSUV^ò+\Õæ_UVïdñWVòSïd\ ö þ éïd\WY])ïdeCUVQS^_í+\@]_è¼UV])ë5æ[ üWVXGZ Óç©bc1UVXSû ñV\])æ_\@ïdeVöî`S\X´^_`S\Ì])WY^_RdU RÀæ¥ïd\æ)æ©^_`WYX ÿ¼íS])UVòGïÙ\ë»RÀæ¥XS\ié\æ)æ)WY])RÙïde UtñV\]qûoéUkXGæq^_]'WYRdXS\Z ö þ æEí\]qè¼Uk]_ëÎWYXé\5RdæK^)`S\Jî UV]'æyîö¥WVææ_`SUtîX RÙX :RÙìb÷SbùoXZ}\@\iZ¥öî \Îë5W V\.^_`G\ éUkX y\é^_QS])\^)`GWt^ þ Rdæ XSUV^W-])\@ïdRdWVòSïÙ\^_\ié'`SXSR kQG\Íè¼Uk]æqUkïÙñRdXSìUtñk\@]_ûmé@UVXGæq^_]'WYRdXS\Z íS])UVòSïd\@ëÎæb *ÍX¹WñV\@]'WYìk\Ôÿæq\\ :RÙìb ö Óç í+\@]_è¼UV])ëÎæÎëQé'`óè¼\î1\]0éUkXGæq^_]'WYRdXk^0é'`G\#é }æ.^_`GWVX þöSî`SRÀé'`Õí+\@]_è¼UV])ëÎæ è¼\@î \@] é@UVXGæq^_]'WYRdX^é'`G\#é }æ1^_`GWVX[vü´b üv`SRÀæè¼\iWt^_QG]_\ÍUY+è ÓçJRdæ1QGæ_\è¼QSï+î`S\XÕé'`S\i#é RdXSìé@UVXGæq^_]'WYRdX^)æÍÿ¼\Vb ìGbdöXSUVX}ûmòSRdXGWY])e5éUVXSû æy^)])WVRÙX^'æ ö}òSQ}^ RÀæ W5éUkæq^_ïdeJUkí\])WY^_RdUVXb

2

:"

"

!#

#

"

88

§

(. / 1 ! ,)- *0 )' # )*+ ( )

# &&( )' ' 3 $ ! !" % $ 2 2 < 849 :4; 849 24;>= : 5?= 5 O : F>= 3H9 2 :@= 9 = 24: :4; 24: 24;>= : 3A= O O 5: 8@= P ; 3 2@= 9 < 849 :49 8H5 24;>= 8 5?= 3H: 2 8@= 5 2 3 2@= 9 < 8 F :49 8 3 24;>= 8 5?= 3H2 2 8@= 5 2 3 5C= 9 < 848 8H5 5:>= 9 5 8 5?= 2 P 5 8@= ; 8 3 8@= 9 = 24O 8H5 545?= 9 5 8 3A= O O F 8@= 9 O 3 5C= O < 84: 9 F 24;>= 9 24P>= F 5?= 3HO 8 F>= FH; 2 F>= 2 < 8 F 9 F 24P>= 9 24P>= F 5?= 3 3 8 F>= FH9 3 2@= 2 P Q R S T U V W OAX OYO k^ ehi4_g}`4b4j_^wbb4ga` bdcfegbkgmoihkj dp[p iq

+ ,-B. 01H2 3 35?6 7 E J'K4L L 2 3 35?6 7 E 7 J'K4L L 2 3 3H2 E +,-B. 01H2 3 3H8 7 E GIHKJLH - H N0 M H O U &VYX ]\ _d

89

2

67

4 5 8>= P4P 9 FA= 3 ; 58 5?= PH5 F 2>= F : F 2>= 84; 2 2>= 94: F 5?= 5P 8 5?= 948 F OZP OAQ p mlon

3

!

#

8>= :4; 8>= 9 F 8>= 3(5 8>= 34F 8>= F43 8>= :H5 8>= :42 8>= :48 O>R

$ 2 3 :>= F 3 3A= ; 3 8>= O 5 8>= P 3 9>= 3 3 2>= 3 3 8>= ; 3 8>= 8 ONS ONT

89

2

67

4 5 3A= O P 3A= 8 ; 3A= 8 8 3A= 53 3A= O 9 3A= 5 : 3A= 8 2 5?= FH2 OZU

3

3 2F 3 3 3 O 3 3

OAV

#

8>= 2 3 2>= 949 8>= 5O 8>= 24P 8>= :42 8>= 242 8>= 3 8 8>= 24: OAW

!

8 9 76 4 5 9>= 8 3A= 5 O O>= 8 P>= 8 ; 5?= ; 2>= : O 3A= O 5?= 5 9 8>= 3 3A= 3H2 2>= 3 3A= 9(5 2>= O 3A= FH; 3A= O 3A= 5F P[O PYP

2

3 3

O 5

3

2 5 5

PNX

sr j°kg¬ nyrsrhSp² ©g@nÌ|}gtgijmlyxSÊ ´rsj »g)gihjmlonqp@rshYlÌqµx'qYj'º Ê ¿ Ïrshkrsp@lyx)jÒ·µkx)lyµx)nlyµkx rshjmlqphk'x 24

25

100%

systematic search local search multi-agent

unassigned courses

20 16

15

90% 80% 70% 60%

13

50%

10

40%

8 6 4 4

5

3

4

4 2 0

0 spring 01b (0.88)

fall 02 (0.88)

4

spring 03 (1.00)

fall 01b (1.02)

20%

3 3

2 0

30% 5

0

fall 01b (1.06)

0

0

spring 03 spring (1.08) 01b (1.18)

Y ²ZrG²osªgmki ie gmb?o}l_*^4i b4i

2

1

10% 0

0%

B

fall 02 (1.27)

B

O

B

fall2001b

² +²

systematic search

data set (ratio)

þ

O

spring2001b

O

B

fall2002

O

spring2003

local search

multi-agent

>t \ _g#i j^kehg#jl b?lvui

ïd\Wñk\æKë5UV])\5ý´ü:þæQGXGWVæ)æqRdìVXG\Z&^)`GWYXÂ[vü Uk] Óçóÿ¦éUkïb SSö }ö¥WYXGZ 3SRdX Óü WVòb ö1î`SRÀé'`!])WVRdæ_\æ0éUVXé\@])XGæJWVòUkQ}^ÕRÙ^)æWYòGRÙïdR^yeó^_Uó\ ©\é^_RdñV\@ïde \ }íSïdUVRÙ^0^_`G\ WñtWYRdïdWVòSïÙ\K])\æ_UVQS]'é\iæ@b ùoXEíGWV]q^)Rdé@QSïdWV]ö'è¼Uk]:çíG]_RdXSì >S>S@ò5ÿ'* ö Rý´ü:þæÓ]_\ë5WVRÙX´QSXQGæ_\Zbüv`SRÀæÒWVïdWV]_ë5RdXSì æqRÙ^qû QGWt^)RÙUkXíS])UVë5í}^_\iZQGæÒ^_Ué@ïÙUæq\ïÙeK\ SWVë.RdXS\^)`S\1æ_UVïdQ}^)RÙUkXGæìk\@XS\])WY^_\Zöî`SRÀé'ÈeRÙ\ïdZ}\iZ UVQS]KRÀZ}\@X^_RÙúéWt^_RdUVXCUYè^_`G\ * &@," 14 %,E0 %, ÂZ}RÀæ_é@QGæ_æ_\Z RÙXêç}\é^)RÙUkXê÷Gb b :Rû XGWYïdïdeVö¥î \.é@UVë5íGWY])\Z ^_`S\Îò+\@`WñRdUV]KUYè þ UkXÔæ_UVïdñtWYòSïd\.WVXGZêQSXGæ_UVïdñWVòSïd\-íS])UVòSû ïÙ\ëÎævRÙXÕ^_\]_ëÎævUYèÓ^_`S\´XQSë-ò\]UYè:WVìV\Xk^'æ RÙ[X £m¤¥§¦©¨ª¦ª«m¬¬*¦®5í\]R^)\@]'Wt^_RdUVXÒbüv`G\ æqUkïÙñtWVòSïÙ\íS]_UkòSïd\@ëÎæWY])\ çíS])RÙXGì 3S+S òÿ[ ö WYïdï >S>S@òÿ¦x[ w* ö GWVïÙï 3S+S .ÿ¦[ öWVXGZ çíS])RÙXGì 3S+SV÷Áÿx[ w* Õÿ :RÙìb büv`S\QSXæqUkïÙñtWYòGïÙ\UVXS\iæRdXGé@ïÙQGZS\&ç}íS]_RdXSì 3S>S@ò!ÿ * WYXGZ WYïdï >S>S 0'ÿ * ´ÿ ÓRdìGbõ b \R^)`S\@]-^_`S\òGWkæqRÀé ÓçÂXSUV] þ ÿ¼Rb \kbÙö:îRÙ^_`SUkQ}^.])\æq^)WV]q^æq^_]'Wt^)\@ìkRÙ\iæ ´RdXGé@ïÙQGZS\æ.W ë5\é'`GWVXSRdæ_ë è¼UV]Rdë5íS])UtñRdXSì5^)`S\ QGWVïÙRÙ^ye0UVè^_`S\.æqUkïÙQ}^)RÙUkXRÙX&^)\@])ëÎæ UYèÌý´ü:þ íS])\èÃû \@])\@XGé@\æöYî`SRÀé'`5Rdæ^_`S\æ_\é@UVXGZGWY])eUVí}^)RÙë5R Wt^)RÙUkXJé])R^)\@])RÙUkXbtùoXGZ}\\Z¥öV^_`S\ kQWYïdR^ye-UYè ^_`S\æ_UVïdQ}^_RdUVXæè¼UkQSXGZ-òeE[vüÂRÀæ WYïdë5Ukæq^:éUkXGæ_Rdæq^_\Xk^)ïÙe`GRÙìk`S\@]ibiPUtî1\ñV\]ö@^_`SRÀæ ïÙ\XSìY^)` '

)

" 9#

#

#

#

#

1#

8

89

!

$

;

$ : B

F

E

D

UYè^)`S\ÎæqUkïÙQS^_RdUVXGæè¼UVQSXZ òe þNUVXÔæqUkïÙñtWVòSïÙ\-RÙXæy^'WYXGé@\æö+î`GRdé'`êRdæÍ^_`G\-íS])Rdë5WV]_e UVí}^)RÙë5R WY^_RdUVXé])R^)\@])RÙUkXö}RdææqRdìVXSRÙúéWYX^_ïdeÎïdWV]_ìk\@] ^_U5ò+UY^)`[vü WYXGZ Óç¥b :RÙìæ@b KWVXGZ5õKæ_`SUtî ^_`GWY^^_`S\í+\@]_è¼UV])ëÎWYXGé@\ UVè þ Rdæë.Uk]_\æq^)WVòSïÙ\vî`S\XÎæqUkïÙñRdXSì æqUkïÙñtWVòSïÙ\´íS])UVòGïÙ\ë5æ ^)`GWYXî`S\Xæ_UVïdñRÙXGì.QGXGæqUkïÙñtWVòSïÙ\´íS])UVòGïÙ\ë5æb

#

45

70

# agents in zero position

# agents in zero position

40

fall 2001b

60 55 50

spring 2003

45 40 35 fall 2002 (B)

30 25

35 30 25 20 15

20

fall 2002 (O)

iteration

15 1

Y

spring 2001b (O)

spring 2001b (B)

65

² G²ohj 20

^?_`Hacb4di

39

58

77

96

115

134

#b4^h _^4dkgml b[_g

153

172

1

191

² G²

20

39

58

77

96

115

134

153

172

191

hkj # b4^YH_^ d kgl b _g*gi4_'a k`Hacb ^_`Hacb4di

i4_'a k`Hacb

& C ' H$& ' %9CD%

q'hq

iteration 10

!& '!& G H

üÒ]'WV#é RdXSì ^)`S\íUæqRÙ^_RdUVXGæ.UVèÍRdXGZ}RdñRÀZ}QGWVïWYìV\X^)æ5Wt^5ñtWY])RÙUkQGæ.R^)\@]'Wt^_RdUVXæ@ö:î1\UkòGæq\]_ñk\Z ^_`S\^_`S])\@\^yeí+\æÍUYèÌWVìV\Xk^Íë.Utñk\@ë5\@X^Íæ_`SUtîX&RdX :RÙìb }bùoX ^_`SRÀæ úìVQS])\Vöî1\-QGæq\iZ^_`G\ RÙXGZS\ CUVèÌ^_`G\JWYìV\X^ æÍí+Ukæ_RÙ^_RdUVXê^_URÙXZ}RdéWt^)\.RÙ^)æWkæ_æ_RdìVXS\iZ ñtWYïdQS\Vbüv`G\-^)`S])\@\.^yeí\iæÚYè ë.Utñk\@ë5\@X^'æ WV]_\K^)`S\´è¼UVïdïÙUtîRdXSì #

variable 40 20 0

index of position

1

51

101

151

201

251

301

351

401

451

301

351

401

451

301

351

401

stable 20 10 0 1

51

101

151

201

251

constant 30 20 10 0 1

51

101

Y

² }²

151

201

251

#b4i _Dk b4gj>d_b4db4g#j

451

iteration

WY])RdWVòSïd\t^)`S\WYìk\@X^é'`GWYXGìV\æ RÙ^)æÌíUæqRÙ^_RdUVX5])\@ïÀWt^)RÙñk\@ïdeè¼]_\ QS\@X^_ïde5WYXGZ-è¦WYRdïÀæ ^_UúXGZ R^'æ £m¤#¥m¦ ¨§¦ª«m¬W¬ ¦®¥b ç^)WVòSïd\}^)`S\EWYìk\@X^])WV]_\ïÙeJé'`GWVXSìV\iæ R^'æíUæqRÙ^_RdUVX ÿ¼UkXGé\Úk]v^yîRdé@\b

90

c1UVXGæq^)WVX^ S^)`S\-WVìV\Xk^úGXGZSæÍW £m¤¥§¦©¨ª¦ª«m¬¬*¦®0Wt^Í^_`S\-ò\ìVRdXSXSRdXSìJUVè^_`S\æ_\WV])é'`Òö YW XGZXS\ñV\]vé'`WYXSìk\æ Rîb rP CE q ê\.æq\@^^_`G\ëÎW }RdëQSëðXQSëò+\@]ÍUYèRÙ^_\])WY^_RdUVXGæ ^_U >S>SÎWVXGZ^_]'WVé# k\Z ^_`S\Îí+Ukæ_R^)RÙUkXGæÚYèvWYìk\@X^)æKUtñV\]^)`S\Î\@X^)RÙ])\ÎZSWt^'Wæ_\îö¥ìV])UVQSí+\ZCRÙX^)Uæ_UVïdñtWYòSïd\ÎWYXGZêQSX}û æqUkïÙñtWYòGïÙ\´RdXGæy^'WYXGé@\æb ê\UkòGæq\]_ñk\ZJ^_`S\´è¼UkïÙïdUtîRÙXGì ùoXêæ_UVïdñtWYòSïd\RdXGæq^)WYXé\æö+ë.Uæy^´WYìk\@X^)æKWY])\-æq^)WVòSïÙ\kö+WJè¼\@î WV]_\.éUkXGæy^'WYX^ö©WYXZ&XSUVXG\ UYè:^_`S\EWVìV\Xk^'æ RdæñtWV]_RÀWYòSïd\Vb ùoXJQSXGæ_UVïdñtWYòSïd\RdXGæq^)WVXGé\iæ@ökë.Uæy^vWYìk\@X^)æWV]_\ñWV]_RÀWYòGïÙ\kökWè¼\î!WY])\æq^)WYòGïÙ\köWVXGZ5XSUVXG\ UYè:^_`S\EWVìV\Xk^'æ Rdæ é@UVXGæq^)WVX^b

q'

F/$& !&H! %'I P$H&% &%

*ÍXKUkQS]¥^yî UvQSXGæ_UVïdñWVòSïd\RÙXGæq^)WVXGé\iæ¥UYè}ü:WYòb ´ÿ¼Rb \kbÙöç}íS]_RdXSì 3S>S@ò.ÿ'* ¥WVXGZ GWYïdï 3S>S Kÿ'* ö þ ïÙ\@èÃ^æ_UVë5\^'WVæ! }æQSXWVæ)æqRdìVXS\iZÂÿ¦é@UVïb R WYXGZ æ_UVë5\-]_\iæqUkQS])é@\æQGXQæq\iZÁÿ¦é@UVïb >S ö

#

YW ïÙ^_`SUkQSìV`öÓRÙX íS]_RdXGé@RÙíSïd\VöÓò\@^q^)\@]5æqUkïÙQ}^)RÙUkXGæé@UVQSïÀZÂò+\0])\Wké'`S\Z¥bÓ[1e é@WV]_\@è¼QSïdïÙe WYXGWVïÙe @û RÙXSì0^_`S\iæq\.æqRÙ^_QWt^_RdUVXæ@ö+î1\QSXGéUtñk\@])\Z^)`S\ 6* &@," -4 %,E0 %, :ö¥î`SRdé'`CRdæKWÎëÎWyUk] æq`SUk]q^'éUkë.RdXSìÍUYè þóWVXGZEëÎWe`SRdXGZ}\] R^'æÓQæq\@è¼QSïÙXG\æ)æÓRdX.íS])Wké^)Rdé@\Vbiùm^ RÀæ:XSUY^ UVQG]:é@ïdWVRÙë ^_`GWY^^_`S\EZ}\iWVZ}ïdU}#é ÎíS`S\XSUVë5\@XGUVXRdævQSXGR QS\´^)U þEbGùm^ ë5WeJWVïdæ_UÎæq`GUtî!QSíRÙXUV^_`S\] æq\iWY]'é'` WVïÙìkUV])R^)`SëÎæ@bÓP Utî \@ñV\]ö©^)`S\Jè¦WkéÊ^)`GWt^ þ\ }`SRdòSR^'æE^)`SRÀææ_`SUV]_^)é@UVë5RÙXGì&î Wkæ XSUY^ XSUV^_RÀé\iZ0\iWY])ïÙRd\@]ib rP CE ¹R^)` ^_`G\.çíS])RdXSì >S>S òÔÿ * ZSWY^)WGöGî \E\ SWYë5RdXS\Z^)`S\í+Ukæ_RÙ^_RdUVXGæUYè \WVé'`ÔWYìk\@X^RÙXê^_`S\Îæq^)WY^_\ÎéUk]_])\æ_íUkXGZ}RdXSìÕ^_U^_`S\5ò+\æq^WYíSíG]_U }Rdë5WY^_\.æqUkïÙQ}^)RÙUkXCè¼UkQSXGZ¥ö WYXGZWYXGWVïÙe \Z0^_`S\EWVïÙïdU}é@Wt^)RÙUkXÕUVèÓ])\æ_UVQS]'é\iæ1^_U.^)Wk æ }æb üv`S\ ò+\æq^ WYíGíS]_U }RdëÎWt^_\ æ_UVïdQ}^)RÙUkX-è¼Uk]^)`SRdæÌíS])UVòSïd\@ë î Wkæ è¼UVQGXGZJWt^1R^)\@]'Wt^_RdUVX }ötîR^)` ÎéUkQS])æ_\ævQGXGWVæ)æqRdìVXG\ZWYXGZ RÎý´üÓþÍæQSXQGæ_\Z¥büv`S\´^_UV^)WYïÒXQGëò+\@] UVè é@UVQS]'æq\iæ RÙX^)`SRdæ íS]_UkòSïd\@ë Rdæ õ Sb Ô\UVòæq\]_ñk\Z-^_`GWY^Ì^)`S\QSXæ_WY^_RÀæyúG\iZÎéUkQS])æ_\æ1é@WVXJWVé^)QGWYïdïde-ò+\Íæ_\@])ñRdé@\Z òe ^_`S\êWñWVRÙïÀWYòGïÙ\köQSXQGæq\iZ¹ý´üÓþÍæ@b þðîvWVæ5XSUV^0WYòSïd\^_UóZ}U ^_`S\ Wkæ_æ_RdìVXSë5\@X^Îè¼Uk] ^_`S\0è¼UVïdïÙUtîRdXSì ])\WkæqUkXbÓüv`S\]_\0î1\]_\0æ_\@ñk\@]'WYïQGXGæ_WY^_RÀæyú\ZÁWVìV\@X^'æ0ÿ¼Rb \Vbdö:é@UVQS]'æq\iæ ´^_`GWY^ é'`SUkæ_\^_U0ë.Utñk\^_UÕWJí+Ukæ_R^)RÙUkX&RÙX&^)`S\@Rd]K]_\iæqí+\é^_RdñV\E])Utî æ éUk]_])\æ_í+UVXGZ}RdXSì5^)UÎ^_`S\5æ)WYë5\ WñWVRÙïÀWYòGïÙ\ý´üÓþöî`SRdïÙ\Õ^_`SRÀæJý´ü:þDéUVQGïdZ UVXSïdeÂò\&Wkæ_æ_RÙìkXS\ZÁ^_U WVæ.ëÎWYXeÁWYìk\@X^)æ5Wkæ R^'æÕéWYíGWkéRÙ^ye î UVQSïÀZ!WYïdïdUtîbÌüv`SRÀæÕæ_R^)QGWt^)RÙUkX¹]_\iæqQGï^)\Z RdXÅéUkXGæy^)])WVRÙX^'æJò+\@RdXSìÁòS])U k\@X WYXGZXSUkXS\EUYè:^_`G\WYìk\@X^)æv])\Wké'`SRÙXGì5 W £m¤#¥§¦©¨ª¦ª«¬W¬*¦®©bþæÍWÎéUkXGæq\ QS\@XGé@\VöGWVï^)`SUVQGìV` WYìV\X^)æEë5Utñk\ZÔ^_U ^_`GWY^-í+Ukæ_R^)RÙUkXöÓXGUVXS\éUVQGïdZ ò+\Wkæ_æ_RdìVXS\iZC^)`GWt^.íUæqRÙ^_RdUVXÒö:WVXGZ ^_`G\ éUV])])\æ_íUkXGZ}RdXSì5ý´ü:þ ])\@ëÎWYRdXS\ZQSXWVæ)æqRdìVXS\iZ¥b Ô\RÙïdïÙQæy^)])WY^_\´^_`GRdæ æ_RÙ^_QGWY^_RdUVXRÙX :RdìGb RSb ÌWVé'`5éRd]'éïd\vé@UV])]_\iæqí+UVXZSæÓ^_UEWKìVRdñV\XÎý´ü:þEb UY^)\v^_`GWY^^_`S\]_\RÀæ\ }Wké^)ïÙeEUkXS\ éRd]'éïd\ í+\@] ý´üÓþb ÌWké'`.æ QGWY])\v]_\íS])\æ_\@X^)æWVX5WVìV\XkîbVüv`S\í+Ukæ_R^)RÙUkX5UYè¥WYXJæ kQWY])\ UkX.^_`S\ é@RÙ]'éïd\Rdæ RÙ])]_\ïÙ\ñtWYX^ÌWYXGZJUVXSïde-Qæq\@è¼QSïGè¼Uk]ÌñRÀæqQGWVïÙR WY^_RdUVX5íSQS])í+Ukæ_\æbküv`S\]_\ ëÎWe.ò\ @\]_UUk]Ìë5UV])\ æ QGWV]_\iæ-UkX¹WêìkRÙñk\@X éRd])é@ïÙ\kb[ ïdWVX æ kQWY])\æ.RÙXZ}RdéWt^)\^_`GWY^Î^_`S\&í+Ukæ_R^)RÙUkX RdæÎW £m¤#¥§¦ ¨§¦ª«m¬W¬ ¦®Eè¼Uk]Ì^)`S\´WYìV\X^ Y^)`S\æ_\îRdïdï+eRd\@ïÀZÎ\ ©\é^_RdñV\KWVæ)æqRdìVXSë5\Xk^'æ@büv`S\úGïÙïd\ZÕæ QGWV]_\iæ RÙXGZSRdéWt^_\Ì^_`GWY^:WVï^)`SUVQSìk`^_`G\Ìí+Ukæ_R^)RÙUkXERdæ^_`S\ ò+\æq^ÓUVXG\è¼UV]Ò^_`S\ WYìk\@X^öiR^:]_\iæqQGï^'æRdX-æ_UVë5\ òS]_U V\XÕéUkXGæq^_]'WYRdXk^'æ@b}üv`QGæRÙ^ RÀævXSUY^ W £m¤#¥m¦ ¨§¦ª«m¬W¬ ¦®¥ökWVXGZ0^_`S\EWké^_QWYïWVæ)æ_RÙìkXSë5\@X^ UYèÌ^)`S\Îí+Ukæ_R^)RÙUkXC^)UÕ^)`S\ÎWYìk\@XÊé@WYXGXSUY^´ò\ÎëÎWkZ}\Vbüv`S\Îé@RÙ]'éïd\æ´í+UVíSQGïdWY^_\ZêòeCæ_\@ñV\])WVï úGïÙïd\Zæ QGWY])\ævWY])\Eý´üÓþÍæ ^_`GWY^ ]_\ëÎWYRdXÕQSXQGæ_\Zb

)

"

#

91

agent in zero position agent in deadlock

ff

# total agents : 65 # agents involved in deadlock: 24 # unused GTAs: 8

² S² b?ko#a_(lvu

Y

i j/kjMb

% 6* &@," #$ * $ ¹`S\X ^)`S\UkXSïÙeí+Ukæ_R^)RÙUkXGæKWVé@é@\@í}^'WYòSïd\E^)UÕWÕæqQSòæq\@^ è¼UV]5WYìV\X^)æWV]_\0ëQ}^)QGWYïdïÙe \ }é@ïÙQæqRdñV\köÓWCZS\WVZSïÙU}é# ÔUééQS]'æ^_`GWY^-íS])\@ñk\@X^)æ-WYXeêUVè ^_`G\ WYìV\X^)æ RÙX^)`S\EæqQGòGæq\@^è¼]_Ukëò+\@RdXSìJWYïdïÙUtî \ZJ^_UÎë5UtñV\Í^_U.^)`S\]_\ QS\æq^_\iZ0í+Ukæ_R^)RÙUkXb UVXG\&UYèK^_`S\ ñtWV]_RÀWYòSïd\æ.RdX!WÂZ}\WkZ}ïdUé# RdæÎRdXGæq^)WVXk^)RdWY^_\iZ¥öÌWYïÙ^_`GUVQSìk` æ_UVë5\ éUkQSïÀZóò+\Vb SQS]_^_`S\]öSW5Z}\iWVZ}ïdU}#é Jé@WVQGæq\iæ1^_`S\´ò+\@`GWñRdUV]vUYè þ»^_U5Z}\ìV]'WVZ}\kb ¹`G\@X&æqUkë5\KWVìV\@X^'æ WY])\0RdXóWÔZ}\WkZ}ïÙU}#é Âæq^)Wt^)\Vö UVXG\0î UVQSïÀZÁ`SUkí\Õ^_`Wt^.^_`S\RÙXZ}\@í+\@XGZS\@XGé@\UVè ^_`G\WVìV\@X^'æ î1UkQSïdZWVïÙïdUtî¹^_`G\@ë ^_UÎìV\@^ UVQ}^ UVèÓ^)`S\EZ}\iWVZ}ïdU}#é Cÿ¼Uk]]_\ë5WVRÙXRdXR^ vîR^)`SUVQS^W ©\é^)RÙXGì ^_`S\5æy^'Wt^)QGæUVèÌWYìk\@X^)æRÙX £¤#¥§¦©¨ª¦ª«m¬'W¬*¦®¥b *ÍQS]ÍUkòGæ_\@])ñWY^_RdUVXGææq`SUtî ^)`GWt^Í^_`SRÀæKRdæÍXSUV^ ^_`S\5é@Wkæq\kbùoXGZS\@\Zö þNRdæÍXSUV^´WVòSïÙ\-^_UWñVUkRdZ&ZS\WVZSïÙU}#é }æÍWVXGZ eRd\@ïÀZSæKWÕZ}\ìV]'WVZSWY^_RdUVX UYè^_`G\æ_UVïdQ}^_RdUVX RÙX ^_`S\æq\XGæq\0^_`GWY^.RÙ^ÎZ}U\æ-XSUY^.ë5W }RÙë5R @\J^)`S\XQGëò+\@].UYèéUkQS]'æq\iæ æ_WY^_RÀæyúG\iZ¥bùoXGZ}\\Z¥ö}æ_QSòGæ_\ QS\@X^vR^)\@]'Wt^)RÙUkXGæÌUVè þEö}RdXGæy^)\WkZJUYèÓë.UtñRdXSì-WVìV\Xk^'æÌUkQ}^vUYè Z}\WkZ}ïÙU}#é .æqRÙ^_QGWY^_RdUVXGæöVë5Utñk\Z.WYìV\X^)æWVïÙ])\WkZ}eRdX £m¤¥§¦©¨ª¦ª«m¬¬*¦®EUVQS^^_`S\RÙ]ÌíUæqRÙ^_RdUVXGæ WYXGZWt^_^_\ë.íS^ ^_U.úGXGZUV^_`S\] £m¤¥§¦©¨ª¦ª«m¬¬*¦®Sæè¼Uk]v^_`S\ëbüv`S\Eé@QS]_])\@X^ò+\æq^ÍæqUkïÙQS^_RdUVX Rdæv^)UY^'WYïdïÙe0ZS\æq^_])UteV\ZöWVXGZ0^_`S\ò+\@`GWñRdUV]vUYè:^_`G\Eæqe}æq^_\@ë Z}\@ìk])WkZ}\æb

-

#

:

4 6,>= & 0

. * ,$ )49, $ # 4 2 ,E8 054&0 % $)*$ , N, &) %& ( $ (:. !,$ $ $ #-, ,E2 "6, $'* "* '8 $ % $ % ,38 #& ( $ M*64+4&) "*+=%&)$'( , $ $ #" 78 $), 8 ,&?2E*>= &Q46,>= & 0 ùm^5Rdæ-Rdë.í+UV]_^)WVX^ ^_`GWY^ þ ò+\Eë5U}Z}Rú\ZWYXGZ\XS`GWYXé\ZîRÙ^_` W5éUkX GRÀé^ ])\æ_UVïdQ}^_RdUVXë5\é'`GWVXSRÀæqë ^_`GWY^ WYïdïÙUtî æ RÙ^^_UÎRÀZ}\@X^_RÙè¼eÕWVXGZæqUkïÙñk\´ZS\WVZSïÙU}é#} æ@b Ô\EZ}RÀæ_é@QGæ_æ ^)`SRÀæRdæ)æ_QS\´RÙX ç\ié^_RdUVX b

= #"*E8

7

)

)

8

q'

I>& %

"

S])UVëUkòGæq\]_ñRdXSìê^_`S\ò+\@`GWñRdUV]5UYè þðUkXÁ^_`G\Cý´üÓþðíS])UVòGïÙ\ëöî \&éUkXGéïdQGZ}\^_`G\ è¼UVïdïÙUtîRdXSì #

92

$'$ 01,32 /2 &* #$ 0 ,E2 Gþ k\@eÎí+UVRdX^vRdX0RÙ^_\])WY^_RdñV\@ûRdë5íS]_Utñk\@ë5\@X^1æq^_]'Wt^)\@ìkRÙ\iæ Rdæ0^_U RdZ}\X^_RÙè¼e!WÂìVUU}Z XS\@RdìV`ò+UV])RÙXGì æy^'Wt^)\VbÌùoX þöÌ^)`SRdæÕRÀæÕWké'`SRÙ\ñV\iZóòe ^_`G\ ]_\iWVé^_RdñV\0])QSïd\æb ùoX &RdX^_UVXó\^ÎWYïbô iõtømöÓ^)`SRÀæ-RÀæ-^_`S\ë5RÙXSûmé@UVX GRÀé^Î`S\QS])Rdæq^_RÀéYb Ô\ XSUY^)Rdé@\Z^)`GWt^ =#%$'$ % 0 ,E2 íG]_UtñRÀZ}\æë5Uk]_\EUkíSí+UV]_^_QSXSRÙ^_Rd\æ^_UÕ\ }íSïdUV])\^_`G\.æ_\WV])é'` æqíWVé\´^)`GWYX &* $ 0 ,E2 ZSU\iæ@ö+WYXGZ&WñkUVRÀZSævìV\@^q^)RÙXSìÕæy^)QG#é ÕRdX&ïdUéWYïUkí}^_RdëÎWSb ¹R^)` &* $ 01,32 @öWVXÎWYìV\X^ë.Utñk\æ:^)UER^'æò+\æq^1í+Ukæ_R^)RÙUkX.î`G\@])\R^1æq^)We}æbVüv`SRÀæRdXGé])\Wkæq\iæ ^_`S\ ZSR JéQGï^)RÙ\iæJUYèÚY^)`S\@]JWVìV\Xk^'æ5WYXGZó^_`G\ éUVë5íSïd\ }RÙ^ye UVè^_`G\ íS])UVòSïd\@ëöî`SRÀé'` QSRd#é ïdeJò+\éUkë5\æv`GWV])Z}\]v^_UÎæ_UVïdñV\Vb *" $'2 =# *32 , R ©\@])\@X^Óò+\@`GWñRdUV]'æÒæ_RÙìkXSRÙúé@WVXk^)ïÙeKW ©\é^Ò^_`G\Ìí+\@]_è¼UV])ë5WVXGé\ÌUYè þb ê\Kè¼UVQGXGZÎ^)`GWt^ ¥ ]_\iæqQGï^'æÌRdX0^)`S\Kò\iæy^ò+\@`GWñRdUV] RÙX0^_\@])ëÎæ UYèÒ])QSX^_Rdë5\ WYXGZÎæ_UVïdQ}^)RÙUkX QGWVïÙRÙ^yeVbþ ^^_`G\ ò\ìVRdXSXSRdXSìEUVè+^)`S\æ_\WV])é'`9ö =# $ $ 01,32 Ìé@WYX QSRÀ#é ïÙe ìVQSRÀZ}\5ë5UV])\ÎWYìV\X^)æK^_UtîvWY]'Z^)`S\@Rd] £m¤#¥m¦ ¨§¦ª«m¬W¬ ¦®¥büv`G\@X &* $ 01,32 ´íG]_\ñV\@X^'æ Z}]'WVæq^_RÀé-é'`GWVXSìV\iæRdX ^)`S\ÎéQS])])\@X^´æq^)WY^_\.î`SRdïÙ\ÎWYïdïdUtîRÙXSìWYìk\@X^)æ^_URÙë5íS])UtñV\^_`S\RÙ] íUæqRÙ^_RdUVXæ@b ÓRdXGWVïÙïde * >,E0 0 ,E2 1ZS\WYïÀæ\ +\ié^)RÙñk\@ïde-îRÙ^_`ÕíSïdWY^_\iWYQJæ_R^)QGWt^)RÙUkXGæÌWVXGZ ïÙU}éWYï¥UVíS^_Rdë5WGb $ *>= &-2 8 $ *>= & 2E, &?8$' , ©þæ`SRÙìk`SïdRÙìk`k^)\ZRdX :RÙì b ÎWVXGZ õSöG^_`S\-\@ñVUkïÙQS^_RdUVX UYè þ Wké])Ukæ)æKR^)\@]'Wt^_RdUVXæ@öWVï^)`SUVQSìk`êXGUYÊXS\ié\æ)æ)WY])RÙïde&ë5UVXSUV^_UVXGRdéVö¥RdæEæq^)WVòSïd\.UkX æqUkïÙñtWVòSïÙ\víS])UVòGïÙ\ë5æÌWYXGZ.ìV]'WVZSQGWYïdïÙeë5Utñk\æÓ^_UtîvWY]'Z-WÍè¼QSïdïæ_UVïdQ}^)RÙUkXb *ÍX5QSXGæ_UVïdñtWYòSïd\ íS])UVòSïd\@ëÎæöiRÙ^)æ:\ñVUVïdQ}^)RÙUkXERdæ:QSXGíS]_\iZ}RÀé^)WVòSïd\ WVXGZWVíSí\iWY]'æ^)UÍUæ_é@RÙïdïÀWt^_\væ_RÙìkXSRú+é@WYX^)ïÙekb üv`SRÀæéUVë5íSïd\@ë5\Xk^'æ:^_`S\v])\æ_QSïÙ^)æ UYè RdQ-\@^WYïbSô ø}òeEé'`GWY]'WVé^_\@])R @RdXSì ^)`S\vò\`GWñRÙUk] UYè þ UkXUtñk\@]_ûmé@UVXGæq^_]'WYRdXS\iZ5íS]_UkòSïd\@ëÎæ@ö}î`GRdé'`Õ^_`S\e0`GWkZXSUY^ æq^_QGZ}Rd\Zb S])UVëéUkë5íGWY])RÙXGìæqe}æq^_\ë5WY^_RÀéYöïdUéWYï¥WYXZ0ë-QSï^)RûoWYìk\@X^ æ_\WV])é'`0UVXJ^)`S\Eý´üÓþ íS])UVòSû ïÙ\ëöVî1\vRÀZ}\@X^_RÙè¼eE^)`S])\@\víGWY]'WYë5\@^_\@]'æÓ^_`GWY^Ìæq\\@ë ^)UZ}\^)\@])ë.RdXS\v^)`S\ò\`GWñRÙUk] UYè+æ_\WV])é'`Òö XGWYë5\@ïde ÿ 5^_`S\ÔéUVX^)]_Ukïæ_é'`G\@ëÎWSöÿ Î^_`S\&è¼])\@\iZ}UVë ^_U QSXGZ}U WVæ)æqRdìVXSë5\Xk^'æÎZ}QS])RÙXGì æq\iWY]'é'`öÓWVXGZ!ÿ¦÷ ´^)`S\ÕîvWe éUVX GRdé^)æ.WY])\0æ_UVïdñV\Z WYXGZÁZS\WVZSïÙU}#é }æEòS])U k\@Xb:[ \@ïdUtîöÒî \ Z}Rdæ)éQæ_æÓ^_`G\1ò+\@`GWñRdUV]ÓUYèS^)`S\ ^_`S])\@\ æy^)])WY^_\@ìkRÙ\iæÒî \1^_\æq^_\iZERdXïdRÙìk`^:UVèS^_`S\iæq\ íGWV])WVë.\@^_\])æb ê\5\ }í+\é^K^)`SRÀæWYXGWVïÙe}æ_Rdæ^_Uò+\.ìk\@XS\])WVïÙR WVòSïd\Eò\eVUkXGZCUkQS]KïÙRdë5R^)\Z éUkXGæqRÀZ}\])WY^_RdUVXGæb *ÍQS] WYXGWVïÙe}æ_Rdæ RÀæ æ_QSë5ë5WV]_R @\iZ0RdXüÓWVòbG÷Sb

1

9

#

:

)

#

#

)

BN%

½ ¾ {©u

DI3 % &

S®'¯mV±]X 0° c]_(l ka

³ zzÍhxlyg´sgtp@+giVlyrszEp ¹uYlqp@|sx|}x'µpYrsgn {rp|ksx lygKgt'p+gklyrszEp

5p'w-wVrsx'.rshjmlqp@|rssrslw a_'` ka

° Y XS® § b4i bwkg jehd b

Û©sx)¸Vrs|sx

±i®´¯XVmV+® n_g Ml_ddehjj/k'a ~ xpksgt_ uVµkgnolyx)njogVlyrsgihj b4*^ ehijeBl

¹uYgisx'jlyrsiµtl fukÄj n_Zb^bb?ok %^_(kl krs_Ywjmlqp|rssrsÚ'x'j ¹ {ghix_n1jogsklyrsgihkj ÛprssjÌlyg´jogsxvlyrsiµtl fukÄjÌx)x'h-·rlyµ nqphkgzhx'joj! nyx'jmlqpnolÌjmlonqp@lyx'rx)j ">ga ^&#b4g`klvuj^klvuehg b4*^ ehijeBl krs_Ywjmlqp|rssrsÚ'x'j {ghix_n1jogsklyrsgihkj ¹ ÛprssjÌlyg´jogsxvlyrsiµtl fukÄjÌx)x'h-·rlyµ |p@_tlonqp@_Yrsh# nyx'jmlqp@nol1jmlonqp@lyx'rsx'j

Ï Ð ¹ Ð uYlqkµ p@nq|p@sjoxe µ|}x'ij x'jMµb4dpYkrsgjeBn l U °k&¬ VYX rG² \ _d k^ ehg$ jYb` b#k(e_^4i_Dni b?k^l ij^?kjMb e/b4iehgz_*^nehd!macb4db4g#j/kjeB_gmp

93

% CE% IJ$& 5BN

% & '+

%'I+ ùoX þö\Wké'`ÔWVìV\@X^´RdæéUVXé\@])XS\iZ îRÙ^_`ö YW XGZÁè¼Ué@QGæ_\æ.UVXö WVé'`GRÙ\ñRdXSìêR^'æ-UtîXóïÙU}éWYï ìVUkWVï ¹ë5UtñRÙXSì ^_UêWêë5RÙXGRÙëÎWYïvñRdUVïÀWt^_RdUVXSû ñWVïÙQG\&íUæqRÙ^_RdUVXbüv`GRdæÎRdXGé@]_\iWVæ_\æ-^_`S\ è¼])\@\iZ}UVëUYèWYX WYìk\@X^5^_UÂ\ íGïÙUk]_\RÙ^)æJæ_\WV])é'` æqíGWké\köî`SRÀé'`êWVïÙïdUtî æKæq\iWY]'é'`^_UWñVUVRÀZïdUéWYï UVíS^_Rdë5WGb+þÍæ´W0])\æ_QSïÙ^ö+^_`S\ þ `GWkæÍWVX RÙXS`G\@])\@X^ RÙë5ë-QSXSRÙ^yeJ^_UÎïdU}é@WVïUVí}^)RÙëÎWGbGüv`S\ìVïdUVòWYï¥ìVUWYï©UYè ë5RÙXGRÙë5R @RdXSì0éUkX GRÀé^'ævUYèW æy^'Wt^_\0RdæRÙë5íSïdRdé@R^)ïÙeÂé@UVX^_])UVïdïÙ\iZÔòeê^_`G\J\@XñRÙ])UVXGë.\X^ öÒ^)`S])UVQSìk`Âî`SRdé'` ^)`S\0WVìV\@X^'æ é@UVë5ëQSXGRdéWt^_\ tWYë5UVXSìÍ\Wké'ÈUY^)`S\@]ibiüv`GRdæ éUkë5ëQSXSRÀé@WY^_RdUVXEë5\iZ}RÙQGë WYXGZïÙU}éWYïéUVX^)]_Ukï æ_é'`S\ëÎWÕUVè þWY])\.\ +\ié^)RÙñk\.î`G\@Xê^_`S\ÎíS])UVòGïÙ\ë,Rdææ_UVïdñtWYòSïd\Vö¥òGQ}^^_`S\e&è¦WYRdïî`S\X íS]_UkòSïd\@ëÎæWY])\vUtñV\@]_ûoéUVXæy^)])WVRÙXS\iZ¥b@ùoXZ}\@\iZ¥öYUVX.QSXGæ_UVïdñtWYòSïd\RdXGæy^'WYXGé@\æö þ RÀæQSXGæq^)WVòSïd\ WYXGZé@WVQGæq\iævUkæ)éRdïÙïÀWt^)RÙUkXGæb üv`SRdæZ}\é@\@X^_]'WYïdR \ZéUkXk^)]_Ukïæq`GUVQSïÀZò\é@UVX^_]'WVæq^_\iZ0îRÙ^_`^)`S\é\X^_]'WYïdR \ZéUVX^)]_Ukï UYèÍïdU}é@WVï æq\iWY]'é'`Áî`S\@])\^_`S\XS\RÙìk`òUk]_RdXSì æy^'Wt^_\iæÎWY])\Õ\ñtWYïdQGWt^)\ZóìVïdUVòGWVïÙïdeÂòe Wêé@\@X}û ^_]'WYïdR \Z&è¼QGXGé^)RÙUkXbýKïÙUkòGWYïéUkX^_])UVïÒQæq\iZ RdX :ç&ïd\WVZGæ ^)UWæy^'WYòSïd\.í\]qè¼Uk]_ëÎWYXé\ ^_`G\ ë.Utñk\@ë5\@X^1^_U5Wæ_QGé@é@\æ)æqUk]1æq^)Wt^)\ÍRÀæöRdX0ìk\@XS\])WVïöWYïdïdUtî1\iZ5UVXSïdeÎî`S\@X0^_`S\KXS\RÙìk`ò+UV])RdXSì æy^'Wt^_\])\ZSQGé\iæv^_`S\ìkïÙUkòGWYïé@Ukæq^öGæ_QGé'`&WVæ^_`S\^)UY^'WYïXQSë-ò\] UYè òS])U V\Xé@UVXGæq^_]'WYRdX^)ævRdX ^_`S\Eæq^)WY^_\VbP Utî \@ñV\]ö^)`SRdæ RÙXGZUVè é@UVX^_])UVï¥Utñk\@])ïÙeÎ])\æq^_])Rdé^)æ ^)`S\ë5UtñV\@ë5\XkÛVè WVìV\@X^'æ WYXGZ ^_`S\.^_`G\5æ_\WV])é'`C\WVæ_RdïÙe&ìk\^)æÍ^)])WVíSí\iZ RdXêïdU}é@WVï:Ukí}^_RdëÎWSö©î`SRÀé'`êRÀæKQSXSïdR k\@ïde^_Uò+\ UtñV\@]'éUkë5\\@ñk\@XîR^)`&])WVXGZ}Ukë ]_\iæy^'WY]_^)æ´ô ømb ùoXÕòGWk#é ^_]'WV#é Îæq\iWY]'é'`ö}WYïÙ^_\]_XGWY^_RdñV\´æqUkïÙQS^_RdUVXGævWY])\K\ SWYë5RdXS\ZÕRÙX&W.æqe}æq^_\ë5WY^_RÀéKî Wekb ýK\@XS\])WVïÙïdeÂæqí+\W RdXSìöÒî \Õ\R^)`S\@].\ }íGWVXGZÁW íWY]_^_RÀWYïvæqUkïÙQ}^)RÙUkXÁUV].î \Õé'`S])UVXSUkïÙUkìVRÀé@WVïÙïde éUVXæqRÀZ}\@] Rdë.ë5\iZ}RdWY^_\´WYïÙ^_\]_XGWY^_RdñV\iæ^_U-^_`G\ÍïÀWVæq^ Z}\iéRÀæqRdUVXÒb ÍæqQGWVïÙïdeVöî \])\é@UV]'Z.^_`S\´ò+\æq^ æqUkïÙQ}^)RÙUkX5è¼UVQSXZJæqU´è¦WV]1WkæÌWYXÎRdXGé@QSëò+\@X^ WVXGZ5QSí©ZSWt^)\RÙ^1UkXSïÙe.î`S\X0Wò+\^q^)\@]væqUkïÙQS^_RdUVX Rdæ1è¼UVQGXGZ¥b}þævW-]_\iæqQSïÙ^ök^_`G\ QGWVïÙRÙ^ye5UYèÓæ_UVïdQ}^)RÙUkXGæÌRdë5íS])UtñV\æ1îRÙ^_`J^)RÙë5\´WYXGZJ^_`S\æ_\WV])é'` RdæK^yeíSRÀé@WVïÙïde æy^'WYòSïd\Vb¥PUtî1\ñV\@]iö^_`S]'WVæ_`SRdXSìÕRÀæÍ^_`S\ÎíG]_RÀé\-^_UíGWeè¼UV]´^)`S\Îæy^'WYòSRdïdR^ye WVXGZ éUVë5íSïd\^)\@XS\iæ_æ:UVè+æ_\WY]'é'`b Ô\1^)\æq^_\iZ-ò+UY^)`-`G\@QS])Rdæq^_RÀévWYXGZ.æq^_U}é'`GWVæq^_RÀé1òGWV#é ^_]'WV#é æ_\WV])é'` ô Stø1WVXGZêè¼UVQSXGZÔ^)`GWt^-òGWV#é ^_]'WV#é RdXSìXS\@ñk\@]ìkU\iæò+\@eVUkXGZê^_`S\J^_`SRd])Z UVèv^_`S\0ZS\@í}^)`ÂUYè ^_`S\^_])\@\ UkXJUVQS]1íS])UVòSïd\@ëÎæb WVXGZ}UVë ])\æq^)WY]_^1æq^_]'Wt^)\@ìVRd\æÌWYXGZÎé@]_\iZ}R^_ûòWVæ_\ZÎæ_\WY]'é'`ÎWY])\ î We}æ1^_UJWñkUVRÀZÎ^_`GRdæv^)`S]'WVæ_`SRÙXGìGöòSQ}^ ^)`S\@eÕæ_Wké])Rúé@\éUVë5íSïd\^)\@XS\iæ_æb

CEH!&% % &!% #H > þ ë5UkXSì.^_`S\Í^_`S])\@\æq^_]'Wt^)\@ìVRd\æ1î \^)\æq^_\Zæ_Uè¦WV] ¼ÿ î \WY])\v^_\iæy^)RÙXSìUY^_`G\@]'æ öVUVXGïÙe þ¹îvWVæÌWYòSïd\^_U-æqUkïÙñk\UVQS]`WY]'Z¥öVæ_UVïdñWVòSïd\RÙXGæq^)WVXGé\iæ@b üv`SRdæWYòSRdïÙRÙ^ye0éWYXò\´^)])Wké\Z0^_U5RÙ^)æ WVòSRdïÙRÙ^yeJ^)U.QGXGZ}UJWVæ)æ_RÙìkXSë5\@X^)æb ùoX þö+WVXCWYìk\@X^ÍéWYX QSXZ}U0RÙ^)æKWVæ)æqRdìVXGë.\X^ÍWVæXS\@\iZ}\Z¥ö©\ñV\@X&RÙèÌRÙ^KRÀæWÕéUkXGæqRÀæqû ^_\@XÛVXG\Vb©ùoX è¦Wké^ö¥XGUWVìV\@X^Kë5We&])\@ëÎWVRÙX RdX W0ìkRÙñk\@X í+Ukæ_R^)RÙUkXCQSXGïÙ\iæ_æK^)`SRdæKí+Ukæ_R^)RÙUkX RdæÎWké@é\í}^)WVòSïd\Õ^)UÁWYïdïUY^)`S\@]JWVìV\@X^'æ@ö:^_`Wt^JRÀæ@öRÙ^J]_\ëÎWYRdXGæÎW £¤#¥§¦©¨ª¦ª«m¬'W¬*¦® Wké])Ukæ)æ R^)\@]'Wt^_RdUVXæ@bSüv`SRÀævè¼\iWt^_QG]_\Eæ_\@\ë5æ^_UÎò+\´^_`S\EëÎW yUV]v]_\iWVæ_UVXîè þ Rdæ WVòSïd\Í^)UJæqUkïÙñk\ æqQGéé\iæ_æqè¼QSïdïÙeÔïdWV]_ìk\Vö¥^)RÙìk`^-íS])UVòSïd\@ëÎæ´^)`GWt^.]_\iæqRÀæq^_\ZÔ^)`S\0UV^_`S\]^_\ié'`SXSR QS\æ-î1\J^_\æq^_\iZ ÿ¼Rb \kbÙö}^)`S\Eæ_UVïdñWVòSïd\KRdXGæq^)WVXGé\iævUYèüÓWVòb -î1\]_\ÚVXGïÙe0æ_UVïdñV\iZ0òe þ b ùoXCéUkXk^)])WkæyîöSRdX òUV^_`Ôæqe}æq^_\@ëÎWY^_RÀéEWYXGZ `SRÙïdïÙûmé@ïÙRdëòSRdXSìæq\iWY]'é'`öGWJñtWVïÙQS\RdæKWVæ)æqRdìVXG\Z ^_UE^)`S\KñWV]_RÀWYòGïÙ\^_`GWY^vé@ïdWVRÙëÎæ1R^ úG]'æyîökUkXÕWú])æq^qûoéUkë.\köYúG]'æq^qûoæq\]_ñk\Z5òGWkæqRÀæ@b *ÍQS] RÙë5íSïd\û ë.\X^)Wt^)RÙUkX UYèïÙU}éWYïÌæ_\WV])é'` ÿ¦W&`GRÙïdïûoéïdRdëòSRdXSì æq^_]'Wt^_\ìVeêîR^)`ÁW&éUVë-òSRdXGWt^)RÙUkXÔUVè éUVXSû æy^)])WVRÙX^JíS])UVíGWVìkWY^_RdUVX WYXGZ WÂë5RdX}ûmé@UVX Rdé^0`S\@QG]_RÀæy^)Rdéè¼Uk]JñtWYïdQS\Cæ_\@ïd\é^_RdUVX ÎZSU\iæÎXSUV^ QSXGZ}U0é@UVXGæ_RÀæy^)\@X^WVæ)æqRdìVXGë.\X^)æbSP Utî \@ñV\]ö}ë5Uk]_\ìk\@XS\])WVïÙïdeVö}RdX&òGWV#é ^)])Wk#é Jæq\iWY]'é'`WVXGZ Óç©öWkæ_æ_RÙìkXSë5\@X^)æ´éWYXÔò\5QSXGZSUVXS\5QGæ_RÙXGìòWV#é ^_]'WV#é RdXSìÕWVXGZ ]'WYXGZSUVë.û])\æq^)WV]q^´æy^)])WY^_\@û

94

ìVRd\æöÓ])\æ_í\ié^)RÙñk\@ïdeVb:ùoX UVQG]-\ í+\@])Rdë.\X^)æöÓò+UY^)` òGWk#é k^)])Wk#é RÙXSì WVXGZÁ]'WYXZ}UVë.ûm]_\iæy^'WY]_^)æ è¦WYRdïÙ\iZ0^)UÎæqUkïÙñk\Í^)RÙìk`^vRdXGæq^)WVXGé\iæ@öSZ}QG\´^_U5^_`G\Eæq`S\\@] æ_R \ÚYèÒ^)`S\Eæ_\WY]'é'`Õæ_íGWké\Vb %&I3 CEH%

% 9! !H!&

% I P&CE H %; Ô\JRdZS\@X^_RÙè¼eÔ^yî1U&ëÎWYRdXÁWYíSû íS]_UWVé'`S\iævè¼UV]´æ_\WY]'é'`^_UZ}\WVïÓîRÙ^_`ÔéUVX GRdé^)æ Ó ÿ `S\@QS])RÀæy^)RdéVöòGWkæq\iZUVXCæ_UVë5\EíG]_RdUV])R^ye æqQGé'`ÎWkæW úG]'æq^qûoéUVë5\köúG])æq^qûoæ_\@])ñV\Zö ïd\WVæq^ÌéUkë5ë.RÙ^_ë5\XkîöYè¦WVRÙïÙûú])æq^íS]_RdXGé@RÙíSïd\VöVUV]QGæ_RÙXGì QGæq\]qûoZ}\@úGXS\ZíG]_\@è¼\@])\@XGé@\æöWYXZÂÿ XSUVXSûmé@UVë5ë5R^_^)WYïöGî`G\@])\-éUkX GRÀé^Íæ_\^'æÍWV]_\ë.\]_\ïÙe RdZ}\X^_RÙúG\ZÕWYXGZÎ`WYXGZ}\iZÎ\@RÙ^_`G\@]1^_UE^)`S\KQGæq\]ÌUk]Ì^)U-Wé@UVX Rdé^1])\æ_UVïdQ}^_RdUVXÎíS])U}é\iZ}QS])\ô ÷Yøb ¹`S\@XÕR^ Rdæ XSUV^ WYòGïÙ\^_U-æ_UVïdñV\KWEéUkX GRÀé^ÿ¼\kb ìbÙöWE])\æ_UVQS]'é\ÍéUVX^)\@X^_RdUVXÎRdXJ^)`S\KéWVæ_\ UYè W])\æ_UVQG])é@\-WYïdïdUéWt^)RÙUkXêíS])UVòGïÙ\ë ö þ * >,64 $ D* "*E8$' ,E8 *64>4 , *" * & 6*32 $ 12J* *>= & 8* (#tb¥üv`SRÀæéRÙ\ïdZSæÍ^)`S\ÎZ}\WkZ}ïÙU}#é íS`S\XSUVë5\XSUVXê\@XGé@UVQSX^_\]_\iZ RdX UtñV\@]_ûoéUVXæy^)])WVRÙXS\iZ éWVæ_\æöRÙX^)]_U}Z}QGé@\Z¹RdXÅç\ié^_RdUVX»÷Sb ÁWYXGZ¹Z}RÀæ)éQGæ)æq\iZ RÙX ç\é^_RdUVX Gb ê\ ò+\@ïdRd\@ñV\ ^_`GWY^J^_`S\CXSUVX}ûoéUkë5ë.RÙ^q^'WYïKæy^)])WY^_\@ìkeÁRÀæJë5UV])\ WYíSíG]_UkíS]_RÀWt^)\RÙX¹íS]'WVé^_RÀé@WVï æq\@^q^_RdXSìæ´ò\ié@WYQæq\5RÙ^ " &* & ( &)0 $ $ ,E8 " , ", " $´WYXGZêëÎW V\iæÍ^_`S\ë,^_`G\ ]_\iæqí+UVXGæ_RdòSRÙïdRÙ^yeUYèGWæ_QSòGæ_\ QS\X^ÓéUkX GRÀé^Ò])\æ_UVïdQ}^_RdUVX´íS])U}é\iæ_æb Ô\1éUkXGæ_RdZ}\]¥^_`GRdæè¼\Wt^)QS])\ UYè þ»^_UÎò+\íGWV]q^)Rdé@QSïdWV]_ïdeJWt^_^_]'WVé^_RdñV\VbùoXGZS\@\ZöGéUkX GRÀé^ RÀZ}\@X^_RÙúéWt^_RdUVXRÀæ W5Z}R Jé@QSï^ ^)WV!æ &ÿ¼í+\@])`GWYíæ Kûm`GWY]'Z WVXGZ þ!ëÎWeÎéUkXGæy^)R^)Q}^_\Í^_`G\ úG]'æy^ \ ©\é^_RdñV\ÍWVXGZÎìk\@XS\])WVï æy^)])WY^_\@ìke5^_UJWVíSíS])UkWVé'`J^_`SRÀæíS])UVòSïd\@ëb [1UV^_`JòWV#é ^_]'WV#é .æq\iWY]'é'`JWYXGZ :ç5UVí+\@]'Wt^)\RdXÕWë.Uk]_\]_\iæqUkïÙQ}^)\îvWe Y^_`G\@e5`S\@QG]_RÀæy^)Rû é@WYïdïde´Wkæ_æ_RdìVX´ñtWYïdQS\æÒ^_UKWVæëÎWVXe´ñtWY])RÀWYòSïd\æÓWVæí+Ukæ)æ_RÙòSïd\Vbþæ:W])\æ_QSïÙ^öiî`S\@X-ë5W }RÙë5R @RdXSì ^_`S\æ_UVïdQ}^)RÙUkXïd\@XGìY^_`ÒöWVæ RÙX^)`S\ý´üÓþ íS])UVòSïd\@ëö}^)`S\@eÕ\@XGZQSíúXGZ}RdXSì0æ_UVïdQ}^_RdUVXGæ^_`GWY^ WY])\Kë5Uk]_\´éUkë.í+\^)R^)RÙñk\Jÿ¼Rb \kbÙöGïÙUkXSìV\#] Ì^_`WYX þb _åqÞ óåy ß ß â ) â yáä ¹`SRÙïd\ þóRdæ , $ÒWKéUkë.íGïÙ\@^_\ íS]_U}é@\Z}QS])\Vöî1\ î \@])\ÌíSQ @ïd\Zòe´RÙ^)æ WYòGRÙïdR^ye´^)U QSRÀ#é ïÙe æqUkïÙñk\E^_RdìV`^KíS]_UkòSïd\@ëÎæ@b©P Utî \@ñk\@]iöGRdXCUtñk\@]_ûmé@UVXGæq^_]'WYRdXS\ZÕíS])UVòSïd\@ëÎæö+æ_UVë5\.WYìV\X^)æÍë5We ò\.WYïdî We}æíS])\@ñk\@X^_\iZòeUY^)`S\@]KWYìk\@X^)æ è¼]_Ukë ])\Wké'`SRÙXGìJW £m¤¥§¦©¨ª¦ª«m¬¬*¦®¥b *ÍXS\é@WVX ^_`SRdX UYè}^)`S\1ZS\WVZSïÙU}#é KíS`S\@XGUVë5\@XSUkXWVæ:Wí+Utî1\]qè¼QGïVè¼\iWt^_QG]_\1UYè þ æ_RÙXGé@\ÌRÙ^:WVïÙïdUtî æQGæ ^_U RÀZ}\@X^)Rè¼eWYXZ´Rdæ_UVïÀWt^)\éUVX GRdé^)æbc1UkXñk\@]'æq\ïÙeköUVXS\1éUVQGïdZK^)`SRÙX KUVè}R^ÓWkæÒWæ_`SUV]_^)é@UVë5RdXSì UYè þ æ_RÙXé\VöÒRÙX UtñV\]qûoéUkXGæq^_]'WYRdXS\Z éWVæ_\æö¥RÊeRd\@ïÀZSæEæ_`SUV]_^_\]æqUkïÙQ}^)RÙUkXGæ´^)`GWYX ÓçêUk] [ ü´b ê\RÀZ}\@X^)Rè¼e0è¼UVQS]í+Ukæ)æqRdòSïd\WñV\@XQS\iæ1è¼UV] Z}\iWYïdRÙXSì5îRÙ^_`&Z}\iWVZ}ïdU}#é æb Vb " $ ",30Q0-8 "* $' , * )(, $' * $' , D0 #" * 0 VùoX þEöYWVìV\Xk^'æÒ\ Sé'`GWYXSìk\ RÙXSè¼UV])ë5WY^_RdUVXRdXGZ}Rd]_\ié^)ïÙekö^_`G]_UkQSìV`K^_`G\Ì\@XñRd]_UkXSë5\@X ^ tÎöWVXGZ´`GWñk\XSU \ }íSïdRÀéRÙ^:éUkë.û ëQSXGRdéWt^_RdUVXÕë5\é'`GWVXSRdæ_ëb}üv`S\´RÙX}è¼Uk]_ëÎWY^_RdUVXÎ^)`GWt^RÀæ1íWVæ)æq\iZJRÀæ W.æ_QSë5ë5WV]_e5UVèÒ^_`G\ æy^'Wt^)\UYèÒ^_`S\Í\@XñRÙ])UVXSë5\XkîbþìV\Xk^'æ1WV]_\XSUY^WYòSïd\ ^)U])\é@UVìVXGR \ \Wké'`ÎUY^_`G\@] æ1RÙXZ}Rû ñRdZ}QWYïXS\@\iZSæ WYXGZE^_`QGæ WY])\ÌQGXGWYòSïd\^)UÍ\iæy^'WYòSïdRÀæq`.éUkWVïÙRÙ^_RdUVXÒb *ÍXS\véUkQSïÀZRÙXñk\æq^_RdìkWt^)\ `SUtî!^)UÎ\æq^)WVòSïÙRÀæ_`Õë5UV])\´\ ©\é^_RdñV\VöGRÙX}è¼Uk]_ëÎWY^_RdñV\éUkë.ë-QSXSRÀé@WY^_RdUVXGævWYë5UkXSìÎWYìV\X^)æö WVævRdX&W^)]_QGïÙeÕëQSïÙ^_RÙûmWVìV\Xk^WYíGíS]_UWVé'`b }b (>= * $' , * & (+, $ 0 ¹`S\@X¹W Z}\iWVZ}ïdU}#é ÂUééQS]'æ.RÙX þEöî \&éUVQGïdZóQGæ_\ ^_`S\æ_UVïdQ}^_RdUVX0è¼UVQSXZÕWkæ1W.æq\\Z0è¼UV]vWYXSUV^_`S\]æq\iWY]'é'`Î^_\ié'`SXSR kQG\Kæ_QGé'`ÕWkæ Óç0UV]v[ ü´b *ÍXS\ éUkQSïÀZ5\@ñV\X5RÙëÎWYìkRÙXG\ W´íUk]q^_è¼UVïdRÙUUVè¥WVïÙìkUV])R^)`SëÎæ î`S\@])\ñtWY])RÙUkQGææ_UVïdñV\])æöîR^)` ñtWY])RÙUkQGæÓè¼\Wt^)QS])\æ WYXGZ-î \W XS\iæ_æ_\æöéUUVí+\@]'Wt^)\^_Uæ_UVïdñV\ WÍìkRÙñk\@X.Z}R Jé@QSï^íS])UVòGïÙ\ëb ê\EWV]_\´î UV] RÙXSì.RdX0^)`SRdæZ}RÙ])\é^_RdUVXb

)

1

"

"

1

:

7

&

3*

&

)(

):

)

95

÷Sb &@,>=* &", $',& üv`S\JZS\é\X^_]'WYïdR \ZCé@UVX^_])UVï UVè þN\XGWYòGïÙ\iæWYX WYìV\X^Í^)UíGQS]qû æqQG\Î^_`S\Õæ_WY^_RÀæyè¦Wké^_RdUVX UVèvRÙ^)æUtîXöïdU}é@WYïÌìVUkWVïbPUtî1\ñV\]R^-WVïdæ_U&QSXGZS\@])ë.RdXS\iæ^_`G\ WYòSRdïdR^yeUYè^)`S\5æqe}æq^_\@ë ^_UWVé'`SRd\@ñk\éUUkí\])WY^_RdñV\@ïdeWJéUkë.ë5UkX ìVïdUVòGWVïÒìkUkWYï ÿ¼î`S\X æqQé'`CWJìkUkWYïÓ\ }RÀæy^'æòSQS^ÍRÀæÍXGUY^ WV]_\@^_UÎUkí}^_RdëÎWYï b+ùoXÔç\ié^_RdUVX Gb Eî1\-RÙXñk\æq^_RdìkWt^)\ `SUtî ^)UÕ\XS`GWYXé\ þNîR^)`CìkïÙUkòGWYïéUkX^_])UVïÓWVXGZ \ SWYë5RÙXG\^)`S\ÎWVZ}ñtWYX^'WYìV\iæWVXGZ æq`GUV]_^)éUkë5RÙXSìæ UYè UVQS]víS])UVí+Ukæ_\Zæy^)])WY^_\ìVeVb Gb , " $ , & 8$ , SþXUtñV\@]_ûoéUVXæy^)])WVRÙXS\iZ.íS]_UkòSïd\@ëòeJZ}\@úGXSRÙ^_RdUVXöS`WVæ XSUÎæqUkïÙQSû ^_RdUVXbSc1UVX GRdé^Ì])\æ_UVïdQ}^_RdUVX5RÀæ^)`QæXS\ié\æ)æ)WY])RÙïde`S\QS])Rdæq^_RÀé WYXZ5íS]_UkòSïd\@ë.ûmZS\@í+\@XGZ}\X^ ô i÷tømbSüv`S\@])\WY])\K^yî1U5ëÎWYRdXWVíSíS])UkWké'`S\æÌ^_UJé@UVX GRÀé^])\æ_UVïdQ}^_RdUV+X $ * "%$'2 ùoX.WYXERdX^_\@]'WVé^_RdñV\Ìæ_\^_^_RdXSìGö^_`G\ÌRÀZ}\@X^_RÙúG\iZéUkX GRÀé^'æÓWV]_S \ ®ª¦§¦¦ æ ^_`Wt^0é@WVXóò+\ íS])\æ_\@X^)\Z ^_UÔ^_`S\&Qæq\])æÎWVXGZóWVïÙïdUtî^)`S\@ë>^_UÂRdX^_\@ìk])WY^_\Õ^_`S\RÙ] UtîX yQGZSìVë5\@X^Ìè¼Uk] é@UVX Rdé^1])\æ_UVïdQ}^_RdUVXbüv`SRÀæÌRÀæ^)`S\´é@WVæ_\ RdXJUVQG] WVíSíSïdRdéWt^_RdUV+X ë5Ukæq^ éUVX GRdé^]_\iæqUkïÙQ}^)RÙUkX0RÀævéQS])]_\X^_ïdeJZ}UVXS\´RdX^_\@]'WVé^_RdñV\ïÙekökî`SRÀé'`WYïdïÙUtî æ1^_`G\ RdXk^)\@ìk])WY^_RdUVXUYè QSX QGWVX^_RÙúWYòSïd\ éUVXæy^)])WVRÙX^)æ RdX^_U.^_`S\Eæ_UVïdQ}^)RÙUkXGæ@b

8$ ,E0 * $' " :çUYèÃ^é@UVXGæq^_]'WYRdX^)æö©íS]_\@è¼\@])\@XGé@\æö¥WYXGZê]_QGïÙ\iæK^_U]_\ïdW CéUkXGæy^)])WVRÙX^'æ éUkQSïÀZ0ò+\´RÙXGé@ïÙQZ}\ZRdX0^)`S\´ë.U}Z}\ï¥RÙXUV]'Z}\]1^)UÎæqUkïÙñk\ÍéUkX GRÀé^'æWYQ}^)UVëÎWt^)RdéWYïdïÙekb *ÍXGé@\vWKìkRÙñk\@X-é@UVX Rdé^ RÀæ RÀZ}\@X^_RÙúG\iZ.WVXGZ.æqUkïÙñk\Z¥ötWÍXS\@î íS])UVòSïd\@ë RÀæ ìV\XS\@]'Wt^)\Z òGWkæq\iZUVX^_`S\ ë5UZSRúéWt^)RÙUkXEUYè}^)`S\1RdXSRÙ^_RÀWYïUVXG\VöWYXGZE^_`S\ íS])UVòSïd\@ë æ_UVïdñV\@]ÓRdæÓ])QSX UVX^)`SRdæ XS\@î»íS]_UkòSïd\@ëbSüv`SRÀæíS])Ué@\æ)æ ]_\í\iWt^)ævQGXk^)RÙïÓWYïdïÒéUkX GRÀé^)æWV]_\æ_UVïdñV\iZ¥b [1\ïÙUtî î1\RÙXñk\æq^_RdìkWt^)\0^yî U Z}RÙ])\é^_RdUVXGæ.^)UÂWVZSZ}])\æ)æ5Z}\WkZ}ïdU#é Âòe WkZSZ}RdXSìÔìVïdUVòGWVï éUVX^)]_Ukï^)U þÿç\é^_RdUVX Gb WYXZÕòeJ`G\@QS])Rdæq^_RÀé´éUkX GRÀé^ ])\æ_UVïdQ}^_RdUVXÔÿç\ié^_RdUVX b b

"

"

)

)

"

"

r $&&I> rs $

% &9 ;I+% CJ%

ùoXGæqíGRÙ])\ZòeïdU}é@WVïæ_\WV])é'`Òöî \ÌíS])UVí+Ukæ_\1^_UK\@XS`GWVXGé\ þ îRÙ^_`.ìVïdUVòGWVïé@UVX^_])UVïRdXUV]'Z}\] ^_U0WñkUVRÀZZS\WVZSïÙU}é# }æ@bSüÓUÎ^)`SRdæÍ\@XZ¥öGî \WVZGZ^_`S\è¼UkïÙïdUtîRdXSìÎ]_\iWVé^_RdñV\])QSïÙ\E^_UÎ^)`S\ þ æqe}æq^_\@ëb@þ èÃ^_\]Óæ_\@ïd\é^)RÙXGì WvíUæqRÙ^_RdUVXK^_Uë5Utñk\ ^_Uv^_`Wt^ÒRdë5íS]_Utñk\æ+RÙ^)æUtîX´ïdU}é@WYïtìkUkWVïSÿÃ^_`GRdæ RdæZ}UkXS\WVééUV]'Z}RdXSìE^)U5^_`S\´])\Wké^_RdñV\K])QSïÙ\iævUYèç\é^_RdUVX }b ÷ ö^)`S\WYìk\@X^vWVïdæ_UÎé'`S\#é }æ1^_`G\ \ ©\éÛVè^)`SRÀæë5UtñV\vUVX-^_`S\ìkïÙUkòGWYï}ìkUkWVïöë5\WkæqQS])\Z.WVæ:^)`S\^)UY^)WVï}XQSëò+\@]UVèñRdUVïÀWt^_RdUVXæ UYè^)`S\Í\X^_Rd]_\Kæy^'Wt^_\kbùoXGZ}\@\iZ¥ö^_`S\´WYìk\@X^ æ1XS\@î¹í+Ukæ_R^)RÙUkX0ëÎWe5RÙXé])\WVæ_\^)`S\´XQGëò+\@] UYè ñRdUVïÀWt^)RÙUkXGæè¼UV]KUVXS\.UV]Kë5UV])\UY^)`S\@]WVìV\@X^'æ@b *ÍXSïde&î`S\@XêR^Z}U\æKXSUV^´Z}\@^_\]_RdUV]'Wt^)\E^_`G\ ìVïdUVòGWVï+ìkUkWVïZ}U\æ ^_`G\EWYìV\X^v\ +\ié^)RÙñk\@ïdeJ\ }\é@Q}^_\´^)`S\EéUkXGæ_RdZ}\]_\iZJë5UtñV\kb rP CE çUVïdñV\K^_`G\ý´üÓþ íS])UVòGïÙ\ëè¼UV]v^)`S\EZSWY^)W5æq\@^ UYèçíS])RÙXSì 3S+S -ÿ * öGWVX UtñV\@]_ûoéUVXæy^)])WVRÙXS\iZ&RdXGæy^'WYXGé@\VöWVXGZê^_`GWYÊUYè GWYïdï 3S>Sÿ * öÒW&æ_UVïdñtWYòSïd\5RÙXæy^'WYXGé@\Vö¥QGæ_RÙXGì ^_`S\ Uk]_RdìVRdXGWVïvWVXGZ ^_`G\@X ^_`S\Cë.U}Z}RÙúG\iZ þDWVïÙìkUV])R^)`Sëb ê\Cé'`GUUæq\ ¥ Wkæ5^_`G\ Z}\è¦WVQSï^ ò+\@`WñRdUV]WVXGZUVòGæ_\@])ñV\Í^_`S\XQSë-ò\]UYèWYìk\@X^)æ Rd[X £m¤#¥§¦©¨ª¦ª«¬W¬*¦®©b *ÍX ^_`G\-Utñk\@]_ûmé@UVXGæq^_]'WYRdXS\ZZSWY^)Wæq\@^ö+^_`G\.XG\@î ])QSïd\-î \.WkZSZ}\iZ&è¼UV]KìVïdUVòWYï éUVX^)]_Ukï î WkæJWYòSïd\^_UÂ])\ZSQGé\&^)`S\CZ}\iWVZ}ïdU}#é ÁòSQ}^ÕXSUY^0\ïÙRdë5RÙXWt^_\CR^Õé@UVë5íSïd\^)\@ïdeVbùoXGZS\@\ZöÌî \ UVòGæ_\@])ñV\iZ ^_`GWY^Î^_`S\ë5U}Z}RÙúG\Z þS Jÿ * ÿ¦æ_\@\ Utî æÿ¦ò WVXGZÁÿ¼è RdXCü:WYòb öeRd\@ïÀZ}RdXSìJ^)`S\-ZSWY^)WÕæq\@^)ææ_`SUtîX RdX Utî æ-ÿ¦W WYXGZCÿ¼\ RÙXJü:WYòÒb Sb Ô\XSUV^_RÀé\Z.^)`GWt^ þÅî WkæWYòGïÙ\^)U-æ_UVïdñV\v^)`S\íS])UVòGïÙ\ëNî`SRdïd\ ^_`S\J^yî1U&UV^_`S\]æq\iWY]'é'`êæq^_]'Wt^)\@ìVRd\æKè¦WVRÙïd\ZbÒùoXGZ}\\Z¥öÒ^_`S\]_\JWV]_\5XSUV^\@XGUVQSìk`Âý´ü:þæ ^_UÎæ_UVïdñV\Í^_`S\iæq\´ZSWt^'W5æq\@^)æb[ UUkæq^_RdXSì-^_`S\ë òeJWVZGZ}RÙXGì Z}QSë5ë-e }ñtWVïÙQS\iæ@öUkXS\Wt^W ^_Rdë5\Vö¥WYïdïÙUtî æ þ ^_UÕæ_UVïdñV\E^_`S\iæq\-íS]_UkòSïd\@ëÎæ@b+þèÃ^)\@]´W0éUkë5íSïÙ\@^_\æ_UVïdQ}^)RÙUkX RdæÍUVòSû ^)WVRÙXS\iZ¥öî1\Í]_\ë5UtñV\^)`S\ÍZ}QGë.ë-eÎñtWYïdQS\æb Ô\KXSUY^)Rdé@\Z5^)`GWt^ ^_`SRÀæ1^_\ié'`SXSR kQG\ÍWYïdïdUtî æ QGæÍ^_UìV\XS\@]'Wt^)\íGWV]q^)RdWVï:æ_UVïdQ}^)RÙUkXGæ´æqRdìVXSRÙúéWYX^_ïdeò\@^q^_\]´^_`GWVX ^)`SUkæ_\-Ukò}^)WVRÙXG\Z òe ÓçWYXZÕ[vü´öSî`SRÀé'`Õè¦WYRdïÙ\iZ0^)UJæqUkïÙñk\Í\ñV\X0^)`S\òUUæy^)\ZÕRÙXGæq^)WVXGé\iæ@b }b ýKRÙñk\@X.WZ}\iWVZ}ïdU}#é .ÿZ}\æ)é])RÙò+\Z`S\@])\1WkæÓWÍæq\@^ÓUVèéUkX GRÀé^_RdXSìKWYìV\X^)æÒWVXGZ^)`S\ QSXSR kQG\ RÙXZ}\ UVè^_`S\RÙ]Eé@UV])]_\iæqí+UVXZ}RÙXGìJíUæqRÙ^_RdUVXæ ö^yî UÕé@UVX}úìVQS]'Wt^)RÙUkXGæWV]_\-íUæ_æ_RÙòGïÙ\ +\Rû ^_`S\]WYïdï WVìV\Xk^'æ`WñV\E^)`S\Îæ_WVë5\-ñtWYïdQS\-è¼UV]Í^_`S\5í+Ukæ_R^)RÙUkXCUk]^)`S\@e `WñV\-ZSR ©\@])\@X^ö XSUVXSû \@])UêñtWYïdQS\iæ@büv`S\&æ_R^)QGWt^)RÙUkXóRÀæ5RdïÙïdQGæq^_]'Wt^_\iZ RdX :RÙìb SbþDéRd]'éïd\Õ])\@íS])\æ_\@X^'æ ^_`S\RdXGZ}\ ÕUVèÓ^)`S\EéUkX GRÀé^_RdXSìÎí+Ukæ_R^)RÙUkXGæöSWÎæ QGWV]_\K]_\íS])\æ_\@X^)ævWYX&WVìV\@XîöSWYXGZÕ^_`G\ ñtWYïdQS\JîRÙ^_`SRdX ^_`S\Jæ QGWY])\ÎRÀæ´^_`S\0ñRdUVïÀWt^)RÙUkXêñtWVïÙQS\Jè¼UV]´^_`G\JíUæqRÙ^_RdUVXÒbÓüv`S\JWVìV\@X^'æ ïÙU}éWt^_\iZ.UVXÎWEéRd]'éïd\véWYQGæ_\ WZ}\iWVZ}ïdU}#é +bVùoX-^)`S\úG]'æq^Ìé@WVæ_\VöVî1\ WV]_\RdXÎWæqRÙ^_QGWY^_RdUVXÎRdX î`SRÀé'`Õ^)`S\EéUkXGæq^_]'WYRdXk^'ævWYXGZÕíS])\è¼\@])\@Xé\ævRdX^_`S\íG]_UkòSïÙ\ëeRÙ\ïdZS\ZíUæqRÙ^_RdUVXæ ^_`GWY^ `GWñV\ H*" $'&)(^)`S\´æ_WVë.\ñWVïÙQG\æÌè¼UV]1^_`S\ÍñWV]_RÀWYòGïÙ\iæRdX0WZ}\iWVZ}ïdU}#é ©bVùoX0UV^_`S\]Ìî UV]'ZSæö þ»Z}U\æÌXSUY^ `GWñV\ \XSUVQGìV`ÎRdX}è¼UV])ëÎWt^_RdUVX5^)UZ}RÀæ_é@]_Rdë5RÙXWt^_\KWYë5UVXGì´^_`S\ÍñWV]_RÀWYòGïÙ\iæ RÙXñkUVïdñV\ZJRÙXÕ^_`S\Z}\iWVZ}ïdU}#é ©büv`QGæö}WYXÕWV]_òGR^)])WV]_ekökìk]_\\Z}eÎWVæ)æ_RÙìkXSë5\@X^1UVèÒ^_`G\Kí+Ukæ_RÙû ^_RdUVXÎ^)UWYXe.UYè©^_`G\ÍWYìk\@X^)æÌRÙX5^)`S\ÍZS\WVZSïÙU}#é .Rdæ^_`S\ÍUVXSïde-ë5\ié'`GWYXSRÀé@WVïGîvWeE^)UæqUkïÙñk\ ^_`S\Z}\iWVZ}ïdU}#é ©b Ô\´íS]_UkíUæq\^_UÎæ_UVïdñV\ć Wkæq\ Wkæ1è¼UVïdïÙUtî æb Ô\æqUk]q^ ^_`G\Ké@UVX GRÀé^)RÙXGì

#

#

3 case1 3 3

2 case2 5 3

² G²^Q_l kiHb4i_Dko b?koa_(lvup WYìk\@X^)æÌRÙX0WVXJRdXGé])\WkæqRdXSìUV]'Z}\]1WVééUk])Z}RdXSì´^)UE^_`G\@Rd]1ñRdUVïÀWt^_RdUVXÎñtWVïÙQS\kb Ô\\ SWYë5RÙXG\ ^_`S\´WñtWYRdïdWVòSïd\ íUæqRÙ^_RdUVXJè¼UV]vWVæ)æqRdìVXSë5\Xk^Ì^_UE^)`S\ÍúG])æq^ WVìV\Xk^ RÙXJ^_`S\KíS])RÙUk]_RÙ^ye QS\QS\Vö Y

97

òS])\W RdXSì-^_Rd\æv]'WYXGZ}Ukë5ïÙeköWVXGZé'`S\#é Îî`S\@^_`S\]v^_`SRÀæZ}U\ævXGUYêRÙ\ïdZWYXeÎRdXGéUkXGæqRÀæqû ^_\XGéeÕRÙX^_`G\E\@X^_Rd])\KíG]_UkòSïÙ\ëbSùmè WVXeJRÙXéUVXæqRÀæy^)\@XGé@eJRÀæ\@XGé@UVQSX^)\@])\Z¥ö}î \]_\ë5UtñV\ ^_`S\5WYìk\@X^è¼])UVëhkib4x)^n¶ Ç3 kÊ ~ Ç p^ jYe !rÊEl eBk'4Êa i ggjojMsb rahh.a e pb4hg l b ~ jpYb4riHEb?kÄ¥^l Ê f:¶©sÇ x'"z x'Èhtlyj'Ê©+ukY×t¶ÒÇ x' ptVw´Ê ËÂµkx'x' l klyrszrsÚplyrghÊ *_*^4gk'a]_ ÇÈY5Ê r^zje rs! hkl veBk'{a rsG g¶iØ jMb paha h e b4rghkl kb ¶¶+pÇ h 5 sKÇ YÊ KÇ ÊÇ Ð© + Vp@h¶ VÊ .Ê lyrpx'htlÓg@nyrsx'htlyxÍ'ghjmlonqp@rhtlÒjyp@lyrsjmÀp)lyrsgihGÊ Ç Ê¾óu}Ê Ø .x'rsVhtnylyrgijmhlyrs¶ Õ½Ìx'Ê ~ p@Ê6rsn i.giµkx)hlyjmµklygYg.h¶ÒÀg¾nÊ¥f:Ï1gihkÊÒjmÄÒlonqµkprsrsrshtkl j'¶ÒukpplyhrsjmÃpÄ©_ÊÒlyr{+ghp@rnqp@GhÊ Î.uY_rsµkhkxrzkkrsÚ'srrshkhkÄ¥f:nyggih4|7sx'rsz)lyjj'Ê ÇYÊ Ä¥ p^ j/lojkeny! jrs_eBl _eBgmk'Ä¥ka nya gig jogjMjob jMx)ahb na aYeÊa e Ø1 b4gb4wVgml |kb l ny¶Sb r¶ È 8-¾1sÇ º si )¿g@Ç nyrlyµÈYz¶¥j© Ç ÀY g@¶¥nÇ lyÊ µxVf:Ê ghjmlonqprshtl:ukplyrjmÀp)lyrsghEÄnyg|sx'z5Ê \ _d Ç VÊ nyÐrp@gizE5p@¾ jVuV¶}p¾1h¾ Vµ¾ g³ z5 ÊG¶V¾Ì¾1sis|}gx)nyrnolylqµkpVz¶(ÓjÓVÀg@zn1gif:htgilygizÍhG|¶krsf hpplyhgpinyrppYS¶9¾Ìik)w lyrsgi hkj p@Ê hE¸Vqµp@hix)j'Ê©Ð+Vlyg Ç VÊØ1sx'z:r j'}Ê gi5Gp@Ê jmly³ x)lyx_n nq; jÌp@lylyrsµkx'x jors³ j'z¶ ~knyx)gp@x'nozlyzx'htx)l hYl Ð+gx'Òqµf:hkgrÙ×tzkx'kj lyÀx_g@n1nKuVuV'girsx'sYhkrs'hx pl h.x)nohf:igirshkhkx'jmx)lonynqrsphrshVx¶ Õ hkÄrsnygx)|kno jorlw-g@ x'|Vnqpjop@{rshk'gish¶k{Grhk'gishG¶v¶ VÊ:Ûgnolyµk'gizrshkkÊ

99

Distributed generative CSP approach towards multi-site product configuration A. Felfernig1 , G. Friedrich1 , D. Jannach1 , M.C. Silaghi2 , and M. Zanker1 1

Universitaet Klagenfurt, A-9020 Klagenfurt, Austria {felfernig,friedrich,jannach,zanker}@ifit.uni-klu.ac.at 2 Florida Institute of Technology (FIT), Melbourne, FL 32901, USA [email protected]

Abstract. Today’s configurators are centralized systems and do not allow manufacturers to cooperate on-line for offer-generation or sales-configuration. However, supply chain integration of configurable products requires the cooperation of the configuration systems from the different manufacturers that jointly offer solutions to customers. As a consequence, there is a business requirement for methods that enable the computation of such configurations by independent specialized agents. Some of the approaches to centralized configuration tasks are based on constraint satisfaction problem (CSP) solving. Most of them extend the traditional CSP approach in order to answer the specific expressivity and dynamism requirements for configuration and similar synthesis tasks. The distributed generative CSP (DisGCSP) framework proposed here builds on a CSP formalism that encompasses the generative aspect of variable creation and of extensible domains for problem variables. It also builds on the distributed CSP (DisCSP) framework, allowing for approaches to configuration tasks where the knowledge is distributed over a set of agents. Notably, the notions of a constraint and a nogood are generalized to an additional level of abstraction, extending inferences to types of variables. The usage of the new framework is exemplified by describing modifications to asynchronous algorithms. Our experimental evaluation gives evidence that for typical configuration tasks, encodings within a DisGCSP framework can be solved more efficiently than encodings with DisCSP.

1

Introduction/Background

The paradigm of mass-customization allows customers to tailor (configure) a product or service according to their specific needs, i.e., the customer can select between several features and options that should be included in the configured product and can determine the physical component structure of the personalized product variant. Typically, there are several technical and marketing restrictions on the legal parameter constellations and on the physical layout. This led manufacturers to develop support for checking the feasibility of user requirements and for computing a consistent solution. Such functionality is provided by product configuration systems (configurators), which constitute a successful application area for different Artificial Intelligence techniques [27], e.g. description logics [17] or rule-based [2] and constraint-based solving algorithms. [8]

100

describes the industrial use of constraint techniques for the configuration of large and complex systems such as telecommunication switches and [16] details an example of a powerful tool based on Constraint Satisfaction available on the market. However, companies find themselves in dynamically determined coalitions with other highly specialized solution providers that jointly offer customized solutions. This high integration aspect of today’s digital markets requests that software products supporting the selling and configuration tasks are no longer conceived as standalone systems. A product configurator can be therefore seen as an agent with private knowledge that acts on behalf of its company and cooperates with other agents to solve configuration tasks. This paper abstracts the centralized definition of configuration tasks in [28] to a more general definition of a generative CSP that is also applicable to the wider range of synthesis problems. Furthermore, we propose a framework that allows us to address distributed configuration tasks by extending DisCSPs with the innovative aspects of local generative CSPs: 1. The constraints (and nogoods) are generalized to a form where they can depend on types rather than on identities of variables. This also enables an elegant treatment of the following aspects. 2. The number of variables of certain types that are active in the local Generative CSP (GCSP) of an agent, may vary depending on the state of the search process. In the DisCSP framework, the external variables existing in the system are predetermined, but in a DisGCSP the set of variables defining the problem is determined dynamically. 3. The domain of the variables may vary dynamically. Some variables model possible connections and they depend on the existence of components that could become connected. Namely, these domains extend when the possibility of connection to new components is created. We describe the interesting impact of the previously mentioned changes on asynchronous algorithms. After giving a motivating example in Section 2, Section 3 defines a generative CSP. Section 4 formalizes a distributed generative CSP and in Section 5 extensions to current DisCSP frameworks are presented. Finally, Section 6 evaluates DisGCSP encoding vs. classic DisCSP problem representation for typical configuration problems.

2

Example of configuration problem

In the following we exemplify a problem from the domain of product configuration ([8]). A telecommunication switch consists of a set of frames, where each frame has optional connection points (denoted as ports) for modules to be plugged in. These modules can be configured to be either analog or digital (Figure 1). In addition, problemspecific constraints describe legal combinations of the number and configuration of the modules on the different frames. As different companies have to cooperate to provide a switching solution, the distribution aspect is inherent in this scenario. In order to keep the example simple, we assume a two-agent setting. The agents A1 and A2 are capable of configuring different functionalities of the telecommunication switch and therefore each of them owns a different set of constraints and they have a limited view on the

101

overall system, e.g., agent A1 requires only a view on the configuration of the upper two frames.

Fig. 1. Telecomunication switch

Short introduction on encoding technique When configuring large technical systems, the exact number of problem variables, i.e. employed modules and their connections, is not known from the beginning, it is therefore not efficient to formulate constraints directly on concrete variables. Instead, comparable to programming languages, variable types exist that allow us to associate a newly created variable with a domain and we can specify relationships in terms of generic constraints. [28] defines a generic constraint γ as a constraint schema, where meta-variables Mit act as placeholders for concrete variables of a specific type t, denoted by the superscript t. The subscript i allows us to distinguish between different meta-variables in one constraint3 . We therefore formalize this configuration task as a CSP, where for each frame two different types of variables exist. The first one encompasses the variables that represent the ports of a frame and the second one the modules to which the ports can be connected, i.e. tpa , tpb , tpc denote the types of port variables of frame A, B and C and ta , tb , tc represent the types of module variables. Every single port and module is represented by a variable4 . Variables representing modules take either the value 0 (analog) or 1 (digital), while port variables are assigned the index value of the module they are connected to. In our example for denoting a single variable the naming convention typei is chosen, where the subscript i gives the index value of variables of the same type, e.g., a1 , a2 , . . . represent the module instances on frame A and 3 4

The exact semantics of generic constraints is given in Definition 3 in Section 3.1 Note, that the frame components themselves are not explicitly modeled, but only via their characterizing port variables.

102

pa1 , pa2 , . . . are the names of its port variables. In order to be able to formulate restrictions on the amount of variables for each type a counter variable (xtype ) exists that holds the number of instantiated variables of type type. The configuration constraints are distributed between two agents, i.e., each agent Ai posesses a set of local constraints5 Γ Ai , i.e., Γ A1 = {γ1 , γ2 , γ4 , γ6 } and Γ A2 = {γ3 , γ5 , γ7 , γ8 }, which are defined as follows: Variables representing modules can take either the value 0 (’analog’) or 1 (’digital’). γ1 : M ta ≤ 1. γ2 : M tb ≤ 1. γ3 : M tc ≤ 1. For agent A1 the modules on frame A and those on frame B must be configured differently, i.e. all modules on frame A are set as ’analog’ and all modules on frame B as ’digital’ or vice versa. γ4 : M1ta = M2tb . For agent A2 the modules on frame A and on frame C must have the same configuration, i.e. they must be all set as either ’analog’ or ’digital’. γ5 : M1ta = M2tc . Agent A1 ensures, that the amount of modules on frame A is equal to the amount of modules on frame B. γ6 : xta = xtb ., where xta resp. xtb is a counter variable, that holds the amount of instantiated module variables on frame A resp. frame B. Similarly for agent A2 , the amount of modules on frame C must be at least as high as the amount of modules on frame A minus 1. γ7 : xta − 1 ≤ xtc . Agent A2 ensures that all modules on frame C are configured as ’digital’. γ8 : M tc = 1.

Example of the usage of the framework Now, we will exemplify the solving process in our framework for distributed generative constraint satisfaction explained later on this paper. Solving is performed by our asynchronous forward checking algorithm with local constraints (compare trace in Figure 3), that is detailed in Section 4. The subscripted index value of the agents Ai also denotes their priority, where 1 is highest. The example in Figure 2.a depicts an initial situation, where a customer-specific requirement imposes a restriction on the configuration result, e.g. the telecommunication switch must contain at least two modules that must be connected via ports to frame A. Agent A1 fulfills this initial customer requirement by generation of problem variables and communicates them via an announce message to the other Agent A2 . The parameter of an announce is a list of new variables denoted by a pair: (type,index). Agent A2 determines his interest in the newly announced variables and communicates an addlink message back. As can be seen from Figure 2.b, agent A1 creates two modules and their ports on frame B in order to fulfill constraint γ6 and sets modules on frame A to 0 and on frame B to 1 (constraint γ4 ). Agent A2 creates a module instance on frame C and configures it 5

For reasons of presentation we omit those constraints that ensure that all modules are connected and that once a port variable is assigned a value, the corresponding connected component variable must exist.

103

module variable

a1 = {?}

a1 = {0}

port variable pa1

pa1 rack A

rack A

b1 = {1}

pb1 rack B

rack B

c1 = {1}

pc1

c2 = {1}

pc2

rack C

rack C

a) Initial situation

b) first round

a1 = {0}

a1 = {1}

pa1

pa1 rack A

rack A

rack B

rack B

b1 = {1}

pb1

c1 = {1}

pc1

c2 = {1}

pc2 rack C

c) conflict occurence

d) solution

Fig. 2. Example problem

1: A1 announce(ta , 1, ta , 2, tpa , 1, tpa , 2)→ 2: A2 –addlink{(ta , 1), (ta , 2), (tpa , 1), (tpa , 2)}→ – 3: A1 announce(tb , 1, tb , 2, tpb , 1, tpb , 2) → 4: A2 announce(tc , 1, tpc , 1) → 5: A1 ok?M a , {1, 2}, 0 → 6: A1 ok?M pa , {1}, 1 → 7: A1 ok?M pa , {2}, 2 → 8: A2 nogood¬(M a , {1, 2}, 0) → 9: A1 ok?M a , {1, 2}, 0 →

A2 A1 A2 A1 A2 A2 A2 A1 A3

Fig. 3. Trace of the solving process with AFCc -GCSP

104

following constraint γ8 as ’digital’ (c1 = 1). In order to reflect the interchangeability of value assignments, generic assignments using a meta-variable, a set of index values and their assigned value are exchanged in an ok? message. When agent A1 communicates the assignments to module variables a1 and a2 to agent A2 an inconsistency is detected by agent A2 (violation of constraint γ5 ). As agent A2 is not capable of locally resolving this conflict by changing his local assignments, he calculates a nogood and communicates it back to agent A1 (Figure 2.c ). Note, that we use here generic nogoods that exploit the interchangeability of the conflicting variables, e.g., all variables of type ta must not take the value 0. Agent A1 resolves the conflict by changing its variable assignments and finally a solution is found (Figure 2.d). In this simple example, the generic nogood involved only one type, but in general it can involve several variable types. Consequently, a solution to a generative constraint satisfaction problem requires not only finding valid assignments to variables, but also determining the exact size of the problem itself. In the sequel of the paper we define a model for a Generative constraint satisfaction problem of a local configurator and detail then the extensions to an asynchronous algorithms in order to be applicable for DisGCSPs.

3

Generative Constraint Satisfaction

In many applications, solving is a generative process, where the number of involved components (i.e., variables) is not known from the beginning. To represent these problems we employ an extended formalism that complies to the specifics of configuration and other synthesis tasks. For efficiency reasons, problem variables representing components of the final system are generated dynamically as part of the solution process because their exact number cannot be determined beforehand. The framework is called generative CSP (GCSP) [10, 28]. This kind of dynamicity extends the approach of dynamic CSP (DCSP) formalized by Mittal and Falkenhainer [19], where all possibly involved variables have to be known from the beginning, since the activation constraints reason on the variable’s activity state. [20] propose a conditional CSP to model a configuration task, where structural dependencies in the configuration model are exploited to trigger the activation of subproblems. Another class of DCSP was first introduced by [6] where constraints can be added or removed independently of the initial problem statement. A GCSP is extended in order to find a consistent solution while the DCSP in [6] has already a solution and is extended due to influence from the outside world (e.g., additional constraints) that necessitates finding a new solution. Here we give a definition of a GCSP that abstracts from the specifically for configuration tasks formulated approach in [28] and applies to the wider range of synthesis problems. Exploiting incremental relaxation of the unary constraints in CSPs (as we introduce in GCSP) is a promising technique for approaching any CSP. To formally refer to states reached along this relaxation, a first attempt could consider the framework given by the following definition [7, 26]:

105

Definition 1 (Open Constraint Satisfaction Problems). An Open CSP (OCSP), O, is defined as a set of CSPs with the same variables and constraints6 , O(1), O(2), .... A partial order, ≺, is defined on O. O(i) ≺ O(j) when all the values in any domain of O(i) are also present in the corresponding domain of O(j). O(i) ≺ O(j) implies i < j. Excepting O(1), all other CSPs in O have a predecessor: ∀j > 1, ∃i, O(i) < O(j). The relation between an OCSP and a Generative CSP will be discussed in more depth in Section 3.2. 3.1

Generative Constraint Satisfaction

Definition 2 (Generative constraint satisfaction problem (GCSP)). A generative constraint satisfaction problem is a tuple GCSP(X0 , Γ , T , ∆0 ), where: – X0 is the set of initially given variables. – Γ is the set of generic constraints. – T = {t1 , . . . , tn } is the set of variable types ti , where dom(ti ) associates the same domain to each variable of type ti , where the domain is a set of atomic values. – For every type ti ∈ T there exists a counter variable xti ∈ X0 that holds the number of variable instantiations for type ti . Thus, explicit constraints involving the total number of variables of specific types and reasoning on the size of the CSP becomes possible. The set of counter variables is C. – ∆0 is a relation on X0 \C ×(T, N ), where N is the set of positive integer numbers. Each tuple (x, (t, i)) associates a variable x ∈ X \ C with a unique type t ∈ T and an index i, that indicates x is the ith variable of type t. The function type(x) accesses ∆0 and returns the type t ∈ T for x and the function index(x) returns the index of x. Definition 3 (Generic constraint). A meta-variable Mi is associated a variable type type(Mi ) ∈ T and must be interpreted as a placeholder for all concrete variables xj , where type(xj ) = type(Mi ). A generic constraint γ ∈ Γ formulates a restriction on the meta-variables Ma , . . . , Mk , and eventually on counter variables. By generating additional variables and relaxing unary constraints7 , a previously unsolvable CSP can become solvable, which is explained by the existence of counter variables that hold the number of variables. To concisely specify that a meta-variable Mi has type t, we denote it by Mit . When modeling a configuration problem, variables representing named connection points between components, i.e., ports, will have references to other components as their domain. Consequently, we need variables whose domain varies depending on the size of a set of specific variables [28]. Example Given ta as the type of variables representing modules and tpa as the type of port variables that are allowed to connect to modules, then the domain of the pa 6

7

[31] introduces the notion of constraints that can be known without knowing domains, as a source of privacy in ABT. Which can be dually seen as addition of values.

106

variables dom(tpa ) must contain references to modules. This is specified by defining t dom(tpa ) = {0, . . . , ∞}, by a meta-constraint M1pa