LPFML: A W3C XML Schema for Linear and Integer Programming Robert Fourer Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, IL,
[email protected]
Leo Lopes Department of Systems and Industrial Engineering, University of Arizona, Tucson, AZ,
[email protected]
Kipp Martin Graduate School of Business, University of Chicago, Chicago, IL,
[email protected]
There are numerous modeling systems for generating linear programs and numerous solvers for optimizing them. However, it is often not possible for modelers to combine their preferred modeling system with their preferred solver. Current modeling systems use their own proprietary model instance formats that various solvers have been adapted to recognize. The existence of all of these formats suggests that one way to encourage modeling system and solver compatibility is to use a standard representation of a problem instance. Such a standard must: be simple to manipulate and validate; be able to express instance-specific and vendor-specific information; and promote the integration of optimization software with other software. In this paper we present LPFML, an XML Schema for representing linear programming (LP) instances. In addition, we provide open source C++ libraries that simplify the exchange of problem instance and solution information between modeling systems and solvers. We show how our system is used to enable previously unavailable language-solver connections and how our design improves on the state of the art under three different scenarios relevant to communication between solvers and modeling systems. Key Words: linear programming, information systems, XML History: revised 9/01
1
1.
Introduction
There are varied modeling languages for expressing linear programming models as input to computer systems, and there are many efficient implementations of algorithms for solving linear programs — see Table 1. This proliferation of languages and solvers presents a difficulty for developers of linear programming software, as modelers may want to use any solver with any modeling language. If there are M modeling languages and N solvers, then M × N “drivers” are required for complete interoperability, as depicted in Figure 1. Table 1: Examples of modeling languages and of solvers. Modeling languages
Solvers
AIMMS (AIMMS (2003)) AMPL (Fourer et al. (1993)) GAMS (Brooke et al. (1988)) LINGO (Schrage (2000)) Mosel (Dash Optimization (2003a)) MPL (Software (2002)) OPL (ILOG (2003b))
CLP (IBM (2003a)) CPLEX (ILOG (2003a)) GLPK (Makhorin (2003)) LINDO (Schrage (1997)) MINOS (Murtagh and Saunders (1983)) MOSEK (Mosek ApS (2003)) Xpress-MP (Dash Optimization (2003b))
Note: Many other examples are listed in the Linear Programming Frequently Asked Questions, www-unix.mcs.anl.gov/otc/Guide/faq/linear-programming-faq.html.
One way to increase modeler-solver compatibility is to adopt a standard representation of a problem instance. Here it is important to make a distinction between models and instances. A model is an abstract, symbolic representation of an optimization problem, while an instance is an explicit description of a problem’s objective and constraints. For linear programming, a model can be described by linear algebraic expressions; an instance can be represented as a list of nonzero coefficients of variables in the objective and constraint functions, along with bounds on the variables and the constraint functions. Using a standard representation of an instance, only M + N software drivers are needed for complete interoperability, as seen in Figure 2. Each modeling language translator generates an instance in the standard format and each solver reads an instance in the standard format. Note that the arrows in Figures 1 and 2 are doubleheaded: a standard form should be able to express results returned from a solver as well as problems sent to a solver. 2
Modeling Language 1
Solver 1
Modeling Language 2
Solver 2
Modeling Language M
Solver N
MN Drivers Required Without XML
Figure 1: M × N drivers required without common interface. Modeling Language 1
Modeling Language 2
Solver 1
XML Instance
Modeling Language M
Solver 2
Solver N M + N Drives Required With XML
Figure 2: M + N drivers required with common interface. Any representation mechanism for linear programming instances is readily extended to specify that certain variables take only integer or zero-one values. Thus in the remainder of this paper, every claim we make regarding linear programming extends to mixed-integer linear programming. Our proposed representation also permits communication of solver output options (e.g. quantity of output), branching strategies, and other solver directives. This paper motivates and describes a standards proposal based on the concept of a markup language and using the technology known as XML. This approach offers substantial advantages over any of the likely alternatives, with few drawbacks. An XML-based approach also offers significant strategic benefits for the mathematical programming community, as we will explain. 3
We are not the first to incorporate XML into mathematical modeling. Chang (2003) and Kristj´ansson (2001) have also proposed XML representations for linear program instances. In contrast to these, our proposal provides a broader range of representational options; equally important, we provide open-source libraries to read and write instances expressed in our proposed format, whether from files or directly in memory. Martin (2002) demonstrates how to bypass a traditional algebraic modeling language and use XSLT (a language for transforming XML files) to convert raw data into an XML description of a problem instance such as we propose in this paper. Bradley (2003) provides a good overview of the uses of XML technologies in operations research. Section 2 describes the problem of representing instances of linear programs, and explains why XML is an appropriate technology on which to base a linear programming standard. Section 3 provides the necessary background material on XML, schemas, and other concepts used in this paper, using some snippets of our representation as examples but assuming no specific knowledge of it. These set the stage for Section 4, in which we introduce LPFML, the W3C XML Schema that defines our proposed format for representing instances of linear programs. Section 5 describes open-source libraries that support the LPFML schema. These libraries hide the XML details completely and reduce interacting with LPFML to manipulating vectors and sparse matrices. As an illustration, we use these libraries to make previously unavailable connections between the AMPL modeling language and the LINDO solver and the OSI solver interface. Section 6 deals with issues raised by the large data requirements of LP instances, by presenting the results of computational tests. We describe several methods for compressing LP instances, and demonstrate that LPFML is competitive in file size and processing speed. Section 7 discusses our strategies to prevent fragmentation of the standard, including our choice of software licensing, distribution structure, and official standardization steps. Finally, Section 8 proposes several extensions and discusses important implications of this work.
4
2.
Instance Representation
An algebraic representation of a simple product mix model, in the AMPL modeling language, is given in Figure 3. This model describes the data needed by any product mix problem, but does not supply particular data. Instead the data for an instance of the product mix linear program is given separately. Figure 4 shows an example of the data (see Anderson et al. (1991)) for an instance, in AMPL’s format for data. set PROD; set DEP;
# products # processing departments
param hours {DEP}; param rate {DEP,PROD};
# time available in each department # hours used in each dept per product unit made
param profit {PROD}; var Make {PROD} >= 0;
# profit per unit of each product made # number of units of each product to be made
maximize TotalProfit: sum {j in PROD} profit[j] * Make[j]; subject to HoursAvailable {i in DEP}: sum {j in PROD} rate[i,j] * Make[j]