Integration of Deduction and Computation Jacques Calmet, Clemens Ballarin and Peter Kullmann University of Karlsruhe Postfach 6980 76128 Karlsruhe, Germany fcalmet, ballarin,
[email protected] http://iaks-www.ira.uka.de/calmet/
October 12, 2000 Abstract
We outline some of our approaches to the integration of Computer Algebra Systems and Automated Theorem Provers. Experimental couplings led to the development of the OMSCS framework, an architecture to specify the coupling of computational and reasoning systems. A model de ning the context of a computation is proposed next. Finally, a multiagent approach, built upon our KOMET project, is then outlined through the integration of Mathematica.
1 Introduction The design of languages and environments to combine and integrate several heterogeneous systems has been initiated a few years ago in many areas. The proceedings of the FroCos series of conference [19] illustrate this trend in the area of Logic. The integration of theorem proving and symbolic mathematical computing has emerged from prototype extensions of single systems to the study of environments enabling interaction among distributed systems. The CALCULEMUS initiative is an essential factor in the structuring of this domain of research. It started a few years ago as a collaborative eort of groups in Computer Algebra (CA) and Automated Theorem Provers (ATP). It is today both a European Research Training Network and a series of workshop [17]. Computer algebra systems (CASs) and automated theorem provers (ATPs) exhibit complementary abilities. CASs focus on eciently solving domainspeci c problems. ATPs are designed to allow for the formalization and solution of wide classes of problems within some logical framework. Integrating CASs and ATPs allows for the solution of problems of a higher complexity than those confronted by each class alone [2]. There are three basic approaches to the integration of deduction (or reasoning) and computation. A rst one is to extend the capabilities of a CAS to 1
enable deduction. A well-known example is the Theorema project of Buchberger that is built upon Mathematica. A second one is to extend an ATP through the implementation of computational capabilities. A typical example is the Omega system of Siekmann. References are found in the proceedings cited above. A drawback of these methods is that they imply to redo to some extent what has been achieved over three decades in TP and CA respectively. This is why we have selected a third method that consists in coupling existing systems. We do not aim at giving a general survey of the state of the art of the domain. In fact, no recent survey does exist. The proceedings of the FroCos and CALCULEMUS workshops are the best sources of references. We present some of the work done in Karlsruhe during the recent years. We start by describing some experimental couplings. Then, we present OMSCS, a framework suitable for integrating simultaneously several systems in a semantically sound way. The following section investigate some tracks to improve the speci cation and semantic soundness of CASs. A tool in that direction is to formulate an algorithm as a schemata and then to use this feature to de ne the context of a computation. The next section deals with an approach, which is gaining strength in this domain: the multiagent interpretation of cooperating systems. The KOMET system is brie y introduced and we sketch how Mathematica can be queried within KOMET. A concluding section is devoted mainly to a discussion of potential applications and how to tackle them.
2 Combining Dtp and Magma Communication and cooperation mechanisms for logical and symbolic computation systems enable to study and solve new classes of problems and to perform ecient computation through cooperating specialized packages. As experiments, we have designed and implemented interfaces between the tactical theorem prover (TP) Isabelle and the general computer algebra system (CAS) Maple [3], and between the automated theorem prover Dtp1 [13] and Magma [5]. The achieved results have been used to design an interface between Imps and Calvin, an experimental CAS we are designing to bypass the black box feature of most CASs. The rst coupling is described in [3]. We do not duplicate it here. Maple has been used in other experimental couplings, for instance in [15]. The second coupling has been the topic of several talks but was never published. It is the topic of this section. One of the assigned goals when combining TP's and CAS's is to assess whether the integrated system has better deduction capabilities than a stand alone theorem prover. The main reason for selecting Dtp is that it is an automated TP based on resolution. Indeed, this was the rst attempt to link a CAS and an automated TP. A second reason is that it is freely available and written in Common Lisp. Magma is one of the two CAS available for serious computational problems in Group Theory. Besides including most of the spe1
Don Geddis' Theorem Prover
2
cialized algorithms required in this domain, it oers several relevant databases of results.
2.1 The Interface
To design the interface is straightforward since Dtp allows to call any external Lisp functions within proofs: ground terms starting with the keyword eval are evaluated by Lisp. As Magma does not provide interfaces for interaction we implemented the communication through standard Unix pipes. Dtp acts then as a master providing the user interface and calling Magma when desired. Magma remains unchanged and reads the initial de nitions provided by Dtp. For technical purposes, we have ported Dtp to CLISP, a Common Lisp developed in Karlsruhe. It is sucient to implement two unidirectional links, clisp2magma and magma2clisp. This ensures easy portability since there is no extension of the kernels of the systems. Three new Dtp functions have been introduced: connectmagma, magmacommand and closemagma. The second one is used to pass commands to Magma. This requires also to implement functions to parse the Lisp expressions in the input and output of Magma.
2.2 Examples
To illustrate some features of the combination we give three simple examples taken from a classical exercise book in Group Theory [12]: Find a group of order 32 with the smallest set of conjugate classes. Magma includes a database of all such groups thus they do not have to be generated. It is easy to guess candidates but dicult to prove the minimality, e.g. the generalized group of quaternions2. The cooperation consists of the following steps: 1. Magma provides by request of the prover a database and its cardinality; 2. Dtp retrieves all objects in the database; 3. the cardinality of conjugate classes of every object is computed by Magma; 4. Dtp computes the minimum and determines the result. Show that W 33 is isomorphic to the direct product of A5 and Z2. A simple algorithm for constructing a monomorphism of G1 to G2 if G1 is isomorphic to a subgroup of G2 has been implemented for any nite permutation groups G1 and G2 in Dtp and Magma. W 3 is transformed by the Todd-Coxeter algorithm CosetAction to a permutation group of order 120 and the isomorphism is automatically veri ed by considering the conjugate classes. 2 3
Relations B ?1 ABA; B 2 A8 . W 3 is a sporadic Coxeter group with relations x21 ; x22 ; x23 ; (x1 x3 )2 ; (x1 x2 )3 ; (x2 x3 )5 .
3
Find a minimal n such that the group of quaternions Q is a subgroup of Sn .
Since Q is of order 8 we know that n 8 and n 4 because 8 divides n!. Although very inecient, Magma could be called to test all possible values for n by a simple algorithm. A better solution to this problem is automatically computed by the combination of Dtp and Magma based upon reasoning with Sylow theorems by stepwise elimination of S4 ; S5 ; S6 and S7 . For instance, the dihedral group D4 of order 8 is a subgroup of S4 . Since D4 is a 2-Sylow group and is not isomorphic to Q the second can not be a subgroup of S4 . The same holds for S5 because D4 is still 2Sylow in S5 . To eliminate S6 is more dicult because one has to generate a 2-Sylow group, determine the order of the elements and check that there are less elements of order 4 as in Q. S7 can be eliminated by the other Sylow theorems. The initial goal of this work was to assess the integration of an ATP and a CAS. As it was expected, a tactical TP would be more powerful. This is illustrated by the computation for S7 in the third example above. Although Dtp possesses databases of theorems grouped in theories to reduce the search space, the search space was too big. We had to use Magma to do some extra computation to master the growth of the search space. This, however, demonstrates the usefulness of such an interface. What is lacking in the approach we have adopted is a knowledge base common to the two integrated systems [6]. For instance, we have to feed some de nitions and theorems to both system. A common knowledge base would make such operations much easier. Such a need was also felt for the other experimental couplings we did investigate. As already pointed out this interface was designed as an experiment to gain knowledge that can be used in more ambitious projects. It was thus a rather pleasant surprise when experts of group theory pointed out that such an integrated system may provide an approach to prove \real" theorems. A test problem is as follows, where the word suitable is used to avoid a too long presentation of the problem: Given a \suitable" in nite collection of p-groups, give a formula for the least n such that the i-th group in the collection can be embedded in Sn , not in Sn?1 . This is a very long term project over 10 years even when using ATPs and CASs according to the experts. However, when analyzing the problem, it is possible to identify sub-problems. Many of them are computational ones. For instance, one must compute determinants of matrices. Depending on the size of these matrices, this is not an easy problem and requires a thorough management of the computation. There are deduction problems as well. One of them is supposed to be simple and can be seen as a test of feasibility. Can we prove, by machine, that every subgroup of Qn2 , the quaternion group of order 2n , is normal? 4
The various integration of systems we performed lead to some general conclusions. Some of them are as follows. The coupled CAS and TP must exchange some of the knowledge they respectively embed. This is best achieved by implementing some sort of common knowledge base. It is possible to classify the types of integrations. This is extensively discussed in [6]. Nowadays, one refers to the granularity of the interaction. Coupling two systems is usually technically straightforward. A main bene t for deduction problems is that the availability of computational capabilities enables to master the growth of the search space. In most cases, the results provided by the CAS are trusted by the prover and thus accepted without checking further whether they are correct or not. This is not satisfactory since CAS are not semantically sound. We investigate further this point in the section dealing with contexts. It is necessary to de ne a semantically sound general framework for coupling mathematical systems.
3 Open Mechanized Symbolic Computation Systems Most of the coupling experiments conducted so far followed an ad-hoc approach, resulting in solutions tailored to speci c problems. Moreover, they lead to the cooperation of two systems only. A structured and principled approach is necessary to allow for the sound integration of systems of several systems in a modular way. Designing interfaces and speci cations for combining systems has led to the de nition of (ocial or de facto) standards in many areas of research, i.e. hardware interfaces, communication languages and protocols. However, in most cases, no de nition of formal semantics has been provided within those standards. When dealing with symbolic computation systems, a formal semantics is required to specify the cooperation in computing the solutions, and to provide the principles in designing new interfaces. We proposed in [4] a framework, OMSCS (Open Mechanized Symbolic Computation Systems), which can be used to specify both ATPs and CASs, and to represent their integration formally. This section summarizes the main ideas. A symbolic mathematical service is a software able to conduct useful and semantically meaningful two-way interactions with the environment. A symbolic mathematical service should be structurally organized as an open architecture able to provide services like, e.g., proving that a formula is a theorem, or computing a de nite symbolic integral, and to be able, if and when necessary, to rely on similar services provided by other tools. In [14], the Open Mechanized Reasoning System (OMRS) architecture was introduced as a means to specify 5
and implement reasoning systems (e.g., theorem provers) as logical services. In [6], this approach has been recast for the domain of symbolic computer algebra systems. OMSCS is the result of recasting together the two approaches.
3.1 The OMSCS Framework
The speci cation of a service must be performed at various levels. At the object level, it is necessary to de ne formally the objects involved in the service, and the basic operations upon them. E.g., for a theorem prover, one must de ne the kind of assertions it manipulates, and the basic inference rules that can be applied upon them. Then, the control level provides a means to de ne the implementation of the computational capabilities de ned at the object level, and to combine them. The control level must include some sort of \programming language" which is used to describe a strategy in the applications of modules implementing basic operations, therefore to actually de ne the behavior of the complex system implementing the service. Finally, the way the service is perceived by the environment, e.g. the naming of services and the protocols implementing them, is de ned within the interaction level. This leads to the following architectural structure for reasoning and algorithmic services: Reasoning Theory = Sequents + Rules Reasoning System = Reasoning Theory + Control Logical Service = Reasoning System + Interaction Computation Theory = Objects + Algorithms Computation System = Computation Theory + Control Algorithmic Service = Computation System + Interaction We synthesize these de nitions into that of Symbolic Mathematical Service. Symbolic Computation Theory = Symbolic Entities + Operations Symbolic Computation System = Symbolic Computation Theory + Control Symbolic Mathematical Service = Symbolic Computation System + Interaction
We call this architecture Open Mechanized Symbolic Computation Systems (OMSCS).
3.2 The object level
Actual systems implement a variety of computation paradigms, based on a wide spectrum of classes of entities. The object level allows for the representation of this variety of objects and behaviors. The notion of domain is extended by de ning a system of symbolic entities as a means to represent the entities manipulated by a symbolic computation system, and the basic relationships governing them. A system of entities includes a set of symbolic objects, a system of symbolic instantiations and a system of symbolic constraints. Objects, constraints and instantiations are taken as primitive sorts. Objects and constraints may 6
be schematic, to allow for the representation of schematic computations via the instantiation system. Thus a system of symbolic entities is a triple as follows: Esys = hO ; Csys; Isysi O is the set of symbolic objects. Csys is a constraint system hC ; j=i, where C is a set of constraints, and j= (P! (C ) C ) is a consequence relation on constraints. Isys is an instantiation system hI ; _[_]i, where I is the set of instantiation maps (or instantiations), and _[_] is the operation for application of instantiations to objects and to constraints, that is _[_] : [O I ! O ] and _[_] : [C I ! C ]. In order to qualify as a system of symbolic entities, Esys , a number of formal requirements are enforced over j=, I , and _[_]. They are described in [4] where theoretical features of the object level are investigated.
3.3 The control level
The control level must specify how a system implements the transformations of entities speci ed declaratively at the object level, and the strategies adopted by the system to combine them to achieve complex behaviors. OMSCS adopts the tactic-based approach to pursue the rst aim. The basic computation abilities of a system are represented using primitive tactics. A primitive tactic provides a particular implementation of an operation de ned within a symbolic computation theory. Intuitively, a primitive tactic is de ned to be a correct implementation of an operation op if every tuple describing its input/output behavior corresponds to some tuple contained in the de nition of op. Primitive tactics implement OMSCS operations directionally. Tactics may fail, representing the partiality of operation applications. It must be possible to control the use of primitive tactics so that they realize speci c instances of operations. This is achieved by exploiting two mechanisms, control arguments and control annotations. Control arguments are additional objects manipulated by the tactics (in addition to symbolic entities) in order to generate values for the output entities. Control annotations are meant as a coloring of the symbolic entities manipulated by tactics, and consisting of additional information, which can be removed via an \annotation removal mapping". Control arguments and control annotations capture the two forms of control (explicit, or environmentdriven and implicit, or system-driven) available in most systems. These features are taken explicitly into account by rede ning the notions provided at the object level accordingly. Thus a system of annotated symbolic entities is a 5-tuple of the form: Esys a = hOa ; Oc ; F; Csysa ; Isysa i Oc is the set of control objects, F is the set of failures, which contains at least a no-failure element Ok and a generic failure element Fail; Oa, Csysa = hCa ; j=i, Isysa = hIa ; _[_]i are the annotated counterparts of the object level de nitions. A more complete presentation is given in [4] together with an illustrative example, the coupling of Isabelle and Maple in this framework. In [4] it is also explained why the notion of gluing of symbolic computation theory through bridge rules is well-suited to such a framework. 7
3.4 The interaction level
OMSCS is based upon rigorous proofs that show that the components and features of the object and control levels perform indeed what they are intended to do. No such proofs exist yet for the interaction level. To specify how a mathematical service interacts with the external world is still an open research problem. One may expect that a solution is speci c to a given computational environment. This would rule out a general speci cation of this component of the architecture. To illustrate this opinion we brie y mention two attempts to investigate this problem. A rst approach is based upon the Logic Broker Architecture of Armando et al. [1]. This architecture has many common features with OMSCS. It is based upon logic. Although its scope appears to be restricted to the cooperation of CASs and ATPs, it looks very promising. A second one is to manage the interaction level through communication protocols. It becomes then possible to de ne a mathematical software bus that enables cooperation and exchange of \data" among a wide class of software systems used for mathematical computing. A rst attempt is described in [7]. The latter approach looks to be even more meaningful when trying to extend OMSCS to systems such as numerical computation or graphics.
4 Schemata and context In the previous section a framework is introduced to formalize the cooperation of symbolic systems. It assumes that each integrated system is semantically sound. Theorem provers are such systems, CAS are not. A trivial example is that simpli cation is often performed through rewrite rules. A consequence is that it is not possible to rule out zero divisions. We turn now our attention to CASs and show how some previous work lead to revisit the typing and speci cation of a CAS. The starting point is to consider that we compute with operators de ned on given domain (types) that have speci c properties (speci cations). We call the triplet operator, domain, properties an abstract computational structure (ACS). Setting our approach in the framework of knowledge representation, a mathematical knowledge base consists of type schemata, algorithm schemata, algebraic algorithms, theorems, symbol tables, and normal forms. A schemata is a representation paradigm used in arti cial intelligence. We adopt the speci cation language Formal- [8] to represent the mathematical knowledge. It is well-suited to specify mathematical domains of computations. An algebraic speci cation introduces constants, operators and properties in their intended interpretation, and enables the reuse of subspeci cations within a speci cation in accordance with the dependencies between particular speci cation modules of an ACS. It is based upon category theory. This provides a link to the previous section. The next step is to de ne types, equations and algorithms through schemata's. A more detailed presentation and suitable references are found in [9]. 8
A type schema represents such a module and consists of: Name, a unique identi er Based-on, a list of inherited ACS Parameters, a list of ACS which are parameters Sorts, a list of new sorts Operators, declarations of new operators InitialProps, initial properties. These de nitions build a based-on hierarchy of the mathematical domains of computation. One de nes also an equation schemata Algorithms are also represented in terms of schemata. They allow the representation of meta-knowledge like: Name, a unique identi er of the schema with variable bindings Signature, describes the types of input and output Constraints, imposed on domain and range De nition, mathematical description of the output Subalgs, list of subalgorithms describing the embedded subtasks Theorems, describing properties of the algorithm Function, name of the corresponding executable algebraic function to compute the output. Similarly to type and equation schemata, algorithm schemata build a hierarchy of specialized versions, and specializations inherit de nitions and theorems from more general algorithms. New properties of algorithms can be derived then by a possibly coupled theorem prover. The concepts of speci cation language and schemata representation are at the of the Calvin system mentioned in section 2. A consequence of such an approach is to enable to de ne a context for a computation. This methodology is valid both when a CAS stands alone and when it is coupled to a TP. A context aims at making available the mathematical knowledge hidden in algebraic algorithms and in computational procedures. It is a methodology to improve the semantical soundness of symbolic computations. Although this is not straightforward to see, both Formal- and OMSCS are closely related to the formulation of speci cations through category theory. To stay as close as possible to OMSCS, we subdivide a context into three levels.
9
The object level context collects the set of speci cations linked to an op-
erator and to its domain of de nition. More generally, the goal is to access all of the information and knowledge that is either explicitly or implicitly available in the schemata representation of algebraic algorithms. This enables the representation of the meta-knowledge available in a schema. The context at this level is basically static in the sense that it is not linked to a run time activity of the algorithms. At the control level the context is partly static and partly dynamic. The static part arises from the hierarchical organization of the object level schemata into equational schemata. At the root of this graphical hierarchy lies the \simplify" function. This generates thus a dynamical component that is associated to the simpli cation process. This mechanism ought to drastically improve the semantical soundness of symbolic computation. For instance, keeping track of divisions by expressions such as x ? a avoids possible divisions by 0 later on in a computation. Technically, this is enforced by adding, when relevant depending on the type, to the theorem part of the schemata an equation of the form x 6= a. This implies a re ned representation of a schema by splitting it into a static and dynamic components. The dynamic component can thus be disregarded when a computation is ended. Another bene t is to avoid storing the whole history of a computation. Only those operations that could lead to mathematically unsound operations need to be stored. This is determined in agreement with the knowledge available in the schemata. The concept of annotation introduced in the control level of OMSCS could provide a mechanism to exchange contexts among cooperating systems. This is under investigation. To de ne the interaction level part of context is still an open problem. A possible track is that the schemata approach leads to a concept of protocol to exchange mathematical knowledge and to check its soundness. For instance, the associativity property of an operator is part of the equations de ning the "theorems" in the schemata for the relevant operator. This enables a fast check for correctness. Such a protocol is very simply coming out from the hierarchical organization of schemata. It is thus part of the context as de ned above.
It must be pointed out that we do not need to run a uni cation algorithm on the equations de ning the theorems in a schemata. On the opposite, we wish to have as many expressions as possible for a theorem to avoid any recomputation and to better master the search space when cooperating with a deduction system.
5 A multi-agent approach Assume that an agent is a piece of software performing a given, well-de ned task. When several agents cooperate through communication and negotiation 10
to perform a more complex task, they constitute a multi-agent system. This is a very crude introduction. A much better one is found in [22] where dierent classes of agents are also de ned. This de nition is however sucient to understand that the coupling of CASs and ATPs can be set into this framework. However, this is not yet fully recognized by the relevant communities (an exception is [20]). Our approach is based upon KOMET (Karlsruhe Open Mediator Technology), a system under development since 1994 [10]. Integrating data and knowledge from multiple heterogeneous sources (each one possibly with a dierent underlying data model) is not only an important aspect of automated reasoning but also of retrieval systems, in the widest sense, whose queries can span such multiple sources. These sources can be as dierent as relational or deductive databases, object bases, (constraint) knowledge bases, or even (structured) les and arbitrary program packages encapsulating speci c knowledge. The web can be such an information source. With the transition from isolated to cooperating information systems there was an urgent need for sound computational query-processing with a diversity of circumstantial informations. Wiederhold et al. [23] have proposed the concept of a mediator, a device which expresses how the integration of dierent databases is to be achieved. We have generalized the concept of mediator to information sources. A mediator integrates dierent sources on a semantic level by providing an integrated view spanning heterogeneous information sources. Dierent languages for building mediatory information systems have been proposed. We have selected (and extended) the Generalized Annotated Logic (GAP) [18]. It allows a seamless integration of temporal, uncertain and inconsistent information. The integration of external data is modeled as functions and relations over some external software packages. In this section we focus on some implementational details involving the embedding of Mathematica into our Mediator Architecture. First, the basic underlying mediator architecture is described. Then, some steps of the integration process is sketched. Some recent theoretical developments in KOMET are reported in [11] where relevant references are cited.
5.1 The Mediator Architecture
The basic architecture consists of mediators converting queries from a common format into more specialized queries, which are subsequently converted by translators (wrappers) into the query language of the requested knowledge sources. In gure 1 the basic architecture of our system is depicted. Such a translator must exist for each mediator{knowledge source combination. A translator also includes other functionalities for utilizing the knowledge source for the mediator such as caching extracted information or managing remote procedure calls. This approach diers signi cantly from traditional integration of multiple databases: In general there is no global schema integrating the local knowledge sources 11
Application
Mediator
Src1
Src2
Src3
Figure 1: Mediator Architecture schemata. The mediator schema is tailored to the speci c integration problem. The information sources may be heterogeneous in their structure, for example object-oriented databases, computer algebra systems, at wwwpages or even mediators as well. In contrast, multi-database systems require less or more uniform data models to build a global schema. Integrating heterogeneous information sources does not only focus on the integration of the data-object schema but also on integrating the dierent accessing mechanisms. Building a global schema from a set of local schemata is a bottom-up procedure: One rst needs to de ne the local schemata explicitly, then the global schema is created on top of the local schemata and nally application views on the global schema are de ned. In contrast, a mediatory design methodology is a top-down approach starting from a given application need to access dierent heterogeneous information sources. The mediators providing this view have (via translators) direct access to the local knowledge sources, because | taking the view from the top | no global schema of all the knowledge sources is available. Therefore mediators can be seen as specialized \global" views. They only integrate those parts of the underlying knowledge source that are necessary for the application needs they serve. The KOMET approach diers from other approaches in that the mediator is knowledge-based, i.e. a declarative rule based language for expressing the mediatory knowledge is being used. The need for AI-techniques in information system integration is supported by the following arguments: Two information systems may dier in the information they provide. Choosing the better (more reliable or more specialized) source requires 12
View 1
View 2
Mediator Schema
KS 1
KS 2
KS 3
Figure 2: Bottom-up Integration of Heterogeneous Information Sources expertise. Other knowledge-intensive decisions may be necessary, such as choosing the optimal parameters for a query or consulting the sources in the correct order. Given a prede ned application need, it could be the case that neither each local information system nor a global schema built from them provides the whole answer. Instead, the answer to a query may require additional inferences on the external informations. These kinds of mediators do not only integrate heterogeneous information systems but also add extra knowledge. The result is a new information system providing more than just the sum of the underlying information systems, but (re-)using and depending on them.
5.2 Syntax and Semantics
We sketch here the basic theory behind our approach to mediated systems. More detailed accounts are available in [10, 21]. A domain, D, is an abstraction of databases and software packages and consists of three components: (1) a set whose elements may be thought of as the data-objects that are being manipulated by the package in question, (2) a set F of functions on | these functions take objects in as input, and return, as output, objects from their range (which needs to be speci ed). The functions in F may be thought of as the prede ned functions that have been implemented in the software package being considered, (3) a set of relations on the data-objects in | intuitively, these relations may be thought of as the prede ned relations in the domain, D. 13
A constraint over D is a rst order formula where the symbols are interpreted over D. is either true or false in D, in which case it is sait to be solvable, or respectively unsolvable in D, where the reference to D will be eliminated if it is clear from context. The key idea behind a mediated system is that constraints provide the link to external sources, whether they are databases, object bases, or other knowledge sources. Consider as an example the following clause
cool spray(O; X; Y ) : [1; T ]
(X1 ; Y1 ) IN RANGE ((X; Y ); 2); AT (O; X1 ; Y1 ); TEMP (O; Temp); Temp 100 robot at(X; Y ) : [0:5; T ]
where cool spray and robot at denote predicates of the annotated logic which are de ned over an annotation lattice of uncertainty and time and IN, RANGE, TEMP, AT, are relations that are provided by a variety of information sources. The clauses states \If the robot is at location (X; Y ) with a certainty of at least 50% and there is an object O within a distance of 2 units from the robot that has a temperature over 100 degrees, then aim a cooling spray at object O." The clause contains Uncertainty: The positional uncertainty, Time: The robots location changes with time, Four external constraint domains: A spatial database is accessed when evaluating the RANGE subquery and the IN relation. A relational database is used to evaluate the AT relation. A temperature sensor returns the values for the TEMP relation and the real number constraint domain is used to evaluate the constraint Temp 100. In particular, the full- edged language involves annotations of predicate symbols according to GAP. Basically, an annotation corresponds to a multi-valued truth value from a complete lattice of truth values. A more detailed description is given in [18]. The annotations play an important role in resolving attribute value inconsistencies. However, we will focus here on the integration part and therefore omit the annotations.
5.3 Embedding Mathematica into a Mediator Architecture A description of this integration is found in the Master thesis of Jekutsch [16]. However, it is mainly an application and by no means the main topic of the thesis. Thus, the presentation is also very sketchy. MathLink is used to link Mathematica to KOMET. 14
In contrast to the above described formalization of constraint domains (consisting of relations and functions), this approach relies on the more powerful concept of representing the functionality of the information source only as a set of relations, but where for each relation one or more modes are given. Intuitively, a mode describes the permitted binding patterns for the evaluation of a given relation. Note, that this is not a limitation, since functions can be represented as relations with appropriate modes. A relation mode is a tuple of argument modes, which specify the binding type of each argument required for the evaluation. The possible argument types are listed in the following table. Argument mode before after + ground ground arbitrary ground ? arbitrary arbitrary \+" means that the argument must be ground before testing the constraint predicates, \?" means that the argument must be ground after calling the external function, and \?" means that the variable instantiation is arbitrary. The use of modes implicitly imposes a certain order of evaluation on the constraint set and thus controls the data ow during the evaluation. With this approach, speci c functionality of the CAS can be adequately introduced to the mediator. The proper modes ensure valid usage of the CAS functions. Consider as an illustrating example the interoperation between a relational database containing the coecients of polynomials a1 X 2 + a0 over integers and Mathematica providing useful routines such as factoring polynomials. In spite of the task itself being rather simple, if not trivial, the example demonstrates how the mediator language can be used to interface the mediator to a CAS in a declarative manner: Coe(A0 ; A1 ) Oracle::Polynomials(A0 ; A1 ) Factorized Poly(X ) Mathematica::Factor(X ); Mathematica::Plus(X; A0 ; Z ); Mathematica::Times(Z; A1 ; Z ); Mathematica::Power(Z; X; 2); Coe(A0 ; A1 ) with modes Polynomials(+; +), Factor(+), Plus(?; +; +) and Times(?; +; +). When issuing a query Factorized poly(X), the translator receives repeatedly an expression Factor(Plus(A0 ; Times(A1 ; Power(X; 2)))) with tuples (A0 ,A1 ) from the Oracle database, which will result in the following set of MathLink function calls: link = MLStart('math -noinit -mathlink'); MLPutFunction(link,'Factor',1); MLPutFunction(link,'Plus',2);
15
MLPutFunction(link,'Times',2); MLPutFunction(link,'Power,2); MLPutSymbol(link,A0); MLPutSymbol(link,A1); MLPutSymbol(link,X); MLPutInteger(link,2); MLEndPacket(link);
It is exactly the purpose of the translator to generate a sequence of those MathLink function calls. We could support the OpenMath interface as well but it was not well enough de ned when this work was completed. We have sketched in this section how a multi-agent system (KOMET) can be used to query a CAS when it is viewed as an information source. The main, achieved, goal was to demonstrate that this approach is technically feasible. More work would be required to transform this experiment into a working tool.
6 Conclusion We did sketch some of the approaches we are pursuing to couple CASs and ATPs. They range from the ad-hoc integration of two selected systems to the design of general environments. It must be noted that this research is set in Computer Science (CS) and that modeling (subsets of) Mathematics leads to the design of ambitious systems in CS. Such models are imposed by the structure of the mathematical elds unter investigation. They are not trivial and cannot be bounded to facilitate the design of such environments. A goal of the coupling of CASs and ATPs is to increase the capabilities of the stand-alone systems. So far, we have not witnessed any impact of the availability of mechanized reasoning or proving on the ability of CASs to perform computation. The impact has been in the reverse direction. When two communities start collaborating a rst requirement is that they speak a common language: they must share common standards and agree on the terminology of the concepts they use. This is illustrated by the following comments. As pointed out above, the CA community does not expect to bene t much from the coupling of ATPs to their systems. But, they have realized that CASs can help improving automatized proof techniques. For instance, the Theorema project of Buchberger ought to enable writing a book on Groebner bases, the proofs being established by Theorema (sitting on top of Mathematica). The early history of CA shows that the initial successes of CA have been in domains where the computations were not that dicult but much too lengthy to be performed by humans, for instance in celestial mechanics or particle physics. This is why computer algebraists would tend to investigate proof problems having similar characteristics. The theorems on p-groups mentioned in section 2 fall in this category. Computer algebraists are either mathematicians or are very close to them. A consequence is that they usually share the opinion that a famous theorem is a theorem that brings fame in Mathematics. This implies that the theorem has not been previously proven. The ATP community has apparently 16
a broader de nition for what a famous theorem is. There is a genuine interest of the ATP community to prove that algorithms or procedures used in CASs are correct. Here lies another possible misunderstanding. In CA an algorithm is, hopefully, a constructive procedure that solves all cases, including the nonexistence of a solution, of a computational assignment. Then, the algorithm is almost a \carbon copy" of the mathematical proof on paper. One could illustrate this facet by algorithms for factorization, gcd or integration. Heuristic procedures are avoided as much as possible. Then, the implementations of these algorithms is polished over the years and fully debugged. In the ATP community an algorithm is mainly viewed as it is in CS: a nite, de nite procedure producing unique solutions and no necessarily built upon a constructive mathematical proof. A result, is that the ATP community has recently produced several mechanized proofs of algebraic algorithms. Analyzing these proofs leads to notice that they duplicate the contents of the algorithm. A consequence is an agreement on what needs to be proven is required. Despite that, coupling deduction and computation is a fascinating task. This was hopefully stressed enough in the previous sections. Some other topics have not been covered: Among them is the use of coupled systems to teach Mathematics in classrooms. CASs are already very successful in this respect. One may now think of tools enabling to teach how to prove theorems. There may be also an extension towards cognitive science. We mentioned in section 2 that a common knowledge base is required to improve the coupling of systems. It could be extended into a tutoring component for both computation and deduction. Such a component is required for the use in classroom. Designing it could focus the design of a simpler ATP usable by non-experts. A last remark is that cooperating systems will play an increasingly greater part in our approach to computing as a whole. A scenario stating that users will query systems distributed on the web as been mentioned many times. Its feasibility depends partly on making software systems cooperate. Today, it is possible to integrate heterogeneous pieces of hardware through communication buses. But, making heterogeneous software systems to intercommunicate is a fully challenging task.
References [1] A. Armando and D. Zini. Interfacing Computer Algebra and Deduction Systems via the Logic Broker Architecture. In proceedings of FroCos 2000. [2] C. Ballarin and L. C. Paulson, A Pragmatic Approach to Extending Provers by Computer Algebra | with Applications to Coding Theory, Fundamenta Informaticae Vol. 39, No. 1{2, pp. 1{20, 1999. [3] C. Ballarin, K. Homann and J. Calmet, Theorems and Algorithms: An Interface between Isabelle and Maple. In A.H.M. Levelt (Ed.), Proceedings of International Symposium on Symbolic and Algebraic Computation (ISSAC'95), pp. 150{157, ACM Press, 1995. 17
[4] P.G. Bertoli, J. Calmet, F. Giunchiglia and K. Homann, Speci cation and Integration of Theorem Provers and Computer Algebra Systems. Fundamenta Informaticae, Vol. 39, No. 1{2, pp. 39{57, 1999. [5] W. Bosma and J. Cannon, Handbook of Magma Functions, Sydney, 1994. [6] J. Calmet and K. Homann, Classi cation of Communication and Cooperation Mechanisms for Logical and Symbolic Computation Systems. In K.U. Shultz and F.Baader (Eds.), Frontiers of Combining Systems, Proceedings of FroCoS'96, pp. 221{234, Kluwer Series on Applied Logic, 1996. [7] J. Calmet and K. Homann, Towards the Mathematics Software Bus., Theoretical Computer Science, Vol. 187, pp. 221{230, 1997 [8] J. Calmet and I.A. Tjandra, A Uni ed-Algebra-Based Speci cation Language for Symbolic Computing , in A. Miola (ed.), Design and Implementation of Symbolic Computation Systems, LNCS 722, pp. 122{133, Springer, 1993. [9] J. Calmet, K. Homann, I.A. Tjandra, Uni ed Domains and Abstract Computational Structures. In J. Calmet, J.A. Campbell (eds.), International Conference on Arti cial Intelligence and Symbolic Mathematical Computing, Karlsruhe, August 3{6, 1992, LNCS 737, pp. 166{177 , Springer, 1993. [10] J. Calmet, S. Jekutsch, P. Kullmann, J. Schu, KOMET { A System for the Integration of Heterogeneous Information Sources,10th International Symposium on Methodologies for Intelligent Systems (ISMIS), Springer, 1997. [11] J. Calmet, P. Kullmann and M. Taneda, Composite Distributive Lattices as Annotation Domains for Mediators.. To appear in Proc. of AISC'2000, Madrid, July 2000. Springer LNAI, 2000. [12] J.D. Dixon, Problems in Group Theory, Dover Publishing, 1973. [13] D. Geddis, The DTP Manual, Stanford University, 1994. [14] F. Giunchiglia, P. Pecchiari and C. Talcott. Reasoning Theories: Towards an Architecture for Open Mechanized Reasoning Systems. In K.U. Shultz and F.Baader (Eds.), Frontiers of Combining Systems, Proceedings of FroCoS'96, Kluwer Series on Applied Logic, 1996. [15] J. Harrison and L. Thery, A sceptic's approach to combining HOL and Maple, J. of Automated Reasoning, vol. 21, pp. 279{294, 1998. [16] S. Jekutsch, Desgin and Implementation of a generic Query-Translator for an integration of heterogenous information sources in a mediator architecture, in German, Diplomarbeit, University of Karlsruhe, 1996.
18
[17] M. Kerber and M. Kohlhase (Eds.) Proceedings of the 8th Symposium on the Integration of Symbolic Computation and Mechanized Reasoning. St Andrews, August 2000. To appear. [18] M. Kifer and V.S. Subrahmanian, Theory of Generalized Annotated Logic Programming, Journal of Logic Programming, 12, pp. 335{367, 1992. [19] H. Kirchner and C. Ringeissen (Eds.) Frontiers of Combining Systems. Proceedings of FroCos 2000, LNAI vol. 1794, 2000. [20] M. Kohlhase, V. Sorge et al. Agent-oriented integration of distributed mathematical services. In proceedings of FroCos 2000. [21] J. Schu, Updates and Query-Processing in a Mediator Architecture, Phd thesis, Shaker, 1996. [22] G. Weiss (ed.), Multiagent Systems - A Modern Approach to Distributed Arti cial Intelligence, MIT Press, 1999. [23] G. Wiederhold, Mediators in the Architecture of Future Information Systems, IEEE Computer 25, pp. 38{49, 1992.
19